Supervised learning policy networks