Learning with deep neural networks forms the state-of-the-art in many tasks such as image classification, image detection, speech recognition, text analysis. We here set out to gain understanding in learning in an ‘end-to-end’ manner for an autonomous vehicle, which refers to directly learning the decision which will result from the perception of the scene. For example, we consider learning a binary ‘stop’/‘go‘ decision, with respect to pedestrians, given the input image. In this work we propose to use additional information, referred to as ‘proxy supervision’, for improved learning and study its effects on the overall performance. We show that the proxy labels significantly improve the robustness of learning, while achieving as good, or better, accuracy than in the original task of binary classification.