Automated vehicles require a comprehensive understanding of traffic situations to ensure safe and comfortable driving. In this context, the prediction of pedestrians is particularly challenging as pedestrian behavior can be influenced by multiple factors. In this paper, we thoroughly analyze the requirements on pedestrian behavior prediction for automated driving via a system-level approach: to this end we investigate real-world pedestrian-vehicle interactions with human drivers. Based on human driving behavior we then derive appropriate reaction patterns of an automated vehicle. Finally, requirements for the prediction of pedestrians are determined. This also includes a novel metric tailored to measure prediction performance from a system-level perspective. Furthermore, we present a pedestrian prediction model based on a Conditional Variational Auto-Encoder (CVAE) which incorporates multiple contextual cues to achieve accurate long-term prediction. The CVAE shows superior performance over a baseline prediction model, where prediction performance was evaluated on a large-scale data set comprising thousands of real-world pedestrian-vehicle-interactions. Finally, we investigate the impact of different contextual cues on prediction performance via an ablation study whose results can guide future research on the perception of relevant pedestrian attributes.