



The emergence of extremely large-scale antenna arrays (ELAA) in millimeter-wave (mmWave) communications, particularly in high-mobility scenarios, highlights the importance of near-field beam prediction. Unlike the conventional far-field assumption, near-field beam prediction requires codebooks that jointly sample the angular and distance domains, which leads to a dramatic increase in pilot overhead. Moreover, unlike the far- field case where the optimal beam evolution is temporally smooth, the optimal near-field beam index exhibits abrupt and nonlinear dynamics due to its joint dependence on user angle and distance, posing significant challenges for temporal modeling. To address these challenges, we propose a novel Convolutional Neural Network-Generative Pre-trained Transformer 2 (CNN-GPT2) based near-field beam prediction framework. Specifically, an uplink pilot transmission strategy is designed to enable efficient channel probing through widebeam analog precoding and frequency-varying digital precoding. The received pilot signals are preprocessed and passed through a CNN-based feature extractor, followed by a GPT-2 model that captures temporal dependencies across multiple frames and directly predicts the near-field beam index in an end-to-end manner.