Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yizhu Zhao

Multi-Modal Large Models Based Beam Prediction: An Example Empowered by DeepSeek

Jun 06, 2025

Yizhu Zhao, Li Yu, Lianzheng Shi, Jianhua Zhang, Guangyi Liu

Abstract:Beam prediction is an effective approach to reduce training overhead in massive multiple-input multiple-output (MIMO) systems. However, existing beam prediction models still exhibit limited generalization ability in diverse scenarios, which remains a critical challenge. In this paper, we propose MLM-BP, a beam prediction framework based on the multi-modal large model released by DeepSeek, with full consideration of multi-modal environmental information. Specifically, the distribution of scatterers that impact the optimal beam is captured by the sensing devices. Then positions are tokenized to generate text-based representations, and multi-view images are processed by an image encoder, which is fine-tuned with low-rank adaptation (LoRA), to extract environmental embeddings. Finally, these embeddings are fed into the large model, and an output projection module is designed to determine the optimal beam index. Simulation results show that MLM-BP achieves 98.1% Top-1 accuracy on the simulation dataset. Additionally, it demonstrates few-shot generalization on a real-world dataset, achieving 72.7% Top-1 accuracy and 92.4% Top-3 accuracy with only 30% of the dataset, outperforming the existing small models by over 15%.

Via

Access Paper or Ask Questions