Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Liyu Cai

AI2MMUM: AI-AI Oriented Multi-Modal Universal Model Leveraging Telecom Domain Large Model

May 15, 2025

Tianyu Jiao, Zhuoran Xiao, Yihang Huang, Chenhui Ye, Yijia Feng, Liyu Cai, Jiang Chang, Fangkun Liu, Yin Xu, Dazhi He(+2 more)

Abstract:Designing a 6G-oriented universal model capable of processing multi-modal data and executing diverse air interface tasks has emerged as a common goal in future wireless systems. Building on our prior work in communication multi-modal alignment and telecom large language model (LLM), we propose a scalable, task-aware artificial intelligence-air interface multi-modal universal model (AI2MMUM), which flexibility and effectively perform various physical layer tasks according to subtle task instructions. The LLM backbone provides robust contextual comprehension and generalization capabilities, while a fine-tuning approach is adopted to incorporate domain-specific knowledge. To enhance task adaptability, task instructions consist of fixed task keywords and learnable, implicit prefix prompts. Frozen radio modality encoders extract universal representations and adapter layers subsequently bridge radio and language modalities. Moreover, lightweight task-specific heads are designed to directly output task objectives. Comprehensive evaluations demonstrate that AI2MMUM achieves SOTA performance across five representative physical environment/wireless channel-based downstream tasks using the WAIR-D and DeepMIMO datasets.

Via

Access Paper or Ask Questions

Addressing the Curse of Scenario and Task Generalization in AI-6G: A Multi-Modal Paradigm

Apr 07, 2025

Tianyu Jiao, Zhuoran Xiao, Yin Xu, Chenhui Ye, Yihang Huang, Zhiyong Chen, Liyu Cai, Jiang Chang, Dazhi He, Yunfeng Guan(+2 more)

Abstract:Existing works on machine learning (ML)-empowered wireless communication primarily focus on monolithic scenarios and single tasks. However, with the blooming growth of communication task classes coupled with various task requirements in future 6G systems, this working pattern is obviously unsustainable. Therefore, identifying a groundbreaking paradigm that enables a universal model to solve multiple tasks in the physical layer within diverse scenarios is crucial for future system evolution. This paper aims to fundamentally address the curse of ML model generalization across diverse scenarios and tasks by unleashing multi-modal feature integration capabilities in future systems. Given the universality of electromagnetic propagation theory, the communication process is determined by the scattering environment, which can be more comprehensively characterized by cross-modal perception, thus providing sufficient information for all communication tasks across varied environments. This fact motivates us to propose a transformative two-stage multi-modal pre-training and downstream task adaptation paradigm...

Via

Access Paper or Ask Questions