Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tianyu Jiao

Towards Native Intelligence: 6G-LLM Trained with Reinforcement Learning from NDT Feedback

Jan 15, 2026

Zhuoran Xiao, Tao Tao, Chenhui Ye, Yunbo Hu, Yijia Feng, Tianyu Jiao, Liyu Cai

Abstract:Owing to its comprehensive understanding of upper-layer application requirements and the capabilities of practical communication systems, the 6G-LLM (6G domain large language model) offers a promising pathway toward realizing network native intelligence. Serving as the system orchestrator, the 6G-LLM drives a paradigm shift that fundamentally departs from existing rule-based approaches, which primarily rely on modular, experience-driven optimization. By contrast, the 6G-LLM substantially enhances network flexibility and adaptability. Nevertheless, current efforts to construct 6G-LLMs are constrained by their reliance on large-scale, meticulously curated, human-authored corpora, which are impractical to obtain in real-world scenarios. Moreover, purely offline-trained models lack the capacity for continual self-improvement, limiting their ability to adapt to the highly dynamic requirements of wireless communication environments. To overcome these limitations, we propose a novel training paradigm termed RLDTF (Reinforcement Learning from Digital Twin Feedback) for 6G-LLMs. This framework leverages network digital twins to generate reward signals based on orchestration outcomes, while employing reinforcement learning to guide the model toward optimal decision-making dynamically. Furthermore, we introduce a weighted token mechanism to improve output accuracy. Comprehensive experimental results demonstrate that our proposed framework significantly outperforms state-of-the-art baselines in orchestration accuracy and solution optimality.

* The paper has been accepted IEEE WCNC 2026

Via

Access Paper or Ask Questions

MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

Nov 18, 2025

MiroMind Team, Song Bai, Lidong Bing, Carson Chen, Guanzheng Chen, Yuntao Chen, Zhe Chen, Ziyi Chen, Jifeng Dai, Xuan Dong(+45 more)

Abstract:We present MiroThinker v1.0, an open-source research agent designed to advance tool-augmented reasoning and information-seeking capabilities. Unlike previous agents that only scale up model size or context length, MiroThinker explores interaction scaling at the model level, systematically training the model to handle deeper and more frequent agent-environment interactions as a third dimension of performance improvement. Unlike LLM test-time scaling, which operates in isolation and risks degradation with longer reasoning chains, interactive scaling leverages environment feedback and external information acquisition to correct errors and refine trajectories. Through reinforcement learning, the model achieves efficient interaction scaling: with a 256K context window, it can perform up to 600 tool calls per task, enabling sustained multi-turn reasoning and complex real-world research workflows. Across four representative benchmarks-GAIA, HLE, BrowseComp, and BrowseComp-ZH-the 72B variant achieves up to 81.9%, 37.7%, 47.1%, and 55.6% accuracy respectively, surpassing previous open-source agents and approaching commercial counterparts such as GPT-5-high. Our analysis reveals that MiroThinker benefits from interactive scaling consistently: research performance improves predictably as the model engages in deeper and more frequent agent-environment interactions, demonstrating that interaction depth exhibits scaling behaviors analogous to model size and context length. These findings establish interaction scaling as a third critical dimension for building next-generation open research agents, complementing model capacity and context windows.

* Technical Report

Via

Access Paper or Ask Questions

AI2MMUM: AI-AI Oriented Multi-Modal Universal Model Leveraging Telecom Domain Large Model

May 15, 2025

Tianyu Jiao, Zhuoran Xiao, Yihang Huang, Chenhui Ye, Yijia Feng, Liyu Cai, Jiang Chang, Fangkun Liu, Yin Xu, Dazhi He(+2 more)

Abstract:Designing a 6G-oriented universal model capable of processing multi-modal data and executing diverse air interface tasks has emerged as a common goal in future wireless systems. Building on our prior work in communication multi-modal alignment and telecom large language model (LLM), we propose a scalable, task-aware artificial intelligence-air interface multi-modal universal model (AI2MMUM), which flexibility and effectively perform various physical layer tasks according to subtle task instructions. The LLM backbone provides robust contextual comprehension and generalization capabilities, while a fine-tuning approach is adopted to incorporate domain-specific knowledge. To enhance task adaptability, task instructions consist of fixed task keywords and learnable, implicit prefix prompts. Frozen radio modality encoders extract universal representations and adapter layers subsequently bridge radio and language modalities. Moreover, lightweight task-specific heads are designed to directly output task objectives. Comprehensive evaluations demonstrate that AI2MMUM achieves SOTA performance across five representative physical environment/wireless channel-based downstream tasks using the WAIR-D and DeepMIMO datasets.

Via

Access Paper or Ask Questions

Addressing the Curse of Scenario and Task Generalization in AI-6G: A Multi-Modal Paradigm

Apr 07, 2025

Tianyu Jiao, Zhuoran Xiao, Yin Xu, Chenhui Ye, Yihang Huang, Zhiyong Chen, Liyu Cai, Jiang Chang, Dazhi He, Yunfeng Guan(+2 more)

Abstract:Existing works on machine learning (ML)-empowered wireless communication primarily focus on monolithic scenarios and single tasks. However, with the blooming growth of communication task classes coupled with various task requirements in future 6G systems, this working pattern is obviously unsustainable. Therefore, identifying a groundbreaking paradigm that enables a universal model to solve multiple tasks in the physical layer within diverse scenarios is crucial for future system evolution. This paper aims to fundamentally address the curse of ML model generalization across diverse scenarios and tasks by unleashing multi-modal feature integration capabilities in future systems. Given the universality of electromagnetic propagation theory, the communication process is determined by the scattering environment, which can be more comprehensively characterized by cross-modal perception, thus providing sufficient information for all communication tasks across varied environments. This fact motivates us to propose a transformative two-stage multi-modal pre-training and downstream task adaptation paradigm...

Via

Access Paper or Ask Questions

AFDM-Enabled Integrated Sensing and Communication: Theoretical Framework and Pilot Design

Feb 20, 2025

Fan Zhang, Zhaocheng Wang, Tianqi Mao, Tianyu Jiao, Yinxiao Zhuo, Miaowen Wen, Wei Xiang, Sheng Chen, George K. Karagiannidis

Figure 1 for AFDM-Enabled Integrated Sensing and Communication: Theoretical Framework and Pilot Design

Figure 2 for AFDM-Enabled Integrated Sensing and Communication: Theoretical Framework and Pilot Design

Figure 3 for AFDM-Enabled Integrated Sensing and Communication: Theoretical Framework and Pilot Design

Figure 4 for AFDM-Enabled Integrated Sensing and Communication: Theoretical Framework and Pilot Design

Abstract:The integrated sensing and communication (ISAC) has been envisioned as one representative usage scenario of sixth-generation (6G) network. However, the unprecedented characteristics of 6G, especially the doubly dispersive channel, make classical ISAC waveforms rather challenging to guarantee a desirable performance level. The recently proposed affine frequency division multiplexing (AFDM) can attain full diversity even under doubly dispersive effects, thus becoming a competitive candidate for next-generation ISAC waveforms. Relevant investigations are still at an early stage, which involve only straightforward design lacking explicit theoretical analysis. This paper provides an in-depth investigation on AFDM waveform design for ISAC applications. Specifically, the closed-form Cr\'{a}mer-Rao bounds of target detection for AFDM are derived, followed by a demonstration on its merits over existing counterparts. Furthermore, we formulate the ambiguity function of the pilot-assisted AFDM waveform for the first time, revealing conditions for stable sensing performance. To further enhance both the communication and sensing performance of the AFDM waveform, we propose a novel pilot design by exploiting the characteristics of AFDM signals. The proposed design is analytically validated to be capable of optimizing the ambiguity function property and channel estimation accuracy simultaneously as well as overcoming the sensing and channel estimation range limitation originated from the pilot spacing. Numerical results have verified the superiority of the proposed pilot design in terms of dual-functional performance.

Via

Access Paper or Ask Questions