Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shuo Liu

$X$-evolve: Solution space evolution powered by large language models

Aug 11, 2025

Yi Zhai, Zhiqiang Wei, Ruohan Li, Keyu Pan, Shuo Liu, Lu Zhang, Jianmin Ji, Wuyang Zhang, Yu Zhang, Yanyong Zhang

$Figure 1 for $X$-evolve: Solution space evolution powered by large language models$

$Figure 2 for $X$-evolve: Solution space evolution powered by large language models$

$Figure 3 for $X$-evolve: Solution space evolution powered by large language models$

$Figure 4 for $X$-evolve: Solution space evolution powered by large language models$

Abstract:While combining large language models (LLMs) with evolutionary algorithms (EAs) shows promise for solving complex optimization problems, current approaches typically evolve individual solutions, often incurring high LLM call costs. We introduce $X$-evolve, a paradigm-shifting method that instead evolves solution spaces $X$ (sets of individual solutions) - subsets of the overall search space $S$. In $X$-evolve, LLMs generate tunable programs wherein certain code snippets, designated as parameters, define a tunable solution space. A score-based search algorithm then efficiently explores this parametrically defined space, guided by feedback from objective function scores. This strategy enables broader and more efficient exploration, which can potentially accelerate convergence at a much lower search cost, requiring up to two orders of magnitude fewer LLM calls than prior leading methods. We demonstrate $X$-evolve's efficacy across three distinct hard optimization problems. For the cap set problem, we discover a larger partial admissible set, establishing a new tighter asymptotic lower bound for the cap set constant ($C \ge 2.2203$). In information theory, we uncover a larger independent set for the 15-vertex cycle graph ($\mathcal{C}_{15}^{\boxtimes 5}$, size 19,946), thereby raising the known lower bound on its Shannon capacity. Furthermore, for the NP-hard online bin packing problem, we generate heuristics that consistently outperform standard strategies across established benchmarks. By evolving solution spaces, our method considerably improves search effectiveness, making it possible to tackle high-dimensional problems that were previously computationally prohibitive.

Via

Access Paper or Ask Questions

LLM Collaboration With Multi-Agent Reinforcement Learning

Aug 06, 2025

Shuo Liu, Zeyu Liang, Xueguang Lyu, Christopher Amato

Abstract:A large amount of work has been done in Multi-Agent Systems (MAS) for modeling and solving problems with multiple interacting agents. However, most LLMs are pretrained independently and not specifically optimized for coordination. Existing LLM fine-tuning frameworks rely on individual rewards, which require complex reward designs for each agent to encourage collaboration. To address these challenges, we model LLM collaboration as a cooperative Multi-Agent Reinforcement Learning (MARL) problem. We develop a multi-agent, multi-turn algorithm, Multi-Agent Group Relative Policy Optimization (MAGRPO), to solve it, building on current RL approaches for LLMs as well as MARL techniques. Our experiments on LLM writing and coding collaboration demonstrate that fine-tuning MAS with MAGRPO enables agents to generate high-quality responses efficiently through effective cooperation. Our approach opens the door to using other MARL methods for LLMs and highlights the associated challenges.

Via

Access Paper or Ask Questions

VIMS: A Visual-Inertial-Magnetic-Sonar SLAM System in Underwater Environments

Jun 18, 2025

Bingbing Zhang, Huan Yin, Shuo Liu, Fumin Zhang, Wen Xu

Figure 1 for VIMS: A Visual-Inertial-Magnetic-Sonar SLAM System in Underwater Environments

Figure 2 for VIMS: A Visual-Inertial-Magnetic-Sonar SLAM System in Underwater Environments

Figure 3 for VIMS: A Visual-Inertial-Magnetic-Sonar SLAM System in Underwater Environments

Figure 4 for VIMS: A Visual-Inertial-Magnetic-Sonar SLAM System in Underwater Environments

Abstract:In this study, we present a novel simultaneous localization and mapping (SLAM) system, VIMS, designed for underwater navigation. Conventional visual-inertial state estimators encounter significant practical challenges in perceptually degraded underwater environments, particularly in scale estimation and loop closing. To address these issues, we first propose leveraging a low-cost single-beam sonar to improve scale estimation. Then, VIMS integrates a high-sampling-rate magnetometer for place recognition by utilizing magnetic signatures generated by an economical magnetic field coil. Building on this, a hierarchical scheme is developed for visual-magnetic place recognition, enabling robust loop closure. Furthermore, VIMS achieves a balance between local feature tracking and descriptor-based loop closing, avoiding additional computational burden on the front end. Experimental results highlight the efficacy of the proposed VIMS, demonstrating significant improvements in both the robustness and accuracy of state estimation within underwater environments.

* This work has been accepted for publication at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)

Via

Access Paper or Ask Questions

dots.llm1 Technical Report

Jun 06, 2025

Bi Huo, Bin Tu, Cheng Qin, Da Zheng, Debing Zhang, Dongjie Zhang, En Li, Fu Guo, Jian Yao, Jie Lou(+17 more)

Abstract:Mixture of Experts (MoE) models have emerged as a promising paradigm for scaling language models efficiently by activating only a subset of parameters for each input token. In this report, we present dots.llm1, a large-scale MoE model that activates 14B parameters out of a total of 142B parameters, delivering performance on par with state-of-the-art models while reducing training and inference costs. Leveraging our meticulously crafted and efficient data processing pipeline, dots.llm1 achieves performance comparable to Qwen2.5-72B after pretraining on 11.2T high-quality tokens and post-training to fully unlock its capabilities. Notably, no synthetic data is used during pretraining. To foster further research, we open-source intermediate training checkpoints at every one trillion tokens, providing valuable insights into the learning dynamics of large language models.

Via

Access Paper or Ask Questions

Fixing Incomplete Value Function Decomposition for Multi-Agent Reinforcement Learning

May 15, 2025

Andrea Baisero, Rupali Bhati, Shuo Liu, Aathira Pillai, Christopher Amato

Abstract:Value function decomposition methods for cooperative multi-agent reinforcement learning compose joint values from individual per-agent utilities, and train them using a joint objective. To ensure that the action selection process between individual utilities and joint values remains consistent, it is imperative for the composition to satisfy the individual-global max (IGM) property. Although satisfying IGM itself is straightforward, most existing methods (e.g., VDN, QMIX) have limited representation capabilities and are unable to represent the full class of IGM values, and the one exception that has no such limitation (QPLEX) is unnecessarily complex. In this work, we present a simple formulation of the full class of IGM values that naturally leads to the derivation of QFIX, a novel family of value function decomposition models that expand the representation capabilities of prior models by means of a thin "fixing" layer. We derive multiple variants of QFIX, and implement three variants in two well-known multi-agent frameworks. We perform an empirical evaluation on multiple SMACv2 and Overcooked environments, which confirms that QFIX (i) succeeds in enhancing the performance of prior methods, (ii) learns more stably and performs better than its main competitor QPLEX, and (iii) achieves this while employing the simplest and smallest mixing models.

Via

Access Paper or Ask Questions

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

May 12, 2025

Xiaomi LLM-Core Team, :, Bingquan Xia, Bowen Shen, Cici, Dawei Zhu, Di Zhang, Gang Wang, Hailin Zhang, Huaqiu Liu(+55 more)

Abstract:We present MiMo-7B, a large language model born for reasoning tasks, with optimization across both pre-training and post-training stages. During pre-training, we enhance the data preprocessing pipeline and employ a three-stage data mixing strategy to strengthen the base model's reasoning potential. MiMo-7B-Base is pre-trained on 25 trillion tokens, with additional Multi-Token Prediction objective for enhanced performance and accelerated inference speed. During post-training, we curate a dataset of 130K verifiable mathematics and programming problems for reinforcement learning, integrating a test-difficulty-driven code-reward scheme to alleviate sparse-reward issues and employing strategic data resampling to stabilize training. Extensive evaluations show that MiMo-7B-Base possesses exceptional reasoning potential, outperforming even much larger 32B models. The final RL-tuned model, MiMo-7B-RL, achieves superior performance on mathematics, code and general reasoning tasks, surpassing the performance of OpenAI o1-mini. The model checkpoints are available at https://github.com/xiaomimimo/MiMo.

Via

Access Paper or Ask Questions

Rapid diagnostics of reconfigurable intelligent surfaces using space-time-coding modulation

May 06, 2025

Yi Ning Zheng, Lei Zhang, Xiao Qing Chen, Marco Rossi, Giuseppe Castaldi, Shuo Liu, Tie Jun Cui, Vincenzo Galdi

Abstract:Reconfigurable intelligent surfaces (RISs) have emerged as a key technology for shaping smart wireless environments in next-generation wireless communication systems. To support the large-scale deployment of RISs, a reliable and efficient diagnostic method is essential to ensure optimal performance. In this work, a robust and efficient approach for RIS diagnostics is proposed using a space-time coding strategy with orthogonal codes. The method encodes the reflected signals from individual RIS elements into distinct code channels, enabling the recovery of channel power at the receiving terminals for fault identification. Theoretical analysis shows that the normally functioning elements generate high power in their respective code channels, whereas the faulty elements exhibit significantly lower power. This distinction enables rapid and accurate diagnostics of elements' operational states through simple signal processing techniques. Simulation results validate the effectiveness of the proposed method, even under high fault ratios and varying reception angles. Proof-of-principle experiments on two RIS prototypes are conducted, implementing two coding strategies: direct and segmented. Experimental results in a realistic scenario confirm the reliability of the diagnostic method, demonstrating its potential for large-scale RIS deployment in future wireless communication systems and radar applications.

* 30 pages, 6 figures, 1 table, supporting information

Via

Access Paper or Ask Questions

AlphaFuse: Learn ID Embeddings for Sequential Recommendation in Null Space of Language Embeddings

Apr 29, 2025

Guoqing Hu, An Zhang, Shuo Liu, Zhibo Cai, Xun Yang, Xiang Wang

Figure 1 for AlphaFuse: Learn ID Embeddings for Sequential Recommendation in Null Space of Language Embeddings

Figure 2 for AlphaFuse: Learn ID Embeddings for Sequential Recommendation in Null Space of Language Embeddings

Figure 3 for AlphaFuse: Learn ID Embeddings for Sequential Recommendation in Null Space of Language Embeddings

Figure 4 for AlphaFuse: Learn ID Embeddings for Sequential Recommendation in Null Space of Language Embeddings

Abstract:Recent advancements in sequential recommendation have underscored the potential of Large Language Models (LLMs) for enhancing item embeddings. However, existing approaches face three key limitations: 1) the degradation of the semantic space when high-dimensional language embeddings are mapped to lower-dimensional ID embeddings, 2) the underutilization of language embeddings, and 3) the reliance on additional trainable parameters, such as an adapter, to bridge the gap between the semantic and behavior spaces. In this paper, we introduce AlphaFuse, a simple but effective language-guided learning strategy that addresses these challenges by learning ID embeddings within the null space of language embeddings. Specifically, we decompose the semantic space of language embeddings via Singular Value Decomposition (SVD), distinguishing it into a semantic-rich row space and a semantic-sparse null space. Collaborative signals are then injected into the null space, while preserving the rich semantics of the row space. AlphaFuse prevents degradation of the semantic space, integrates the retained language embeddings into the final item embeddings, and eliminates the need for auxiliary trainable modules, enabling seamless adaptation to any sequential recommendation framework. We validate the effectiveness and flexibility of AlphaFuse through extensive experiments on three benchmark datasets, including cold-start user and long-tail settings, showcasing significant improvements in both discriminative and diffusion-based generative sequential recommenders. Our codes and datasets are available at https://github.com/Hugo-Chinn/AlphaFuse.

* Accepted by SIGIR'25

Via

Access Paper or Ask Questions

Control Barrier Functions via Minkowski Operations for Safe Navigation among Polytopic Sets

Apr 01, 2025

Yi-Hsuan Chen, Shuo Liu, Wei Xiao, Calin Belta, Michael Otte

Figure 1 for Control Barrier Functions via Minkowski Operations for Safe Navigation among Polytopic Sets

Figure 2 for Control Barrier Functions via Minkowski Operations for Safe Navigation among Polytopic Sets

Figure 3 for Control Barrier Functions via Minkowski Operations for Safe Navigation among Polytopic Sets

Figure 4 for Control Barrier Functions via Minkowski Operations for Safe Navigation among Polytopic Sets

Abstract:Safely navigating around obstacles while respecting the dynamics, control, and geometry of the underlying system is a key challenge in robotics. Control Barrier Functions (CBFs) generate safe control policies by considering system dynamics and geometry when calculating safe forward-invariant sets. Existing CBF-based methods often rely on conservative shape approximations, like spheres or ellipsoids, which have explicit and differentiable distance functions. In this paper, we propose an optimization-defined CBF that directly considers the exact Signed Distance Function (SDF) between a polytopic robot and polytopic obstacles. Inspired by the Gilbert-Johnson-Keerthi (GJK) algorithm, we formulate both (i) minimum distance and (ii) penetration depth between polytopic sets as convex optimization problems in the space of Minkowski difference operations (the MD-space). Convenient geometric properties of the MD-space enable the derivatives of implicit SDF between two polytopes to be computed via differentiable optimization. We demonstrate the proposed framework in three scenarios including pure translation, initialization inside an unsafe set, and multi-obstacle avoidance. These three scenarios highlight the generation of a non-conservative maneuver, a recovery after starting in collision, and the consideration of multiple obstacles via pairwise CBF constraint, respectively.

* 8 pages, 3 figures

Via

Access Paper or Ask Questions

Effective and Efficient Representation Learning for Flight Trajectories

Dec 21, 2024

Shuo Liu, Wenbin Li, Di Yao, Jingping Bi

Figure 1 for Effective and Efficient Representation Learning for Flight Trajectories

Figure 2 for Effective and Efficient Representation Learning for Flight Trajectories

Figure 3 for Effective and Efficient Representation Learning for Flight Trajectories

Figure 4 for Effective and Efficient Representation Learning for Flight Trajectories

Abstract:Flight trajectory data plays a vital role in the traffic management community, especially for downstream tasks such as trajectory prediction, flight recognition, and anomaly detection. Existing works often utilize handcrafted features and design models for different tasks individually, which heavily rely on domain expertise and are hard to extend. We argue that different flight analysis tasks share the same useful features of the trajectory. Jointly learning a unified representation for flight trajectories could be beneficial for improving the performance of various tasks. However, flight trajectory representation learning (TRL) faces two primary challenges, \ie unbalanced behavior density and 3D spatial continuity, which disable recent general TRL methods. In this paper, we propose Flight2Vec , a flight-specific representation learning method to address these challenges. Specifically, a behavior-adaptive patching mechanism is used to inspire the learned representation to pay more attention to behavior-dense segments. Moreover, we introduce a motion trend learning technique that guides the model to memorize not only the precise locations, but also the motion trend to generate better representations. Extensive experimental results demonstrate that Flight2Vec significantly improves performance in downstream tasks such as flight trajectory prediction, flight recognition, and anomaly detection.

* Accepted by AAAI 2025

Via

Access Paper or Ask Questions