Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qingbiao Li

RAP: Runtime-Adaptive Pruning for LLM Inference

May 26, 2025

Huanrong Liu, Chunlin Tian, Xuyang Wei, Jiaheng Dai, Qin Liu, Tianqi Wei, Qingbiao Li, Li Li

Figure 1 for RAP: Runtime-Adaptive Pruning for LLM Inference

Figure 2 for RAP: Runtime-Adaptive Pruning for LLM Inference

Figure 3 for RAP: Runtime-Adaptive Pruning for LLM Inference

Figure 4 for RAP: Runtime-Adaptive Pruning for LLM Inference

Abstract:Large language models (LLMs) excel at language understanding and generation, but their enormous computational and memory requirements hinder deployment. Compression offers a potential solution to mitigate these constraints. However, most existing methods rely on fixed heuristics and thus fail to adapt to runtime memory variations or heterogeneous KV-cache demands arising from diverse user requests. To address these limitations, we propose RAP, an elastic pruning framework driven by reinforcement learning (RL) that dynamically adjusts compression strategies in a runtime-aware manner. Specifically, RAP dynamically tracks the evolving ratio between model parameters and KV-cache across practical execution. Recognizing that FFNs house most parameters, whereas parameter -light attention layers dominate KV-cache formation, the RL agent retains only those components that maximize utility within the current memory budget, conditioned on instantaneous workload and device state. Extensive experiments results demonstrate that RAP outperforms state-of-the-art baselines, marking the first time to jointly consider model weights and KV-cache on the fly.

Via

Access Paper or Ask Questions

TacGNN:Learning Tactile-based In-hand Manipulation with a Blind Robot

Apr 03, 2023

Linhan Yang, Bidan Huang, Qingbiao Li, Ya-Yen Tsai, Wang Wei Lee, Chaoyang Song, Jia Pan

Figure 1 for TacGNN:Learning Tactile-based In-hand Manipulation with a Blind Robot

Figure 2 for TacGNN:Learning Tactile-based In-hand Manipulation with a Blind Robot

Figure 3 for TacGNN:Learning Tactile-based In-hand Manipulation with a Blind Robot

Figure 4 for TacGNN:Learning Tactile-based In-hand Manipulation with a Blind Robot

Abstract:In this paper, we propose a novel framework for tactile-based dexterous manipulation learning with a blind anthropomorphic robotic hand, i.e. without visual sensing. First, object-related states were extracted from the raw tactile signals by a graph-based perception model - TacGNN. The resulting tactile features were then utilized in the policy learning of an in-hand manipulation task in the second stage. This method was examined by a Baoding ball task - simultaneously manipulating two spheres around each other by 180 degrees in hand. We conducted experiments on object states prediction and in-hand manipulation using a reinforcement learning algorithm (PPO). Results show that TacGNN is effective in predicting object-related states during manipulation by decreasing the RMSE of prediction to 0.096cm comparing to other methods, such as MLP, CNN, and GCN. Finally, the robot hand could finish an in-hand manipulation task solely relying on the robotic own perception - tactile sensing and proprioception. In addition, our methods are tested on three tasks with different difficulty levels and transferred to the real robot without further training.

* 8 pages, 4 figures, accepted by RAL

Via

Access Paper or Ask Questions

Accelerating Multi-Agent Planning Using Graph Transformers with Bounded Suboptimality

Jan 20, 2023

Chenning Yu, Qingbiao Li, Sicun Gao, Amanda Prorok

Figure 1 for Accelerating Multi-Agent Planning Using Graph Transformers with Bounded Suboptimality

Figure 2 for Accelerating Multi-Agent Planning Using Graph Transformers with Bounded Suboptimality

Figure 3 for Accelerating Multi-Agent Planning Using Graph Transformers with Bounded Suboptimality

Figure 4 for Accelerating Multi-Agent Planning Using Graph Transformers with Bounded Suboptimality

Abstract:Conflict-Based Search is one of the most popular methods for multi-agent path finding. Though it is complete and optimal, it does not scale well. Recent works have been proposed to accelerate it by introducing various heuristics. However, whether these heuristics can apply to non-grid-based problem settings while maintaining their effectiveness remains an open question. In this work, we find that the answer is prone to be no. To this end, we propose a learning-based component, i.e., the Graph Transformer, as a heuristic function to accelerate the planning. The proposed method is provably complete and bounded-suboptimal with any desired factor. We conduct extensive experiments on two environments with dense graphs. Results show that the proposed Graph Transformer can be trained in problem instances with relatively few agents and generalizes well to a larger number of agents, while achieving better performance than state-of-the-art methods.

* Accepted by ICRA 2023

Via

Access Paper or Ask Questions

Learning to Navigate using Visual Sensor Networks

Aug 05, 2022

Jan Blumenkamp, Qingbiao Li, Binyu Wang, Zhe Liu, Amanda Prorok

Figure 1 for Learning to Navigate using Visual Sensor Networks

Figure 2 for Learning to Navigate using Visual Sensor Networks

Figure 3 for Learning to Navigate using Visual Sensor Networks

Figure 4 for Learning to Navigate using Visual Sensor Networks

Abstract:We consider the problem of navigating a mobile robot towards a target in an unknown environment that is endowed with visual sensors, where neither the robot nor the sensors have access to global positioning information and only use first-person-view images. While prior work in sensor network-based navigation uses explicit mapping and planning techniques, and is often aided by external positioning systems, we propose a vision-only based learning approach that leverages a Graph Neural Network (GNN) to encode and communicate relevant viewpoint information to the mobile robot. During navigation, the robot is guided by a model that we train through imitation learning to approximate optimal motion primitives, thereby predicting the effective cost-to-go (to the target). In our experiments, we first demonstrate generalizability to previously unseen environments with various sensor layouts. The results show that communication among the sensors and robot facilitates a significant improvement in success rate while decreasing path detour mean and variability. This is done without requiring a global map, positioning data, nor pre-calibration of the sensor network. Second, we perform a zero-shot transfer of our model from simulation to the real world. To this end, we train a`translator' model that translates between {latent encodings of} real and simulated images so that the navigation policy (which is trained entirely in simulation) can be used directly on the real robot, without additional fine-tuning. Physical experiments demonstrate the feasibility of our approach in various cluttered environments.

Via

Access Paper or Ask Questions

A Framework for Real-World Multi-Robot Systems Running Decentralized GNN-Based Policies

Nov 02, 2021

Jan Blumenkamp, Steven Morad, Jennifer Gielis, Qingbiao Li, Amanda Prorok

Figure 1 for A Framework for Real-World Multi-Robot Systems Running Decentralized GNN-Based Policies

Figure 2 for A Framework for Real-World Multi-Robot Systems Running Decentralized GNN-Based Policies

Figure 3 for A Framework for Real-World Multi-Robot Systems Running Decentralized GNN-Based Policies

Figure 4 for A Framework for Real-World Multi-Robot Systems Running Decentralized GNN-Based Policies

Abstract:Graph Neural Networks (GNNs) are a paradigm-shifting neural architecture to facilitate the learning of complex multi-agent behaviors. Recent work has demonstrated remarkable performance in tasks such as flocking, multi-agent path planning and cooperative coverage. However, the policies derived through GNN-based learning schemes have not yet been deployed to the real-world on physical multi-robot systems. In this work, we present the design of a system that allows for fully decentralized execution of GNN-based policies. We create a framework based on ROS2 and elaborate its details in this paper. We demonstrate our framework on a case-study that requires tight coordination between robots, and present first-of-a-kind results that show successful real-world deployment of GNN-based policies on a decentralized multi-robot system relying on Adhoc communication. A video demonstration of this case-study can be found online. https://www.youtube.com/watch?v=COh-WLn4iO4

* Submitted to IEEE ICRA (International Conference on Robotics and Automation) 2022

Via

Access Paper or Ask Questions

Graph Neural Network Guided Local Search for the Traveling Salesperson Problem

Oct 12, 2021

Benjamin Hudson, Qingbiao Li, Matthew Malencia, Amanda Prorok

Figure 1 for Graph Neural Network Guided Local Search for the Traveling Salesperson Problem

Figure 2 for Graph Neural Network Guided Local Search for the Traveling Salesperson Problem

Figure 3 for Graph Neural Network Guided Local Search for the Traveling Salesperson Problem

Figure 4 for Graph Neural Network Guided Local Search for the Traveling Salesperson Problem

Abstract:Solutions to the Traveling Salesperson Problem (TSP) have practical applications to processes in transportation, logistics, and automation, yet must be computed with minimal delay to satisfy the real-time nature of the underlying tasks. However, solving large TSP instances quickly without sacrificing solution quality remains challenging for current approximate algorithms. To close this gap, we present a hybrid data-driven approach for solving the TSP based on Graph Neural Networks (GNNs) and Guided Local Search (GLS). Our model predicts the regret of including each edge of the problem graph in the solution; GLS uses these predictions in conjunction with the original problem graph to find solutions. Our experiments demonstrate that this approach converges to optimal solutions at a faster rate than state-of-the-art learning-based approaches and non-learning GLS algorithms for the TSP, notably finding optimal solutions to 96% of the 50-node problem set, 7% more than the next best benchmark, and to 20% of the 100-node problem set, 4.5x more than the next best benchmark. When generalizing from 20-node problems to the 100-node problem set, our approach finds solutions with an average optimality gap of 2.5%, a 10x improvement over the next best learning-based benchmark.

Via

Access Paper or Ask Questions

The Holy Grail of Multi-Robot Planning: Learning to Generate Online-Scalable Solutions from Offline-Optimal Experts

Jul 26, 2021

Amanda Prorok, Jan Blumenkamp, Qingbiao Li, Ryan Kortvelesy, Zhe Liu, Ethan Stump

Figure 1 for The Holy Grail of Multi-Robot Planning: Learning to Generate Online-Scalable Solutions from Offline-Optimal Experts

Abstract:Many multi-robot planning problems are burdened by the curse of dimensionality, which compounds the difficulty of applying solutions to large-scale problem instances. The use of learning-based methods in multi-robot planning holds great promise as it enables us to offload the online computational burden of expensive, yet optimal solvers, to an offline learning procedure. Simply put, the idea is to train a policy to copy an optimal pattern generated by a small-scale system, and then transfer that policy to much larger systems, in the hope that the learned strategy scales, while maintaining near-optimal performance. Yet, a number of issues impede us from leveraging this idea to its full potential. This blue-sky paper elaborates some of the key challenges that remain.

Via

Access Paper or Ask Questions

Graph Neural Networks for Decentralized Multi-Robot Submodular Action Selection

May 18, 2021

Lifeng Zhou, Vishnu D. Sharma, Qingbiao Li, Amanda Prorok, Alejandro Ribeiro, Vijay Kumar

Figure 1 for Graph Neural Networks for Decentralized Multi-Robot Submodular Action Selection

Figure 2 for Graph Neural Networks for Decentralized Multi-Robot Submodular Action Selection

Figure 3 for Graph Neural Networks for Decentralized Multi-Robot Submodular Action Selection

Figure 4 for Graph Neural Networks for Decentralized Multi-Robot Submodular Action Selection

Abstract:In this paper, we develop a learning-based approach for decentralized submodular maximization. We focus on applications where robots are required to jointly select actions, e.g., motion primitives, to maximize team submodular objectives with local communications only. Such applications are essential for large-scale multi-robot coordination such as multi-robot motion planning for area coverage, environment exploration, and target tracking. But the current decentralized submodular maximization algorithms either require assumptions on the inter-robot communication or lose some suboptimal guarantees. In this work, we propose a general-purpose learning architecture towards submodular maximization at scale, with decentralized communications. Particularly, our learning architecture leverages a graph neural network (GNN) to capture local interactions of the robots and learns decentralized decision-making for the robots. We train the learning model by imitating an expert solution and implement the resulting model for decentralized action selection involving local observations and communications only. We demonstrate the performance of our GNN-based learning approach in a scenario of active target coverage with large networks of robots. The simulation results show our approach nearly matches the coverage performance of the expert algorithm, and yet runs several orders faster with more than 30 robots. The results also exhibit our approach's generalization capability in previously unseen scenarios, e.g., larger environments and larger networks of robots.

Via

Access Paper or Ask Questions

Decentralized Control with Graph Neural Networks

Dec 29, 2020

Fernando Gama, Qingbiao Li, Ekaterina Tolstaya, Amanda Prorok, Alejandro Ribeiro

Figure 1 for Decentralized Control with Graph Neural Networks

Figure 2 for Decentralized Control with Graph Neural Networks

Figure 3 for Decentralized Control with Graph Neural Networks

Figure 4 for Decentralized Control with Graph Neural Networks

Abstract:Dynamical systems consisting of a set of autonomous agents face the challenge of having to accomplish a global task, relying only on local information. While centralized controllers are readily available, they face limitations in terms of scalability and implementation, as they do not respect the distributed information structure imposed by the network system of agents. Given the difficulties in finding optimal decentralized controllers, we propose a novel framework using graph neural networks (GNNs) to learn these controllers. GNNs are well-suited for the task since they are naturally distributed architectures and exhibit good scalability and transferability properties. The problems of flocking and multi-agent path planning are explored to illustrate the potential of GNNs in learning decentralized controllers.

* Submitted to IEEE Transactions on Signal Processing

Via

Access Paper or Ask Questions

Message-Aware Graph Attention Networks for Large-Scale Multi-Robot Path Planning

Nov 26, 2020

Qingbiao Li, Weizhe Lin, Zhe Liu, Amanda Prorok

Figure 1 for Message-Aware Graph Attention Networks for Large-Scale Multi-Robot Path Planning

Figure 2 for Message-Aware Graph Attention Networks for Large-Scale Multi-Robot Path Planning

Figure 3 for Message-Aware Graph Attention Networks for Large-Scale Multi-Robot Path Planning

Figure 4 for Message-Aware Graph Attention Networks for Large-Scale Multi-Robot Path Planning

Abstract:The domains of transport and logistics are increasingly relying on autonomous mobile robots for the handling and distribution of passengers or resources. At large system scales, finding decentralized path planning and coordination solutions is key to efficient system performance. Recently, Graph Neural Networks (GNNs) have become popular due to their ability to learn communication policies in decentralized multi-agent systems. Yet, vanilla GNNs rely on simplistic message aggregation mechanisms that prevent agents from prioritizing important information. To tackle this challenge, in this paper, we extend our previous work that utilizes GNNs in multi-agent path planning by incorporating a novel mechanism to allow for message-dependent attention. Our Message-Aware Graph Attention neTwork (MAGAT) is based on a key-query-like mechanism that determines the relative importance of features in the messages received from various neighboring robots. We show that MAGAT is able to achieve a performance close to that of a coupled centralized expert algorithm. Further, ablation studies and comparisons to several benchmark models show that our attention mechanism is very effective across different robot densities and performs stably in different constraints in communication bandwidth. Experiments demonstrate that our model is able to generalize well in previously unseen problem instances, and it achieves a 47% improvement over the benchmark success rate, even in very large-scale instances that are 100x larger than the training instances.

* This work has been submitted to the IEEE Robotics and Automation Letters (RA-L) for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions