Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shengcai Liu

OD-DEAL: Dynamic Expert-Guided Adversarial Learning with Online Decomposition for Scalable Capacitated Vehicle Routing

Feb 03, 2026

Dongbin Jiao, Zisheng Chen, Xianyi Wang, Jintao Shi, Shengcai Liu, Shi Yan

Abstract:Solving large-scale capacitated vehicle routing problems (CVRP) is hindered by the high complexity of heuristics and the limited generalization of neural solvers on massive graphs. We propose OD-DEAL, an adversarial learning framework that tightly integrates hybrid genetic search (HGS) and online barycenter clustering (BCC) decomposition, and leverages high-fidelity knowledge distillation to transfer expert heuristic behavior. OD-DEAL trains a graph attention network (GAT)-based generative policy through a minimax game, in which divide-and-conquer strategies from a hybrid expert are distilled into dense surrogate rewards. This enables high-quality, clustering-free inference on large-scale instances. Empirical results demonstrate that OD-DEAL achieves state-of-the-art (SOTA) real-time CVRP performance, solving 10000-node instances with near-constant neural scaling. This uniquely enables the sub-second, heuristic-quality inference required for dynamic large-scale deployment.

Via

Access Paper or Ask Questions

Personalized Treatment Outcome Prediction from Scarce Data via Dual-Channel Knowledge Distillation and Adaptive Fusion

Oct 30, 2025

Wenjie Chen, Li Zhuang, Ziying Luo, Yu Liu, Jiahao Wu, Shengcai Liu

Abstract:Personalized treatment outcome prediction based on trial data for small-sample and rare patient groups is critical in precision medicine. However, the costly trial data limit the prediction performance. To address this issue, we propose a cross-fidelity knowledge distillation and adaptive fusion network (CFKD-AFN), which leverages abundant but low-fidelity simulation data to enhance predictions on scarce but high-fidelity trial data. CFKD-AFN incorporates a dual-channel knowledge distillation module to extract complementary knowledge from the low-fidelity model, along with an attention-guided fusion module to dynamically integrate multi-source information. Experiments on treatment outcome prediction for the chronic obstructive pulmonary disease demonstrates significant improvements of CFKD-AFN over state-of-the-art methods in prediction accuracy, ranging from 6.67\% to 74.55\%, and strong robustness to varying high-fidelity dataset sizes. Furthermore, we extend CFKD-AFN to an interpretable variant, enabling the exploration of latent medical semantics to support clinical decision-making.

Via

Access Paper or Ask Questions

Quality-Diversity Red-Teaming: Automated Generation of High-Quality and Diverse Attackers for Large Language Models

Jun 08, 2025

Ren-Jian Wang, Ke Xue, Zeyu Qin, Ziniu Li, Sheng Tang, Hao-Tian Li, Shengcai Liu, Chao Qian

Abstract:Ensuring safety of large language models (LLMs) is important. Red teaming--a systematic approach to identifying adversarial prompts that elicit harmful responses from target LLMs--has emerged as a crucial safety evaluation method. Within this framework, the diversity of adversarial prompts is essential for comprehensive safety assessments. We find that previous approaches to red-teaming may suffer from two key limitations. First, they often pursue diversity through simplistic metrics like word frequency or sentence embedding similarity, which may not capture meaningful variation in attack strategies. Second, the common practice of training a single attacker model restricts coverage across potential attack styles and risk categories. This paper introduces Quality-Diversity Red-Teaming (QDRT), a new framework designed to address these limitations. QDRT achieves goal-driven diversity through behavior-conditioned training and implements a behavioral replay buffer in an open-ended manner. Additionally, it trains multiple specialized attackers capable of generating high-quality attacks across diverse styles and risk categories. Our empirical evaluation demonstrates that QDRT generates attacks that are both more diverse and more effective against a wide range of target LLMs, including GPT-2, Llama-3, Gemma-2, and Qwen2.5. This work advances the field of LLM safety by providing a systematic and effective approach to automated red-teaming, ultimately supporting the responsible deployment of LLMs.

Via

Access Paper or Ask Questions

Safe Delta: Consistently Preserving Safety when Fine-Tuning LLMs on Diverse Datasets

May 17, 2025

Ning Lu, Shengcai Liu, Jiahao Wu, Weiyu Chen, Zhirui Zhang, Yew-Soon Ong, Qi Wang, Ke Tang

Abstract:Large language models (LLMs) have shown great potential as general-purpose AI assistants across various domains. To fully leverage this potential in specific applications, many companies provide fine-tuning API services, enabling users to upload their own data for LLM customization. However, fine-tuning services introduce a new safety threat: user-uploaded data, whether harmful or benign, can break the model's alignment, leading to unsafe outputs. Moreover, existing defense methods struggle to address the diversity of fine-tuning datasets (e.g., varying sizes, tasks), often sacrificing utility for safety or vice versa. To address this issue, we propose Safe Delta, a safety-aware post-training defense method that adjusts the delta parameters (i.e., the parameter change before and after fine-tuning). Specifically, Safe Delta estimates the safety degradation, selects delta parameters to maximize utility while limiting overall safety loss, and applies a safety compensation vector to mitigate residual safety loss. Through extensive experiments on four diverse datasets with varying settings, our approach consistently preserves safety while ensuring that the utility gain from benign datasets remains unaffected.

* ICML 2025 Camera Ready

Via

Access Paper or Ask Questions

Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs

May 16, 2025

Zhangying Feng, Qianglong Chen, Ning Lu, Yongqian Li, Siqi Cheng, Shuangmu Peng, Duyu Tang, Shengcai Liu, Zhirui Zhang

Abstract:The development of reasoning capabilities represents a critical frontier in large language models (LLMs) research, where reinforcement learning (RL) and process reward models (PRMs) have emerged as predominant methodological frameworks. Contrary to conventional wisdom, empirical evidence from DeepSeek-R1 demonstrates that pure RL training focused on mathematical problem-solving can progressively enhance reasoning abilities without PRM integration, challenging the perceived necessity of process supervision. In this study, we conduct a systematic investigation of the relationship between RL training and PRM capabilities. Our findings demonstrate that problem-solving proficiency and process supervision capabilities represent complementary dimensions of reasoning that co-evolve synergistically during pure RL training. In particular, current PRMs underperform simple baselines like majority voting when applied to state-of-the-art models such as DeepSeek-R1 and QwQ-32B. To address this limitation, we propose Self-PRM, an introspective framework in which models autonomously evaluate and rerank their generated solutions through self-reward mechanisms. Although Self-PRM consistently improves the accuracy of the benchmark (particularly with larger sample sizes), analysis exposes persistent challenges: The approach exhibits low precision (<10\%) on difficult problems, frequently misclassifying flawed solutions as valid. These analyses underscore the need for continued RL scaling to improve reward alignment and introspective accuracy. Overall, our findings suggest that PRM may not be essential for enhancing complex reasoning, as pure RL not only improves problem-solving skills but also inherently fosters robust PRM capabilities. We hope these findings provide actionable insights for building more reliable and self-aware complex reasoning models.

Via

Access Paper or Ask Questions

SemDiff: Generating Natural Unrestricted Adversarial Examples via Semantic Attributes Optimization in Diffusion Models

Apr 16, 2025

Zeyu Dai, Shengcai Liu, Rui He, Jiahao Wu, Ning Lu, Wenqi Fan, Qing Li, Ke Tang

Abstract:Unrestricted adversarial examples (UAEs), allow the attacker to create non-constrained adversarial examples without given clean samples, posing a severe threat to the safety of deep learning models. Recent works utilize diffusion models to generate UAEs. However, these UAEs often lack naturalness and imperceptibility due to simply optimizing in intermediate latent noises. In light of this, we propose SemDiff, a novel unrestricted adversarial attack that explores the semantic latent space of diffusion models for meaningful attributes, and devises a multi-attributes optimization approach to ensure attack success while maintaining the naturalness and imperceptibility of generated UAEs. We perform extensive experiments on four tasks on three high-resolution datasets, including CelebA-HQ, AFHQ and ImageNet. The results demonstrate that SemDiff outperforms state-of-the-art methods in terms of attack success rate and imperceptibility. The generated UAEs are natural and exhibit semantically meaningful changes, in accord with the attributes' weights. In addition, SemDiff is found capable of evading different defenses, which further validates its effectiveness and threatening.

Via

Access Paper or Ask Questions

Memetic Search for Green Vehicle Routing Problem with Private Capacitated Refueling Stations

Apr 06, 2025

Rui Xu, Xing Fan, Shengcai Liu, Wenjie Chen, Ke Tang

Figure 1 for Memetic Search for Green Vehicle Routing Problem with Private Capacitated Refueling Stations

Figure 2 for Memetic Search for Green Vehicle Routing Problem with Private Capacitated Refueling Stations

Figure 3 for Memetic Search for Green Vehicle Routing Problem with Private Capacitated Refueling Stations

Figure 4 for Memetic Search for Green Vehicle Routing Problem with Private Capacitated Refueling Stations

Abstract:The green vehicle routing problem with private capacitated alternative fuel stations (GVRP-PCAFS) extends the traditional green vehicle routing problem by considering refueling stations limited capacity, where a limited number of vehicles can refuel simultaneously with additional vehicles must wait. This feature presents new challenges for route planning, as waiting times at stations must be managed while keeping route durations within limits and reducing total travel distance. This article presents METS, a novel memetic algorithm (MA) with separate constraint-based tour segmentation (SCTS) and efficient local search (ELS) for solving GVRP-PCAFS. METS combines global and local search effectively through three novelties. For global search, the SCTS strategy splits giant tours to generate diverse solutions, and the search process is guided by a comprehensive fitness evaluation function to dynamically control feasibility and diversity to produce solutions that are both diverse and near-feasible. For local search, ELS incorporates tailored move operators with constant-time move evaluation mechanisms, enabling efficient exploration of large solution neighborhoods. Experimental results demonstrate that METS discovers 31 new best-known solutions out of 40 instances in existing benchmark sets, achieving substantial improvements over current state-of-the-art methods. Additionally, a new large-scale benchmark set based on real-world logistics data is introduced to facilitate future research.

Via

Access Paper or Ask Questions

Cascaded Large-Scale TSP Solving with Unified Neural Guidance: Bridging Local and Population-based Search

Jan 24, 2025

Shengcai Liu, Haoze Lv, Zhiyuan Wang, Ke Tang

Figure 1 for Cascaded Large-Scale TSP Solving with Unified Neural Guidance: Bridging Local and Population-based Search

Figure 2 for Cascaded Large-Scale TSP Solving with Unified Neural Guidance: Bridging Local and Population-based Search

Figure 3 for Cascaded Large-Scale TSP Solving with Unified Neural Guidance: Bridging Local and Population-based Search

Figure 4 for Cascaded Large-Scale TSP Solving with Unified Neural Guidance: Bridging Local and Population-based Search

Abstract:The traveling salesman problem (TSP) is a fundamental NP-hard optimization problem. This work presents UNiCS, a novel unified neural-guided cascaded solver for solving large-scale TSP instances. UNiCS comprises a local search (LS) phase and a population-based search (PBS) phase, both guided by a learning component called unified neural guidance (UNG). Specifically, UNG guides solution generation across both phases and determines appropriate phase transition timing to effectively combine the complementary strengths of LS and PBS. While trained only on simple distributions with relatively small-scale TSP instances, UNiCS generalizes effectively to challenging TSP benchmarks containing much larger instances (10,000-71,009 nodes) with diverse node distributions entirely unseen during training. Experimental results on the large-scale TSP instances demonstrate that UNiCS consistently outperforms state-of-the-art methods, with its advantage remaining consistent across various runtime budgets.

Via

Access Paper or Ask Questions

Domain-Agnostic Co-Evolution of Generalizable Parallel Algorithm Portfolios

Jan 08, 2025

Zhiyuan Wang, Shengcai Liu, Peng Yang, Ke Tang

Abstract:Generalization is the core objective when training optimizers from data. However, limited training instances often constrain the generalization capability of the trained optimizers. Co-evolutionary approaches address this challenge by simultaneously evolving a parallel algorithm portfolio (PAP) and an instance population to eventually obtain PAPs with good generalization. Yet, when applied to a specific problem class, these approaches have a major limitation. They require practitioners to provide instance generators specially tailored to the problem class, which is often non-trivial to design. This work proposes a general-purpose, off-the-shelf PAP construction approach, named domain-agnostic co-evolution of parameterized search (DACE), for binary optimization problems where decision variables take values of 0 or 1. The key innovation of DACE lies in its neural network-based domain-agnostic instance representation and generation mechanism that delimitates the need for domain-specific instance generators. The strong generality of DACE is validated across three real-world binary optimization problems: the complementary influence maximization problem (CIMP), the compiler arguments optimization problem (CAOP), and the contamination control problem (CCP). Given only a small set of training instances from these classes, DACE, without requiring any domain knowledge, constructs PAPs with better generalization performance than existing approaches on all three classes, despite their use of domain-specific instance generators.

Via

Access Paper or Ask Questions

Hybrid Memetic Search for Electric Vehicle Routing with Time Windows, Simultaneous Pickup-Delivery, and Partial Recharges

Oct 25, 2024

Zubin Zheng, Shengcai Liu, Yew-Soon Ong

Figure 1 for Hybrid Memetic Search for Electric Vehicle Routing with Time Windows, Simultaneous Pickup-Delivery, and Partial Recharges

Figure 2 for Hybrid Memetic Search for Electric Vehicle Routing with Time Windows, Simultaneous Pickup-Delivery, and Partial Recharges

Figure 3 for Hybrid Memetic Search for Electric Vehicle Routing with Time Windows, Simultaneous Pickup-Delivery, and Partial Recharges

Figure 4 for Hybrid Memetic Search for Electric Vehicle Routing with Time Windows, Simultaneous Pickup-Delivery, and Partial Recharges

Abstract:With growing environmental concerns, electric vehicles for logistics have gained significant attention within the computational intelligence community in recent years. This work addresses an emerging and significant extension of the electric vehicle routing problem (EVRP), namely EVRP with time windows, simultaneous pickup-delivery, and partial recharges (EVRP-TW-SPD), which has wide real-world applications. We propose a hybrid memetic algorithm (HMA) for solving EVRP-TW-SPD. HMA incorporates two novel components: a parallel-sequential station insertion procedure for handling partial recharges that can better avoid local optima compared to purely sequential insertion, and a cross-domain neighborhood search that explores solution spaces of both electric and non-electric problem domains simultaneously. These components can also be easily applied to various EVRP variants. To bridge the gap between existing benchmarks and real-world scenarios, we introduce a new, large-scale EVRP-TW-SPD benchmark set derived from real-world applications, containing instances with many more customers and charging stations than existing benchmark instances. Extensive experiments demonstrate the significant performance advantages of HMA over existing algorithms across a wide range of problem instances. Both the benchmark set and HMA will be open-sourced to facilitate further research in this area.

Via

Access Paper or Ask Questions