Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiawei Li

Mullet

Entropy-Guided Reasoning Compression

Nov 18, 2025

Hourun Zhu, Yang Gao, Wenlong Fei, Jiawei Li, Huashan Sun

Figure 1 for Entropy-Guided Reasoning Compression

Figure 2 for Entropy-Guided Reasoning Compression

Figure 3 for Entropy-Guided Reasoning Compression

Figure 4 for Entropy-Guided Reasoning Compression

Abstract:Large reasoning models have demonstrated remarkable performance on complex reasoning tasks, yet the excessive length of their chain-of-thought outputs remains a major practical bottleneck due to high computation cost and poor deployability. Existing compression methods have achieved partial success but overlook a crucial phenomenon in the training process -- the entropy conflict. During compression training, entropy decreases, leading to shorter reasoning but limited exploration, while accuracy-oriented objectives increase entropy, lengthening reasoning chains. This can cause the model to get stuck in a local dilemma. Our analysis further reveals the origin of the entropy conflict: many high-entropy tokens are logical connectors that receive larger gradients and are encouraged under the performance objective, while the compression objective simultaneously penalizes these potentially redundant connectors. This opposing pressure creates a direct source of entropy conflict. To address these issues, we adopt an entropy-guided training framework. As entropy descends, the model is guided toward efficient reasoning by encouraging concise thought steps; as entropy rises, exploration is reinforced under the compact reasoning mode to improve robustness. Experiments on six mathematical benchmarks show that our method compresses reasoning length to 20% of the original while maintaining or even surpassing baseline accuracy. Code and models will be released publicly.

* 10pages, 4 figures

Via

Access Paper or Ask Questions

MonoPartNeRF:Human Reconstruction from Monocular Video via Part-Based Neural Radiance Fields

Aug 12, 2025

Yao Lu, Jiawei Li, Ming Jiang

Abstract:In recent years, Neural Radiance Fields (NeRF) have achieved remarkable progress in dynamic human reconstruction and rendering. Part-based rendering paradigms, guided by human segmentation, allow for flexible parameter allocation based on structural complexity, thereby enhancing representational efficiency. However, existing methods still struggle with complex pose variations, often producing unnatural transitions at part boundaries and failing to reconstruct occluded regions accurately in monocular settings. We propose MonoPartNeRF, a novel framework for monocular dynamic human rendering that ensures smooth transitions and robust occlusion recovery. First, we build a bidirectional deformation model that combines rigid and non-rigid transformations to establish a continuous, reversible mapping between observation and canonical spaces. Sampling points are projected into a parameterized surface-time space (u, v, t) to better capture non-rigid motion. A consistency loss further suppresses deformation-induced artifacts and discontinuities. We introduce a part-based pose embedding mechanism that decomposes global pose vectors into local joint embeddings based on body regions. This is combined with keyframe pose retrieval and interpolation, along three orthogonal directions, to guide pose-aware feature sampling. A learnable appearance code is integrated via attention to model dynamic texture changes effectively. Experiments on the ZJU-MoCap and MonoCap datasets demonstrate that our method significantly outperforms prior approaches under complex pose and occlusion conditions, achieving superior joint alignment, texture fidelity, and structural continuity.

Via

Access Paper or Ask Questions

T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models

May 05, 2025

Yunfeng Ge, Jiawei Li, Yiji Zhao, Haomin Wen, Zhao Li, Meikang Qiu, Hongyan Li, Ming Jin, Shirui Pan

Abstract:Text-to-Time Series generation holds significant potential to address challenges such as data sparsity, imbalance, and limited availability of multimodal time series datasets across domains. While diffusion models have achieved remarkable success in Text-to-X (e.g., vision and audio data) generation, their use in time series generation remains in its nascent stages. Existing approaches face two critical limitations: (1) the lack of systematic exploration of general-proposed time series captions, which are often domain-specific and struggle with generalization; and (2) the inability to generate time series of arbitrary lengths, limiting their applicability to real-world scenarios. In this work, we first categorize time series captions into three levels: point-level, fragment-level, and instance-level. Additionally, we introduce a new fragment-level dataset containing over 600,000 high-resolution time series-text pairs. Second, we propose Text-to-Series (T2S), a diffusion-based framework that bridges the gap between natural language and time series in a domain-agnostic manner. T2S employs a length-adaptive variational autoencoder to encode time series of varying lengths into consistent latent embeddings. On top of that, T2S effectively aligns textual representations with latent embeddings by utilizing Flow Matching and employing Diffusion Transformer as the denoiser. We train T2S in an interleaved paradigm across multiple lengths, allowing it to generate sequences of any desired length. Extensive evaluations demonstrate that T2S achieves state-of-the-art performance across 13 datasets spanning 12 domains.

* Accepted by the 34th International Joint Conference on Artificial Intelligence (IJCAI 2025)

Via

Access Paper or Ask Questions

GaussTrap: Stealthy Poisoning Attacks on 3D Gaussian Splatting for Targeted Scene Confusion

Apr 29, 2025

Jiaxin Hong, Sixu Chen, Shuoyang Sun, Hongyao Yu, Hao Fang, Yuqi Tan, Bin Chen, Shuhan Qi, Jiawei Li

Abstract:As 3D Gaussian Splatting (3DGS) emerges as a breakthrough in scene representation and novel view synthesis, its rapid adoption in safety-critical domains (e.g., autonomous systems, AR/VR) urgently demands scrutiny of potential security vulnerabilities. This paper presents the first systematic study of backdoor threats in 3DGS pipelines. We identify that adversaries may implant backdoor views to induce malicious scene confusion during inference, potentially leading to environmental misperception in autonomous navigation or spatial distortion in immersive environments. To uncover this risk, we propose GuassTrap, a novel poisoning attack method targeting 3DGS models. GuassTrap injects malicious views at specific attack viewpoints while preserving high-quality rendering in non-target views, ensuring minimal detectability and maximizing potential harm. Specifically, the proposed method consists of a three-stage pipeline (attack, stabilization, and normal training) to implant stealthy, viewpoint-consistent poisoned renderings in 3DGS, jointly optimizing attack efficacy and perceptual realism to expose security risks in 3D rendering. Extensive experiments on both synthetic and real-world datasets demonstrate that GuassTrap can effectively embed imperceptible yet harmful backdoor views while maintaining high-quality rendering in normal views, validating its robustness, adaptability, and practical applicability.

Via

Access Paper or Ask Questions

Detecting Instruction Fine-tuning Attack on Language Models with Influence Function

Apr 12, 2025

Jiawei Li

Figure 1 for Detecting Instruction Fine-tuning Attack on Language Models with Influence Function

Figure 2 for Detecting Instruction Fine-tuning Attack on Language Models with Influence Function

Figure 3 for Detecting Instruction Fine-tuning Attack on Language Models with Influence Function

Figure 4 for Detecting Instruction Fine-tuning Attack on Language Models with Influence Function

Abstract:Instruction fine-tuning attacks pose a significant threat to large language models (LLMs) by subtly embedding poisoned data in fine-tuning datasets, which can trigger harmful or unintended responses across a range of tasks. This undermines model alignment and poses security risks in real-world deployment. In this work, we present a simple and effective approach to detect and mitigate such attacks using influence functions, a classical statistical tool adapted for machine learning interpretation. Traditionally, the high computational costs of influence functions have limited their application to large models and datasets. The recent Eigenvalue-Corrected Kronecker-Factored Approximate Curvature (EK-FAC) approximation method enables efficient influence score computation, making it feasible for large-scale analysis. We are the first to apply influence functions for detecting language model instruction fine-tuning attacks on large-scale datasets, as both the instruction fine-tuning attack on language models and the influence calculation approximation technique are relatively new. Our large-scale empirical evaluation of influence functions on 50,000 fine-tuning examples and 32 tasks reveals a strong association between influence scores and sentiment. Building on this, we introduce a novel sentiment transformation combined with influence functions to detect and remove critical poisons -- poisoned data points that skew model predictions. Removing these poisons (only 1% of total data) recovers model performance to near-clean levels, demonstrating the effectiveness and efficiency of our approach. Artifact is available at https://github.com/lijiawei20161002/Poison-Detection. WARNING: This paper contains offensive data examples.

Via

Access Paper or Ask Questions

An Enhanced Iterative Deepening Search Algorithm for the Unrestricted Container Rehandling Problem

Apr 12, 2025

Ruoqi Wang, Jiawei Li

Figure 1 for An Enhanced Iterative Deepening Search Algorithm for the Unrestricted Container Rehandling Problem

Figure 2 for An Enhanced Iterative Deepening Search Algorithm for the Unrestricted Container Rehandling Problem

Figure 3 for An Enhanced Iterative Deepening Search Algorithm for the Unrestricted Container Rehandling Problem

Figure 4 for An Enhanced Iterative Deepening Search Algorithm for the Unrestricted Container Rehandling Problem

Abstract:In container terminal yards, the Container Rehandling Problem (CRP) involves rearranging containers between stacks under specific operational rules, and it is a pivotal optimization challenge in intelligent container scheduling systems. Existing CRP studies primarily focus on minimizing reallocation costs using two-dimensional bay structures, considering factors such as container size, weight, arrival sequences, and retrieval priorities. This paper introduces an enhanced deepening search algorithm integrated with improved lower bounds to boost search efficiency. To further reduce the search space, we design mutually consistent pruning rules to avoid excessive computational overhead. The proposed algorithm is validated on three widely used benchmark datasets for the Unrestricted Container Rehandling Problem (UCRP). Experimental results demonstrate that our approach outperforms state-of-the-art exact algorithms in solving the more general UCRP variant, particularly exhibiting superior efficiency when handling containers within the same priority group under strict time constraints.

* 10 pages

Via

Access Paper or Ask Questions

A2RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion

Dec 18, 2024

Jiawei Li, Hongwei Yu, Jiansheng Chen, Xinlong Ding, Jinlong Wang, Jinyuan Liu, Bochao Zou, Huimin Ma

Figure 1 for A2RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion

Figure 2 for A2RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion

Figure 3 for A2RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion

Figure 4 for A2RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion

Abstract:Infrared and visible image fusion (IVIF) is a crucial technique for enhancing visual performance by integrating unique information from different modalities into one fused image. Exiting methods pay more attention to conducting fusion with undisturbed data, while overlooking the impact of deliberate interference on the effectiveness of fusion results. To investigate the robustness of fusion models, in this paper, we propose a novel adversarial attack resilient network, called $\textrm{A}^{\textrm{2}}$RNet. Specifically, we develop an adversarial paradigm with an anti-attack loss function to implement adversarial attacks and training. It is constructed based on the intrinsic nature of IVIF and provide a robust foundation for future research advancements. We adopt a Unet as the pipeline with a transformer-based defensive refinement module (DRM) under this paradigm, which guarantees fused image quality in a robust coarse-to-fine manner. Compared to previous works, our method mitigates the adverse effects of adversarial perturbations, consistently maintaining high-fidelity fusion results. Furthermore, the performance of downstream tasks can also be well maintained under adversarial attacks. Code is available at https://github.com/lok-18/A2RNet.

* 9 pages, 8 figures, The 39th Annual AAAI Conference on Artificial Intelligence

Via

Access Paper or Ask Questions

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Dec 16, 2024

Guoxuan Chen, Han Shi, Jiawei Li, Yihang Gao, Xiaozhe Ren, Yimeng Chen, Xin Jiang, Zhenguo Li, Weiyang Liu, Chao Huang

Figure 1 for SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Figure 2 for SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Figure 3 for SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Figure 4 for SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Abstract:Large Language Models (LLMs) have exhibited exceptional performance across a spectrum of natural language processing tasks. However, their substantial sizes pose considerable challenges, particularly in computational demands and inference speed, due to their quadratic complexity. In this work, we have identified a key pattern: certain seemingly meaningless special tokens (i.e., separators) contribute disproportionately to attention scores compared to semantically meaningful tokens. This observation suggests that information of the segments between these separator tokens can be effectively condensed into the separator tokens themselves without significant information loss. Guided by this insight, we introduce SepLLM, a plug-and-play framework that accelerates inference by compressing these segments and eliminating redundant tokens. Additionally, we implement efficient kernels for training acceleration. Experimental results across training-free, training-from-scratch, and post-training settings demonstrate SepLLM's effectiveness. Notably, using the Llama-3-8B backbone, SepLLM achieves over 50% reduction in KV cache on the GSM8K-CoT benchmark while maintaining comparable performance. Furthermore, in streaming settings, SepLLM effectively processes sequences of up to 4 million tokens or more while maintaining consistent language modeling capabilities.

Via

Access Paper or Ask Questions

$\textrm{A}^{\textrm{2}}$RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion

Dec 13, 2024

Jiawei Li, Hongwei Yu, Jiansheng Chen, Xinlong Ding, Jinlong Wang, Jinyuan Liu, Bochao Zou, Huimin Ma

$Figure 1 for $\textrm{A}^{\textrm{2}}$RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion$

$Figure 2 for $\textrm{A}^{\textrm{2}}$RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion$

$Figure 3 for $\textrm{A}^{\textrm{2}}$RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion$

$Figure 4 for $\textrm{A}^{\textrm{2}}$RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion$

* 9 pages, 8 figures, The 39th Annual AAAI Conference on Artificial Intelligence

Via

Access Paper or Ask Questions

Delta-Influence: Unlearning Poisons via Influence Functions

Nov 20, 2024

Wenjie Li, Jiawei Li, Christian Schroeder de Witt, Ameya Prabhu, Amartya Sanyal

Figure 1 for Delta-Influence: Unlearning Poisons via Influence Functions

Figure 2 for Delta-Influence: Unlearning Poisons via Influence Functions

Figure 3 for Delta-Influence: Unlearning Poisons via Influence Functions

Figure 4 for Delta-Influence: Unlearning Poisons via Influence Functions

Abstract:Addressing data integrity challenges, such as unlearning the effects of data poisoning after model training, is necessary for the reliable deployment of machine learning models. State-of-the-art influence functions, such as EK-FAC, often fail to accurately attribute abnormal model behavior to the specific poisoned training data responsible for the data poisoning attack. In addition, traditional unlearning algorithms often struggle to effectively remove the influence of poisoned samples, particularly when only a few affected examples can be identified. To address these challenge, we introduce $\Delta$-Influence, a novel approach that leverages influence functions to trace abnormal model behavior back to the responsible poisoned training data using as little as just one poisoned test example. $\Delta$-Influence applies data transformations that sever the link between poisoned training data and compromised test points without significantly affecting clean data. This allows $\Delta$-Influence to detect large negative shifts in influence scores following data transformations, a phenomenon we term as influence collapse, thereby accurately identifying poisoned training data. Unlearning this subset, e.g. through retraining, effectively eliminates the data poisoning. We validate our method across three vision-based poisoning attacks and three datasets, benchmarking against four detection algorithms and five unlearning strategies. We show that $\Delta$-Influence consistently achieves the best unlearning across all settings, showing the promise of influence functions for corrective unlearning. Our code is publicly available at: \url{https://github.com/andyisokay/delta-influence}

* Accepted at NeurIPS Workshop on Attributing Model Behavior at Scale (ATTRIB @ NeurIPS 2024)

Via

Access Paper or Ask Questions