Picture for Mengnan Du

Mengnan Du

TAROT: Task-Adaptive Refinement of LLM-prior Graphs for Few-shot Tabular Learning

Add code
Jun 10, 2026
Viaarxiv icon

SAIGuard: Communication-State Simulation for Proactive Defense of LLM Multi-Agent Systems

Add code
Jun 10, 2026
Viaarxiv icon

DynaCF: Mitigating Shortcut Learning in Reward Models via Dynamic Counterfactual Sensitivity

Add code
Jun 08, 2026
Viaarxiv icon

SAEExplainer: Interpreting SAE Features with Activation-Guided Preference Optimization

Add code
Jun 07, 2026
Viaarxiv icon

RASFT: Rollout-Adaptive Supervised Fine-Tuning for Reasoning

Add code
Jun 05, 2026
Viaarxiv icon

HARVE: Hacking-Aware Reward-Head Vector Editing for Robust Reward Models

Add code
Jun 02, 2026
Viaarxiv icon

Law of Neural Interaction: Depth-Width Shape, Interaction Efficiency, and Generalization

Add code
May 27, 2026
Viaarxiv icon

Universal Activation Verbalizer: A Unified Framework for Cross-Model Activation Explanation

Add code
May 25, 2026
Viaarxiv icon

Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs

Add code
Mar 03, 2026
Viaarxiv icon

FinAnchor: Aligned Multi-Model Representations for Financial Prediction

Add code
Feb 24, 2026
Viaarxiv icon