Picture for Xiaozhi Wang

Xiaozhi Wang

Guiding LLM Post-training Data Engineering with Model Internals from Sparse Autoencoders

Add code
May 26, 2026
Viaarxiv icon

StoryAlign: Evaluating and Training Reward Models for Story Generation

Add code
May 06, 2026
Viaarxiv icon

WildReward: Learning Reward Models from In-the-Wild Human Interactions

Add code
Feb 09, 2026
Viaarxiv icon

On the Paradoxical Interference between Instruction-Following and Task Solving

Add code
Jan 29, 2026
Viaarxiv icon

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following

Add code
Jun 11, 2025
Viaarxiv icon

AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios

Add code
May 22, 2025
Viaarxiv icon

Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks

Add code
Apr 26, 2025
Figure 1 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Figure 2 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Figure 3 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Figure 4 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Viaarxiv icon

Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs

Add code
Mar 03, 2025
Figure 1 for Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs
Figure 2 for Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs
Figure 3 for Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs
Figure 4 for Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs
Viaarxiv icon

Sparse Auto-Encoder Interprets Linguistic Features in Large Language Models

Add code
Feb 27, 2025
Viaarxiv icon

Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

Add code
Feb 26, 2025
Viaarxiv icon