Picture for Huawei Shen

Huawei Shen

CAS Key Laboratory of AI Safety, Institute of Computing Technology, CAS, Beijing, China, University of Chinese Academy of Sciences, Beijing, China

GIFT: Games as Informal Training for Generalizable LLMs

Add code
Jan 09, 2026
Viaarxiv icon

Learning from Mistakes: Negative Reasoning Samples Enhance Out-of-Domain Generalization

Add code
Jan 08, 2026
Viaarxiv icon

Fine-tuning Done Right in Model Editing

Add code
Sep 26, 2025
Figure 1 for Fine-tuning Done Right in Model Editing
Figure 2 for Fine-tuning Done Right in Model Editing
Figure 3 for Fine-tuning Done Right in Model Editing
Figure 4 for Fine-tuning Done Right in Model Editing
Viaarxiv icon

Stop Spinning Wheels: Mitigating LLM Overthinking via Mining Patterns for Early Reasoning Exit

Add code
Aug 25, 2025
Figure 1 for Stop Spinning Wheels: Mitigating LLM Overthinking via Mining Patterns for Early Reasoning Exit
Figure 2 for Stop Spinning Wheels: Mitigating LLM Overthinking via Mining Patterns for Early Reasoning Exit
Figure 3 for Stop Spinning Wheels: Mitigating LLM Overthinking via Mining Patterns for Early Reasoning Exit
Figure 4 for Stop Spinning Wheels: Mitigating LLM Overthinking via Mining Patterns for Early Reasoning Exit
Viaarxiv icon

LLM4MEA: Data-free Model Extraction Attacks on Sequential Recommenders via Large Language Models

Add code
Jul 22, 2025
Viaarxiv icon

From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment

Add code
Jun 14, 2025
Figure 1 for From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment
Figure 2 for From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment
Figure 3 for From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment
Figure 4 for From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment
Viaarxiv icon

Inference-time Alignment in Continuous Space

Add code
May 26, 2025
Viaarxiv icon

Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs

Add code
May 23, 2025
Viaarxiv icon

Distilling the Implicit Multi-Branch Structure in LLMs' Reasoning via Reinforcement Learning

Add code
May 22, 2025
Viaarxiv icon

InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning

Add code
May 07, 2025
Figure 1 for InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning
Figure 2 for InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning
Figure 3 for InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning
Figure 4 for InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning
Viaarxiv icon