Picture for Bing Qin

Bing Qin

MPO: Multilingual Safety Alignment via Reward Gap Optimization

Add code
May 22, 2025
Viaarxiv icon

When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners

Add code
May 21, 2025
Viaarxiv icon

Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment

Add code
May 21, 2025
Figure 1 for Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Figure 2 for Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Figure 3 for Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Figure 4 for Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Viaarxiv icon

Investigating and Enhancing the Robustness of Large Multimodal Models Against Temporal Inconsistency

Add code
May 20, 2025
Viaarxiv icon

Beyond Frameworks: Unpacking Collaboration Strategies in Multi-Agent Systems

Add code
May 18, 2025
Viaarxiv icon

UFO-RL: Uncertainty-Focused Optimization for Efficient Reinforcement Learning Data Selection

Add code
May 18, 2025
Viaarxiv icon

Simulating Before Planning: Constructing Intrinsic User World Model for User-Tailored Dialogue Policy Planning

Add code
Apr 18, 2025
Figure 1 for Simulating Before Planning: Constructing Intrinsic User World Model for User-Tailored Dialogue Policy Planning
Figure 2 for Simulating Before Planning: Constructing Intrinsic User World Model for User-Tailored Dialogue Policy Planning
Figure 3 for Simulating Before Planning: Constructing Intrinsic User World Model for User-Tailored Dialogue Policy Planning
Figure 4 for Simulating Before Planning: Constructing Intrinsic User World Model for User-Tailored Dialogue Policy Planning
Viaarxiv icon

Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models

Add code
Apr 17, 2025
Viaarxiv icon

AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender

Add code
Apr 13, 2025
Viaarxiv icon

Chain of Strategy Optimization Makes Large Language Models Better Emotional Supporter

Add code
Mar 07, 2025
Viaarxiv icon