Picture for Yulin Hu

Yulin Hu

MPO: Multilingual Safety Alignment via Reward Gap Optimization

Add code
May 22, 2025
Viaarxiv icon

Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment

Add code
May 21, 2025
Viaarxiv icon

When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners

Add code
May 21, 2025
Viaarxiv icon

AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender

Add code
Apr 13, 2025
Viaarxiv icon

Chain of Strategy Optimization Makes Large Language Models Better Emotional Supporter

Add code
Mar 07, 2025
Viaarxiv icon

Beware of Your Po! Measuring and Mitigating AI Safety Risks in Role-Play Fine-Tuning of LLMs

Add code
Feb 28, 2025
Viaarxiv icon

Lens: Rethinking Multilingual Enhancement for Large Language Models

Add code
Oct 06, 2024
Figure 1 for Lens: Rethinking Multilingual Enhancement for Large Language Models
Figure 2 for Lens: Rethinking Multilingual Enhancement for Large Language Models
Figure 3 for Lens: Rethinking Multilingual Enhancement for Large Language Models
Figure 4 for Lens: Rethinking Multilingual Enhancement for Large Language Models
Viaarxiv icon

Towards Comprehensive and Efficient Post Safety Alignment of Large Language Models via Safety Patching

Add code
May 22, 2024
Figure 1 for Towards Comprehensive and Efficient Post Safety Alignment of Large Language Models via Safety Patching
Figure 2 for Towards Comprehensive and Efficient Post Safety Alignment of Large Language Models via Safety Patching
Figure 3 for Towards Comprehensive and Efficient Post Safety Alignment of Large Language Models via Safety Patching
Figure 4 for Towards Comprehensive and Efficient Post Safety Alignment of Large Language Models via Safety Patching
Viaarxiv icon

Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence

Add code
Feb 15, 2024
Viaarxiv icon

DAPT: A Dual Attention Framework for Parameter-Efficient Continual Learning of Large Language Models

Add code
Jan 16, 2024
Viaarxiv icon