Picture for Yanyan Zhao

Yanyan Zhao

Psychological Counseling Cannot Be Achieved Overnight: Automated Psychological Counseling Through Multi-Session Conversations

Add code
Jun 07, 2025
Viaarxiv icon

How Does Sequence Modeling Architecture Influence Base Capabilities of Pre-trained Language Models? Exploring Key Architecture Design Principles to Avoid Base Capabilities Degradation

Add code
May 24, 2025
Viaarxiv icon

MPO: Multilingual Safety Alignment via Reward Gap Optimization

Add code
May 22, 2025
Viaarxiv icon

When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners

Add code
May 21, 2025
Viaarxiv icon

Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment

Add code
May 21, 2025
Viaarxiv icon

AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender

Add code
Apr 13, 2025
Viaarxiv icon

Chain of Strategy Optimization Makes Large Language Models Better Emotional Supporter

Add code
Mar 07, 2025
Viaarxiv icon

Beware of Your Po! Measuring and Mitigating AI Safety Risks in Role-Play Fine-Tuning of LLMs

Add code
Feb 28, 2025
Viaarxiv icon

Can Large Language Models Understand You Better? An MBTI Personality Detection Dataset Aligned with Population Traits

Add code
Dec 17, 2024
Viaarxiv icon

Separate the Wheat from the Chaff: A Post-Hoc Approach to Safety Re-Alignment for Fine-Tuned Language Models

Add code
Dec 15, 2024
Figure 1 for Separate the Wheat from the Chaff: A Post-Hoc Approach to Safety Re-Alignment for Fine-Tuned Language Models
Figure 2 for Separate the Wheat from the Chaff: A Post-Hoc Approach to Safety Re-Alignment for Fine-Tuned Language Models
Figure 3 for Separate the Wheat from the Chaff: A Post-Hoc Approach to Safety Re-Alignment for Fine-Tuned Language Models
Figure 4 for Separate the Wheat from the Chaff: A Post-Hoc Approach to Safety Re-Alignment for Fine-Tuned Language Models
Viaarxiv icon