Picture for Zeye Sun

Zeye Sun

Instructions are all you need: Self-supervised Reinforcement Learning for Instruction Following

Add code
Oct 16, 2025
Viaarxiv icon

Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation

Add code
Feb 27, 2025
Figure 1 for Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation
Figure 2 for Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation
Figure 3 for Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation
Figure 4 for Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation
Viaarxiv icon

Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following

Add code
Feb 24, 2025
Figure 1 for Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following
Figure 2 for Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following
Figure 3 for Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following
Figure 4 for Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following
Viaarxiv icon

Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models

Add code
Jan 09, 2025
Viaarxiv icon

Minor DPO reject penalty to increase training robustness

Add code
Aug 22, 2024
Figure 1 for Minor DPO reject penalty to increase training robustness
Figure 2 for Minor DPO reject penalty to increase training robustness
Figure 3 for Minor DPO reject penalty to increase training robustness
Figure 4 for Minor DPO reject penalty to increase training robustness
Viaarxiv icon

Minor SFT loss for LLM fine-tune to increase performance and reduce model deviation

Add code
Aug 20, 2024
Figure 1 for Minor SFT loss for LLM fine-tune to increase performance and reduce model deviation
Figure 2 for Minor SFT loss for LLM fine-tune to increase performance and reduce model deviation
Viaarxiv icon