Picture for Yuanshun Yao

Yuanshun Yao

Toward Optimal LLM Alignments Using Two-Player Games

Add code
Jun 16, 2024
Figure 1 for Toward Optimal LLM Alignments Using Two-Player Games
Figure 2 for Toward Optimal LLM Alignments Using Two-Player Games
Figure 3 for Toward Optimal LLM Alignments Using Two-Player Games
Figure 4 for Toward Optimal LLM Alignments Using Two-Player Games
Viaarxiv icon

Label Smoothing Improves Machine Unlearning

Add code
Jun 11, 2024
Figure 1 for Label Smoothing Improves Machine Unlearning
Figure 2 for Label Smoothing Improves Machine Unlearning
Figure 3 for Label Smoothing Improves Machine Unlearning
Figure 4 for Label Smoothing Improves Machine Unlearning
Viaarxiv icon

Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards

Add code
Mar 14, 2024
Figure 1 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 2 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 3 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 4 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Viaarxiv icon

Learning to Watermark LLM-generated Text via Reinforcement Learning

Add code
Mar 13, 2024
Figure 1 for Learning to Watermark LLM-generated Text via Reinforcement Learning
Figure 2 for Learning to Watermark LLM-generated Text via Reinforcement Learning
Figure 3 for Learning to Watermark LLM-generated Text via Reinforcement Learning
Figure 4 for Learning to Watermark LLM-generated Text via Reinforcement Learning
Viaarxiv icon

Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach

Add code
Feb 20, 2024
Figure 1 for Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach
Figure 2 for Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach
Figure 3 for Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach
Figure 4 for Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach
Viaarxiv icon

Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting

Add code
Feb 16, 2024
Figure 1 for Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting
Figure 2 for Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting
Figure 3 for Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting
Figure 4 for Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting
Viaarxiv icon

Rethinking Machine Unlearning for Large Language Models

Add code
Feb 15, 2024
Figure 1 for Rethinking Machine Unlearning for Large Language Models
Figure 2 for Rethinking Machine Unlearning for Large Language Models
Viaarxiv icon

Human-Instruction-Free LLM Self-Alignment with Limited Samples

Add code
Jan 06, 2024
Viaarxiv icon

Large Language Model Unlearning

Add code
Oct 14, 2023
Figure 1 for Large Language Model Unlearning
Figure 2 for Large Language Model Unlearning
Figure 3 for Large Language Model Unlearning
Figure 4 for Large Language Model Unlearning
Viaarxiv icon

Fair Classifiers that Abstain without Harm

Add code
Oct 09, 2023
Figure 1 for Fair Classifiers that Abstain without Harm
Figure 2 for Fair Classifiers that Abstain without Harm
Figure 3 for Fair Classifiers that Abstain without Harm
Figure 4 for Fair Classifiers that Abstain without Harm
Viaarxiv icon