Picture for Yuanshun Yao

Yuanshun Yao

Robust Multi-bit Text Watermark with LLM-based Paraphrasers

Add code
Dec 04, 2024
Viaarxiv icon

ACC-Debate: An Actor-Critic Approach to Multi-Agent Debate

Add code
Nov 04, 2024
Viaarxiv icon

Toward Optimal LLM Alignments Using Two-Player Games

Add code
Jun 16, 2024
Figure 1 for Toward Optimal LLM Alignments Using Two-Player Games
Figure 2 for Toward Optimal LLM Alignments Using Two-Player Games
Figure 3 for Toward Optimal LLM Alignments Using Two-Player Games
Figure 4 for Toward Optimal LLM Alignments Using Two-Player Games
Viaarxiv icon

Label Smoothing Improves Machine Unlearning

Add code
Jun 11, 2024
Figure 1 for Label Smoothing Improves Machine Unlearning
Figure 2 for Label Smoothing Improves Machine Unlearning
Figure 3 for Label Smoothing Improves Machine Unlearning
Figure 4 for Label Smoothing Improves Machine Unlearning
Viaarxiv icon

Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards

Add code
Mar 14, 2024
Figure 1 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 2 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 3 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 4 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Viaarxiv icon

Learning to Watermark LLM-generated Text via Reinforcement Learning

Add code
Mar 13, 2024
Figure 1 for Learning to Watermark LLM-generated Text via Reinforcement Learning
Figure 2 for Learning to Watermark LLM-generated Text via Reinforcement Learning
Figure 3 for Learning to Watermark LLM-generated Text via Reinforcement Learning
Figure 4 for Learning to Watermark LLM-generated Text via Reinforcement Learning
Viaarxiv icon

Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach

Add code
Feb 20, 2024
Figure 1 for Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach
Figure 2 for Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach
Figure 3 for Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach
Figure 4 for Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach
Viaarxiv icon

Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting

Add code
Feb 16, 2024
Viaarxiv icon

Rethinking Machine Unlearning for Large Language Models

Add code
Feb 15, 2024
Viaarxiv icon

Human-Instruction-Free LLM Self-Alignment with Limited Samples

Add code
Jan 06, 2024
Viaarxiv icon