Picture for Yangsibo Huang

Yangsibo Huang

Quantifying Cross-Modality Memorization in Vision-Language Models

Add code
Jun 05, 2025
Viaarxiv icon

MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations

Add code
Feb 10, 2025
Viaarxiv icon

Scaling Embedding Layers in Language Models

Add code
Feb 03, 2025
Viaarxiv icon

Scaling Laws for Differentially Private Language Models

Add code
Jan 31, 2025
Viaarxiv icon

Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards

Add code
Jan 13, 2025
Figure 1 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 2 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 3 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 4 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Viaarxiv icon

On Evaluating the Durability of Safeguards for Open-Weight LLMs

Add code
Dec 10, 2024
Viaarxiv icon

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice

Add code
Dec 09, 2024
Figure 1 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 2 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 3 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 4 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Viaarxiv icon

On Memorization of Large Language Models in Logical Reasoning

Add code
Oct 30, 2024
Figure 1 for On Memorization of Large Language Models in Logical Reasoning
Figure 2 for On Memorization of Large Language Models in Logical Reasoning
Figure 3 for On Memorization of Large Language Models in Logical Reasoning
Figure 4 for On Memorization of Large Language Models in Logical Reasoning
Viaarxiv icon

An Adversarial Perspective on Machine Unlearning for AI Safety

Add code
Sep 26, 2024
Viaarxiv icon

ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty

Add code
Aug 26, 2024
Viaarxiv icon