Picture for Yejin Choi

Yejin Choi

ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge

Add code
Oct 21, 2025
Viaarxiv icon

Opt-ICL at LeWiDi-2025: Maximizing In-Context Signal from Rater Examples via Meta-Learning

Add code
Oct 08, 2025
Viaarxiv icon

From Supervision to Exploration: What Does Protein Language Model Learn During Reinforcement Learning?

Add code
Oct 02, 2025
Viaarxiv icon

Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs

Add code
Sep 26, 2025
Viaarxiv icon

Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation

Add code
Sep 09, 2025
Figure 1 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 2 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 3 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 4 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Viaarxiv icon

UQ: Assessing Language Models on Unsolved Questions

Add code
Aug 25, 2025
Viaarxiv icon

Zero-shot Multimodal Document Retrieval via Cross-modal Question Generation

Add code
Aug 23, 2025
Viaarxiv icon

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Add code
Aug 21, 2025
Figure 1 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 2 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 3 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 4 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Viaarxiv icon

Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations

Add code
Jun 23, 2025
Viaarxiv icon

Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers

Add code
Jun 16, 2025
Viaarxiv icon