Picture for Sarah Zhang

Sarah Zhang

Stepwise Penalization for Length-Efficient Chain-of-Thought Reasoning

Add code
Feb 27, 2026
Viaarxiv icon

"Meet My Sidekick!": Effects of Separate Identities and Control of a Single Robot in HRI

Add code
Feb 07, 2026
Viaarxiv icon

WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance

Add code
Nov 17, 2025
Viaarxiv icon

Stabilizing Reinforcement Learning for Honesty Alignment in Language Models on Deductive Reasoning

Add code
Nov 12, 2025
Viaarxiv icon

A Reasoning-Focused Legal Retrieval Benchmark

Add code
May 06, 2025
Viaarxiv icon

A Dataset and Benchmark for Automatically Answering and Generating Machine Learning Final Exams

Add code
Jun 11, 2022
Figure 1 for A Dataset and Benchmark for Automatically Answering and Generating Machine Learning Final Exams
Figure 2 for A Dataset and Benchmark for Automatically Answering and Generating Machine Learning Final Exams
Figure 3 for A Dataset and Benchmark for Automatically Answering and Generating Machine Learning Final Exams
Figure 4 for A Dataset and Benchmark for Automatically Answering and Generating Machine Learning Final Exams
Viaarxiv icon