Picture for Zhilin Wang

Zhilin Wang

Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration

Add code
Sep 18, 2025
Viaarxiv icon

Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning

Add code
Sep 04, 2025
Viaarxiv icon

Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas

Add code
May 20, 2025
Viaarxiv icon

HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages

Add code
May 16, 2025
Viaarxiv icon

Llama-Nemotron: Efficient Reasoning Models

Add code
May 02, 2025
Viaarxiv icon

SEE: Continual Fine-tuning with Sequential Ensemble of Experts

Add code
Apr 09, 2025
Viaarxiv icon

Adversarial Training of Reward Models

Add code
Apr 08, 2025
Figure 1 for Adversarial Training of Reward Models
Figure 2 for Adversarial Training of Reward Models
Figure 3 for Adversarial Training of Reward Models
Figure 4 for Adversarial Training of Reward Models
Viaarxiv icon

Lost in Literalism: How Supervised Training Shapes Translationese in LLMs

Add code
Mar 06, 2025
Figure 1 for Lost in Literalism: How Supervised Training Shapes Translationese in LLMs
Figure 2 for Lost in Literalism: How Supervised Training Shapes Translationese in LLMs
Figure 3 for Lost in Literalism: How Supervised Training Shapes Translationese in LLMs
Figure 4 for Lost in Literalism: How Supervised Training Shapes Translationese in LLMs
Viaarxiv icon

Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks

Add code
Mar 06, 2025
Viaarxiv icon

Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing

Add code
Feb 21, 2025
Figure 1 for Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing
Figure 2 for Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing
Figure 3 for Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing
Figure 4 for Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing
Viaarxiv icon