Picture for Chi Jin

Chi Jin

Frontier LLMs Still Struggle with Simple Reasoning Tasks

Add code
Jul 09, 2025
Viaarxiv icon

Learning World Models for Interactive Video Generation

Add code
May 28, 2025
Viaarxiv icon

Principled Out-of-Distribution Generalization via Simplicity

Add code
May 28, 2025
Viaarxiv icon

Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving on Inequalities

Add code
May 19, 2025
Viaarxiv icon

PokéChamp: an Expert-level Minimax Language Agent

Add code
Mar 06, 2025
Viaarxiv icon

Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving

Add code
Feb 11, 2025
Viaarxiv icon

MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations

Add code
Feb 10, 2025
Viaarxiv icon

Generative Diffusion Modeling: A Practical Handbook

Add code
Dec 22, 2024
Viaarxiv icon

DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization

Add code
Dec 20, 2024
Viaarxiv icon

Benign Overfitting in Out-of-Distribution Generalization of Linear Models

Add code
Dec 19, 2024
Figure 1 for Benign Overfitting in Out-of-Distribution Generalization of Linear Models
Viaarxiv icon