Picture for Ilia Kulikov

Ilia Kulikov

OptimalThinkingBench: Evaluating Over and Underthinking in LLMs

Add code
Aug 18, 2025
Viaarxiv icon

Learning to Reason for Factuality

Add code
Aug 07, 2025
Viaarxiv icon

CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks

Add code
Jul 31, 2025
Viaarxiv icon

NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks

Add code
Jul 02, 2025
Viaarxiv icon

Bridging Offline and Online Reinforcement Learning for LLMs

Add code
Jun 26, 2025
Viaarxiv icon

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Add code
May 15, 2025
Viaarxiv icon

NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions

Add code
Feb 18, 2025
Viaarxiv icon

LLM Pretraining with Continuous Concepts

Add code
Feb 12, 2025
Viaarxiv icon

Diverse Preference Optimization

Add code
Jan 31, 2025
Figure 1 for Diverse Preference Optimization
Figure 2 for Diverse Preference Optimization
Figure 3 for Diverse Preference Optimization
Figure 4 for Diverse Preference Optimization
Viaarxiv icon

Adaptive Decoding via Latent Preference Optimization

Add code
Nov 14, 2024
Figure 1 for Adaptive Decoding via Latent Preference Optimization
Figure 2 for Adaptive Decoding via Latent Preference Optimization
Figure 3 for Adaptive Decoding via Latent Preference Optimization
Figure 4 for Adaptive Decoding via Latent Preference Optimization
Viaarxiv icon