Picture for Lei Cui

Lei Cui

Scaling Data Difficulty: Improving Coding Models via Reinforcement Learning on Fresh and Challenging Problems

Add code
Mar 08, 2026
Viaarxiv icon

UniM: A Unified Any-to-Any Interleaved Multimodal Benchmark

Add code
Mar 05, 2026
Viaarxiv icon

Synergizing Understanding and Generation with Interleaved Analyzing-Drafting Thinking

Add code
Feb 24, 2026
Viaarxiv icon

LongReasonArena: A Long Reasoning Benchmark for Large Language Models

Add code
Aug 26, 2025
Viaarxiv icon

Geometric-Mean Policy Optimization

Add code
Jul 28, 2025
Viaarxiv icon

WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models

Add code
Jun 12, 2025
Figure 1 for WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models
Figure 2 for WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models
Figure 3 for WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models
Figure 4 for WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models
Viaarxiv icon

Think Only When You Need with Large Hybrid-Reasoning Models

Add code
May 21, 2025
Viaarxiv icon

Model as a Game: On Numerical and Spatial Consistency for Generative Games

Add code
Mar 27, 2025
Viaarxiv icon

PEACE: Empowering Geologic Map Holistic Understanding with MLLMs

Add code
Jan 10, 2025
Figure 1 for PEACE: Empowering Geologic Map Holistic Understanding with MLLMs
Figure 2 for PEACE: Empowering Geologic Map Holistic Understanding with MLLMs
Figure 3 for PEACE: Empowering Geologic Map Holistic Understanding with MLLMs
Figure 4 for PEACE: Empowering Geologic Map Holistic Understanding with MLLMs
Viaarxiv icon

MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark

Add code
Dec 19, 2024
Figure 1 for MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark
Figure 2 for MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark
Figure 3 for MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark
Figure 4 for MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark
Viaarxiv icon