Picture for Yankai Lin

Yankai Lin

Less Noise, More Voice: Reinforcement Learning for Reasoning via Instruction Purification

Add code
Jan 29, 2026
Viaarxiv icon

DARC: Decoupled Asymmetric Reasoning Curriculum for LLM Evolution

Add code
Jan 21, 2026
Viaarxiv icon

AtomMem : Learnable Dynamic Agentic Memory with Atomic Memory Operation

Add code
Jan 13, 2026
Viaarxiv icon

Forest Before Trees: Latent Superposition for Efficient Visual Reasoning

Add code
Jan 11, 2026
Viaarxiv icon

MiniCPM4: Ultra-Efficient LLMs on End Devices

Add code
Jun 09, 2025
Figure 1 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Figure 2 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Figure 3 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Figure 4 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Viaarxiv icon

Learning to Focus: Causal Attention Distillation via Gradient-Guided Token Pruning

Add code
Jun 09, 2025
Viaarxiv icon

LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models

Add code
May 25, 2025
Figure 1 for LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
Figure 2 for LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
Figure 3 for LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
Figure 4 for LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
Viaarxiv icon

ToLeaP: Rethinking Development of Tool Learning with Large Language Models

Add code
May 17, 2025
Figure 1 for ToLeaP: Rethinking Development of Tool Learning with Large Language Models
Figure 2 for ToLeaP: Rethinking Development of Tool Learning with Large Language Models
Figure 3 for ToLeaP: Rethinking Development of Tool Learning with Large Language Models
Figure 4 for ToLeaP: Rethinking Development of Tool Learning with Large Language Models
Viaarxiv icon

DeepCritic: Deliberate Critique with Large Language Models

Add code
May 01, 2025
Viaarxiv icon

Learning to Generate Structured Output with Schema Reinforcement Learning

Add code
Feb 26, 2025
Viaarxiv icon