Picture for Wenbo Su

Wenbo Su

Logics-STEM: Empowering LLM Reasoning via Failure-Driven Post-Training and Document Knowledge Enhancement

Add code
Jan 08, 2026
Viaarxiv icon

One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling

Add code
Jan 06, 2026
Viaarxiv icon

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Add code
Dec 31, 2025
Viaarxiv icon

RollArt: Scaling Agentic RL Training via Disaggregated Infrastructure

Add code
Dec 27, 2025
Viaarxiv icon

Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

Add code
Dec 19, 2025
Figure 1 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Figure 2 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Figure 3 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Figure 4 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Viaarxiv icon

RecGPT-V2 Technical Report

Add code
Dec 16, 2025
Figure 1 for RecGPT-V2 Technical Report
Figure 2 for RecGPT-V2 Technical Report
Figure 3 for RecGPT-V2 Technical Report
Figure 4 for RecGPT-V2 Technical Report
Viaarxiv icon

MeSH: Memory-as-State-Highways for Recursive Transformers

Add code
Oct 09, 2025
Viaarxiv icon

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

Add code
Oct 02, 2025
Viaarxiv icon

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Add code
Aug 11, 2025
Viaarxiv icon

RecGPT Technical Report

Add code
Jul 30, 2025
Figure 1 for RecGPT Technical Report
Figure 2 for RecGPT Technical Report
Figure 3 for RecGPT Technical Report
Figure 4 for RecGPT Technical Report
Viaarxiv icon