Picture for Weixun Wang

Weixun Wang

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Add code
Dec 31, 2025
Viaarxiv icon

RollArt: Scaling Agentic RL Training via Disaggregated Infrastructure

Add code
Dec 27, 2025
Viaarxiv icon

Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

Add code
Dec 19, 2025
Figure 1 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Figure 2 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Figure 3 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Figure 4 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Viaarxiv icon

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

Add code
Oct 02, 2025
Viaarxiv icon

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Add code
Aug 11, 2025
Viaarxiv icon

Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library

Add code
Jun 06, 2025
Viaarxiv icon

Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models

Add code
May 26, 2025
Figure 1 for Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models
Figure 2 for Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models
Figure 3 for Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models
Figure 4 for Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models
Viaarxiv icon

USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models

Add code
May 26, 2025
Figure 1 for USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models
Figure 2 for USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models
Figure 3 for USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models
Figure 4 for USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models
Viaarxiv icon

Think-J: Learning to Think for Generative LLM-as-a-Judge

Add code
May 20, 2025
Viaarxiv icon

Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation

Add code
Mar 20, 2025
Viaarxiv icon