Picture for Shibo Hao

Shibo Hao

K2-Think: A Parameter-Efficient Reasoning System

Add code
Sep 09, 2025
Viaarxiv icon

Vision-G1: Towards General Vision Language Reasoning with Multi-Domain Data Curation

Add code
Aug 18, 2025
Viaarxiv icon

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Add code
Jun 17, 2025
Viaarxiv icon

Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models

Add code
May 19, 2025
Viaarxiv icon

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought

Add code
May 18, 2025
Viaarxiv icon

LLM Pretraining with Continuous Concepts

Add code
Feb 12, 2025
Viaarxiv icon

Linear Correlation in LM's Compositional Generalization and Hallucination

Add code
Feb 06, 2025
Viaarxiv icon

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Add code
Dec 20, 2024
Figure 1 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Figure 2 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Figure 3 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Figure 4 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Viaarxiv icon

Training Large Language Models to Reason in a Continuous Latent Space

Add code
Dec 09, 2024
Figure 1 for Training Large Language Models to Reason in a Continuous Latent Space
Figure 2 for Training Large Language Models to Reason in a Continuous Latent Space
Figure 3 for Training Large Language Models to Reason in a Continuous Latent Space
Figure 4 for Training Large Language Models to Reason in a Continuous Latent Space
Viaarxiv icon

Pandora: Towards General World Model with Natural Language Actions and Video States

Add code
Jun 12, 2024
Figure 1 for Pandora: Towards General World Model with Natural Language Actions and Video States
Figure 2 for Pandora: Towards General World Model with Natural Language Actions and Video States
Figure 3 for Pandora: Towards General World Model with Natural Language Actions and Video States
Figure 4 for Pandora: Towards General World Model with Natural Language Actions and Video States
Viaarxiv icon