Picture for Shibo Hao

Shibo Hao

K2-Think: A Parameter-Efficient Reasoning System

Add code
Sep 09, 2025
Viaarxiv icon

Vision-G1: Towards General Vision Language Reasoning with Multi-Domain Data Curation

Add code
Aug 18, 2025
Viaarxiv icon

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Add code
Jun 17, 2025
Viaarxiv icon

Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models

Add code
May 19, 2025
Figure 1 for Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models
Figure 2 for Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models
Figure 3 for Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models
Figure 4 for Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models
Viaarxiv icon

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought

Add code
May 18, 2025
Viaarxiv icon

LLM Pretraining with Continuous Concepts

Add code
Feb 12, 2025
Figure 1 for LLM Pretraining with Continuous Concepts
Figure 2 for LLM Pretraining with Continuous Concepts
Figure 3 for LLM Pretraining with Continuous Concepts
Figure 4 for LLM Pretraining with Continuous Concepts
Viaarxiv icon

Linear Correlation in LM's Compositional Generalization and Hallucination

Add code
Feb 06, 2025
Figure 1 for Linear Correlation in LM's Compositional Generalization and Hallucination
Figure 2 for Linear Correlation in LM's Compositional Generalization and Hallucination
Figure 3 for Linear Correlation in LM's Compositional Generalization and Hallucination
Figure 4 for Linear Correlation in LM's Compositional Generalization and Hallucination
Viaarxiv icon

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Add code
Dec 20, 2024
Figure 1 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Figure 2 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Figure 3 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Figure 4 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Viaarxiv icon

Training Large Language Models to Reason in a Continuous Latent Space

Add code
Dec 09, 2024
Figure 1 for Training Large Language Models to Reason in a Continuous Latent Space
Figure 2 for Training Large Language Models to Reason in a Continuous Latent Space
Figure 3 for Training Large Language Models to Reason in a Continuous Latent Space
Figure 4 for Training Large Language Models to Reason in a Continuous Latent Space
Viaarxiv icon

Pandora: Towards General World Model with Natural Language Actions and Video States

Add code
Jun 12, 2024
Figure 1 for Pandora: Towards General World Model with Natural Language Actions and Video States
Figure 2 for Pandora: Towards General World Model with Natural Language Actions and Video States
Figure 3 for Pandora: Towards General World Model with Natural Language Actions and Video States
Figure 4 for Pandora: Towards General World Model with Natural Language Actions and Video States
Viaarxiv icon