Picture for Shibo Hao

Shibo Hao

Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models

Add code
May 19, 2025
Viaarxiv icon

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought

Add code
May 18, 2025
Viaarxiv icon

LLM Pretraining with Continuous Concepts

Add code
Feb 12, 2025
Viaarxiv icon

Linear Correlation in LM's Compositional Generalization and Hallucination

Add code
Feb 06, 2025
Viaarxiv icon

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Add code
Dec 20, 2024
Figure 1 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Figure 2 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Figure 3 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Figure 4 for Offline Reinforcement Learning for LLM Multi-Step Reasoning
Viaarxiv icon

Training Large Language Models to Reason in a Continuous Latent Space

Add code
Dec 09, 2024
Figure 1 for Training Large Language Models to Reason in a Continuous Latent Space
Figure 2 for Training Large Language Models to Reason in a Continuous Latent Space
Figure 3 for Training Large Language Models to Reason in a Continuous Latent Space
Figure 4 for Training Large Language Models to Reason in a Continuous Latent Space
Viaarxiv icon

Pandora: Towards General World Model with Natural Language Actions and Video States

Add code
Jun 12, 2024
Figure 1 for Pandora: Towards General World Model with Natural Language Actions and Video States
Figure 2 for Pandora: Towards General World Model with Natural Language Actions and Video States
Figure 3 for Pandora: Towards General World Model with Natural Language Actions and Video States
Figure 4 for Pandora: Towards General World Model with Natural Language Actions and Video States
Viaarxiv icon

Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking

Add code
Jun 09, 2024
Figure 1 for Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking
Figure 2 for Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking
Figure 3 for Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking
Figure 4 for Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking
Viaarxiv icon

LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models

Add code
Apr 08, 2024
Viaarxiv icon

Reasoning with Language Model is Planning with World Model

Add code
May 24, 2023
Viaarxiv icon