Picture for Haitao Mi

Haitao Mi

HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving

Add code
Dec 30, 2024
Viaarxiv icon

Teaching LLMs to Refine with Tools

Add code
Dec 22, 2024
Figure 1 for Teaching LLMs to Refine with Tools
Figure 2 for Teaching LLMs to Refine with Tools
Figure 3 for Teaching LLMs to Refine with Tools
Figure 4 for Teaching LLMs to Refine with Tools
Viaarxiv icon

Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens

Add code
Nov 26, 2024
Figure 1 for Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
Figure 2 for Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
Figure 3 for Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
Figure 4 for Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
Viaarxiv icon

Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning

Add code
Oct 09, 2024
Figure 1 for Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning
Figure 2 for Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning
Figure 3 for Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning
Figure 4 for Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning
Viaarxiv icon

DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search

Add code
Oct 04, 2024
Figure 1 for DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
Figure 2 for DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
Figure 3 for DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
Figure 4 for DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
Viaarxiv icon

HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows

Add code
Sep 25, 2024
Figure 1 for HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows
Figure 2 for HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows
Figure 3 for HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows
Figure 4 for HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows
Viaarxiv icon

SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models

Add code
Aug 28, 2024
Figure 1 for SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models
Figure 2 for SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models
Figure 3 for SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models
Figure 4 for SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models
Viaarxiv icon

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

Add code
Jun 30, 2024
Figure 1 for Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Figure 2 for Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Figure 3 for Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Viaarxiv icon

LiteSearch: Efficacious Tree Search for LLM

Add code
Jun 29, 2024
Viaarxiv icon

Scaling Synthetic Data Creation with 1,000,000,000 Personas

Add code
Jun 28, 2024
Viaarxiv icon