Picture for Kuikun Liu

Kuikun Liu

Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning

Add code
Jul 22, 2025
Viaarxiv icon

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

Add code
Jul 17, 2025
Viaarxiv icon

RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy

Add code
Mar 31, 2025
Viaarxiv icon

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Add code
Feb 10, 2025
Viaarxiv icon

Are Your LLMs Capable of Stable Reasoning?

Add code
Dec 17, 2024
Viaarxiv icon

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

Add code
Jul 29, 2024
Figure 1 for MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Figure 2 for MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Figure 3 for MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Figure 4 for MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Viaarxiv icon

CIBench: Evaluating Your LLMs with a Code Interpreter Plugin

Add code
Jul 15, 2024
Figure 1 for CIBench: Evaluating Your LLMs with a Code Interpreter Plugin
Figure 2 for CIBench: Evaluating Your LLMs with a Code Interpreter Plugin
Figure 3 for CIBench: Evaluating Your LLMs with a Code Interpreter Plugin
Figure 4 for CIBench: Evaluating Your LLMs with a Code Interpreter Plugin
Viaarxiv icon

AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data

Add code
May 29, 2024
Viaarxiv icon

InternLM2 Technical Report

Add code
Mar 26, 2024
Figure 1 for InternLM2 Technical Report
Figure 2 for InternLM2 Technical Report
Figure 3 for InternLM2 Technical Report
Figure 4 for InternLM2 Technical Report
Viaarxiv icon

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

Add code
Mar 19, 2024
Viaarxiv icon