Picture for Lidong Bing

Lidong Bing

DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

Add code
Jan 14, 2026
Viaarxiv icon

EverMemOS: A Self-Organizing Memory Operating System for Structured Long-Horizon Reasoning

Add code
Jan 05, 2026
Viaarxiv icon

On the Role of Discreteness in Diffusion LLMs

Add code
Dec 27, 2025
Viaarxiv icon

MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

Add code
Nov 18, 2025
Viaarxiv icon

Multi-Agent Tool-Integrated Policy Optimization

Add code
Oct 06, 2025
Figure 1 for Multi-Agent Tool-Integrated Policy Optimization
Figure 2 for Multi-Agent Tool-Integrated Policy Optimization
Figure 3 for Multi-Agent Tool-Integrated Policy Optimization
Figure 4 for Multi-Agent Tool-Integrated Policy Optimization
Viaarxiv icon

MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search

Add code
May 25, 2025
Figure 1 for MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search
Figure 2 for MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search
Figure 3 for MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search
Figure 4 for MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search
Viaarxiv icon

MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback

Add code
May 23, 2025
Figure 1 for MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback
Figure 2 for MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback
Figure 3 for MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback
Figure 4 for MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback
Viaarxiv icon

100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models

Add code
May 01, 2025
Viaarxiv icon

Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations

Add code
Apr 18, 2025
Viaarxiv icon

FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving

Add code
Feb 27, 2025
Figure 1 for FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving
Figure 2 for FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving
Figure 3 for FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving
Figure 4 for FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving
Viaarxiv icon