Picture for Ruochen Xu

Ruochen Xu

Unifying Language Agent Algorithms with Graph-based Orchestration Engine for Reproducible Agent Research

Add code
May 30, 2025
Viaarxiv icon

VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model

Add code
Apr 10, 2025
Figure 1 for VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model
Figure 2 for VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model
Figure 3 for VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model
Figure 4 for VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model
Viaarxiv icon

Grasping by Spiraling: Reproducing Elephant Movements with Rigid-Soft Robot Synergy

Add code
Apr 02, 2025
Viaarxiv icon

The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding?

Add code
Feb 19, 2025
Figure 1 for The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding?
Figure 2 for The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding?
Figure 3 for The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding?
Figure 4 for The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding?
Viaarxiv icon

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration

Add code
Nov 25, 2024
Figure 1 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 2 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 3 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 4 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Viaarxiv icon

OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding

Add code
Jul 06, 2024
Figure 1 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Figure 2 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Figure 3 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Figure 4 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Viaarxiv icon

Preserving Knowledge in Large Language Model: A Model-Agnostic Self-Decompression Approach

Add code
Jun 17, 2024
Figure 1 for Preserving Knowledge in Large Language Model: A Model-Agnostic Self-Decompression Approach
Figure 2 for Preserving Knowledge in Large Language Model: A Model-Agnostic Self-Decompression Approach
Figure 3 for Preserving Knowledge in Large Language Model: A Model-Agnostic Self-Decompression Approach
Figure 4 for Preserving Knowledge in Large Language Model: A Model-Agnostic Self-Decompression Approach
Viaarxiv icon

Rho-1: Not All Tokens Are What You Need

Add code
Apr 11, 2024
Figure 1 for Rho-1: Not All Tokens Are What You Need
Figure 2 for Rho-1: Not All Tokens Are What You Need
Figure 3 for Rho-1: Not All Tokens Are What You Need
Figure 4 for Rho-1: Not All Tokens Are What You Need
Viaarxiv icon

ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models

Add code
Mar 08, 2024
Figure 1 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 2 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 3 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 4 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Viaarxiv icon

DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents

Add code
Feb 21, 2024
Figure 1 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Figure 2 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Figure 3 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Figure 4 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Viaarxiv icon