Picture for Wenjun Li

Wenjun Li

HiMAP-Travel: Hierarchical Multi-Agent Planning for Long-Horizon Constrained Travel

Add code
Mar 05, 2026
Viaarxiv icon

SRR-Judge: Step-Level Rating and Refinement for Enhancing Search-Integrated Reasoning in Search Agents

Add code
Feb 08, 2026
Viaarxiv icon

VLA-AN: An Efficient and Onboard Vision-Language-Action Framework for Aerial Navigation in Complex Environments

Add code
Dec 19, 2025
Figure 1 for VLA-AN: An Efficient and Onboard Vision-Language-Action Framework for Aerial Navigation in Complex Environments
Figure 2 for VLA-AN: An Efficient and Onboard Vision-Language-Action Framework for Aerial Navigation in Complex Environments
Figure 3 for VLA-AN: An Efficient and Onboard Vision-Language-Action Framework for Aerial Navigation in Complex Environments
Figure 4 for VLA-AN: An Efficient and Onboard Vision-Language-Action Framework for Aerial Navigation in Complex Environments
Viaarxiv icon

Doc-Researcher: A Unified System for Multimodal Document Parsing and Deep Research

Add code
Oct 24, 2025
Viaarxiv icon

Understanding R1-Zero-Like Training: A Critical Perspective

Add code
Mar 26, 2025
Figure 1 for Understanding R1-Zero-Like Training: A Critical Perspective
Figure 2 for Understanding R1-Zero-Like Training: A Critical Perspective
Figure 3 for Understanding R1-Zero-Like Training: A Critical Perspective
Figure 4 for Understanding R1-Zero-Like Training: A Critical Perspective
Viaarxiv icon

Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k

Add code
Mar 12, 2025
Figure 1 for Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Figure 2 for Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Figure 3 for Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Figure 4 for Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Viaarxiv icon

Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger

Add code
Feb 18, 2025
Figure 1 for Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger
Figure 2 for Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger
Figure 3 for Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger
Figure 4 for Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger
Viaarxiv icon

Improving Environment Novelty Quantification for Effective Unsupervised Environment Design

Add code
Feb 08, 2025
Viaarxiv icon

A Survey of Foundation Models for Music Understanding

Add code
Sep 15, 2024
Figure 1 for A Survey of Foundation Models for Music Understanding
Figure 2 for A Survey of Foundation Models for Music Understanding
Figure 3 for A Survey of Foundation Models for Music Understanding
Figure 4 for A Survey of Foundation Models for Music Understanding
Viaarxiv icon

A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks

Add code
Aug 02, 2024
Figure 1 for A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks
Viaarxiv icon