Picture for Yiwen Tang

Yiwen Tang

Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

Add code
Dec 11, 2025
Viaarxiv icon

REVISION:Reflective Intent Mining and Online Reasoning Auxiliary for E-commerce Visual Search System Optimization

Add code
Oct 26, 2025
Viaarxiv icon

Hume: Introducing System-2 Thinking in Visual-Language-Action Model

Add code
May 29, 2025
Figure 1 for Hume: Introducing System-2 Thinking in Visual-Language-Action Model
Figure 2 for Hume: Introducing System-2 Thinking in Visual-Language-Action Model
Figure 3 for Hume: Introducing System-2 Thinking in Visual-Language-Action Model
Figure 4 for Hume: Introducing System-2 Thinking in Visual-Language-Action Model
Viaarxiv icon

EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models

Add code
May 28, 2025
Viaarxiv icon

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

Add code
May 19, 2025
Figure 1 for AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use
Figure 2 for AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use
Figure 3 for AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use
Figure 4 for AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use
Viaarxiv icon

AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations

Add code
Apr 10, 2025
Viaarxiv icon

OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation

Add code
Feb 25, 2025
Viaarxiv icon

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Add code
Feb 13, 2025
Figure 1 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 2 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 3 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 4 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Viaarxiv icon

FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives

Add code
Oct 29, 2024
Figure 1 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Figure 2 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Figure 3 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Figure 4 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Viaarxiv icon

Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding

Add code
Apr 11, 2024
Viaarxiv icon