Picture for Rui Qian

Rui Qian

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

Add code
Sep 19, 2025
Viaarxiv icon

STARC: See-Through-Wall Augmented Reality Framework for Human-Robot Collaboration in Emergency Response

Add code
Sep 19, 2025
Viaarxiv icon

Energy-Constrained Navigation for Planetary Rovers under Hybrid RTG-Solar Power

Add code
Sep 18, 2025
Viaarxiv icon

CogStream: Context-guided Streaming Video Question Answering

Add code
Jun 12, 2025
Viaarxiv icon

Seed1.5-VL Technical Report

Add code
May 11, 2025
Viaarxiv icon

FA-BARF: Frequency Adapted Bundle-Adjusting Neural Radiance Fields

Add code
Mar 15, 2025
Viaarxiv icon

DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation

Add code
Mar 13, 2025
Viaarxiv icon

DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning

Add code
Mar 09, 2025
Viaarxiv icon

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Add code
Jan 09, 2025
Figure 1 for OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
Figure 2 for OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
Figure 3 for OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
Figure 4 for OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
Viaarxiv icon

Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction

Add code
Jan 06, 2025
Figure 1 for Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
Figure 2 for Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
Figure 3 for Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
Figure 4 for Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
Viaarxiv icon