Picture for Yuankai Qi

Yuankai Qi

Multimodal Visual Surrogate Compression for Alzheimer's Disease Classification

Add code
Jan 29, 2026
Viaarxiv icon

Visual Marker Search for Autonomous Drone Landing in Diverse Urban Environments

Add code
Jan 16, 2026
Viaarxiv icon

Teaching Prompts to Coordinate: Hierarchical Layer-Grouped Prompt Tuning for Continual Learning

Add code
Nov 15, 2025
Figure 1 for Teaching Prompts to Coordinate: Hierarchical Layer-Grouped Prompt Tuning for Continual Learning
Figure 2 for Teaching Prompts to Coordinate: Hierarchical Layer-Grouped Prompt Tuning for Continual Learning
Figure 3 for Teaching Prompts to Coordinate: Hierarchical Layer-Grouped Prompt Tuning for Continual Learning
Figure 4 for Teaching Prompts to Coordinate: Hierarchical Layer-Grouped Prompt Tuning for Continual Learning
Viaarxiv icon

Experiences from Benchmarking Vision-Language-Action Models for Robotic Manipulation

Add code
Nov 14, 2025
Figure 1 for Experiences from Benchmarking Vision-Language-Action Models for Robotic Manipulation
Figure 2 for Experiences from Benchmarking Vision-Language-Action Models for Robotic Manipulation
Figure 3 for Experiences from Benchmarking Vision-Language-Action Models for Robotic Manipulation
Figure 4 for Experiences from Benchmarking Vision-Language-Action Models for Robotic Manipulation
Viaarxiv icon

Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding

Add code
May 21, 2025
Viaarxiv icon

Learning to Reason and Navigate: Parameter Efficient Action Planning with Large Language Models

Add code
May 12, 2025
Viaarxiv icon

FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing

Add code
May 02, 2025
Figure 1 for FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
Figure 2 for FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
Figure 3 for FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
Figure 4 for FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
Viaarxiv icon

SDVPT: Semantic-Driven Visual Prompt Tuning for Open-World Object Counting

Add code
Apr 24, 2025
Figure 1 for SDVPT: Semantic-Driven Visual Prompt Tuning for Open-World Object Counting
Figure 2 for SDVPT: Semantic-Driven Visual Prompt Tuning for Open-World Object Counting
Figure 3 for SDVPT: Semantic-Driven Visual Prompt Tuning for Open-World Object Counting
Figure 4 for SDVPT: Semantic-Driven Visual Prompt Tuning for Open-World Object Counting
Viaarxiv icon

ProgRoCC: A Progressive Approach to Rough Crowd Counting

Add code
Apr 18, 2025
Figure 1 for ProgRoCC: A Progressive Approach to Rough Crowd Counting
Figure 2 for ProgRoCC: A Progressive Approach to Rough Crowd Counting
Figure 3 for ProgRoCC: A Progressive Approach to Rough Crowd Counting
Figure 4 for ProgRoCC: A Progressive Approach to Rough Crowd Counting
Viaarxiv icon

The Devil is in the Distributions: Explicit Modeling of Scene Content is Key in Zero-Shot Video Captioning

Add code
Mar 31, 2025
Figure 1 for The Devil is in the Distributions: Explicit Modeling of Scene Content is Key in Zero-Shot Video Captioning
Figure 2 for The Devil is in the Distributions: Explicit Modeling of Scene Content is Key in Zero-Shot Video Captioning
Figure 3 for The Devil is in the Distributions: Explicit Modeling of Scene Content is Key in Zero-Shot Video Captioning
Figure 4 for The Devil is in the Distributions: Explicit Modeling of Scene Content is Key in Zero-Shot Video Captioning
Viaarxiv icon