Picture for Zhi-Qi Cheng

Zhi-Qi Cheng

Cell Behavior Video Classification Challenge, a benchmark for computer vision methods in time-lapse microscopy

Add code
Jan 15, 2026
Viaarxiv icon

Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding

Add code
Jan 09, 2026
Viaarxiv icon

HCMA: Hierarchical Cross-model Alignment for Grounded Text-to-Image Generation

Add code
May 15, 2025
Figure 1 for HCMA: Hierarchical Cross-model Alignment for Grounded Text-to-Image Generation
Figure 2 for HCMA: Hierarchical Cross-model Alignment for Grounded Text-to-Image Generation
Figure 3 for HCMA: Hierarchical Cross-model Alignment for Grounded Text-to-Image Generation
Figure 4 for HCMA: Hierarchical Cross-model Alignment for Grounded Text-to-Image Generation
Viaarxiv icon

Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions

Add code
Apr 16, 2025
Viaarxiv icon

Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models

Add code
Apr 10, 2025
Viaarxiv icon

HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard

Add code
Mar 18, 2025
Figure 1 for HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard
Figure 2 for HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard
Figure 3 for HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard
Figure 4 for HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard
Viaarxiv icon

MaxSup: Overcoming Representation Collapse in Label Smoothing

Add code
Feb 18, 2025
Figure 1 for MaxSup: Overcoming Representation Collapse in Label Smoothing
Figure 2 for MaxSup: Overcoming Representation Collapse in Label Smoothing
Figure 3 for MaxSup: Overcoming Representation Collapse in Label Smoothing
Figure 4 for MaxSup: Overcoming Representation Collapse in Label Smoothing
Viaarxiv icon

A Video-grounded Dialogue Dataset and Metric for Event-driven Activities

Add code
Jan 30, 2025
Figure 1 for A Video-grounded Dialogue Dataset and Metric for Event-driven Activities
Figure 2 for A Video-grounded Dialogue Dataset and Metric for Event-driven Activities
Figure 3 for A Video-grounded Dialogue Dataset and Metric for Event-driven Activities
Figure 4 for A Video-grounded Dialogue Dataset and Metric for Event-driven Activities
Viaarxiv icon

UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models for Universal Cross-Domain Retrieval

Add code
Dec 14, 2024
Figure 1 for UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models for Universal Cross-Domain Retrieval
Figure 2 for UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models for Universal Cross-Domain Retrieval
Figure 3 for UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models for Universal Cross-Domain Retrieval
Figure 4 for UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models for Universal Cross-Domain Retrieval
Viaarxiv icon

StableAnimator: High-Quality Identity-Preserving Human Image Animation

Add code
Nov 26, 2024
Viaarxiv icon