Picture for Danlei Huang

Danlei Huang

AD-AVSR: Asymmetric Dual-stream Enhancement for Robust Audio-Visual Speech Recognition

Add code
Aug 11, 2025
Viaarxiv icon

eMotions: A Large-Scale Dataset and Audio-Visual Fusion Network for Emotion Analysis in Short-form Videos

Add code
Aug 09, 2025
Viaarxiv icon

HOLA: Enhancing Audio-visual Deepfake Detection via Hierarchical Contextual Aggregations and Efficient Pre-training

Add code
Jul 30, 2025
Viaarxiv icon

HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs

Add code
Jun 16, 2025
Viaarxiv icon

ViC-Bench: Benchmarking Visual-Interleaved Chain-of-Thought Capability in MLLMs with Free-Style Intermediate State Representations

Add code
May 20, 2025
Viaarxiv icon

3A-YOLO: New Real-Time Object Detectors with Triple Discriminative Awareness and Coordinated Representations

Add code
Dec 10, 2024
Viaarxiv icon