Picture for Xuecheng Wu

Xuecheng Wu

TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training

Add code
Aug 25, 2025
Viaarxiv icon

A Trustworthy Method for Multimodal Emotion Recognition

Add code
Aug 11, 2025
Viaarxiv icon

AD-AVSR: Asymmetric Dual-stream Enhancement for Robust Audio-Visual Speech Recognition

Add code
Aug 11, 2025
Viaarxiv icon

eMotions: A Large-Scale Dataset and Audio-Visual Fusion Network for Emotion Analysis in Short-form Videos

Add code
Aug 09, 2025
Viaarxiv icon

HOLA: Enhancing Audio-visual Deepfake Detection via Hierarchical Contextual Aggregations and Efficient Pre-training

Add code
Jul 30, 2025
Viaarxiv icon

HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs

Add code
Jun 16, 2025
Viaarxiv icon

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment

Add code
May 22, 2025
Viaarxiv icon

ViC-Bench: Benchmarking Visual-Interleaved Chain-of-Thought Capability in MLLMs with Free-Style Intermediate State Representations

Add code
May 20, 2025
Viaarxiv icon

TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs

Add code
Apr 10, 2025
Viaarxiv icon

3A-YOLO: New Real-Time Object Detectors with Triple Discriminative Awareness and Coordinated Representations

Add code
Dec 10, 2024
Viaarxiv icon