Picture for Yuping Wang

Yuping Wang

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

Add code
May 19, 2025
Viaarxiv icon

Generative AI for Autonomous Driving: Frontiers and Opportunities

Add code
May 13, 2025
Viaarxiv icon

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Add code
Apr 11, 2025
Viaarxiv icon

UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving

Add code
Mar 31, 2025
Viaarxiv icon

Can Large Vision Language Models Read Maps Like a Human?

Add code
Mar 18, 2025
Viaarxiv icon

FwNet-ECA: Facilitating Window Attention with Global Receptive Fields through Fourier Filtering Operations

Add code
Feb 25, 2025
Viaarxiv icon

Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization

Add code
Feb 18, 2025
Viaarxiv icon

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

Add code
Feb 06, 2025
Viaarxiv icon

A Privacy-Preserving Domain Adversarial Federated learning for multi-site brain functional connectivity analysis

Add code
Feb 03, 2025
Viaarxiv icon

Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model

Add code
Jan 13, 2025
Figure 1 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 2 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 3 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 4 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Viaarxiv icon