Picture for Yuxuan Wang

Yuxuan Wang

Sherman

Probing and Inducing Combinational Creativity in Vision-Language Models

Add code
Apr 17, 2025
Viaarxiv icon

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Add code
Apr 11, 2025
Viaarxiv icon

OrchMLLM: Orchestrate Multimodal Data with Batch Post-Balancing to Accelerate Multimodal Large Language Model Training

Add code
Mar 31, 2025
Viaarxiv icon

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Add code
Mar 29, 2025
Viaarxiv icon

QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions

Add code
Mar 26, 2025
Viaarxiv icon

Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context

Add code
Mar 19, 2025
Viaarxiv icon

A Parallel Hybrid Action Space Reinforcement Learning Model for Real-world Adaptive Traffic Signal Control

Add code
Mar 18, 2025
Viaarxiv icon

PBR3DGen: A VLM-guided Mesh Generation with High-quality PBR Texture

Add code
Mar 14, 2025
Viaarxiv icon

NsBM-GAT: A Non-stationary Block Maximum and Graph Attention Framework for General Traffic Crash Risk Prediction

Add code
Mar 06, 2025
Viaarxiv icon

From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Add code
Feb 26, 2025
Viaarxiv icon