Picture for Yiming Wang

Yiming Wang

Breaking the Overscaling Curse: Thinking Parallelism Before Parallel Thinking

Add code
Jan 29, 2026
Viaarxiv icon

ExoGS: A 4D Real-to-Sim-to-Real Framework for Scalable Manipulation Data Collection

Add code
Jan 26, 2026
Viaarxiv icon

Semantic-Guided Unsupervised Video Summarization

Add code
Jan 21, 2026
Viaarxiv icon

Spherical Geometry Diffusion: Generating High-quality 3D Face Geometry via Sphere-anchored Representations

Add code
Jan 19, 2026
Viaarxiv icon

Unifying Speech Recognition, Synthesis and Conversion with Autoregressive Transformers

Add code
Jan 15, 2026
Viaarxiv icon

PALM-Bench: A Comprehensive Benchmark for Personalized Audio-Language Models

Add code
Jan 07, 2026
Viaarxiv icon

PhysSFI-Net: Physics-informed Geometric Learning of Skeletal and Facial Interactions for Orthognathic Surgical Outcome Prediction

Add code
Jan 06, 2026
Viaarxiv icon

RGMP: Recurrent Geometric-prior Multimodal Policy for Generalizable Humanoid Robot Manipulation

Add code
Nov 12, 2025
Viaarxiv icon

LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training

Add code
Oct 16, 2025
Figure 1 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Figure 2 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Figure 3 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Figure 4 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Viaarxiv icon

Direct Simultaneous Translation Activation for Large Audio-Language Models

Add code
Sep 19, 2025
Figure 1 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Figure 2 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Figure 3 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Figure 4 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Viaarxiv icon