Picture for Yiming Wang

Yiming Wang

Semantic-Guided Unsupervised Video Summarization

Add code
Jan 21, 2026
Viaarxiv icon

Spherical Geometry Diffusion: Generating High-quality 3D Face Geometry via Sphere-anchored Representations

Add code
Jan 19, 2026
Viaarxiv icon

Unifying Speech Recognition, Synthesis and Conversion with Autoregressive Transformers

Add code
Jan 15, 2026
Viaarxiv icon

PALM-Bench: A Comprehensive Benchmark for Personalized Audio-Language Models

Add code
Jan 07, 2026
Viaarxiv icon

PhysSFI-Net: Physics-informed Geometric Learning of Skeletal and Facial Interactions for Orthognathic Surgical Outcome Prediction

Add code
Jan 06, 2026
Viaarxiv icon

RGMP: Recurrent Geometric-prior Multimodal Policy for Generalizable Humanoid Robot Manipulation

Add code
Nov 12, 2025
Viaarxiv icon

LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training

Add code
Oct 16, 2025
Figure 1 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Figure 2 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Figure 3 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Figure 4 for LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
Viaarxiv icon

Direct Simultaneous Translation Activation for Large Audio-Language Models

Add code
Sep 19, 2025
Figure 1 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Figure 2 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Figure 3 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Figure 4 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Viaarxiv icon

Enhancing Retrieval Augmentation via Adversarial Collaboration

Add code
Sep 18, 2025
Viaarxiv icon

MemEvo: Memory-Evolving Incremental Multi-view Clustering

Add code
Sep 18, 2025
Viaarxiv icon