Picture for Yifan Yang

Yifan Yang

P/D-Device: Disaggregated Large Language Model between Cloud and Devices

Add code
Aug 12, 2025
Viaarxiv icon

Preview WB-DH: Towards Whole Body Digital Human Bench for the Generation of Whole-body Talking Avatar Videos

Add code
Aug 12, 2025
Viaarxiv icon

Phi-Ground Tech Report: Advancing Perception in GUI Grounding

Add code
Jul 31, 2025
Viaarxiv icon

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Add code
Jul 17, 2025
Viaarxiv icon

MK-Pose: Category-Level Object Pose Estimation via Multimodal-Based Keypoint Learning

Add code
Jul 09, 2025
Viaarxiv icon

SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes

Add code
Jun 26, 2025
Viaarxiv icon

StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling

Add code
Jun 14, 2025
Viaarxiv icon

Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning

Add code
Jun 06, 2025
Viaarxiv icon

Knowledge-guided Contextual Gene Set Analysis Using Large Language Models

Add code
Jun 04, 2025
Viaarxiv icon

ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL

Add code
May 30, 2025
Viaarxiv icon