Picture for Chenyu Yang

Chenyu Yang

LeVo: High-Quality Song Generation with Multi-Preference Alignment

Add code
Jun 09, 2025
Viaarxiv icon

SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement

Add code
Jun 09, 2025
Viaarxiv icon

ZeroGUI: Automating Online GUI Learning at Zero Human Cost

Add code
May 29, 2025
Viaarxiv icon

MAPLE: Encoding Dexterous Robotic Manipulation Priors Learned From Egocentric Videos

Add code
Apr 08, 2025
Viaarxiv icon

ORCA: An Open-Source, Reliable, Cost-Effective, Anthropomorphic Robotic Hand for Uninterrupted Dexterous Task Learning

Add code
Apr 05, 2025
Viaarxiv icon

SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor

Add code
Dec 18, 2024
Figure 1 for SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor
Figure 2 for SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor
Figure 3 for SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor
Figure 4 for SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor
Viaarxiv icon

PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models

Add code
Dec 12, 2024
Figure 1 for PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
Figure 2 for PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
Figure 3 for PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
Figure 4 for PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
Viaarxiv icon

VQ-ACE: Efficient Policy Search for Dexterous Robotic Manipulation via Action Chunking Embedding

Add code
Nov 05, 2024
Figure 1 for VQ-ACE: Efficient Policy Search for Dexterous Robotic Manipulation via Action Chunking Embedding
Figure 2 for VQ-ACE: Efficient Policy Search for Dexterous Robotic Manipulation via Action Chunking Embedding
Figure 3 for VQ-ACE: Efficient Policy Search for Dexterous Robotic Manipulation via Action Chunking Embedding
Figure 4 for VQ-ACE: Efficient Policy Search for Dexterous Robotic Manipulation via Action Chunking Embedding
Viaarxiv icon

Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning

Add code
Jun 11, 2024
Figure 1 for Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
Figure 2 for Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
Figure 3 for Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
Figure 4 for Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
Viaarxiv icon

CRAG -- Comprehensive RAG Benchmark

Add code
Jun 07, 2024
Figure 1 for CRAG -- Comprehensive RAG Benchmark
Figure 2 for CRAG -- Comprehensive RAG Benchmark
Figure 3 for CRAG -- Comprehensive RAG Benchmark
Figure 4 for CRAG -- Comprehensive RAG Benchmark
Viaarxiv icon