Picture for Yichong Leng

Yichong Leng

Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model

Add code
Dec 23, 2025
Viaarxiv icon

Virtual Width Networks

Add code
Nov 17, 2025
Viaarxiv icon

EFIM: Efficient Serving of LLMs for Infilling Tasks with Improved KV Cache Reuse

Add code
May 29, 2025
Viaarxiv icon

Kimi-Audio Technical Report

Add code
Apr 25, 2025
Figure 1 for Kimi-Audio Technical Report
Figure 2 for Kimi-Audio Technical Report
Figure 3 for Kimi-Audio Technical Report
Figure 4 for Kimi-Audio Technical Report
Viaarxiv icon

MoonCast: High-Quality Zero-Shot Podcast Generation

Add code
Mar 19, 2025
Figure 1 for MoonCast: High-Quality Zero-Shot Podcast Generation
Figure 2 for MoonCast: High-Quality Zero-Shot Podcast Generation
Figure 3 for MoonCast: High-Quality Zero-Shot Podcast Generation
Figure 4 for MoonCast: High-Quality Zero-Shot Podcast Generation
Viaarxiv icon

The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation

Add code
Mar 06, 2025
Viaarxiv icon

Qwen2-Audio Technical Report

Add code
Jul 15, 2024
Viaarxiv icon

Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation

Add code
Apr 23, 2024
Viaarxiv icon

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Add code
Mar 05, 2024
Viaarxiv icon

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Add code
Feb 12, 2024
Figure 1 for AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
Figure 2 for AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
Figure 3 for AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
Figure 4 for AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
Viaarxiv icon