Picture for Yifu Chen

Yifu Chen

Modeling and Benchmarking Spoken Dialogue Rewards with Modality and Colloquialness

Add code
Mar 16, 2026
Viaarxiv icon

WavBench: Benchmarking Reasoning, Colloquialism, and Paralinguistics for End-to-End Spoken Dialogue Models

Add code
Feb 13, 2026
Viaarxiv icon

WavReward: Spoken Dialogue Models With Generalist Reward Evaluators

Add code
May 14, 2025
Viaarxiv icon

WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models

Add code
Feb 20, 2025
Figure 1 for WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models
Figure 2 for WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models
Figure 3 for WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models
Figure 4 for WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models
Viaarxiv icon

Speech Watermarking with Discrete Intermediate Representations

Add code
Dec 18, 2024
Viaarxiv icon

WavChat: A Survey of Spoken Dialogue Models

Add code
Nov 26, 2024
Figure 1 for WavChat: A Survey of Spoken Dialogue Models
Figure 2 for WavChat: A Survey of Spoken Dialogue Models
Figure 3 for WavChat: A Survey of Spoken Dialogue Models
Figure 4 for WavChat: A Survey of Spoken Dialogue Models
Viaarxiv icon

Improving Text-guided Object Inpainting with Semantic Pre-inpainting

Add code
Sep 12, 2024
Viaarxiv icon

WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling

Add code
Aug 29, 2024
Figure 1 for WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Figure 2 for WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Figure 3 for WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Figure 4 for WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Viaarxiv icon

OS-FPI: A Coarse-to-Fine One-Stream Network for UAV Geo-Localization

Add code
Mar 10, 2024
Viaarxiv icon

Skywork: A More Open Bilingual Foundation Model

Add code
Oct 30, 2023
Figure 1 for Skywork: A More Open Bilingual Foundation Model
Figure 2 for Skywork: A More Open Bilingual Foundation Model
Figure 3 for Skywork: A More Open Bilingual Foundation Model
Figure 4 for Skywork: A More Open Bilingual Foundation Model
Viaarxiv icon