Picture for Zhijie Yan

Zhijie Yan

CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens

Add code
Jul 09, 2024
Viaarxiv icon

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

Add code
Mar 28, 2024
Figure 1 for TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Figure 2 for TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Figure 3 for TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Figure 4 for TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Viaarxiv icon

Large Language Models Powered Context-aware Motion Prediction

Add code
Mar 17, 2024
Figure 1 for Large Language Models Powered Context-aware Motion Prediction
Figure 2 for Large Language Models Powered Context-aware Motion Prediction
Figure 3 for Large Language Models Powered Context-aware Motion Prediction
Figure 4 for Large Language Models Powered Context-aware Motion Prediction
Viaarxiv icon

Advancing VAD Systems Based on Multi-Task Learning with Improved Model Structures

Add code
Dec 19, 2023
Viaarxiv icon

Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models

Add code
Nov 14, 2023
Figure 1 for Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Figure 2 for Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Figure 3 for Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Figure 4 for Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Viaarxiv icon

LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

Add code
Oct 11, 2023
Figure 1 for LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
Figure 2 for LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
Figure 3 for LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
Figure 4 for LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
Viaarxiv icon

The second multi-channel multi-party meeting transcription challenge 2.0): A benchmark for speaker-attributed ASR

Add code
Sep 24, 2023
Figure 1 for The second multi-channel multi-party meeting transcription challenge  2.0): A benchmark for speaker-attributed ASR
Figure 2 for The second multi-channel multi-party meeting transcription challenge  2.0): A benchmark for speaker-attributed ASR
Figure 3 for The second multi-channel multi-party meeting transcription challenge  2.0): A benchmark for speaker-attributed ASR
Figure 4 for The second multi-channel multi-party meeting transcription challenge  2.0): A benchmark for speaker-attributed ASR
Viaarxiv icon

Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System

Add code
May 25, 2023
Figure 1 for Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System
Figure 2 for Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System
Figure 3 for Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System
Figure 4 for Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System
Viaarxiv icon

MUG: A General Meeting Understanding and Generation Benchmark

Add code
Mar 27, 2023
Figure 1 for MUG: A General Meeting Understanding and Generation Benchmark
Figure 2 for MUG: A General Meeting Understanding and Generation Benchmark
Figure 3 for MUG: A General Meeting Understanding and Generation Benchmark
Figure 4 for MUG: A General Meeting Understanding and Generation Benchmark
Viaarxiv icon

Overview of the ICASSP 2023 General Meeting Understanding and Generation Challenge

Add code
Mar 24, 2023
Figure 1 for Overview of the ICASSP 2023 General Meeting Understanding and Generation Challenge
Figure 2 for Overview of the ICASSP 2023 General Meeting Understanding and Generation Challenge
Viaarxiv icon