Picture for Zeyu Xie

Zeyu Xie

Covo-Audio Technical Report

Add code
Feb 10, 2026
Viaarxiv icon

Overview of the Amphion Toolkit (v0.2)

Add code
Jan 26, 2025
Figure 1 for Overview of the Amphion Toolkit (v0.2)
Figure 2 for Overview of the Amphion Toolkit (v0.2)
Figure 3 for Overview of the Amphion Toolkit (v0.2)
Figure 4 for Overview of the Amphion Toolkit (v0.2)
Viaarxiv icon

DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation

Add code
Jul 18, 2024
Figure 1 for DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation
Figure 2 for DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation
Figure 3 for DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation
Figure 4 for DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation
Viaarxiv icon

AudioTime: A Temporally-aligned Audio-text Benchmark Dataset

Add code
Jul 03, 2024
Figure 1 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Figure 2 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Figure 3 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Figure 4 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Viaarxiv icon

PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation

Add code
Jul 03, 2024
Figure 1 for PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
Figure 2 for PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
Figure 3 for PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
Viaarxiv icon

FakeSound: Deepfake General Audio Detection

Add code
Jun 12, 2024
Figure 1 for FakeSound: Deepfake General Audio Detection
Figure 2 for FakeSound: Deepfake General Audio Detection
Figure 3 for FakeSound: Deepfake General Audio Detection
Figure 4 for FakeSound: Deepfake General Audio Detection
Viaarxiv icon

A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds

Add code
Mar 07, 2024
Figure 1 for A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds
Figure 2 for A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds
Figure 3 for A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds
Figure 4 for A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds
Viaarxiv icon

Enhancing Audio Generation Diversity with Visual Information

Add code
Mar 02, 2024
Viaarxiv icon

Phonetic and Lexical Discovery of a Canine Language using HuBERT

Add code
Feb 25, 2024
Figure 1 for Phonetic and Lexical Discovery of a Canine Language using HuBERT
Figure 2 for Phonetic and Lexical Discovery of a Canine Language using HuBERT
Figure 3 for Phonetic and Lexical Discovery of a Canine Language using HuBERT
Figure 4 for Phonetic and Lexical Discovery of a Canine Language using HuBERT
Viaarxiv icon

Improving Audio Caption Fluency with Automatic Error Correction

Add code
Jun 16, 2023
Viaarxiv icon