Picture for Chutong Meng

Chutong Meng

RepCodec: A Speech Representation Codec for Speech Tokenization

Add code
Aug 31, 2023
Figure 1 for RepCodec: A Speech Representation Codec for Speech Tokenization
Figure 2 for RepCodec: A Speech Representation Codec for Speech Tokenization
Figure 3 for RepCodec: A Speech Representation Codec for Speech Tokenization
Figure 4 for RepCodec: A Speech Representation Codec for Speech Tokenization
Viaarxiv icon

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research

Add code
Mar 30, 2023
Figure 1 for WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Figure 2 for WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Figure 3 for WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Figure 4 for WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Viaarxiv icon

CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning

Add code
Oct 08, 2022
Figure 1 for CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
Figure 2 for CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
Figure 3 for CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
Figure 4 for CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
Viaarxiv icon

GigaST: A 10,000-hour Pseudo Speech Translation Corpus

Add code
Apr 08, 2022
Figure 1 for GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Figure 2 for GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Figure 3 for GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Figure 4 for GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Viaarxiv icon