Alert button

"speech": models, code, and papers
Alert button

Data-Efficient French Language Modeling with CamemBERTa

Jun 02, 2023
Wissam Antoun, Benoît Sagot, Djamé Seddah

Figure 1 for Data-Efficient French Language Modeling with CamemBERTa
Figure 2 for Data-Efficient French Language Modeling with CamemBERTa
Figure 3 for Data-Efficient French Language Modeling with CamemBERTa
Figure 4 for Data-Efficient French Language Modeling with CamemBERTa
Viaarxiv icon

NLPositionality: Characterizing Design Biases of Datasets and Models

Jun 02, 2023
Sebastin Santy, Jenny T. Liang, Ronan Le Bras, Katharina Reinecke, Maarten Sap

Figure 1 for NLPositionality: Characterizing Design Biases of Datasets and Models
Figure 2 for NLPositionality: Characterizing Design Biases of Datasets and Models
Figure 3 for NLPositionality: Characterizing Design Biases of Datasets and Models
Figure 4 for NLPositionality: Characterizing Design Biases of Datasets and Models
Viaarxiv icon

Efficient Speech Translation with Dynamic Latent Perceivers

Oct 28, 2022
Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussá

Figure 1 for Efficient Speech Translation with Dynamic Latent Perceivers
Figure 2 for Efficient Speech Translation with Dynamic Latent Perceivers
Figure 3 for Efficient Speech Translation with Dynamic Latent Perceivers
Figure 4 for Efficient Speech Translation with Dynamic Latent Perceivers
Viaarxiv icon

Efficient Speech Translation with Pre-trained Models

Nov 09, 2022
Zhaolin Li, Jan Niehues

Figure 1 for Efficient Speech Translation with Pre-trained Models
Figure 2 for Efficient Speech Translation with Pre-trained Models
Figure 3 for Efficient Speech Translation with Pre-trained Models
Figure 4 for Efficient Speech Translation with Pre-trained Models
Viaarxiv icon

Imitator: Personalized Speech-driven 3D Facial Animation

Dec 30, 2022
Balamurugan Thambiraja, Ikhsanul Habibie, Sadegh Aliakbarian, Darren Cosker, Christian Theobalt, Justus Thies

Figure 1 for Imitator: Personalized Speech-driven 3D Facial Animation
Figure 2 for Imitator: Personalized Speech-driven 3D Facial Animation
Figure 3 for Imitator: Personalized Speech-driven 3D Facial Animation
Figure 4 for Imitator: Personalized Speech-driven 3D Facial Animation
Viaarxiv icon

Bridging Speech and Textual Pre-trained Models with Unsupervised ASR

Nov 06, 2022
Jiatong Shi, Chan-Jan Hsu, Holam Chung, Dongji Gao, Paola Garcia, Shinji Watanabe, Ann Lee, Hung-yi Lee

Figure 1 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 2 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 3 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 4 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Viaarxiv icon

A Whisper transformer for audio captioning trained with synthetic captions and transfer learning

May 15, 2023
Marek Kadlčík, Adam Hájek, Jürgen Kieslich, Radosław Winiecki

Figure 1 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Figure 2 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Figure 3 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Figure 4 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Viaarxiv icon

Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding

May 23, 2023
Tian-Hao Zhang, Hai-Bo Qin, Zhi-Hao Lai, Song-Lu Chen, Qi Liu, Feng Chen, Xinyuan Qian, Xu-Cheng Yin

Figure 1 for Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Figure 2 for Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Figure 3 for Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Figure 4 for Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Viaarxiv icon

Code-Switching Text Generation and Injection in Mandarin-English ASR

Mar 20, 2023
Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng

Figure 1 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Figure 2 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Figure 3 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Figure 4 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Viaarxiv icon

Pragmatically Appropriate Diversity for Dialogue Evaluation

Apr 06, 2023
Katherine Stasaski, Marti A. Hearst

Figure 1 for Pragmatically Appropriate Diversity for Dialogue Evaluation
Figure 2 for Pragmatically Appropriate Diversity for Dialogue Evaluation
Figure 3 for Pragmatically Appropriate Diversity for Dialogue Evaluation
Figure 4 for Pragmatically Appropriate Diversity for Dialogue Evaluation
Viaarxiv icon