Alert button

"speech": models, code, and papers
Alert button

DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors

Dec 07, 2023
Federico Landini, Mireia Diez, Themos Stafylakis, Lukáš Burget

Viaarxiv icon

SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis

Add code
Bookmark button
Alert button
Nov 29, 2023
Ziqiao Peng, Wentao Hu, Yue Shi, Xiangyu Zhu, Xiaomei Zhang, Hao Zhao, Jun He, Hongyan Liu, Zhaoxin Fan

Viaarxiv icon

Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer

Oct 05, 2023
Paul-Ambroise Duquenne, Holger Schwenk, Benoît Sagot

Figure 1 for Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Figure 2 for Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Figure 3 for Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Figure 4 for Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Viaarxiv icon

Multi-dimensional Speech Quality Assessment in Crowdsourcing

Add code
Bookmark button
Alert button
Sep 14, 2023
Babak Naderi, Ross Cutler, Nicolae-Catalin Ristea

Figure 1 for Multi-dimensional Speech Quality Assessment in Crowdsourcing
Figure 2 for Multi-dimensional Speech Quality Assessment in Crowdsourcing
Figure 3 for Multi-dimensional Speech Quality Assessment in Crowdsourcing
Figure 4 for Multi-dimensional Speech Quality Assessment in Crowdsourcing
Viaarxiv icon

Analysis of Visual Features for Continuous Lipreading in Spanish

Nov 21, 2023
David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

Viaarxiv icon

DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation

Add code
Bookmark button
Alert button
Nov 08, 2023
Guinan Su, Yanwu Yang, Zhifeng Li

Viaarxiv icon

XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words

Add code
Bookmark button
Alert button
Oct 08, 2023
Robin Algayres, Pablo Diego-Simon, Benoit Sagot, Emmanuel Dupoux

Figure 1 for XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words
Figure 2 for XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words
Figure 3 for XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words
Figure 4 for XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words
Viaarxiv icon

Connecting Speech Encoder and Large Language Model for ASR

Sep 26, 2023
Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Viaarxiv icon

Towards Conceptualization of "Fair Explanation": Disparate Impacts of anti-Asian Hate Speech Explanations on Content Moderators

Add code
Bookmark button
Alert button
Oct 23, 2023
Tin Nguyen, Jiannan Xu, Aayushi Roy, Hal Daumé III, Marine Carpuat

Viaarxiv icon

MUST: A Multilingual Student-Teacher Learning approach for low-resource speech recognition

Oct 29, 2023
Muhammad Umar Farooq, Rehan Ahmad, Thomas Hain

Figure 1 for MUST: A Multilingual Student-Teacher Learning approach for low-resource speech recognition
Figure 2 for MUST: A Multilingual Student-Teacher Learning approach for low-resource speech recognition
Figure 3 for MUST: A Multilingual Student-Teacher Learning approach for low-resource speech recognition
Figure 4 for MUST: A Multilingual Student-Teacher Learning approach for low-resource speech recognition
Viaarxiv icon