Alert button

"speech": models, code, and papers
Alert button

Tuning Large language model for End-to-end Speech Translation

Oct 03, 2023
Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Xiaolin Jiao

Figure 1 for Tuning Large language model for End-to-end Speech Translation
Figure 2 for Tuning Large language model for End-to-end Speech Translation
Figure 3 for Tuning Large language model for End-to-end Speech Translation
Figure 4 for Tuning Large language model for End-to-end Speech Translation
Viaarxiv icon

DiffusionSat: A Generative Foundation Model for Satellite Imagery

Dec 06, 2023
Samar Khanna, Patrick Liu, Linqi Zhou, Chenlin Meng, Robin Rombach, Marshall Burke, David Lobell, Stefano Ermon

Viaarxiv icon

Learning Repeatable Speech Embeddings Using An Intra-class Correlation Regularizer

Add code
Bookmark button
Alert button
Oct 25, 2023
Jianwei Zhang, Suren Jayasuriya, Visar Berisha

Viaarxiv icon

AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation

Add code
Bookmark button
Alert button
Oct 11, 2023
Liyang Chen, Weihong Bao, Shun Lei, Boshi Tang, Zhiyong Wu, Shiyin Kang, Haozhi Huang

Figure 1 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Figure 2 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Figure 3 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Figure 4 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Viaarxiv icon

No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation

Add code
Bookmark button
Alert button
Oct 10, 2023
Dennis Fucci, Marco Gaido, Matteo Negri, Mauro Cettolo, Luisa Bentivogli

Figure 1 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 2 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 3 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 4 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Viaarxiv icon

VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System

Oct 27, 2023
Abdul Waheed, Bashar Talafha, Peter Sullivan, AbdelRahim Elmadany, Muhammad Abdul-Mageed

Figure 1 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Figure 2 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Figure 3 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Figure 4 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Viaarxiv icon

On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments

Add code
Bookmark button
Alert button
Oct 09, 2023
William Ravenscroft, Stefan Goetze, Thomas Hain

Viaarxiv icon

Unsupervised Speech Recognition with N-Skipgram and Positional Unigram Matching

Add code
Bookmark button
Alert button
Oct 03, 2023
Liming Wang, Mark Hasegawa-Johnson, Chang D. Yoo

Viaarxiv icon

Guided Flows for Generative Modeling and Decision Making

Nov 22, 2023
Qinqing Zheng, Matt Le, Neta Shaul, Yaron Lipman, Aditya Grover, Ricky T. Q. Chen

Viaarxiv icon

Test-Time Training for Speech

Sep 19, 2023
Sri Harsha Dumpala, Chandramouli Sastry, Sageev Oore

Figure 1 for Test-Time Training for Speech
Figure 2 for Test-Time Training for Speech
Figure 3 for Test-Time Training for Speech
Figure 4 for Test-Time Training for Speech
Viaarxiv icon