Alert button

"speech": models, code, and papers
Alert button

High-precision Voice Search Query Correction via Retrievable Speech-text Embedings

Jan 08, 2024
Christopher Li, Gary Wang, Kyle Kastner, Heng Su, Allen Chen, Andrew Rosenberg, Zhehuai Chen, Zelin Wu, Leonid Velikovich, Pat Rondon, Diamantino Caseiro, Petar Aleksic

Viaarxiv icon

Resolving Transcription Ambiguity in Spanish: A Hybrid Acoustic-Lexical System for Punctuation Restoration

Feb 05, 2024
Xiliang Zhu, Chia-Tien Chang, Shayna Gardiner, David Rossouw, Jonas Robertson

Viaarxiv icon

Whispering in Norwegian: Navigating Orthographic and Dialectic Challenges

Feb 02, 2024
Per E Kummervold, Javier de la Rosa, Freddy Wetjen, Rolv-Arild Braaten, Per Erik Solberg

Viaarxiv icon

Proactive Detection of Voice Cloning with Localized Watermarking

Jan 30, 2024
Robin San Roman, Pierre Fernandez, Alexandre Défossez, Teddy Furon, Tuan Tran, Hady Elsahar

Viaarxiv icon

Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech

Jan 19, 2024
Abhinav Garg, Jiyeon Kim, Sushil Khyalia, Chanwoo Kim, Dhananjaya Gowda

Viaarxiv icon

Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience

Feb 06, 2024
Xilin Jiang, Cong Han, Yinghao Aaron Li, Nima Mesgarani

Viaarxiv icon

Boosting Large Language Model for Speech Synthesis: An Empirical Study

Dec 30, 2023
Hongkun Hao, Long Zhou, Shujie Liu, Jinyu Li, Shujie Hu, Rui Wang, Furu Wei

Viaarxiv icon

Evaluating Speech-in-Speech Perception via a Humanoid Robot

Dec 19, 2023
Luke Meyer, Gloria Araiza-Illan, Laura Rachman, Etienne Gaudrain, Deniz Baskent

Viaarxiv icon

EmphAssess : a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models

Dec 21, 2023
Maureen de Seyssel, Antony D'Avirro, Adina Williams, Emmanuel Dupoux

Viaarxiv icon

Who Said What? An Automated Approach to Analyzing Speech in Preschool Classrooms

Jan 14, 2024
Anchen Sun, Juan J Londono, Batya Elbaum, Luis Estrada, Roberto Jose Lazo, Laura Vitale, Hugo Gonzalez Villasanti, Riccardo Fusaroli, Lynn K Perry, Daniel S Messinger

Viaarxiv icon