Alert button

"speech recognition": models, code, and papers
Alert button

EMO-SUPERB: An In-depth Look at Speech Emotion Recognition

Add code
Bookmark button
Alert button
Feb 22, 2024
Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee

Viaarxiv icon

Non-verbal information in spontaneous speech -- towards a new framework of analysis

Mar 06, 2024
Tirza Biron, Moshe Barboy, Eran Ben-Artzy, Alona Golubchik, Yanir Marmor, Smadar Szekely, Yaron Winter, David Harel

Figure 1 for Non-verbal information in spontaneous speech -- towards a new framework of analysis
Figure 2 for Non-verbal information in spontaneous speech -- towards a new framework of analysis
Figure 3 for Non-verbal information in spontaneous speech -- towards a new framework of analysis
Figure 4 for Non-verbal information in spontaneous speech -- towards a new framework of analysis
Viaarxiv icon

Language and Speech Technology for Central Kurdish Varieties

Add code
Bookmark button
Alert button
Mar 04, 2024
Sina Ahmadi, Daban Q. Jaff, Md Mahfuz Ibn Alam, Antonios Anastasopoulos

Figure 1 for Language and Speech Technology for Central Kurdish Varieties
Figure 2 for Language and Speech Technology for Central Kurdish Varieties
Figure 3 for Language and Speech Technology for Central Kurdish Varieties
Figure 4 for Language and Speech Technology for Central Kurdish Varieties
Viaarxiv icon

Classist Tools: Social Class Correlates with Performance in NLP

Add code
Bookmark button
Alert button
Mar 07, 2024
Amanda Cercas Curry, Giuseppe Attanasio, Zeerak Talat, Dirk Hovy

Figure 1 for Classist Tools: Social Class Correlates with Performance in NLP
Figure 2 for Classist Tools: Social Class Correlates with Performance in NLP
Figure 3 for Classist Tools: Social Class Correlates with Performance in NLP
Figure 4 for Classist Tools: Social Class Correlates with Performance in NLP
Viaarxiv icon

Cross-Speaker Encoding Network for Multi-Talker Speech Recognition

Jan 08, 2024
Jiawen Kang, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng

Viaarxiv icon

PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings

Add code
Bookmark button
Alert button
Mar 04, 2024
Joonas Kalda, Clément Pagés, Ricard Marxer, Tanel Alumäe, Hervé Bredin

Figure 1 for PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings
Figure 2 for PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings
Figure 3 for PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings
Figure 4 for PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings
Viaarxiv icon

RADIA -- Radio Advertisement Detection with Intelligent Analytics

Mar 06, 2024
Jorge Álvarez, Juan Carlos Armenteros, Camilo Torrón, Miguel Ortega-Martín, Alfonso Ardoiz, Óscar García, Ignacio Arranz, Íñigo Galdeano, Ignacio Garrido, Adrián Alonso, Fernando Bayón, Oleg Vorontsov

Figure 1 for RADIA -- Radio Advertisement Detection with Intelligent Analytics
Figure 2 for RADIA -- Radio Advertisement Detection with Intelligent Analytics
Figure 3 for RADIA -- Radio Advertisement Detection with Intelligent Analytics
Figure 4 for RADIA -- Radio Advertisement Detection with Intelligent Analytics
Viaarxiv icon

LCB-net: Long-Context Biasing for Audio-Visual Speech Recognition

Jan 12, 2024
Fan Yu, Haoxu Wang, Xian Shi, Shiliang Zhang

Viaarxiv icon

AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition

Jan 18, 2024
Ju Lin, Niko Moritz, Yiteng Huang, Ruiming Xie, Ming Sun, Christian Fuegen, Frank Seide

Viaarxiv icon

Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation

Feb 19, 2024
Nineli Lashkarashvili, Wen Wu, Guangzhi Sun, Philip C. Woodland

Viaarxiv icon