Alert button

"speech recognition": models, code, and papers
Alert button

Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking

Mar 13, 2024
Ming Dong, Yujing Chen, Miao Zhang, Hao Sun, Tingting He

Viaarxiv icon

AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition

Jan 18, 2024
Ju Lin, Niko Moritz, Yiteng Huang, Ruiming Xie, Ming Sun, Christian Fuegen, Frank Seide

Viaarxiv icon

Cross-Speaker Encoding Network for Multi-Talker Speech Recognition

Jan 08, 2024
Jiawen Kang, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng

Viaarxiv icon

Toward Practical Automatic Speech Recognition and Post-Processing: a Call for Explainable Error Benchmark Guideline

Jan 26, 2024
Seonmin Koo, Chanjun Park, Jinsung Kim, Jaehyung Seo, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim

Viaarxiv icon

LCB-net: Long-Context Biasing for Audio-Visual Speech Recognition

Jan 12, 2024
Fan Yu, Haoxu Wang, Xian Shi, Shiliang Zhang

Viaarxiv icon

Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition

Dec 18, 2023
Peng Shen, Xugang Lu, Hisashi Kawai

Viaarxiv icon

SeMaScore : a new evaluation metric for automatic speech recognition tasks

Jan 15, 2024
Zitha Sasindran, Harsha Yelchuri, T. V. Prabhakar

Viaarxiv icon

SALAD: Smart AI Language Assistant Daily

Feb 13, 2024
Ragib Amin Nihal, Tran Dong Huu Quoc, Lin Zirui, Xu Yimimg, Liu Haoran, An Zhaoyi, Kyou Ma

Viaarxiv icon

Listening to Multi-talker Conversations: Modular and End-to-end Perspectives

Feb 14, 2024
Desh Raj

Viaarxiv icon

Mel-FullSubNet: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR

Feb 22, 2024
Rui Zhou, Xian Li, Ying Fang, Xiaofei Li

Viaarxiv icon