Alert button

"speech": models, code, and papers
Alert button

Instructing Hierarchical Tasks to Robots by Verbal Commands

Add code
Bookmark button
Alert button
Nov 30, 2023
P. Telkes, A. Angleraud, R. Pieters

Viaarxiv icon

Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR

Nov 30, 2023
Jintao Jiang, Yingbo Gao, Zoltan Tuske

Figure 1 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 2 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 3 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 4 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Viaarxiv icon

Fast Word Error Rate Estimation Using Self-Supervised Representations For Speech And Text

Oct 12, 2023
Chanho Park, Chengsong Lu, Mingjie Chen, Thomas Hain

Viaarxiv icon

LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models

Nov 28, 2023
Chi-Chang Lee, Hong-Wei Chen, Chu-Song Chen, Hsin-Min Wang, Tsung-Te Liu, Yu Tsao

Figure 1 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 2 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 3 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 4 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Viaarxiv icon

Learning Speech Representation From Contrastive Token-Acoustic Pretraining

Add code
Bookmark button
Alert button
Sep 06, 2023
Chunyu Qiang, Hao Li, Yixin Tian, Ruibo Fu, Tao Wang, Longbiao Wang, Jianwu Dang

Figure 1 for Learning Speech Representation From Contrastive Token-Acoustic Pretraining
Figure 2 for Learning Speech Representation From Contrastive Token-Acoustic Pretraining
Viaarxiv icon

Self Generated Wargame AI: Double Layer Agent Task Planning Based on Large Language Model

Dec 02, 2023
Y. Sun, C. Yu, J. Zhao, W. Wang, X. Zhou

Viaarxiv icon

Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

Oct 06, 2023
Yan Zhao, Yuan Zong, Jincen Wang, Hailun Lian, Cheng Lu, Li Zhao, Wenming Zheng

Figure 1 for Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition
Figure 2 for Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition
Figure 3 for Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition
Figure 4 for Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition
Viaarxiv icon

Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency

Nov 05, 2023
Sungho Jeon, Ching-Feng Yeh, Hakan Inan, Wei-Ning Hsu, Rashi Rungta, Yashar Mehdad, Daniel Bikel

Viaarxiv icon

Average Token Delay: A Duration-aware Latency Metric for Simultaneous Translation

Nov 27, 2023
Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Viaarxiv icon

Overview of the VLSP 2022 -- Abmusu Shared Task: A Data Challenge for Vietnamese Abstractive Multi-document Summarization

Nov 27, 2023
Mai-Vu Tran, Hoang-Quynh Le, Duy-Cat Can, Quoc-An Nguyen

Viaarxiv icon