Picture for Rama Doddipatla

Rama Doddipatla

Geodesic interpolation of frame-wise speaker embeddings for the diarization of meeting scenarios

Add code
Jan 08, 2024
Viaarxiv icon

Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues

Add code
Sep 21, 2023
Figure 1 for Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues
Figure 2 for Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues
Figure 3 for Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues
Figure 4 for Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues
Viaarxiv icon

Frame-wise and overlap-robust speaker embeddings for meeting diarization

Jun 01, 2023
Figure 1 for Frame-wise and overlap-robust speaker embeddings for meeting diarization
Figure 2 for Frame-wise and overlap-robust speaker embeddings for meeting diarization
Figure 3 for Frame-wise and overlap-robust speaker embeddings for meeting diarization
Figure 4 for Frame-wise and overlap-robust speaker embeddings for meeting diarization
Viaarxiv icon

A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures

Jun 01, 2023
Figure 1 for A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures
Figure 2 for A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures
Figure 3 for A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures
Figure 4 for A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures
Viaarxiv icon

Adversarial learning of neural user simulators for dialogue policy optimisation

Jun 01, 2023
Figure 1 for Adversarial learning of neural user simulators for dialogue policy optimisation
Figure 2 for Adversarial learning of neural user simulators for dialogue policy optimisation
Figure 3 for Adversarial learning of neural user simulators for dialogue policy optimisation
Figure 4 for Adversarial learning of neural user simulators for dialogue policy optimisation
Viaarxiv icon

Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition

Apr 24, 2023
Figure 1 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Figure 2 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Figure 3 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Figure 4 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Viaarxiv icon

Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding

Apr 21, 2023
Figure 1 for Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding
Figure 2 for Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding
Figure 3 for Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding
Viaarxiv icon

Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer

Jul 29, 2022
Figure 1 for Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer
Figure 2 for Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer
Figure 3 for Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer
Figure 4 for Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer
Viaarxiv icon

Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition

Add code
May 09, 2022
Figure 1 for Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition
Figure 2 for Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition
Figure 3 for Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition
Figure 4 for Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition
Viaarxiv icon

On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training

Add code
May 03, 2022
Figure 1 for On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training
Figure 2 for On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training
Figure 3 for On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training
Figure 4 for On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training
Viaarxiv icon