Picture for Rita Singh

Rita Singh

OleSpeech-IV: A Large-Scale Multispeaker and Multilingual Conversational Speech Dataset with Diverse Topics

Add code
Sep 04, 2025
Viaarxiv icon

Deciphering GunType Hierarchy through Acoustic Analysis of Gunshot Recordings

Add code
Jun 25, 2025
Viaarxiv icon

CoLMbo: Speaker Language Model for Descriptive Profiling

Add code
Jun 11, 2025
Viaarxiv icon

CAARMA: Class Augmentation with Adversarial Mixup Regularization

Add code
Mar 20, 2025
Figure 1 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Figure 2 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Figure 3 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Figure 4 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Viaarxiv icon

A New Benchmark for Few-Shot Class-Incremental Learning: Redefining the Upper Bound

Add code
Mar 13, 2025
Figure 1 for A New Benchmark for Few-Shot Class-Incremental Learning: Redefining the Upper Bound
Figure 2 for A New Benchmark for Few-Shot Class-Incremental Learning: Redefining the Upper Bound
Figure 3 for A New Benchmark for Few-Shot Class-Incremental Learning: Redefining the Upper Bound
Figure 4 for A New Benchmark for Few-Shot Class-Incremental Learning: Redefining the Upper Bound
Viaarxiv icon

Mellow: a small audio language model for reasoning

Add code
Mar 11, 2025
Figure 1 for Mellow: a small audio language model for reasoning
Figure 2 for Mellow: a small audio language model for reasoning
Figure 3 for Mellow: a small audio language model for reasoning
Figure 4 for Mellow: a small audio language model for reasoning
Viaarxiv icon

On the Robust Approximation of ASR Metrics

Add code
Feb 18, 2025
Viaarxiv icon

Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models

Add code
Feb 18, 2025
Viaarxiv icon

ADIFF: Explaining audio difference using natural language

Add code
Feb 06, 2025
Figure 1 for ADIFF: Explaining audio difference using natural language
Figure 2 for ADIFF: Explaining audio difference using natural language
Figure 3 for ADIFF: Explaining audio difference using natural language
Figure 4 for ADIFF: Explaining audio difference using natural language
Viaarxiv icon

Tessellated Linear Model for Age Prediction from Voice

Add code
Jan 16, 2025
Figure 1 for Tessellated Linear Model for Age Prediction from Voice
Figure 2 for Tessellated Linear Model for Age Prediction from Voice
Figure 3 for Tessellated Linear Model for Age Prediction from Voice
Figure 4 for Tessellated Linear Model for Age Prediction from Voice
Viaarxiv icon