Picture for Bhiksha Raj

Bhiksha Raj

Language Technologies Institute, Carnegie Mellon University, Mohammed bin Zayed University of AI

Did You Hear That? Introducing AADG: A Framework for Generating Benchmark Data in Audio Anomaly Detection

Add code
Oct 04, 2024
Figure 1 for Did You Hear That? Introducing AADG: A Framework for Generating Benchmark Data in Audio Anomaly Detection
Figure 2 for Did You Hear That? Introducing AADG: A Framework for Generating Benchmark Data in Audio Anomaly Detection
Figure 3 for Did You Hear That? Introducing AADG: A Framework for Generating Benchmark Data in Audio Anomaly Detection
Figure 4 for Did You Hear That? Introducing AADG: A Framework for Generating Benchmark Data in Audio Anomaly Detection
Viaarxiv icon

ImageFolder: Autoregressive Image Generation with Folded Tokens

Add code
Oct 02, 2024
Figure 1 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Figure 2 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Figure 3 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Figure 4 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Viaarxiv icon

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech

Add code
Sep 24, 2024
Viaarxiv icon

Revisiting Acoustic Features for Robust ASR

Add code
Sep 24, 2024
Figure 1 for Revisiting Acoustic Features for Robust ASR
Figure 2 for Revisiting Acoustic Features for Robust ASR
Figure 3 for Revisiting Acoustic Features for Robust ASR
Figure 4 for Revisiting Acoustic Features for Robust ASR
Viaarxiv icon

DeWinder: Single-Channel Wind Noise Reduction using Ultrasound Sensing

Add code
Sep 10, 2024
Figure 1 for DeWinder: Single-Channel Wind Noise Reduction using Ultrasound Sensing
Figure 2 for DeWinder: Single-Channel Wind Noise Reduction using Ultrasound Sensing
Figure 3 for DeWinder: Single-Channel Wind Noise Reduction using Ultrasound Sensing
Figure 4 for DeWinder: Single-Channel Wind Noise Reduction using Ultrasound Sensing
Viaarxiv icon

PDAF: A Phonetic Debiasing Attention Framework For Speaker Verification

Add code
Sep 09, 2024
Figure 1 for PDAF: A Phonetic Debiasing Attention Framework For Speaker Verification
Figure 2 for PDAF: A Phonetic Debiasing Attention Framework For Speaker Verification
Figure 3 for PDAF: A Phonetic Debiasing Attention Framework For Speaker Verification
Figure 4 for PDAF: A Phonetic Debiasing Attention Framework For Speaker Verification
Viaarxiv icon

Efficient Autoregressive Audio Modeling via Next-Scale Prediction

Add code
Aug 16, 2024
Figure 1 for Efficient Autoregressive Audio Modeling via Next-Scale Prediction
Figure 2 for Efficient Autoregressive Audio Modeling via Next-Scale Prediction
Figure 3 for Efficient Autoregressive Audio Modeling via Next-Scale Prediction
Figure 4 for Efficient Autoregressive Audio Modeling via Next-Scale Prediction
Viaarxiv icon

Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization?

Add code
Aug 12, 2024
Figure 1 for Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization?
Figure 2 for Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization?
Figure 3 for Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization?
Figure 4 for Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization?
Viaarxiv icon

Audio Entailment: Assessing Deductive Reasoning for Audio Understanding

Add code
Jul 25, 2024
Viaarxiv icon

SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios

Add code
Jul 22, 2024
Figure 1 for SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios
Figure 2 for SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios
Figure 3 for SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios
Figure 4 for SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios
Viaarxiv icon