Picture for Anurag Kumar

Anurag Kumar

Interspeech 2025 URGENT Speech Enhancement Challenge

Add code
May 29, 2025
Viaarxiv icon

Learning to Highlight Audio by Watching Movies

Add code
May 17, 2025
Viaarxiv icon

Hearing Anywhere in Any Environment

Add code
Apr 14, 2025
Viaarxiv icon

Quickest change detection for UAV-based sensing

Add code
Apr 10, 2025
Viaarxiv icon

Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment

Add code
Jan 30, 2025
Figure 1 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Figure 2 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Figure 3 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Figure 4 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Viaarxiv icon

SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models

Add code
Jan 14, 2025
Figure 1 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Figure 2 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Figure 3 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Figure 4 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Viaarxiv icon

Bridging Context Gaps: Enhancing Comprehension in Long-Form Social Conversations Through Contextualized Excerpts

Add code
Dec 28, 2024
Viaarxiv icon

Scaling Concept With Text-Guided Diffusion Models

Add code
Oct 31, 2024
Viaarxiv icon

Using RLHF to align speech enhancement approaches to mean-opinion quality scores

Add code
Oct 17, 2024
Viaarxiv icon

Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation

Add code
Oct 09, 2024
Figure 1 for Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Figure 2 for Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Figure 3 for Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Figure 4 for Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Viaarxiv icon