Picture for Vamsi Krishna Ithapu

Vamsi Krishna Ithapu

Sound Event Detection with Boundary-Aware Optimization and Inference

Add code
Jan 07, 2026
Viaarxiv icon

Hearing Anywhere in Any Environment

Add code
Apr 14, 2025
Viaarxiv icon

Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement

Add code
Nov 04, 2024
Figure 1 for Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Figure 2 for Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Figure 3 for Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Figure 4 for Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Viaarxiv icon

Spherical World-Locking for Audio-Visual Localization in Egocentric Videos

Add code
Aug 09, 2024
Figure 1 for Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
Figure 2 for Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
Figure 3 for Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
Figure 4 for Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
Viaarxiv icon

Hearing Loss Detection from Facial Expressions in One-on-one Conversations

Add code
Jan 17, 2024
Figure 1 for Hearing Loss Detection from Facial Expressions in One-on-one Conversations
Figure 2 for Hearing Loss Detection from Facial Expressions in One-on-one Conversations
Figure 3 for Hearing Loss Detection from Facial Expressions in One-on-one Conversations
Figure 4 for Hearing Loss Detection from Facial Expressions in One-on-one Conversations
Viaarxiv icon

The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective

Add code
Dec 20, 2023
Figure 1 for The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
Figure 2 for The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
Figure 3 for The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
Figure 4 for The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
Viaarxiv icon

Egocentric Auditory Attention Localization in Conversations

Add code
Mar 28, 2023
Figure 1 for Egocentric Auditory Attention Localization in Conversations
Figure 2 for Egocentric Auditory Attention Localization in Conversations
Figure 3 for Egocentric Auditory Attention Localization in Conversations
Figure 4 for Egocentric Auditory Attention Localization in Conversations
Viaarxiv icon

Novel-View Acoustic Synthesis

Add code
Jan 23, 2023
Figure 1 for Novel-View Acoustic Synthesis
Figure 2 for Novel-View Acoustic Synthesis
Figure 3 for Novel-View Acoustic Synthesis
Figure 4 for Novel-View Acoustic Synthesis
Viaarxiv icon

Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations

Add code
Jan 04, 2023
Figure 1 for Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Figure 2 for Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Figure 3 for Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Figure 4 for Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Viaarxiv icon

LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders

Add code
Nov 20, 2022
Viaarxiv icon