Alert button
Picture for Nima Mesgarani

Nima Mesgarani

Alert button

Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation

Add code
Bookmark button
Alert button
Mar 27, 2024
Xilin Jiang, Cong Han, Nima Mesgarani

Viaarxiv icon

Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience

Add code
Bookmark button
Alert button
Feb 06, 2024
Xilin Jiang, Cong Han, Yinghao Aaron Li, Nima Mesgarani

Viaarxiv icon

Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain

Add code
Bookmark button
Alert button
Jan 31, 2024
Gavin Mischler, Yinghao Aaron Li, Stephan Bickel, Ashesh D. Mehta, Nima Mesgarani

Viaarxiv icon

Exploring Self-Supervised Contrastive Learning of Spatial Sound Event Representation

Add code
Bookmark button
Alert button
Sep 27, 2023
Xilin Jiang, Cong Han, Yinghao Aaron Li, Nima Mesgarani

Viaarxiv icon

HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform

Add code
Bookmark button
Alert button
Sep 18, 2023
Yinghao Aaron Li, Cong Han, Xilin Jiang, Nima Mesgarani

Figure 1 for HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform
Figure 2 for HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform
Figure 3 for HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform
Viaarxiv icon

SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs

Add code
Bookmark button
Alert button
Jul 18, 2023
Yinghao Aaron Li, Cong Han, Nima Mesgarani

Figure 1 for SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Figure 2 for SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Figure 3 for SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Figure 4 for SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Viaarxiv icon

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Add code
Bookmark button
Alert button
Jun 13, 2023
Yinghao Aaron Li, Cong Han, Vinay S. Raghavan, Gavin Mischler, Nima Mesgarani

Figure 1 for StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Figure 2 for StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Figure 3 for StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Figure 4 for StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Viaarxiv icon

DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes

Add code
Bookmark button
Alert button
May 29, 2023
Xilin Jiang, Yinghao Aaron Li, Nima Mesgarani

Figure 1 for DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
Figure 2 for DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
Figure 3 for DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
Figure 4 for DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
Viaarxiv icon

Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network

Add code
Bookmark button
Alert button
Mar 13, 2023
Cong Han, Nima Mesgarani

Figure 1 for Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network
Figure 2 for Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network
Figure 3 for Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network
Figure 4 for Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network
Viaarxiv icon

Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation

Add code
Bookmark button
Alert button
Feb 11, 2023
Cong Han, Vishal Choudhari, Yinghao Aaron Li, Nima Mesgarani

Figure 1 for Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation
Figure 2 for Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation
Figure 3 for Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation
Figure 4 for Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation
Viaarxiv icon