Alert button
Picture for Rohan Badlani

Rohan Badlani

Alert button

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

Add code
Bookmark button
Alert button
Feb 02, 2024
Zhifeng Kong, Arushi Goel, Rohan Badlani, Wei Ping, Rafael Valle, Bryan Catanzaro

Viaarxiv icon

Scaling NVIDIA's Multi-speaker Multi-lingual TTS Systems with Zero-Shot TTS to Indic Languages

Add code
Bookmark button
Alert button
Jan 29, 2024
Akshit Arora, Rohan Badlani, Sungwon Kim, Rafael Valle, Bryan Catanzaro

Viaarxiv icon

VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation

Add code
Bookmark button
Alert button
Mar 14, 2023
Rohan Badlani, Akshit Arora, Subhankar Ghosh, Rafael Valle, Kevin J. Shih, João Felipe Santos, Boris Ginsburg, Bryan Catanzaro

Figure 1 for VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation
Viaarxiv icon

Multilingual Multiaccented Multispeaker TTS with RADTTS

Add code
Bookmark button
Alert button
Jan 24, 2023
Rohan Badlani, Rafael Valle, Kevin J. Shih, João Felipe Santos, Siddharth Gururani, Bryan Catanzaro

Figure 1 for Multilingual Multiaccented Multispeaker TTS with RADTTS
Figure 2 for Multilingual Multiaccented Multispeaker TTS with RADTTS
Figure 3 for Multilingual Multiaccented Multispeaker TTS with RADTTS
Figure 4 for Multilingual Multiaccented Multispeaker TTS with RADTTS
Viaarxiv icon

Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows

Add code
Bookmark button
Alert button
Mar 07, 2022
Kevin J. Shih, Rafael Valle, Rohan Badlani, João Felipe Santos, Bryan Catanzaro

Figure 1 for Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows
Figure 2 for Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows
Figure 3 for Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows
Figure 4 for Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows
Viaarxiv icon

One TTS Alignment To Rule Them All

Add code
Bookmark button
Alert button
Aug 23, 2021
Rohan Badlani, Adrian Łancucki, Kevin J. Shih, Rafael Valle, Wei Ping, Bryan Catanzaro

Figure 1 for One TTS Alignment To Rule Them All
Figure 2 for One TTS Alignment To Rule Them All
Figure 3 for One TTS Alignment To Rule Them All
Figure 4 for One TTS Alignment To Rule Them All
Viaarxiv icon

Relation Extraction with Contextualized Relation Embedding (CRE)

Add code
Bookmark button
Alert button
Nov 19, 2020
Xiaoyu Chen, Rohan Badlani

Figure 1 for Relation Extraction with Contextualized Relation Embedding (CRE)
Figure 2 for Relation Extraction with Contextualized Relation Embedding (CRE)
Figure 3 for Relation Extraction with Contextualized Relation Embedding (CRE)
Figure 4 for Relation Extraction with Contextualized Relation Embedding (CRE)
Viaarxiv icon

Framework for evaluation of sound event detection in web videos

Add code
Bookmark button
Alert button
Apr 04, 2018
Rohan Badlani, Ankit Shah, Benjamin Elizalde, Anurag Kumar, Bhiksha Raj

Figure 1 for Framework for evaluation of sound event detection in web videos
Figure 2 for Framework for evaluation of sound event detection in web videos
Figure 3 for Framework for evaluation of sound event detection in web videos
Figure 4 for Framework for evaluation of sound event detection in web videos
Viaarxiv icon

An Approach for Self-Training Audio Event Detectors Using Web Data

Add code
Bookmark button
Alert button
Jun 27, 2017
Benjamin Elizalde, Ankit Shah, Siddharth Dalmia, Min Hun Lee, Rohan Badlani, Anurag Kumar, Bhiksha Raj, Ian Lane

Figure 1 for An Approach for Self-Training Audio Event Detectors Using Web Data
Figure 2 for An Approach for Self-Training Audio Event Detectors Using Web Data
Figure 3 for An Approach for Self-Training Audio Event Detectors Using Web Data
Viaarxiv icon