Alert button
Picture for Bryan Catanzaro

Bryan Catanzaro

Alert button

RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models

Aug 15, 2023
Jie Huang, Wei Ping, Peng Xu, Mohammad Shoeybi, Kevin Chen-Chuan Chang, Bryan Catanzaro

Figure 1 for RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models
Figure 2 for RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models
Figure 3 for RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models
Figure 4 for RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models
Viaarxiv icon

GraPhSyM: Graph Physical Synthesis Model

Aug 07, 2023
Ahmed Agiza, Rajarshi Roy, Teodor Dumitru Ene, Saad Godil, Sherief Reda, Bryan Catanzaro

Figure 1 for GraPhSyM: Graph Physical Synthesis Model
Figure 2 for GraPhSyM: Graph Physical Synthesis Model
Figure 3 for GraPhSyM: Graph Physical Synthesis Model
Figure 4 for GraPhSyM: Graph Physical Synthesis Model
Viaarxiv icon

Progressive Learning of 3D Reconstruction Network from 2D GAN Data

May 18, 2023
Aysegul Dundar, Jun Gao, Andrew Tao, Bryan Catanzaro

Figure 1 for Progressive Learning of 3D Reconstruction Network from 2D GAN Data
Figure 2 for Progressive Learning of 3D Reconstruction Network from 2D GAN Data
Figure 3 for Progressive Learning of 3D Reconstruction Network from 2D GAN Data
Figure 4 for Progressive Learning of 3D Reconstruction Network from 2D GAN Data
Viaarxiv icon

Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models

May 17, 2023
Songwei Ge, Seungjun Nah, Guilin Liu, Tyler Poon, Andrew Tao, Bryan Catanzaro, David Jacobs, Jia-Bin Huang, Ming-Yu Liu, Yogesh Balaji

Figure 1 for Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Figure 2 for Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Figure 3 for Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Figure 4 for Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Viaarxiv icon

Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study

Apr 13, 2023
Boxin Wang, Wei Ping, Peng Xu, Lawrence McAfee, Zihan Liu, Mohammad Shoeybi, Yi Dong, Oleksii Kuchaiev, Bo Li, Chaowei Xiao, Anima Anandkumar, Bryan Catanzaro

Figure 1 for Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
Figure 2 for Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
Figure 3 for Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
Figure 4 for Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
Viaarxiv icon

VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation

Mar 14, 2023
Rohan Badlani, Akshit Arora, Subhankar Ghosh, Rafael Valle, Kevin J. Shih, João Felipe Santos, Boris Ginsburg, Bryan Catanzaro

Figure 1 for VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation
Viaarxiv icon

Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models

Feb 14, 2023
Shrimai Prabhumoye, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro

Figure 1 for Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models
Figure 2 for Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models
Figure 3 for Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models
Figure 4 for Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models
Viaarxiv icon

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning

Feb 09, 2023
Zhuolin Yang, Wei Ping, Zihan Liu, Vijay Korthikanti, Weili Nie, De-An Huang, Linxi Fan, Zhiding Yu, Shiyi Lan, Bo Li, Ming-Yu Liu, Yuke Zhu, Mohammad Shoeybi, Bryan Catanzaro, Chaowei Xiao, Anima Anandkumar

Figure 1 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Figure 2 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Figure 3 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Figure 4 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Viaarxiv icon

Multilingual Multiaccented Multispeaker TTS with RADTTS

Jan 24, 2023
Rohan Badlani, Rafael Valle, Kevin J. Shih, João Felipe Santos, Siddharth Gururani, Bryan Catanzaro

Figure 1 for Multilingual Multiaccented Multispeaker TTS with RADTTS
Figure 2 for Multilingual Multiaccented Multispeaker TTS with RADTTS
Figure 3 for Multilingual Multiaccented Multispeaker TTS with RADTTS
Figure 4 for Multilingual Multiaccented Multispeaker TTS with RADTTS
Viaarxiv icon

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

Nov 17, 2022
Yogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song, Karsten Kreis, Miika Aittala, Timo Aila, Samuli Laine, Bryan Catanzaro, Tero Karras, Ming-Yu Liu

Figure 1 for eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Figure 2 for eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Figure 3 for eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Figure 4 for eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Viaarxiv icon