Picture for Adam Polyak

Adam Polyak

Audio Language Modeling using Perceptually-Guided Discrete Representations

Add code
Nov 04, 2022
Viaarxiv icon

AudioGen: Textually Guided Audio Generation

Add code
Sep 30, 2022
Figure 1 for AudioGen: Textually Guided Audio Generation
Figure 2 for AudioGen: Textually Guided Audio Generation
Figure 3 for AudioGen: Textually Guided Audio Generation
Figure 4 for AudioGen: Textually Guided Audio Generation
Viaarxiv icon

Make-A-Video: Text-to-Video Generation without Text-Video Data

Add code
Sep 29, 2022
Figure 1 for Make-A-Video: Text-to-Video Generation without Text-Video Data
Figure 2 for Make-A-Video: Text-to-Video Generation without Text-Video Data
Figure 3 for Make-A-Video: Text-to-Video Generation without Text-Video Data
Figure 4 for Make-A-Video: Text-to-Video Generation without Text-Video Data
Viaarxiv icon

KNN-Diffusion: Image Generation via Large-Scale Retrieval

Add code
Apr 06, 2022
Figure 1 for KNN-Diffusion: Image Generation via Large-Scale Retrieval
Figure 2 for KNN-Diffusion: Image Generation via Large-Scale Retrieval
Figure 3 for KNN-Diffusion: Image Generation via Large-Scale Retrieval
Figure 4 for KNN-Diffusion: Image Generation via Large-Scale Retrieval
Viaarxiv icon

Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

Add code
Mar 24, 2022
Figure 1 for Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
Figure 2 for Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
Figure 3 for Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
Figure 4 for Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
Viaarxiv icon

Locally Shifted Attention With Early Global Integration

Add code
Dec 22, 2021
Figure 1 for Locally Shifted Attention With Early Global Integration
Figure 2 for Locally Shifted Attention With Early Global Integration
Figure 3 for Locally Shifted Attention With Early Global Integration
Figure 4 for Locally Shifted Attention With Early Global Integration
Viaarxiv icon

Textless Speech Emotion Conversion using Decomposed and Discrete Representations

Add code
Nov 14, 2021
Figure 1 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 2 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 3 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 4 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Viaarxiv icon

fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit

Add code
Sep 14, 2021
Figure 1 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Figure 2 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Figure 3 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Figure 4 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Viaarxiv icon

Text-Free Prosody-Aware Generative Spoken Language Modeling

Add code
Sep 07, 2021
Figure 1 for Text-Free Prosody-Aware Generative Spoken Language Modeling
Figure 2 for Text-Free Prosody-Aware Generative Spoken Language Modeling
Figure 3 for Text-Free Prosody-Aware Generative Spoken Language Modeling
Figure 4 for Text-Free Prosody-Aware Generative Spoken Language Modeling
Viaarxiv icon

Direct speech-to-speech translation with discrete units

Add code
Jul 12, 2021
Figure 1 for Direct speech-to-speech translation with discrete units
Figure 2 for Direct speech-to-speech translation with discrete units
Figure 3 for Direct speech-to-speech translation with discrete units
Figure 4 for Direct speech-to-speech translation with discrete units
Viaarxiv icon