Alert button
Picture for Alexei Baevski

Alexei Baevski

Alert button

Toward Joint Language Modeling for Speech Units and Text

Add code
Bookmark button
Alert button
Oct 12, 2023
Ju-Chieh Chou, Chung-Ming Chien, Wei-Ning Hsu, Karen Livescu, Arun Babu, Alexis Conneau, Alexei Baevski, Michael Auli

Figure 1 for Toward Joint Language Modeling for Speech Units and Text
Figure 2 for Toward Joint Language Modeling for Speech Units and Text
Figure 3 for Toward Joint Language Modeling for Speech Units and Text
Figure 4 for Toward Joint Language Modeling for Speech Units and Text
Viaarxiv icon

Scaling Speech Technology to 1,000+ Languages

Add code
Bookmark button
Alert button
May 22, 2023
Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli

Figure 1 for Scaling Speech Technology to 1,000+ Languages
Figure 2 for Scaling Speech Technology to 1,000+ Languages
Figure 3 for Scaling Speech Technology to 1,000+ Languages
Figure 4 for Scaling Speech Technology to 1,000+ Languages
Viaarxiv icon

OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav

Add code
Bookmark button
Alert button
Mar 14, 2023
Karmesh Yadav, Arjun Majumdar, Ram Ramrakhya, Naoki Yokoyama, Alexei Baevski, Zsolt Kira, Oleksandr Maksymets, Dhruv Batra

Figure 1 for OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav
Figure 2 for OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav
Figure 3 for OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav
Figure 4 for OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav
Viaarxiv icon

AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations

Add code
Bookmark button
Alert button
Feb 10, 2023
Jiachen Lian, Alexei Baevski, Wei-Ning Hsu, Michael Auli

Figure 1 for AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Figure 2 for AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Figure 3 for AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Figure 4 for AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Viaarxiv icon

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Add code
Bookmark button
Alert button
Dec 14, 2022
Alexei Baevski, Arun Babu, Wei-Ning Hsu, Michael Auli

Figure 1 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 2 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 3 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 4 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Viaarxiv icon

Introducing Semantics into Speech Encoders

Add code
Bookmark button
Alert button
Nov 15, 2022
Derek Xu, Shuyan Dong, Changhan Wang, Suyoun Kim, Zhaojiang Lin, Akshat Shrivastava, Shang-Wen Li, Liang-Hsuan Tseng, Alexei Baevski, Guan-Ting Lin, Hung-yi Lee, Yizhou Sun, Wei Wang

Figure 1 for Introducing Semantics into Speech Encoders
Figure 2 for Introducing Semantics into Speech Encoders
Figure 3 for Introducing Semantics into Speech Encoders
Figure 4 for Introducing Semantics into Speech Encoders
Viaarxiv icon

Masked Autoencoders that Listen

Add code
Bookmark button
Alert button
Jul 13, 2022
Po-Yao, Huang, Hu Xu, Juncheng Li, Alexei Baevski, Michael Auli, Wojciech Galuba, Florian Metze, Christoph Feichtenhofer

Figure 1 for Masked Autoencoders that Listen
Figure 2 for Masked Autoencoders that Listen
Figure 3 for Masked Autoencoders that Listen
Figure 4 for Masked Autoencoders that Listen
Viaarxiv icon

Wav2Vec-Aug: Improved self-supervised training with limited data

Add code
Bookmark button
Alert button
Jun 27, 2022
Anuroop Sriram, Michael Auli, Alexei Baevski

Figure 1 for Wav2Vec-Aug: Improved self-supervised training with limited data
Figure 2 for Wav2Vec-Aug: Improved self-supervised training with limited data
Figure 3 for Wav2Vec-Aug: Improved self-supervised training with limited data
Figure 4 for Wav2Vec-Aug: Improved self-supervised training with limited data
Viaarxiv icon

Offline Visual Representation Learning for Embodied Navigation

Add code
Bookmark button
Alert button
Apr 27, 2022
Karmesh Yadav, Ram Ramrakhya, Arjun Majumdar, Vincent-Pierre Berges, Sachit Kuhar, Dhruv Batra, Alexei Baevski, Oleksandr Maksymets

Figure 1 for Offline Visual Representation Learning for Embodied Navigation
Figure 2 for Offline Visual Representation Learning for Embodied Navigation
Figure 3 for Offline Visual Representation Learning for Embodied Navigation
Figure 4 for Offline Visual Representation Learning for Embodied Navigation
Viaarxiv icon

On-demand compute reduction with stochastic wav2vec 2.0

Add code
Bookmark button
Alert button
Apr 25, 2022
Apoorv Vyas, Wei-Ning Hsu, Michael Auli, Alexei Baevski

Figure 1 for On-demand compute reduction with stochastic wav2vec 2.0
Figure 2 for On-demand compute reduction with stochastic wav2vec 2.0
Figure 3 for On-demand compute reduction with stochastic wav2vec 2.0
Figure 4 for On-demand compute reduction with stochastic wav2vec 2.0
Viaarxiv icon