Alert button
Picture for Guillaume Wenzek

Guillaume Wenzek

Alert button

NLLB Team

Seamless: Multilingual Expressive and Streaming Speech Translation

Dec 08, 2023
Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Coria Meglioli, David Dale, Ning Dong, Mark Duppenthaler, Paul-Ambroise Duquenne, Brian Ellis, Hady Elsahar, Justin Haaheim, John Hoffman, Min-Jae Hwang, Hirofumi Inaguma, Christopher Klaiber, Ilia Kulikov, Pengwei Li, Daniel Licht, Jean Maillard, Ruslan Mavlyutov, Alice Rakotoarison, Kaushik Ram Sadagopan, Abinesh Ramakrishnan, Tuan Tran, Guillaume Wenzek, Yilin Yang, Ethan Ye, Ivan Evtimov, Pierre Fernandez, Cynthia Gao, Prangthip Hansanti, Elahe Kalbassi, Amanda Kallet, Artyom Kozhevnikov, Gabriel Mejia Gonzalez, Robin San Roman, Christophe Touret, Corinne Wong, Carleigh Wood, Bokai Yu, Pierre Andrews, Can Balioglu, Peng-Jen Chen, Marta R. Costa-jussà, Maha Elbayad, Hongyu Gong, Francisco Guzmán, Kevin Heffernan, Somya Jain, Justine Kao, Ann Lee, Xutai Ma, Alex Mourachko, Benjamin Peloquin, Juan Pino, Sravya Popuri, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Anna Sun, Paden Tomasello, Changhan Wang, Jeff Wang, Skyler Wang, Mary Williamson

Figure 1 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 2 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 3 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 4 for Seamless: Multilingual Expressive and Streaming Speech Translation
Viaarxiv icon

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

Aug 23, 2023
Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Cora Meglioli, David Dale, Ning Dong, Paul-Ambroise Duquenne, Hady Elsahar, Hongyu Gong, Kevin Heffernan, John Hoffman, Christopher Klaiber, Pengwei Li, Daniel Licht, Jean Maillard, Alice Rakotoarison, Kaushik Ram Sadagopan, Guillaume Wenzek, Ethan Ye, Bapi Akula, Peng-Jen Chen, Naji El Hachem, Brian Ellis, Gabriel Mejia Gonzalez, Justin Haaheim, Prangthip Hansanti, Russ Howes, Bernie Huang, Min-Jae Hwang, Hirofumi Inaguma, Somya Jain, Elahe Kalbassi, Amanda Kallet, Ilia Kulikov, Janice Lam, Daniel Li, Xutai Ma, Ruslan Mavlyutov, Benjamin Peloquin, Mohamed Ramadan, Abinesh Ramakrishnan, Anna Sun, Kevin Tran, Tuan Tran, Igor Tufanov, Vish Vogeti, Carleigh Wood, Yilin Yang, Bokai Yu, Pierre Andrews, Can Balioglu, Marta R. Costa-jussà, Onur Celebi, Maha Elbayad, Cynthia Gao, Francisco Guzmán, Justine Kao, Ann Lee, Alexandre Mourachko, Juan Pino, Sravya Popuri, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Paden Tomasello, Changhan Wang, Jeff Wang, Skyler Wang

Figure 1 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 2 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 3 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 4 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Viaarxiv icon

No Language Left Behind: Scaling Human-Centered Machine Translation

Jul 11, 2022
NLLB team, Marta R. Costa-jussà, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula, Loic Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran, Pierre Andrews, Necip Fazil Ayan, Shruti Bhosale, Sergey Edunov, Angela Fan, Cynthia Gao, Vedanuj Goswami, Francisco Guzmán, Philipp Koehn, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Jeff Wang

Figure 1 for No Language Left Behind: Scaling Human-Centered Machine Translation
Figure 2 for No Language Left Behind: Scaling Human-Centered Machine Translation
Figure 3 for No Language Left Behind: Scaling Human-Centered Machine Translation
Figure 4 for No Language Left Behind: Scaling Human-Centered Machine Translation
Viaarxiv icon

How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?

Apr 29, 2022
Shiyue Zhang, Vishrav Chaudhary, Naman Goyal, James Cross, Guillaume Wenzek, Mohit Bansal, Francisco Guzman

Figure 1 for How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?
Figure 2 for How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?
Figure 3 for How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?
Figure 4 for How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?
Viaarxiv icon

The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

Jun 06, 2021
Naman Goyal, Cynthia Gao, Vishrav Chaudhary, Peng-Jen Chen, Guillaume Wenzek, Da Ju, Sanjana Krishnan, Marc'Aurelio Ranzato, Francisco Guzman, Angela Fan

Figure 1 for The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
Figure 2 for The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
Figure 3 for The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
Figure 4 for The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
Viaarxiv icon

Generating Fact Checking Briefs

Nov 10, 2020
Angela Fan, Aleksandra Piktus, Fabio Petroni, Guillaume Wenzek, Marzieh Saeidi, Andreas Vlachos, Antoine Bordes, Sebastian Riedel

Figure 1 for Generating Fact Checking Briefs
Figure 2 for Generating Fact Checking Briefs
Figure 3 for Generating Fact Checking Briefs
Figure 4 for Generating Fact Checking Briefs
Viaarxiv icon

Beyond English-Centric Multilingual Machine Translation

Oct 21, 2020
Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin

Figure 1 for Beyond English-Centric Multilingual Machine Translation
Figure 2 for Beyond English-Centric Multilingual Machine Translation
Figure 3 for Beyond English-Centric Multilingual Machine Translation
Figure 4 for Beyond English-Centric Multilingual Machine Translation
Viaarxiv icon

CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data

Nov 15, 2019
Guillaume Wenzek, Marie-Anne Lachaux, Alexis Conneau, Vishrav Chaudhary, Francisco Guzmán, Armand Joulin, Edouard Grave

Figure 1 for CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Figure 2 for CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Figure 3 for CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Figure 4 for CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Viaarxiv icon

CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB

Nov 10, 2019
Holger Schwenk, Guillaume Wenzek, Sergey Edunov, Edouard Grave, Armand Joulin

Figure 1 for CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
Figure 2 for CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
Figure 3 for CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
Figure 4 for CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
Viaarxiv icon

Unsupervised Cross-lingual Representation Learning at Scale

Nov 05, 2019
Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov

Figure 1 for Unsupervised Cross-lingual Representation Learning at Scale
Figure 2 for Unsupervised Cross-lingual Representation Learning at Scale
Figure 3 for Unsupervised Cross-lingual Representation Learning at Scale
Figure 4 for Unsupervised Cross-lingual Representation Learning at Scale
Viaarxiv icon