Alert button
Picture for Shruti Bhosale

Shruti Bhosale

Alert button

NLLB Team

Effective Long-Context Scaling of Foundation Models

Sep 27, 2023
Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma

Figure 1 for Effective Long-Context Scaling of Foundation Models
Figure 2 for Effective Long-Context Scaling of Foundation Models
Figure 3 for Effective Long-Context Scaling of Foundation Models
Figure 4 for Effective Long-Context Scaling of Foundation Models
Viaarxiv icon

Llama 2: Open Foundation and Fine-Tuned Chat Models

Jul 19, 2023
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom

Figure 1 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 2 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 3 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 4 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Viaarxiv icon

Revisiting Machine Translation for Cross-lingual Classification

May 23, 2023
Mikel Artetxe, Vedanuj Goswami, Shruti Bhosale, Angela Fan, Luke Zettlemoyer

Figure 1 for Revisiting Machine Translation for Cross-lingual Classification
Figure 2 for Revisiting Machine Translation for Cross-lingual Classification
Figure 3 for Revisiting Machine Translation for Cross-lingual Classification
Figure 4 for Revisiting Machine Translation for Cross-lingual Classification
Viaarxiv icon

Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference

Mar 10, 2023
Haiyang Huang, Newsha Ardalani, Anna Sun, Liu Ke, Hsien-Hsin S. Lee, Anjali Sridhar, Shruti Bhosale, Carole-Jean Wu, Benjamin Lee

Figure 1 for Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference
Figure 2 for Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference
Figure 3 for Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference
Figure 4 for Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference
Viaarxiv icon

Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation

Dec 15, 2022
Maha Elbayad, Anna Sun, Shruti Bhosale

Figure 1 for Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation
Figure 2 for Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation
Figure 3 for Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation
Figure 4 for Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation
Viaarxiv icon

Causes and Cures for Interference in Multilingual Translation

Dec 14, 2022
Uri Shaham, Maha Elbayad, Vedanuj Goswami, Omer Levy, Shruti Bhosale

Figure 1 for Causes and Cures for Interference in Multilingual Translation
Figure 2 for Causes and Cures for Interference in Multilingual Translation
Figure 3 for Causes and Cures for Interference in Multilingual Translation
Figure 4 for Causes and Cures for Interference in Multilingual Translation
Viaarxiv icon

No Language Left Behind: Scaling Human-Centered Machine Translation

Jul 11, 2022
NLLB team, Marta R. Costa-jussà, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula, Loic Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran, Pierre Andrews, Necip Fazil Ayan, Shruti Bhosale, Sergey Edunov, Angela Fan, Cynthia Gao, Vedanuj Goswami, Francisco Guzmán, Philipp Koehn, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Jeff Wang

Figure 1 for No Language Left Behind: Scaling Human-Centered Machine Translation
Figure 2 for No Language Left Behind: Scaling Human-Centered Machine Translation
Figure 3 for No Language Left Behind: Scaling Human-Centered Machine Translation
Figure 4 for No Language Left Behind: Scaling Human-Centered Machine Translation
Viaarxiv icon

Multilingual Machine Translation with Hyper-Adapters

May 22, 2022
Christos Baziotis, Mikel Artetxe, James Cross, Shruti Bhosale

Figure 1 for Multilingual Machine Translation with Hyper-Adapters
Figure 2 for Multilingual Machine Translation with Hyper-Adapters
Figure 3 for Multilingual Machine Translation with Hyper-Adapters
Figure 4 for Multilingual Machine Translation with Hyper-Adapters
Viaarxiv icon

Data Selection Curriculum for Neural Machine Translation

Mar 25, 2022
Tasnim Mohiuddin, Philipp Koehn, Vishrav Chaudhary, James Cross, Shruti Bhosale, Shafiq Joty

Figure 1 for Data Selection Curriculum for Neural Machine Translation
Figure 2 for Data Selection Curriculum for Neural Machine Translation
Figure 3 for Data Selection Curriculum for Neural Machine Translation
Figure 4 for Data Selection Curriculum for Neural Machine Translation
Viaarxiv icon

Efficient Large Scale Language Modeling with Mixtures of Experts

Dec 20, 2021
Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov

Figure 1 for Efficient Large Scale Language Modeling with Mixtures of Experts
Figure 2 for Efficient Large Scale Language Modeling with Mixtures of Experts
Figure 3 for Efficient Large Scale Language Modeling with Mixtures of Experts
Figure 4 for Efficient Large Scale Language Modeling with Mixtures of Experts
Viaarxiv icon