Alert button
Picture for Kshitij Gupta

Kshitij Gupta

Alert button

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Add code
Bookmark button
Alert button
Mar 30, 2024
Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak, Aleksandr Drozd, Jordan Clive, Kshitij Gupta, Liangyu Chen, Qi Sun, Ken Tsui, Noah Persaud, Nour Fahmy, Tianlong Chen, Mohit Bansal, Nicolo Monti, Tai Dang, Ziyang Luo, Tien-Tung Bui, Roberto Navigli, Virendra Mehta, Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo

Viaarxiv icon

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Add code
Bookmark button
Alert button
Mar 26, 2024
Adam Ibrahim, Benjamin Thérien, Kshitij Gupta, Mats L. Richter, Quentin Anthony, Timothée Lesort, Eugene Belilovsky, Irina Rish

Figure 1 for Simple and Scalable Strategies to Continually Pre-train Large Language Models
Figure 2 for Simple and Scalable Strategies to Continually Pre-train Large Language Models
Figure 3 for Simple and Scalable Strategies to Continually Pre-train Large Language Models
Figure 4 for Simple and Scalable Strategies to Continually Pre-train Large Language Models
Viaarxiv icon

Continual Pre-Training of Large Language Models: How to (re)warm your model?

Add code
Bookmark button
Alert button
Aug 08, 2023
Kshitij Gupta, Benjamin Thérien, Adam Ibrahim, Mats L. Richter, Quentin Anthony, Eugene Belilovsky, Irina Rish, Timothée Lesort

Figure 1 for Continual Pre-Training of Large Language Models: How to (re)warm your model?
Figure 2 for Continual Pre-Training of Large Language Models: How to (re)warm your model?
Figure 3 for Continual Pre-Training of Large Language Models: How to (re)warm your model?
Figure 4 for Continual Pre-Training of Large Language Models: How to (re)warm your model?
Viaarxiv icon

ARB: Advanced Reasoning Benchmark for Large Language Models

Add code
Bookmark button
Alert button
Jul 28, 2023
Tomohiro Sawada, Daniel Paleka, Alexander Havrilla, Pranav Tadepalli, Paula Vidas, Alexander Kranias, John J. Nay, Kshitij Gupta, Aran Komatsuzaki

Figure 1 for ARB: Advanced Reasoning Benchmark for Large Language Models
Figure 2 for ARB: Advanced Reasoning Benchmark for Large Language Models
Figure 3 for ARB: Advanced Reasoning Benchmark for Large Language Models
Figure 4 for ARB: Advanced Reasoning Benchmark for Large Language Models
Viaarxiv icon

Broken Neural Scaling Laws

Add code
Bookmark button
Alert button
Nov 10, 2022
Ethan Caballero, Kshitij Gupta, Irina Rish, David Krueger

Figure 1 for Broken Neural Scaling Laws
Figure 2 for Broken Neural Scaling Laws
Figure 3 for Broken Neural Scaling Laws
Figure 4 for Broken Neural Scaling Laws
Viaarxiv icon

Data Augmentation for Automated Essay Scoring using Transformer Models

Add code
Bookmark button
Alert button
Oct 29, 2022
Kshitij Gupta

Figure 1 for Data Augmentation for Automated Essay Scoring using Transformer Models
Figure 2 for Data Augmentation for Automated Essay Scoring using Transformer Models
Figure 3 for Data Augmentation for Automated Essay Scoring using Transformer Models
Viaarxiv icon

MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation

Add code
Bookmark button
Alert button
Oct 01, 2022
Kshitij Gupta

Figure 1 for MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation
Figure 2 for MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation
Viaarxiv icon

cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation

Add code
Bookmark button
Alert button
Jun 09, 2022
Kshitij Gupta, Devansh Gautam, Radhika Mamidi

Figure 1 for cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation
Figure 2 for cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation
Figure 3 for cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation
Figure 4 for cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation
Viaarxiv icon