Alert button
Picture for Sujay Sanghavi

Sujay Sanghavi

Alert button

Pre-training Small Base LMs with Fewer Tokens

Add code
Bookmark button
Alert button
Apr 12, 2024
Sunny Sanyal, Sujay Sanghavi, Alexandros G. Dimakis

Viaarxiv icon

Time Weaver: A Conditional Time Series Generation Model

Add code
Bookmark button
Alert button
Mar 05, 2024
Sai Shankar Narasimhan, Shubhankar Agarwal, Oguzhan Akcin, Sujay Sanghavi, Sandeep Chinchali

Figure 1 for Time Weaver: A Conditional Time Series Generation Model
Figure 2 for Time Weaver: A Conditional Time Series Generation Model
Figure 3 for Time Weaver: A Conditional Time Series Generation Model
Figure 4 for Time Weaver: A Conditional Time Series Generation Model
Viaarxiv icon

In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness

Add code
Bookmark button
Alert button
Feb 18, 2024
Liam Collins, Advait Parulekar, Aryan Mokhtari, Sujay Sanghavi, Sanjay Shakkottai

Viaarxiv icon

Towards Quantifying the Preconditioning Effect of Adam

Add code
Bookmark button
Alert button
Feb 11, 2024
Rudrajit Das, Naman Agarwal, Sujay Sanghavi, Inderjit S. Dhillon

Viaarxiv icon

Understanding the Training Speedup from Sampling with Approximate Losses

Add code
Bookmark button
Alert button
Feb 10, 2024
Rudrajit Das, Xi Chen, Bertram Ieong, Parikshit Bansal, Sujay Sanghavi

Viaarxiv icon

Contrastive Approach to Prior Free Positive Unlabeled Learning

Add code
Bookmark button
Alert button
Feb 08, 2024
Anish Acharya, Sujay Sanghavi

Viaarxiv icon

Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity

Add code
Bookmark button
Alert button
Jul 31, 2023
Charlie Hou, Kiran Koshy Thekumparampil, Michael Shavlovsky, Giulia Fanti, Yesh Dattatreya, Sujay Sanghavi

Figure 1 for Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity
Figure 2 for Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity
Figure 3 for Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity
Figure 4 for Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity
Viaarxiv icon

Logarithmic Bayes Regret Bounds

Add code
Bookmark button
Alert button
Jun 15, 2023
Alexia Atsidakou, Branislav Kveton, Sumeet Katariya, Constantine Caramanis, Sujay Sanghavi

Figure 1 for Logarithmic Bayes Regret Bounds
Figure 2 for Logarithmic Bayes Regret Bounds
Viaarxiv icon

Understanding the Effectiveness of Early Weight Averaging for Training Large Language Models

Add code
Bookmark button
Alert button
Jun 05, 2023
Sunny Sanyal, Jean Kaddour, Abhishek Kumar, Sujay Sanghavi

Figure 1 for Understanding the Effectiveness of Early Weight Averaging for Training Large Language Models
Figure 2 for Understanding the Effectiveness of Early Weight Averaging for Training Large Language Models
Figure 3 for Understanding the Effectiveness of Early Weight Averaging for Training Large Language Models
Figure 4 for Understanding the Effectiveness of Early Weight Averaging for Training Large Language Models
Viaarxiv icon

Understanding Self-Distillation in the Presence of Label Noise

Add code
Bookmark button
Alert button
Jan 30, 2023
Rudrajit Das, Sujay Sanghavi

Figure 1 for Understanding Self-Distillation in the Presence of Label Noise
Figure 2 for Understanding Self-Distillation in the Presence of Label Noise
Figure 3 for Understanding Self-Distillation in the Presence of Label Noise
Figure 4 for Understanding Self-Distillation in the Presence of Label Noise
Viaarxiv icon