Picture for Harsh Mehta

Harsh Mehta

Shammie

The Road Less Scheduled

Add code
May 24, 2024
Viaarxiv icon

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

When, Why and How Much? Adaptive Learning Rate Scheduling by Refinement

Add code
Oct 11, 2023
Figure 1 for When, Why and How Much? Adaptive Learning Rate Scheduling by Refinement
Figure 2 for When, Why and How Much? Adaptive Learning Rate Scheduling by Refinement
Figure 3 for When, Why and How Much? Adaptive Learning Rate Scheduling by Refinement
Figure 4 for When, Why and How Much? Adaptive Learning Rate Scheduling by Refinement
Viaarxiv icon

Mechanic: A Learning Rate Tuner

Add code
Jun 02, 2023
Figure 1 for Mechanic: A Learning Rate Tuner
Figure 2 for Mechanic: A Learning Rate Tuner
Figure 3 for Mechanic: A Learning Rate Tuner
Figure 4 for Mechanic: A Learning Rate Tuner
Viaarxiv icon

Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

Add code
Feb 11, 2023
Viaarxiv icon

Simplifying and Understanding State Space Models with Diagonal Linear RNNs

Add code
Dec 07, 2022
Figure 1 for Simplifying and Understanding State Space Models with Diagonal Linear RNNs
Figure 2 for Simplifying and Understanding State Space Models with Diagonal Linear RNNs
Figure 3 for Simplifying and Understanding State Space Models with Diagonal Linear RNNs
Figure 4 for Simplifying and Understanding State Space Models with Diagonal Linear RNNs
Viaarxiv icon

Differentially Private Image Classification from Features

Add code
Nov 24, 2022
Figure 1 for Differentially Private Image Classification from Features
Figure 2 for Differentially Private Image Classification from Features
Figure 3 for Differentially Private Image Classification from Features
Figure 4 for Differentially Private Image Classification from Features
Viaarxiv icon

Convexifying Transformers: Improving optimization and understanding of transformer networks

Add code
Nov 20, 2022
Figure 1 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Figure 2 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Figure 3 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Figure 4 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Viaarxiv icon

Long Range Language Modeling via Gated State Spaces

Add code
Jul 02, 2022
Figure 1 for Long Range Language Modeling via Gated State Spaces
Figure 2 for Long Range Language Modeling via Gated State Spaces
Figure 3 for Long Range Language Modeling via Gated State Spaces
Viaarxiv icon