Alert button
Picture for Darshil Doshi

Darshil Doshi

Alert button

To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets

Add code
Bookmark button
Alert button
Oct 19, 2023
Darshil Doshi, Aritra Das, Tianyu He, Andrey Gromov

Viaarxiv icon

AutoInit: Automatic Initialization via Jacobian Tuning

Add code
Bookmark button
Alert button
Jun 27, 2022
Tianyu He, Darshil Doshi, Andrey Gromov

Figure 1 for AutoInit: Automatic Initialization via Jacobian Tuning
Figure 2 for AutoInit: Automatic Initialization via Jacobian Tuning
Figure 3 for AutoInit: Automatic Initialization via Jacobian Tuning
Figure 4 for AutoInit: Automatic Initialization via Jacobian Tuning
Viaarxiv icon

Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm

Add code
Bookmark button
Alert button
Nov 30, 2021
Darshil Doshi, Tianyu He, Andrey Gromov

Figure 1 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Figure 2 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Figure 3 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Figure 4 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Viaarxiv icon