Alert button
Picture for Eshaan Nichani

Eshaan Nichani

Alert button

How Transformers Learn Causal Structure with Gradient Descent

Add code
Bookmark button
Alert button
Feb 22, 2024
Eshaan Nichani, Alex Damian, Jason D. Lee

Viaarxiv icon

Learning Hierarchical Polynomials with Three-Layer Neural Networks

Add code
Bookmark button
Alert button
Nov 23, 2023
Zihao Wang, Eshaan Nichani, Jason D. Lee

Viaarxiv icon

Fine-Tuning Language Models with Just Forward Passes

Add code
Bookmark button
Alert button
May 27, 2023
Sadhika Malladi, Tianyu Gao, Eshaan Nichani, Alex Damian, Jason D. Lee, Danqi Chen, Sanjeev Arora

Figure 1 for Fine-Tuning Language Models with Just Forward Passes
Figure 2 for Fine-Tuning Language Models with Just Forward Passes
Figure 3 for Fine-Tuning Language Models with Just Forward Passes
Figure 4 for Fine-Tuning Language Models with Just Forward Passes
Viaarxiv icon

Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models

Add code
Bookmark button
Alert button
May 18, 2023
Alex Damian, Eshaan Nichani, Rong Ge, Jason D. Lee

Figure 1 for Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models
Figure 2 for Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models
Viaarxiv icon

Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks

Add code
Bookmark button
Alert button
May 11, 2023
Eshaan Nichani, Alex Damian, Jason D. Lee

Figure 1 for Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Viaarxiv icon

Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability

Add code
Bookmark button
Alert button
Sep 30, 2022
Alex Damian, Eshaan Nichani, Jason D. Lee

Figure 1 for Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
Figure 2 for Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
Figure 3 for Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
Figure 4 for Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
Viaarxiv icon

Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials

Add code
Bookmark button
Alert button
Jun 08, 2022
Eshaan Nichani, Yu Bai, Jason D. Lee

Figure 1 for Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Figure 2 for Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Viaarxiv icon

Do Deeper Convolutional Networks Perform Better?

Add code
Bookmark button
Alert button
Oct 19, 2020
Eshaan Nichani, Adityanarayanan Radhakrishnan, Caroline Uhler

Figure 1 for Do Deeper Convolutional Networks Perform Better?
Figure 2 for Do Deeper Convolutional Networks Perform Better?
Figure 3 for Do Deeper Convolutional Networks Perform Better?
Figure 4 for Do Deeper Convolutional Networks Perform Better?
Viaarxiv icon

Balancedness and Alignment are Unlikely in Linear Neural Networks

Add code
Bookmark button
Alert button
Mar 13, 2020
Adityanarayanan Radhakrishnan, Eshaan Nichani, Daniel Bernstein, Caroline Uhler

Figure 1 for Balancedness and Alignment are Unlikely in Linear Neural Networks
Figure 2 for Balancedness and Alignment are Unlikely in Linear Neural Networks
Figure 3 for Balancedness and Alignment are Unlikely in Linear Neural Networks
Viaarxiv icon