Picture for Stefanie Jegelka

Stefanie Jegelka

Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA

Understanding the Role of Equivariance in Self-supervised Learning

Add code
Nov 10, 2024
Viaarxiv icon

Generalization, Expressivity, and Universality of Graph Neural Networks on Attributed Graphs

Add code
Nov 08, 2024
Viaarxiv icon

An Information Criterion for Controlled Disentanglement of Multimodal Data

Add code
Oct 31, 2024
Viaarxiv icon

What is Wrong with Perplexity for Long-context Language Modeling?

Add code
Oct 31, 2024
Figure 1 for What is Wrong with Perplexity for Long-context Language Modeling?
Figure 2 for What is Wrong with Perplexity for Long-context Language Modeling?
Figure 3 for What is Wrong with Perplexity for Long-context Language Modeling?
Figure 4 for What is Wrong with Perplexity for Long-context Language Modeling?
Viaarxiv icon

On the Role of Depth and Looping for In-Context Learning with Task Diversity

Add code
Oct 29, 2024
Figure 1 for On the Role of Depth and Looping for In-Context Learning with Task Diversity
Figure 2 for On the Role of Depth and Looping for In-Context Learning with Task Diversity
Viaarxiv icon

Beyond Interpretability: The Gains of Feature Monosemanticity on Model Robustness

Add code
Oct 27, 2024
Viaarxiv icon

Computing Optimal Regularizers for Online Linear Optimization

Add code
Oct 22, 2024
Viaarxiv icon

Simplicity Bias via Global Convergence of Sharpness Minimization

Add code
Oct 21, 2024
Viaarxiv icon

Learning Linear Attention in Polynomial Time

Add code
Oct 14, 2024
Viaarxiv icon

Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?

Add code
Oct 10, 2024
Figure 1 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 2 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 3 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 4 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Viaarxiv icon