Picture for Song Mei

Song Mei

Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization

Add code
Jun 12, 2024
Figure 1 for Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization
Figure 2 for Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization
Viaarxiv icon

U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models

Add code
May 01, 2024
Viaarxiv icon

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

Add code
Apr 11, 2024
Figure 1 for An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization
Figure 2 for An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization
Figure 3 for An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization
Figure 4 for An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization
Viaarxiv icon

Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning

Add code
Apr 08, 2024
Viaarxiv icon

Statistical Estimation in the Spiked Tensor Model via the Quantum Approximate Optimization Algorithm

Add code
Feb 29, 2024
Figure 1 for Statistical Estimation in the Spiked Tensor Model via the Quantum Approximate Optimization Algorithm
Figure 2 for Statistical Estimation in the Spiked Tensor Model via the Quantum Approximate Optimization Algorithm
Figure 3 for Statistical Estimation in the Spiked Tensor Model via the Quantum Approximate Optimization Algorithm
Figure 4 for Statistical Estimation in the Spiked Tensor Model via the Quantum Approximate Optimization Algorithm
Viaarxiv icon

Mean-field variational inference with the TAP free energy: Geometric and statistical properties in linear models

Add code
Nov 14, 2023
Viaarxiv icon

How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations

Add code
Oct 16, 2023
Figure 1 for How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations
Figure 2 for How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations
Figure 3 for How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations
Figure 4 for How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations
Viaarxiv icon

Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining

Add code
Oct 12, 2023
Viaarxiv icon

Deep Networks as Denoising Algorithms: Sample-Efficient Learning of Diffusion Models in High-Dimensional Graphical Models

Add code
Sep 20, 2023
Viaarxiv icon

What can a Single Attention Layer Learn? A Study Through the Random Features Lens

Add code
Jul 21, 2023
Figure 1 for What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Figure 2 for What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Figure 3 for What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Figure 4 for What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Viaarxiv icon