Picture for Zixiang Chen

Zixiang Chen

Why Does Sharpness-Aware Minimization Generalize Better Than SGD?

Add code
Oct 11, 2023
Figure 1 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 2 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 3 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 4 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Viaarxiv icon

Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP

Add code
Oct 02, 2023
Viaarxiv icon

Benign Overfitting for Two-layer ReLU Networks

Add code
Mar 07, 2023
Viaarxiv icon

Learning High-Dimensional Single-Neuron ReLU Networks with Finite Samples

Add code
Mar 03, 2023
Viaarxiv icon

ISA-Net: Improved spatial attention network for PET-CT tumor segmentation

Add code
Nov 04, 2022
Viaarxiv icon

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

Add code
Sep 30, 2022
Figure 1 for A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
Figure 2 for A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
Viaarxiv icon

Towards Understanding Mixture of Experts in Deep Learning

Add code
Aug 04, 2022
Figure 1 for Towards Understanding Mixture of Experts in Deep Learning
Figure 2 for Towards Understanding Mixture of Experts in Deep Learning
Figure 3 for Towards Understanding Mixture of Experts in Deep Learning
Figure 4 for Towards Understanding Mixture of Experts in Deep Learning
Viaarxiv icon

Benign Overfitting in Two-layer Convolutional Neural Networks

Add code
Feb 14, 2022
Figure 1 for Benign Overfitting in Two-layer Convolutional Neural Networks
Viaarxiv icon

Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

Add code
Oct 25, 2021
Figure 1 for Faster Perturbed Stochastic Gradient Methods for Finding Local Minima
Figure 2 for Faster Perturbed Stochastic Gradient Methods for Finding Local Minima
Viaarxiv icon

Self-training Converts Weak Learners to Strong Learners in Mixture Models

Add code
Jul 16, 2021
Viaarxiv icon