Picture for Yiwen Kou

Yiwen Kou

Matching the Statistical Query Lower Bound for k-sparse Parity Problems with Stochastic Gradient Descent

Add code
Apr 18, 2024
Viaarxiv icon

Guided Discrete Diffusion for Electronic Health Record Generation

Add code
Apr 18, 2024
Viaarxiv icon

Fast Sampling via De-randomization for Discrete Diffusion Models

Add code
Dec 14, 2023
Viaarxiv icon

Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data

Add code
Oct 29, 2023
Figure 1 for Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data
Figure 2 for Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data
Figure 3 for Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data
Figure 4 for Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data
Viaarxiv icon

Why Does Sharpness-Aware Minimization Generalize Better Than SGD?

Add code
Oct 11, 2023
Figure 1 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 2 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 3 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 4 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Viaarxiv icon

Benign Overfitting for Two-layer ReLU Networks

Add code
Mar 07, 2023
Viaarxiv icon