Alert button
Picture for Yiwen Kou

Yiwen Kou

Alert button

Fast Sampling via De-randomization for Discrete Diffusion Models

Add code
Bookmark button
Alert button
Dec 14, 2023
Zixiang Chen, Huizhuo Yuan, Yongqian Li, Yiwen Kou, Junkai Zhang, Quanquan Gu

Viaarxiv icon

Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data

Add code
Bookmark button
Alert button
Oct 29, 2023
Yiwen Kou, Zixiang Chen, Quanquan Gu

Viaarxiv icon

Why Does Sharpness-Aware Minimization Generalize Better Than SGD?

Add code
Bookmark button
Alert button
Oct 11, 2023
Zixiang Chen, Junkai Zhang, Yiwen Kou, Xiangning Chen, Cho-Jui Hsieh, Quanquan Gu

Figure 1 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 2 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 3 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 4 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Viaarxiv icon

Benign Overfitting for Two-layer ReLU Networks

Add code
Bookmark button
Alert button
Mar 07, 2023
Yiwen Kou, Zixiang Chen, Yuanzhou Chen, Quanquan Gu

Figure 1 for Benign Overfitting for Two-layer ReLU Networks
Figure 2 for Benign Overfitting for Two-layer ReLU Networks
Figure 3 for Benign Overfitting for Two-layer ReLU Networks
Figure 4 for Benign Overfitting for Two-layer ReLU Networks
Viaarxiv icon