Picture for Zhewei Yao

Zhewei Yao

Rethinking Batch Normalization in Transformers

Add code
Mar 17, 2020
Figure 1 for Rethinking Batch Normalization in Transformers
Figure 2 for Rethinking Batch Normalization in Transformers
Figure 3 for Rethinking Batch Normalization in Transformers
Figure 4 for Rethinking Batch Normalization in Transformers
Viaarxiv icon

PyHessian: Neural Networks Through the Lens of the Hessian

Add code
Jan 02, 2020
Figure 1 for PyHessian: Neural Networks Through the Lens of the Hessian
Figure 2 for PyHessian: Neural Networks Through the Lens of the Hessian
Figure 3 for PyHessian: Neural Networks Through the Lens of the Hessian
Figure 4 for PyHessian: Neural Networks Through the Lens of the Hessian
Viaarxiv icon

ZeroQ: A Novel Zero Shot Quantization Framework

Add code
Jan 01, 2020
Figure 1 for ZeroQ: A Novel Zero Shot Quantization Framework
Figure 2 for ZeroQ: A Novel Zero Shot Quantization Framework
Figure 3 for ZeroQ: A Novel Zero Shot Quantization Framework
Figure 4 for ZeroQ: A Novel Zero Shot Quantization Framework
Viaarxiv icon

HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks

Add code
Nov 10, 2019
Figure 1 for HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Figure 2 for HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Figure 3 for HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Figure 4 for HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Viaarxiv icon

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Add code
Sep 25, 2019
Figure 1 for Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Figure 2 for Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Figure 3 for Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Figure 4 for Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Viaarxiv icon

ANODEV2: A Coupled Neural ODE Evolution Framework

Add code
Jun 10, 2019
Figure 1 for ANODEV2: A Coupled Neural ODE Evolution Framework
Figure 2 for ANODEV2: A Coupled Neural ODE Evolution Framework
Figure 3 for ANODEV2: A Coupled Neural ODE Evolution Framework
Figure 4 for ANODEV2: A Coupled Neural ODE Evolution Framework
Viaarxiv icon

Residual Networks as Nonlinear Systems: Stability Analysis using Linearization

Add code
May 31, 2019
Figure 1 for Residual Networks as Nonlinear Systems: Stability Analysis using Linearization
Figure 2 for Residual Networks as Nonlinear Systems: Stability Analysis using Linearization
Figure 3 for Residual Networks as Nonlinear Systems: Stability Analysis using Linearization
Figure 4 for Residual Networks as Nonlinear Systems: Stability Analysis using Linearization
Viaarxiv icon

HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision

Add code
Apr 29, 2019
Figure 1 for HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
Figure 2 for HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
Figure 3 for HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
Figure 4 for HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
Viaarxiv icon

JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks

Add code
Apr 07, 2019
Figure 1 for JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks
Figure 2 for JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks
Figure 3 for JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks
Figure 4 for JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks
Viaarxiv icon

Inefficiency of K-FAC for Large Batch Size Training

Add code
Mar 14, 2019
Figure 1 for Inefficiency of K-FAC for Large Batch Size Training
Figure 2 for Inefficiency of K-FAC for Large Batch Size Training
Figure 3 for Inefficiency of K-FAC for Large Batch Size Training
Figure 4 for Inefficiency of K-FAC for Large Batch Size Training
Viaarxiv icon