Alert button
Picture for Ritchie Zhao

Ritchie Zhao

Alert button

Microscaling Data Formats for Deep Learning

Oct 19, 2023
Bita Darvish Rouhani, Ritchie Zhao, Ankit More, Mathew Hall, Alireza Khodamoradi, Summer Deng, Dhruv Choudhary, Marius Cornea, Eric Dellinger, Kristof Denolf, Stosic Dusan, Venmugil Elango, Maximilian Golub, Alexander Heinecke, Phil James-Roxby, Dharmesh Jani, Gaurav Kolhe, Martin Langhammer, Ada Li, Levi Melnick, Maral Mesmakhosroshahi, Andres Rodriguez, Michael Schulte, Rasoul Shafipour, Lei Shao, Michael Siu, Pradeep Dubey, Paulius Micikevicius, Maxim Naumov, Colin Verrilli, Ralph Wittig, Doug Burger, Eric Chung

Figure 1 for Microscaling Data Formats for Deep Learning
Figure 2 for Microscaling Data Formats for Deep Learning
Figure 3 for Microscaling Data Formats for Deep Learning
Figure 4 for Microscaling Data Formats for Deep Learning
Viaarxiv icon

Shared Microexponents: A Little Shifting Goes a Long Way

Feb 16, 2023
Bita Rouhani, Ritchie Zhao, Venmugil Elango, Rasoul Shafipour, Mathew Hall, Maral Mesmakhosroshahi, Ankit More, Levi Melnick, Maximilian Golub, Girish Varatkar, Lei Shao, Gaurav Kolhe, Dimitry Melts, Jasmine Klar, Renee L'Heureux, Matt Perry, Doug Burger, Eric Chung, Zhaoxia Deng, Sam Naghshineh, Jongsoo Park, Maxim Naumov

Figure 1 for Shared Microexponents: A Little Shifting Goes a Long Way
Figure 2 for Shared Microexponents: A Little Shifting Goes a Long Way
Figure 3 for Shared Microexponents: A Little Shifting Goes a Long Way
Figure 4 for Shared Microexponents: A Little Shifting Goes a Long Way
Viaarxiv icon

Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations

Feb 17, 2020
Yichi Zhang, Ritchie Zhao, Weizhe Hua, Nayun Xu, G. Edward Suh, Zhiru Zhang

Figure 1 for Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations
Figure 2 for Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations
Figure 3 for Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations
Figure 4 for Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations
Viaarxiv icon

Overwrite Quantization: Opportunistic Outlier Handling for Neural Network Accelerators

Oct 13, 2019
Ritchie Zhao, Christopher De Sa, Zhiru Zhang

Figure 1 for Overwrite Quantization: Opportunistic Outlier Handling for Neural Network Accelerators
Figure 2 for Overwrite Quantization: Opportunistic Outlier Handling for Neural Network Accelerators
Figure 3 for Overwrite Quantization: Opportunistic Outlier Handling for Neural Network Accelerators
Figure 4 for Overwrite Quantization: Opportunistic Outlier Handling for Neural Network Accelerators
Viaarxiv icon

Improving Neural Network Quantization without Retraining using Outlier Channel Splitting

Jan 30, 2019
Ritchie Zhao, Yuwei Hu, Jordan Dotzel, Christopher De Sa, Zhiru Zhang

Figure 1 for Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Figure 2 for Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Figure 3 for Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Figure 4 for Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Viaarxiv icon

Improving Neural Network Quantization using Outlier Channel Splitting

Jan 28, 2019
Ritchie Zhao, Yuwei Hu, Jordan Dotzel, Christopher De Sa, Zhiru Zhang

Figure 1 for Improving Neural Network Quantization using Outlier Channel Splitting
Figure 2 for Improving Neural Network Quantization using Outlier Channel Splitting
Figure 3 for Improving Neural Network Quantization using Outlier Channel Splitting
Figure 4 for Improving Neural Network Quantization using Outlier Channel Splitting
Viaarxiv icon

Building Efficient Deep Neural Networks with Unitary Group Convolutions

Nov 19, 2018
Ritchie Zhao, Yuwei Hu, Jordan Dotzel, Christopher De Sa, Zhiru Zhang

Figure 1 for Building Efficient Deep Neural Networks with Unitary Group Convolutions
Figure 2 for Building Efficient Deep Neural Networks with Unitary Group Convolutions
Figure 3 for Building Efficient Deep Neural Networks with Unitary Group Convolutions
Figure 4 for Building Efficient Deep Neural Networks with Unitary Group Convolutions
Viaarxiv icon

Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration

Jul 15, 2017
Jeng-Hau Lin, Tianwei Xing, Ritchie Zhao, Zhiru Zhang, Mani Srivastava, Zhuowen Tu, Rajesh K. Gupta

Figure 1 for Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration
Figure 2 for Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration
Figure 3 for Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration
Figure 4 for Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration
Viaarxiv icon