Alert button
Picture for Guy Boudoukh

Guy Boudoukh

Alert button

An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

Add code
Bookmark button
Alert button
Jun 28, 2023
Haihao Shen, Hengyu Meng, Bo Dong, Zhe Wang, Ofir Zafrir, Yi Ding, Yu Luo, Hanwen Chang, Qun Gao, Ziheng Wang, Guy Boudoukh, Moshe Wasserblat

Figure 1 for An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs
Figure 2 for An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs
Figure 3 for An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs
Figure 4 for An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs
Viaarxiv icon

Fast DistilBERT on CPUs

Add code
Bookmark button
Alert button
Oct 27, 2022
Haihao Shen, Ofir Zafrir, Bo Dong, Hengyu Meng, Xinyu Ye, Zhe Wang, Yi Ding, Hanwen Chang, Guy Boudoukh, Moshe Wasserblat

Figure 1 for Fast DistilBERT on CPUs
Figure 2 for Fast DistilBERT on CPUs
Figure 3 for Fast DistilBERT on CPUs
Figure 4 for Fast DistilBERT on CPUs
Viaarxiv icon

Prune Once for All: Sparse Pre-Trained Language Models

Add code
Bookmark button
Alert button
Nov 10, 2021
Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat

Figure 1 for Prune Once for All: Sparse Pre-Trained Language Models
Figure 2 for Prune Once for All: Sparse Pre-Trained Language Models
Figure 3 for Prune Once for All: Sparse Pre-Trained Language Models
Figure 4 for Prune Once for All: Sparse Pre-Trained Language Models
Viaarxiv icon

Q8BERT: Quantized 8Bit BERT

Add code
Bookmark button
Alert button
Oct 17, 2019
Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat

Figure 1 for Q8BERT: Quantized 8Bit BERT
Figure 2 for Q8BERT: Quantized 8Bit BERT
Viaarxiv icon