Picture for Peter Milder

Peter Milder

Stony Brook University

On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers

Add code
Jun 02, 2021
Figure 1 for On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers
Figure 2 for On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers
Figure 3 for On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers
Figure 4 for On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers
Viaarxiv icon

Medusa: A Scalable Interconnect for Many-Port DNN Accelerators and Wide DRAM Controller Interfaces

Add code
Jul 11, 2018
Figure 1 for Medusa: A Scalable Interconnect for Many-Port DNN Accelerators and Wide DRAM Controller Interfaces
Figure 2 for Medusa: A Scalable Interconnect for Many-Port DNN Accelerators and Wide DRAM Controller Interfaces
Figure 3 for Medusa: A Scalable Interconnect for Many-Port DNN Accelerators and Wide DRAM Controller Interfaces
Figure 4 for Medusa: A Scalable Interconnect for Many-Port DNN Accelerators and Wide DRAM Controller Interfaces
Viaarxiv icon