Picture for Huihong Shi

Huihong Shi

Celine

Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

Add code
Jun 22, 2024
Viaarxiv icon

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

Add code
Jun 11, 2024
Figure 1 for ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Figure 2 for ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Figure 3 for ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Figure 4 for ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Viaarxiv icon

P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer

Add code
May 30, 2024
Figure 1 for P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
Figure 2 for P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
Figure 3 for P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
Figure 4 for P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
Viaarxiv icon

Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer

Add code
May 06, 2024
Viaarxiv icon

An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT

Add code
Mar 29, 2024
Figure 1 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Figure 2 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Figure 3 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Figure 4 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Viaarxiv icon

A Computationally Efficient Neural Video Compression Accelerator Based on a Sparse CNN-Transformer Hybrid Network

Add code
Dec 19, 2023
Figure 1 for A Computationally Efficient Neural Video Compression Accelerator Based on a Sparse CNN-Transformer Hybrid Network
Figure 2 for A Computationally Efficient Neural Video Compression Accelerator Based on a Sparse CNN-Transformer Hybrid Network
Figure 3 for A Computationally Efficient Neural Video Compression Accelerator Based on a Sparse CNN-Transformer Hybrid Network
Figure 4 for A Computationally Efficient Neural Video Compression Accelerator Based on a Sparse CNN-Transformer Hybrid Network
Viaarxiv icon

S2R: Exploring a Double-Win Transformer-Based Framework for Ideal and Blind Super-Resolution

Add code
Aug 16, 2023
Figure 1 for S2R: Exploring a Double-Win Transformer-Based Framework for Ideal and Blind Super-Resolution
Figure 2 for S2R: Exploring a Double-Win Transformer-Based Framework for Ideal and Blind Super-Resolution
Figure 3 for S2R: Exploring a Double-Win Transformer-Based Framework for Ideal and Blind Super-Resolution
Figure 4 for S2R: Exploring a Double-Win Transformer-Based Framework for Ideal and Blind Super-Resolution
Viaarxiv icon

ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer

Add code
Jun 10, 2023
Figure 1 for ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Figure 2 for ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Figure 3 for ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Figure 4 for ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Viaarxiv icon

ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention

Add code
Nov 09, 2022
Figure 1 for ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention
Figure 2 for ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention
Figure 3 for ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention
Figure 4 for ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention
Viaarxiv icon

NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks

Add code
Oct 24, 2022
Figure 1 for NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks
Figure 2 for NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks
Figure 3 for NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks
Figure 4 for NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks
Viaarxiv icon