Picture for Haikuo Shao

Haikuo Shao

APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration

Add code
Aug 26, 2025
Viaarxiv icon

FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization

Add code
May 25, 2025
Viaarxiv icon

Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores

Add code
Sep 26, 2024
Viaarxiv icon

Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment

Add code
Jul 16, 2024
Viaarxiv icon

Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer

Add code
May 06, 2024
Figure 1 for Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Figure 2 for Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Figure 3 for Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Figure 4 for Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Viaarxiv icon

An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT

Add code
Mar 29, 2024
Figure 1 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Figure 2 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Figure 3 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Figure 4 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Viaarxiv icon