Picture for Haikuo Shao

Haikuo Shao

APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration

Add code
Aug 26, 2025
Viaarxiv icon

FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization

Add code
May 25, 2025
Figure 1 for FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization
Figure 2 for FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization
Figure 3 for FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization
Figure 4 for FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization
Viaarxiv icon

Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores

Add code
Sep 26, 2024
Figure 1 for Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores
Figure 2 for Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores
Figure 3 for Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores
Figure 4 for Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores
Viaarxiv icon

Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment

Add code
Jul 16, 2024
Viaarxiv icon

Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer

Add code
May 06, 2024
Figure 1 for Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Figure 2 for Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Figure 3 for Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Figure 4 for Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Viaarxiv icon

An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT

Add code
Mar 29, 2024
Figure 1 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Figure 2 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Figure 3 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Figure 4 for An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Viaarxiv icon