Picture for Shaobo Ma

Shaobo Ma

APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration

Add code
Aug 26, 2025
Viaarxiv icon

FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization

Add code
May 25, 2025
Viaarxiv icon

FlashForge: Ultra-Efficient Prefix-Aware Attention for LLM Decoding

Add code
May 23, 2025
Viaarxiv icon

Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores

Add code
Sep 26, 2024
Viaarxiv icon

Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment

Add code
Jul 16, 2024
Viaarxiv icon