Picture for Chao Fang

Chao Fang

A Scheduling Framework for Efficient MoE Inference on Edge GPU-NDP Systems

Add code
Jan 07, 2026
Viaarxiv icon

P3-LLM: An Integrated NPU-PIM Accelerator for LLM Inference Using Hybrid Numerical Formats

Add code
Nov 16, 2025
Figure 1 for P3-LLM: An Integrated NPU-PIM Accelerator for LLM Inference Using Hybrid Numerical Formats
Figure 2 for P3-LLM: An Integrated NPU-PIM Accelerator for LLM Inference Using Hybrid Numerical Formats
Figure 3 for P3-LLM: An Integrated NPU-PIM Accelerator for LLM Inference Using Hybrid Numerical Formats
Figure 4 for P3-LLM: An Integrated NPU-PIM Accelerator for LLM Inference Using Hybrid Numerical Formats
Viaarxiv icon

Precision-Scalable Microscaling Datapaths with Optimized Reduction Tree for Efficient NPU Integration

Add code
Nov 09, 2025
Viaarxiv icon

APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration

Add code
Aug 26, 2025
Viaarxiv icon

Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics Learning

Add code
May 28, 2025
Figure 1 for Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics Learning
Figure 2 for Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics Learning
Figure 3 for Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics Learning
Figure 4 for Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics Learning
Viaarxiv icon

Enable Lightweight and Precision-Scalable Posit/IEEE-754 Arithmetic in RISC-V Cores for Transprecision Computing

Add code
May 25, 2025
Viaarxiv icon

FlashForge: Ultra-Efficient Prefix-Aware Attention for LLM Decoding

Add code
May 23, 2025
Figure 1 for FlashForge: Ultra-Efficient Prefix-Aware Attention for LLM Decoding
Figure 2 for FlashForge: Ultra-Efficient Prefix-Aware Attention for LLM Decoding
Figure 3 for FlashForge: Ultra-Efficient Prefix-Aware Attention for LLM Decoding
Figure 4 for FlashForge: Ultra-Efficient Prefix-Aware Attention for LLM Decoding
Viaarxiv icon

A Novel P-bit-based Probabilistic Computing Approach for Solving the 3-D Protein Folding Problem

Add code
Feb 27, 2025
Viaarxiv icon

Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format

Add code
Nov 24, 2024
Viaarxiv icon

Jamming Detection and Channel Estimation for Spatially Correlated Beamspace Massive MIMO

Add code
Oct 18, 2024
Figure 1 for Jamming Detection and Channel Estimation for Spatially Correlated Beamspace Massive MIMO
Figure 2 for Jamming Detection and Channel Estimation for Spatially Correlated Beamspace Massive MIMO
Figure 3 for Jamming Detection and Channel Estimation for Spatially Correlated Beamspace Massive MIMO
Figure 4 for Jamming Detection and Channel Estimation for Spatially Correlated Beamspace Massive MIMO
Viaarxiv icon