Picture for Zebin Yang

Zebin Yang

KEEP: A KV-Cache-Centric Memory Management System for Efficient Embodied Planning

Add code
Feb 27, 2026
Viaarxiv icon

DySL-VLA: Efficient Vision-Language-Action Model Inference via Dynamic-Static Layer-Skipping for Robot Manipulation

Add code
Feb 26, 2026
Viaarxiv icon

LightMamba: Efficient Mamba Acceleration on FPGA with Quantization and Hardware Co-design

Add code
Feb 21, 2025
Viaarxiv icon

Inherently Interpretable Tree Ensemble Learning

Add code
Oct 24, 2024
Figure 1 for Inherently Interpretable Tree Ensemble Learning
Figure 2 for Inherently Interpretable Tree Ensemble Learning
Figure 3 for Inherently Interpretable Tree Ensemble Learning
Figure 4 for Inherently Interpretable Tree Ensemble Learning
Viaarxiv icon

MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers

Add code
Oct 23, 2024
Figure 1 for MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
Figure 2 for MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
Figure 3 for MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
Figure 4 for MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
Viaarxiv icon

FastQuery: Communication-efficient Embedding Table Query for Private LLM Inference

Add code
May 25, 2024
Viaarxiv icon

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding

Add code
Feb 21, 2024
Figure 1 for ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
Figure 2 for ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
Figure 3 for ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
Figure 4 for ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
Viaarxiv icon

AttentionLego: An Open-Source Building Block For Spatially-Scalable Large Language Model Accelerator With Processing-In-Memory Technology

Add code
Jan 21, 2024
Viaarxiv icon

PiML Toolbox for Interpretable Machine Learning Model Development and Validation

Add code
May 07, 2023
Viaarxiv icon

Explainable Recommendation Systems by Generalized Additive Models with Manifest and Latent Interactions

Add code
Dec 15, 2020
Figure 1 for Explainable Recommendation Systems by Generalized Additive Models with Manifest and Latent Interactions
Figure 2 for Explainable Recommendation Systems by Generalized Additive Models with Manifest and Latent Interactions
Figure 3 for Explainable Recommendation Systems by Generalized Additive Models with Manifest and Latent Interactions
Figure 4 for Explainable Recommendation Systems by Generalized Additive Models with Manifest and Latent Interactions
Viaarxiv icon