Picture for Guihai Chen

Guihai Chen

Pre$^3$: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation

Add code
Jun 04, 2025
Viaarxiv icon

B2LoRa: Boosting LoRa Transmission for Satellite-IoT Systems with Blind Coherent Combining

Add code
May 30, 2025
Viaarxiv icon

Query Routing for Retrieval-Augmented Language Models

Add code
May 29, 2025
Viaarxiv icon

Automated Privacy Information Annotation in Large Language Model Interactions

Add code
May 27, 2025
Viaarxiv icon

DVD-Quant: Data-free Video Diffusion Transformers Quantization

Add code
May 24, 2025
Viaarxiv icon

FlashForge: Ultra-Efficient Prefix-Aware Attention for LLM Decoding

Add code
May 23, 2025
Viaarxiv icon

Low-bit Model Quantization for Deep Neural Networks: A Survey

Add code
May 08, 2025
Viaarxiv icon

Responsive DNN Adaptation for Video Analytics against Environment Shift via Hierarchical Mobile-Cloud Collaborations

Add code
Apr 30, 2025
Viaarxiv icon

Symmetry-Preserving Architecture for Multi-NUMA Environments (SPANE): A Deep Reinforcement Learning Approach for Dynamic VM Scheduling

Add code
Apr 21, 2025
Viaarxiv icon

Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions

Add code
Apr 17, 2025
Viaarxiv icon