Picture for Cheng Luo

Cheng Luo

R-KV: Redundancy-aware KV Cache Compression for Training-Free Reasoning Models Acceleration

Add code
May 30, 2025
Viaarxiv icon

OmniResponse: Online Multimodal Conversational Response Generation in Dyadic Interactions

Add code
May 27, 2025
Viaarxiv icon

REACT 2025: the Third Multiple Appropriate Facial Reaction Generation Challenge

Add code
May 22, 2025
Viaarxiv icon

MOM: Memory-Efficient Offloaded Mini-Sequence Inference for Long Context Language Models

Add code
Apr 16, 2025
Viaarxiv icon

Algorithm Design and Prototype Validation for Reconfigurable Intelligent Sensing Surface: Forward-Only Transmission

Add code
Mar 31, 2025
Viaarxiv icon

Low-Complexity Beamforming Design for Null Space-based Simultaneous Wireless Information and Power Transfer Systems

Add code
Mar 11, 2025
Viaarxiv icon

Bedrock Models in Communication and Sensing: Advancing Generalization, Transferability, and Performance

Add code
Mar 11, 2025
Viaarxiv icon

CaseGen: A Benchmark for Multi-Stage Legal Case Documents Generation

Add code
Feb 25, 2025
Viaarxiv icon

Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts

Add code
Feb 18, 2025
Figure 1 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Figure 2 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Figure 3 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Figure 4 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Viaarxiv icon

HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading

Add code
Feb 18, 2025
Viaarxiv icon