Picture for Murali Annavaram

Murali Annavaram

Differentially Private Retrieval-Augmented Generation

Add code
Feb 16, 2026
Viaarxiv icon

LRD-MPC: Efficient MPC Inference through Low-rank Decomposition

Add code
Feb 16, 2026
Viaarxiv icon

Striking the Right Balance between Compute and Copy: Improving LLM Inferencing Under Speculative Decoding

Add code
Nov 15, 2025
Viaarxiv icon

DuetServe: Harmonizing Prefill and Decode for LLM Serving via Adaptive GPU Multiplexing

Add code
Nov 06, 2025
Viaarxiv icon

Memory-Efficient Differentially Private Training with Gradient Random Projection

Add code
Jun 18, 2025
Viaarxiv icon

DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding

Add code
Apr 08, 2025
Figure 1 for DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
Figure 2 for DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
Figure 3 for DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
Figure 4 for DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
Viaarxiv icon

Efficient LLM Inference with I/O-Aware Partial KV Cache Recomputation

Add code
Nov 26, 2024
Viaarxiv icon

Characterizing Context Influence and Hallucination in Summarization

Add code
Oct 03, 2024
Figure 1 for Characterizing Context Influence and Hallucination in Summarization
Figure 2 for Characterizing Context Influence and Hallucination in Summarization
Figure 3 for Characterizing Context Influence and Hallucination in Summarization
Figure 4 for Characterizing Context Influence and Hallucination in Summarization
Viaarxiv icon

Adaptively Private Next-Token Prediction of Large Language Models

Add code
Oct 02, 2024
Figure 1 for Adaptively Private Next-Token Prediction of Large Language Models
Figure 2 for Adaptively Private Next-Token Prediction of Large Language Models
Figure 3 for Adaptively Private Next-Token Prediction of Large Language Models
Figure 4 for Adaptively Private Next-Token Prediction of Large Language Models
Viaarxiv icon

CADC: Encoding User-Item Interactions for Compressing Recommendation Model Training Data

Add code
Jul 11, 2024
Figure 1 for CADC: Encoding User-Item Interactions for Compressing Recommendation Model Training Data
Figure 2 for CADC: Encoding User-Item Interactions for Compressing Recommendation Model Training Data
Viaarxiv icon