Alert button
Picture for Amey Agrawal

Amey Agrawal

Alert button

Microsoft

Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve

Add code
Bookmark button
Alert button
Mar 04, 2024
Amey Agrawal, Nitin Kedia, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S. Gulavani, Alexey Tumanov, Ramachandran Ramjee

Figure 1 for Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve
Figure 2 for Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve
Figure 3 for Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve
Figure 4 for Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve
Viaarxiv icon

SARATHI: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills

Add code
Bookmark button
Alert button
Aug 31, 2023
Amey Agrawal, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S. Gulavani, Ramachandran Ramjee

Figure 1 for SARATHI: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills
Figure 2 for SARATHI: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills
Figure 3 for SARATHI: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills
Figure 4 for SARATHI: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills
Viaarxiv icon

DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization

Add code
Bookmark button
Alert button
Jun 20, 2023
Amey Agrawal, Sameer Reddy, Satwik Bhattamishra, Venkata Prabhakara Sarath Nookala, Vidushi Vashishth, Kexin Rong, Alexey Tumanov

Figure 1 for DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization
Figure 2 for DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization
Figure 3 for DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization
Figure 4 for DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization
Viaarxiv icon

Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads

Add code
Bookmark button
Alert button
Feb 21, 2022
Dharma Shukla, Muthian Sivathanu, Srinidhi Viswanatha, Bhargav Gulavani, Rimma Nehme, Amey Agrawal, Chen Chen, Nipun Kwatra, Ramachandran Ramjee, Pankaj Sharma, Atul Katiyar, Vipul Modi, Vaibhav Sharma, Abhishek Singh, Shreshth Singhal, Kaustubh Welankar, Lu Xun, Ravi Anupindi, Karthik Elangovan, Hasibur Rahman, Zhou Lin, Rahul Seetharaman, Cheng Xu, Eddie Ailijiang, Suresh Krishnappa, Mark Russinovich

Figure 1 for Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads
Figure 2 for Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads
Figure 3 for Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads
Figure 4 for Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads
Viaarxiv icon

Singularity: Planet-Scale, Preemptible, Elastic Scheduling of AI Workloads

Add code
Bookmark button
Alert button
Feb 16, 2022
Dharma Shukla, Muthian Sivathanu, Srinidhi Viswanatha, Bhargav Gulavani, Rimma Nehme, Amey Agrawal, Chen Chen, Nipun Kwatra, Ramachandran Ramjee, Pankaj Sharma, Atul Katiyar, Vipul Modi, Vaibhav Sharma, Abhishek Singh, Shreshth Singhal, Kaustubh Welankar, Lu Xun, Ravi Anupindi, Karthik Elangovan, Hasibur Rahman, Zhou Lin, Rahul Seetharaman, Cheng Xu, Eddie Ailijiang, Suresh Krishnappa, Mark Russinovich

Figure 1 for Singularity: Planet-Scale, Preemptible, Elastic Scheduling of AI Workloads
Figure 2 for Singularity: Planet-Scale, Preemptible, Elastic Scheduling of AI Workloads
Figure 3 for Singularity: Planet-Scale, Preemptible, Elastic Scheduling of AI Workloads
Figure 4 for Singularity: Planet-Scale, Preemptible, Elastic Scheduling of AI Workloads
Viaarxiv icon

Learning Digital Circuits: A Journey Through Weight Invariant Self-Pruning Neural Networks

Add code
Bookmark button
Alert button
Sep 05, 2019
Amey Agrawal, Rohit Karlupia

Figure 1 for Learning Digital Circuits: A Journey Through Weight Invariant Self-Pruning Neural Networks
Figure 2 for Learning Digital Circuits: A Journey Through Weight Invariant Self-Pruning Neural Networks
Figure 3 for Learning Digital Circuits: A Journey Through Weight Invariant Self-Pruning Neural Networks
Figure 4 for Learning Digital Circuits: A Journey Through Weight Invariant Self-Pruning Neural Networks
Viaarxiv icon