Picture for Murali Emani

Murali Emani

MoE-Inference-Bench: Performance Evaluation of Mixture of Expert Large Language and Vision Models

Add code
Aug 24, 2025
Viaarxiv icon

LangVision-LoRA-NAS: Neural Architecture Search for Variable LoRA Rank in Vision Language Models

Add code
Aug 17, 2025
Viaarxiv icon

BaKlaVa -- Budgeted Allocation of KV cache for Long-context Inference

Add code
Feb 18, 2025
Viaarxiv icon

LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators

Add code
Oct 31, 2024
Viaarxiv icon

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Add code
Oct 11, 2023
Figure 1 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 2 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 3 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 4 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Viaarxiv icon

A Comprehensive Performance Study of Large Language Models on Novel AI Accelerators

Add code
Oct 06, 2023
Viaarxiv icon

Data Race Detection Using Large Language Models

Add code
Aug 15, 2023
Figure 1 for Data Race Detection Using Large Language Models
Figure 2 for Data Race Detection Using Large Language Models
Figure 3 for Data Race Detection Using Large Language Models
Figure 4 for Data Race Detection Using Large Language Models
Viaarxiv icon

A Survey of Techniques for Optimizing Transformer Inference

Add code
Jul 16, 2023
Figure 1 for A Survey of Techniques for Optimizing Transformer Inference
Figure 2 for A Survey of Techniques for Optimizing Transformer Inference
Figure 3 for A Survey of Techniques for Optimizing Transformer Inference
Figure 4 for A Survey of Techniques for Optimizing Transformer Inference
Viaarxiv icon

LM4HPC: Towards Effective Language Model Application in High-Performance Computing

Add code
Jun 26, 2023
Viaarxiv icon

A Multi-Level, Multi-Scale Visual Analytics Approach to Assessment of Multifidelity HPC Systems

Add code
Jun 15, 2023
Viaarxiv icon