Picture for Baris Kasikci

Baris Kasikci

M*: A Modular, Extensible, Serving System for Multimodal Models

Add code
Jun 10, 2026
Viaarxiv icon

Ekka: Automated Diagnosis of Silent Errors in LLM Inference

Add code
Jun 03, 2026
Viaarxiv icon

MURMUR: An Efficient Inference System for Long-Form ASR

Add code
May 31, 2026
Viaarxiv icon

VibeServe: Can AI Agents Build Bespoke LLM Serving Systems?

Add code
May 07, 2026
Viaarxiv icon

VoxServe: Streaming-Centric Serving System for Speech Language Models

Add code
Jan 30, 2026
Viaarxiv icon

Magneton: Optimizing Energy Efficiency of ML Systems via Differential Energy Debugging

Add code
Dec 09, 2025
Viaarxiv icon

Fake Runs, Real Fixes -- Analyzing xPU Performance Through Simulation

Add code
Mar 18, 2025
Figure 1 for Fake Runs, Real Fixes -- Analyzing xPU Performance Through Simulation
Figure 2 for Fake Runs, Real Fixes -- Analyzing xPU Performance Through Simulation
Figure 3 for Fake Runs, Real Fixes -- Analyzing xPU Performance Through Simulation
Figure 4 for Fake Runs, Real Fixes -- Analyzing xPU Performance Through Simulation
Viaarxiv icon

TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval

Add code
Feb 28, 2025
Figure 1 for TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval
Figure 2 for TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval
Figure 3 for TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval
Figure 4 for TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval
Viaarxiv icon

LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation

Add code
Feb 27, 2025
Figure 1 for LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
Figure 2 for LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
Figure 3 for LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
Figure 4 for LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
Viaarxiv icon

Tactic: Adaptive Sparse Attention with Clustering and Distribution Fitting for Long-Context LLMs

Add code
Feb 17, 2025
Viaarxiv icon