Picture for Alexandros Kouris

Alexandros Kouris

WhiFlash: Accelerating Speculative Decoding with Token-Level Cross-Paradigm Routing

Add code
Jun 05, 2026
Viaarxiv icon

Speculative Decoding with a Speculative Vocabulary

Add code
Feb 14, 2026
Viaarxiv icon

Attentive Feature Aggregation or: How Policies Learn to Stop Worrying about Robustness and Attend to Task-Relevant Visual Cues

Add code
Nov 13, 2025
Viaarxiv icon

Progressive Mixed-Precision Decoding for Efficient LLM Inference

Add code
Oct 17, 2024
Figure 1 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 2 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 3 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 4 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Viaarxiv icon

Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads

Add code
Oct 17, 2023
Viaarxiv icon

The Future of Consumer Edge-AI Computing

Add code
Oct 19, 2022
Figure 1 for The Future of Consumer Edge-AI Computing
Figure 2 for The Future of Consumer Edge-AI Computing
Figure 3 for The Future of Consumer Edge-AI Computing
Viaarxiv icon

Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs

Add code
Sep 27, 2022
Figure 1 for Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Figure 2 for Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Figure 3 for Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Figure 4 for Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Viaarxiv icon

Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design

Add code
Sep 20, 2022
Figure 1 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Figure 2 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Figure 3 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Figure 4 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Viaarxiv icon

Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions

Add code
Jun 09, 2021
Figure 1 for Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions
Figure 2 for Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions
Figure 3 for Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions
Viaarxiv icon

Multi-Exit Semantic Segmentation Networks

Add code
Jun 07, 2021
Figure 1 for Multi-Exit Semantic Segmentation Networks
Figure 2 for Multi-Exit Semantic Segmentation Networks
Figure 3 for Multi-Exit Semantic Segmentation Networks
Figure 4 for Multi-Exit Semantic Segmentation Networks
Viaarxiv icon