Picture for Huamin Chen

Huamin Chen

Dual-Pool Token-Budget Routing for Cost-Efficient and Reliable LLM Serving

Add code
Apr 09, 2026
Viaarxiv icon

From Inference Routing to Agent Orchestration: Declarative Policy Compilation with Cross-Layer Verification

Add code
Mar 28, 2026
Viaarxiv icon

Knowledge Access Beats Model Size: Memory Augmented Routing for Persistent AI Agents

Add code
Mar 24, 2026
Viaarxiv icon

The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project

Add code
Mar 22, 2026
Viaarxiv icon

Conflict-Free Policy Languages for Probabilistic ML Predicates: A Framework and Case Study with the Semantic Router DSL

Add code
Mar 18, 2026
Viaarxiv icon

Visual Confused Deputy: Exploiting and Defending Perception Failures in Computer-Using Agents

Add code
Mar 16, 2026
Viaarxiv icon

Outcome-Aware Tool Selection for Semantic Routers: Latency-Constrained Learning Without LLM Inference

Add code
Mar 13, 2026
Viaarxiv icon

Adaptive Vision-Language Model Routing for Computer Use Agents

Add code
Mar 13, 2026
Viaarxiv icon

98$\times$ Faster LLM Routing Without a Dedicated GPU: Flash Attention, Prompt Compression, and Near-Streaming for the vLLM Semantic Router

Add code
Mar 13, 2026
Viaarxiv icon

Building Trust: Foundations of Security, Safety and Transparency in AI

Add code
Nov 19, 2024
Viaarxiv icon