Picture for Wonjin Shin

Wonjin Shin

POP: Online Structural Pruning Enables Efficient Inference of Large Foundation Models

Add code
Feb 06, 2026
Viaarxiv icon

V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval

Add code
Dec 24, 2025
Figure 1 for V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval
Figure 2 for V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval
Figure 3 for V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval
Figure 4 for V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval
Viaarxiv icon