Picture for Song Han

Song Han

University of Connecticut

OckBench: Measuring the Efficiency of LLM Reasoning

Add code
Nov 07, 2025
Viaarxiv icon

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Add code
Oct 10, 2025
Viaarxiv icon

LongLive: Real-time Interactive Long Video Generation

Add code
Sep 26, 2025
Viaarxiv icon

3D Aware Region Prompted Vision Language Model

Add code
Sep 16, 2025
Figure 1 for 3D Aware Region Prompted Vision Language Model
Figure 2 for 3D Aware Region Prompted Vision Language Model
Figure 3 for 3D Aware Region Prompted Vision Language Model
Figure 4 for 3D Aware Region Prompted Vision Language Model
Viaarxiv icon

Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services

Add code
Aug 20, 2025
Figure 1 for Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services
Figure 2 for Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services
Figure 3 for Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services
Figure 4 for Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services
Viaarxiv icon

EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos

Add code
Jul 16, 2025
Viaarxiv icon

Scaling RL to Long Videos

Add code
Jul 10, 2025
Viaarxiv icon

Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation

Add code
Jul 02, 2025
Viaarxiv icon

Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation

Add code
Jun 24, 2025
Viaarxiv icon

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Add code
May 28, 2025
Viaarxiv icon