Picture for Xinru Tang

Xinru Tang

Designing Spatial Architectures for Sparse Attention: STAR Accelerator via Cross-Stage Tiling

Add code
Dec 24, 2025
Viaarxiv icon

"It's trained by non-disabled people": Evaluating How Image Quality Affects Product Captioning with VLMs

Add code
Nov 12, 2025
Viaarxiv icon

Tackling the Dynamicity in a Production LLM Serving System with SOTA Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient Meta-kernels

Add code
Dec 24, 2024
Figure 1 for Tackling the Dynamicity in a Production LLM Serving System with SOTA Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient Meta-kernels
Figure 2 for Tackling the Dynamicity in a Production LLM Serving System with SOTA Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient Meta-kernels
Figure 3 for Tackling the Dynamicity in a Production LLM Serving System with SOTA Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient Meta-kernels
Figure 4 for Tackling the Dynamicity in a Production LLM Serving System with SOTA Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient Meta-kernels
Viaarxiv icon