Picture for Suyeon Jang

Suyeon Jang

T-SAR: A Full-Stack Co-design for CPU-Only Ternary LLM Inference via In-Place SIMD ALU Reorganization

Add code
Nov 17, 2025
Viaarxiv icon

QUILL: An Algorithm-Architecture Co-Design for Cache-Local Deformable Attention

Add code
Nov 17, 2025
Viaarxiv icon