Picture for Tamoghno Das

Tamoghno Das

T-SAR: A Full-Stack Co-design for CPU-Only Ternary LLM Inference via In-Place SIMD ALU Reorganization

Add code
Nov 17, 2025
Viaarxiv icon

QUILL: An Algorithm-Architecture Co-Design for Cache-Local Deformable Attention

Add code
Nov 17, 2025
Viaarxiv icon

ASTER: Attention-based Spiking Transformer Engine for Event-driven Reasoning

Add code
Nov 10, 2025
Viaarxiv icon

LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation

Add code
Apr 15, 2025
Figure 1 for LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation
Figure 2 for LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation
Figure 3 for LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation
Figure 4 for LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation
Viaarxiv icon