Grouped Query Attention


Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization

Add code
Jun 16, 2025
Viaarxiv icon

Unleashing Diffusion and State Space Models for Medical Image Segmentation

Add code
Jun 15, 2025
Viaarxiv icon

Hardware-Efficient Attention for Fast Decoding

Add code
May 27, 2025
Viaarxiv icon

XDementNET: An Explainable Attention Based Deep Convolutional Network to Detect Alzheimer Progression from MRI data

Add code
May 20, 2025
Viaarxiv icon

Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Add code
May 21, 2025
Viaarxiv icon

Grouping First, Attending Smartly: Training-Free Acceleration for Diffusion Transformers

Add code
May 20, 2025
Viaarxiv icon

Stable Reinforcement Learning for Efficient Reasoning

Add code
May 23, 2025
Viaarxiv icon

MC3D-AD: A Unified Geometry-aware Reconstruction Model for Multi-category 3D Anomaly Detection

Add code
May 04, 2025
Viaarxiv icon

Geometry-Informed Neural Operator Transformer

Add code
Apr 29, 2025
Viaarxiv icon

Cost-Optimal Grouped-Query Attention for Long-Context LLMs

Add code
Mar 12, 2025
Viaarxiv icon