Grouped Query Attention


No Generation without Representation: Efficient Causal Protein Language Models Enable Zero-Shot Fitness Estimation

Add code
Feb 02, 2026
Viaarxiv icon

Low-Rank Key Value Attention

Add code
Jan 16, 2026
Viaarxiv icon

Mugi: Value Level Parallelism For Efficient LLMs

Add code
Jan 15, 2026
Viaarxiv icon

Pairing-free Group-level Knowledge Distillation for Robust Gastrointestinal Lesion Classification in White-Light Endoscopy

Add code
Jan 14, 2026
Viaarxiv icon

Nonparametric Kernel Clustering with Bandit Feedback

Add code
Jan 12, 2026
Viaarxiv icon

Mixture of Attention Schemes (MoAS): Learning to Route Between MHA, GQA, and MQA

Add code
Dec 16, 2025
Viaarxiv icon

Leveraging Parameter Space Symmetries for Reasoning Skill Transfer in LLMs

Add code
Nov 13, 2025
Viaarxiv icon

Knocking-Heads Attention

Add code
Oct 27, 2025
Viaarxiv icon

Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space

Add code
Oct 06, 2025
Viaarxiv icon

MCP-RiskCue: Can LLM Infer Risk Information From MCP Server System Logs?

Add code
Nov 12, 2025
Viaarxiv icon