Alibi


Mitigating Position-Shift Failures in Text-Based Modular Arithmetic via Position Curriculum and Template Diversity

Add code
Jan 07, 2026
Viaarxiv icon

Group Representational Position Encoding

Add code
Dec 08, 2025
Figure 1 for Group Representational Position Encoding
Figure 2 for Group Representational Position Encoding
Figure 3 for Group Representational Position Encoding
Figure 4 for Group Representational Position Encoding
Viaarxiv icon

HoPE: Hyperbolic Rotary Positional Encoding for Stable Long-Range Dependency Modeling in Large Language Models

Add code
Sep 05, 2025
Viaarxiv icon

A standard transformer and attention with linear biases for molecular conformer generation

Add code
Jun 24, 2025
Viaarxiv icon

SeqPE: Transformer with Sequential Position Encoding

Add code
Jun 16, 2025
Viaarxiv icon

Theoretical Analysis of Positional Encodings in Transformer Models: Impact on Expressiveness and Generalization

Add code
Jun 05, 2025
Viaarxiv icon

Bayesian Attention Mechanism: A Probabilistic Framework for Positional Encoding and Context Length Extrapolation

Add code
May 28, 2025
Viaarxiv icon

Context-aware Biases for Length Extrapolation

Add code
Mar 11, 2025
Figure 1 for Context-aware Biases for Length Extrapolation
Figure 2 for Context-aware Biases for Length Extrapolation
Figure 3 for Context-aware Biases for Length Extrapolation
Figure 4 for Context-aware Biases for Length Extrapolation
Viaarxiv icon

Sliding Window Attention Training for Efficient Large Language Models

Add code
Feb 26, 2025
Figure 1 for Sliding Window Attention Training for Efficient Large Language Models
Figure 2 for Sliding Window Attention Training for Efficient Large Language Models
Figure 3 for Sliding Window Attention Training for Efficient Large Language Models
Figure 4 for Sliding Window Attention Training for Efficient Large Language Models
Viaarxiv icon

Wavelet-based Positional Representation for Long Context

Add code
Feb 04, 2025
Viaarxiv icon