Alibi


Context-aware Biases for Length Extrapolation

Add code
Mar 11, 2025
Viaarxiv icon

Sliding Window Attention Training for Efficient Large Language Models

Add code
Feb 26, 2025
Viaarxiv icon

Wavelet-based Positional Representation for Long Context

Add code
Feb 04, 2025
Viaarxiv icon

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Add code
Dec 23, 2024
Viaarxiv icon

Flex Attention: A Programming Model for Generating Optimized Attention Kernels

Add code
Dec 07, 2024
Viaarxiv icon

The Nature of Mathematical Modeling and Probabilistic Optimization Engineering in Generative AI

Add code
Oct 24, 2024
Figure 1 for The Nature of Mathematical Modeling and Probabilistic Optimization Engineering in Generative AI
Figure 2 for The Nature of Mathematical Modeling and Probabilistic Optimization Engineering in Generative AI
Figure 3 for The Nature of Mathematical Modeling and Probabilistic Optimization Engineering in Generative AI
Viaarxiv icon

Linear Recency Bias During Training Improves Transformers' Fit to Reading Times

Add code
Sep 17, 2024
Viaarxiv icon

On the Interchangeability of Positional Embeddings in Multilingual Neural Machine Translation Models

Add code
Aug 21, 2024
Viaarxiv icon

BEACON: Benchmark for Comprehensive RNA Tasks and Language Models

Add code
Jun 14, 2024
Viaarxiv icon

Mitigate Position Bias in Large Language Models via Scaling a Single Dimension

Add code
Jun 04, 2024
Figure 1 for Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
Figure 2 for Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
Figure 3 for Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
Figure 4 for Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
Viaarxiv icon