Picture for Andy Yang

Andy Yang

Probability Distributions Computed by Hard-Attention Transformers

Add code
Oct 31, 2025
Figure 1 for Probability Distributions Computed by Hard-Attention Transformers
Figure 2 for Probability Distributions Computed by Hard-Attention Transformers
Viaarxiv icon

Simulating Hard Attention Using Soft Attention

Add code
Dec 13, 2024
Figure 1 for Simulating Hard Attention Using Soft Attention
Figure 2 for Simulating Hard Attention Using Soft Attention
Viaarxiv icon

A Formal Framework for Understanding Length Generalization in Transformers

Add code
Oct 03, 2024
Figure 1 for A Formal Framework for Understanding Length Generalization in Transformers
Figure 2 for A Formal Framework for Understanding Length Generalization in Transformers
Figure 3 for A Formal Framework for Understanding Length Generalization in Transformers
Figure 4 for A Formal Framework for Understanding Length Generalization in Transformers
Viaarxiv icon

Counting Like Transformers: Compiling Temporal Counting Logic Into Softmax Transformers

Add code
Apr 05, 2024
Viaarxiv icon

Masked Hard-Attention Transformers and Boolean RASP Recognize Exactly the Star-Free Languages

Add code
Oct 21, 2023
Figure 1 for Masked Hard-Attention Transformers and Boolean RASP Recognize Exactly the Star-Free Languages
Figure 2 for Masked Hard-Attention Transformers and Boolean RASP Recognize Exactly the Star-Free Languages
Figure 3 for Masked Hard-Attention Transformers and Boolean RASP Recognize Exactly the Star-Free Languages
Viaarxiv icon