Picture for Baohong Lv

Baohong Lv

Scaling TransNormer to 175 Billion Parameters

Add code
Jul 27, 2023
Viaarxiv icon

cosFormer: Rethinking Softmax in Attention

Add code
Feb 17, 2022
Figure 1 for cosFormer: Rethinking Softmax in Attention
Figure 2 for cosFormer: Rethinking Softmax in Attention
Figure 3 for cosFormer: Rethinking Softmax in Attention
Figure 4 for cosFormer: Rethinking Softmax in Attention
Viaarxiv icon