Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Viacheslav Yusupov

Token Homogenization under Positional Bias

Aug 23, 2025

Viacheslav Yusupov, Danil Maksimov, Ameliia Alaeva, Tatiana Zaitceva, Antipina Anna, Anna Vasileva, Chenlin Liu, Rayuth Chheng, Danil Sazanakov, Andrey Chetvergov(+2 more)

Figure 1 for Token Homogenization under Positional Bias

Figure 2 for Token Homogenization under Positional Bias

Figure 3 for Token Homogenization under Positional Bias

Figure 4 for Token Homogenization under Positional Bias

Abstract:This paper investigates token homogenization - the convergence of token representations toward uniformity across transformer layers and its relationship to positional bias in large language models. We empirically examine whether homogenization occurs and how positional bias amplifies this effect. Through layer-wise similarity analysis and controlled experiments, we demonstrate that tokens systematically lose distinctiveness during processing, particularly when biased toward extremal positions. Our findings confirm both the existence of homogenization and its dependence on positional attention mechanisms.

Via

Access Paper or Ask Questions

Knowledge Graph Completion with Mixed Geometry Tensor Factorization

Apr 03, 2025

Viacheslav Yusupov, Maxim Rakhuba, Evgeny Frolov

Figure 1 for Knowledge Graph Completion with Mixed Geometry Tensor Factorization

Figure 2 for Knowledge Graph Completion with Mixed Geometry Tensor Factorization

Figure 3 for Knowledge Graph Completion with Mixed Geometry Tensor Factorization

Figure 4 for Knowledge Graph Completion with Mixed Geometry Tensor Factorization

Abstract:In this paper, we propose a new geometric approach for knowledge graph completion via low rank tensor approximation. We augment a pretrained and well-established Euclidean model based on a Tucker tensor decomposition with a novel hyperbolic interaction term. This correction enables more nuanced capturing of distributional properties in data better aligned with real-world knowledge graphs. By combining two geometries together, our approach improves expressivity of the resulting model achieving new state-of-the-art link prediction accuracy with a significantly lower number of parameters compared to the previous Euclidean and hyperbolic models.

* Accepted to AISTATS 2025

Via

Access Paper or Ask Questions