Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ezequiel Lopez-Rubio

Enhanced QKNorm normalization for neural transformers with the Lp norm

Feb 04, 2026

Ezequiel Lopez-Rubio, Javier Montes-Perez, Esteban Jose Palomo

Abstract:The normalization of query and key vectors is an essential part of the Transformer architecture. It ensures that learning is stable regardless of the scale of these vectors. Some normalization approaches are available. In this preliminary work, a generalization of the QKNorm normalization scheme is proposed. The approach is based on the Lp norm, allowing non-Euclidean norms to be employed. Experimental results demonstrate the suitability of the method for a simple problem.

Via

Access Paper or Ask Questions

Alternative positional encoding functions for neural transformers

Dec 22, 2025

Ezequiel Lopez-Rubio, Macoris Decena-Gimenez, Rafael Marcos Luque-Baena

Abstract:A key module in neural transformer-based deep architectures is positional encoding. This module enables a suitable way to encode positional information as input for transformer neural layers. This success has been rooted in the use of sinusoidal functions of various frequencies, in order to capture recurrent patterns of differing typical periods. In this work, an alternative set of periodic functions is proposed for positional encoding. These functions preserve some key properties of sinusoidal ones, while they depart from them in fundamental ways. Some tentative experiments are reported, where the original sinusoidal version is substantially outperformed. This strongly suggests that the alternative functions may have a wider use in other transformer architectures.

Via

Access Paper or Ask Questions

Representation of the structure of graphs by sequences of instructions

Dec 13, 2025

Ezequiel Lopez-Rubio

Abstract:The representation of graphs is commonly based on the adjacency matrix concept. This formulation is the foundation of most algebraic and computational approaches to graph processing. The advent of deep learning language models offers a wide range of powerful computational models that are specialized in the processing of text. However, current procedures to represent graphs are not amenable to processing by these models. In this work, a new method to represent graphs is proposed. It represents the adjacency matrix of a graph by a string of simple instructions. The instructions build the adjacency matrix step by step. The transformation is reversible, i.e., given a graph the string can be produced and vice versa. The proposed representation is compact, and it maintains the local structural patterns of the graph. Therefore, it is envisaged that it could be useful to boost the processing of graphs by deep learning models. A tentative computational experiment is reported, demonstrating improved classification performance and faster computation times with the proposed representation.

Via

Access Paper or Ask Questions