Abstract:Equivariant neural networks incorporate symmetries through group actions, embedding them as an inductive bias to improve performance on a wide variety of tasks. However, existing equivariant methods can be computationally intensive, with high parameter counts, and are often tied to a specific architecture. We propose a simple zero-parameter approach that imposes approximate equivariance for a finite group in the latent representation, as an additional term in the loss function. We conduct experiments which allow the network to learn a group representation on the latent space, and show in every case it prefers to learn the regular representation. Fixing this action on the latent space, this yields a simple method to impose approximate equivariance as an additional loss penalty. We benchmark our approach on three datasets and compare it against several existing equivariant methods, showing that in many cases it achieves similar or better performance for a fraction of the parameters.
Abstract:Transformer models have revolutionized fields from natural language processing to computer vision, yet their internal computational dynamics remain poorly understood raising concerns about predictability and robustness. In this work, we introduce Entropy-Lens, a scalable, model-agnostic framework that leverages information theory to interpret frozen, off-the-shelf large-scale transformers. By quantifying the evolution of Shannon entropy within intermediate residual streams, our approach extracts computational signatures that distinguish model families, categorize task-specific prompts, and correlate with output accuracy. We further demonstrate the generality of our method by extending the analysis to vision transformers. Our results suggest that entropy-based metrics can serve as a principled tool for unveiling the inner workings of modern transformer architectures.
Abstract:Clifford Group Equivariant Neural Networks (CGENNs) leverage Clifford algebras and multivectors as an alternative approach to incorporating group equivariance to ensure symmetry constraints in neural representations. In principle, this formulation generalizes to orthogonal groups and preserves equivariance regardless of the metric signature. However, previous works have restricted internal network representations to Euclidean or Minkowski (pseudo-)metrics, handpicked depending on the problem at hand. In this work, we propose an alternative method that enables the metric to be learned in a data-driven fashion, allowing the CGENN network to learn more flexible representations. Specifically, we populate metric matrices fully, ensuring they are symmetric by construction, and leverage eigenvalue decomposition to integrate this additional learnable component into the original CGENN formulation in a principled manner. Additionally, we motivate our method using insights from category theory, which enables us to explain Clifford algebras as a categorical construction and guarantee the mathematical soundness of our approach. We validate our method in various tasks and showcase the advantages of learning more flexible latent metric representations. The code and data are available at https://github.com/rick-ali/Metric-Learning-for-CGENNs