Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Patrick Mendoza

ONG: Orthogonal Natural Gradient Descent

Aug 24, 2025

Yajat Yadav, Jathin Korrapati, Patrick Mendoza

Abstract:Orthogonal gradient descent has emerged as a powerful method for continual learning tasks. However, its Euclidean projections overlook the underlying information-geometric structure of the space of distributions parametrized by neural networks, which can lead to suboptimal convergence in learning tasks. To counteract this, we combine it with the idea of the natural gradient and present ONG (Orthogonal Natural Gradient Descent). ONG preconditions each new task gradient with an efficient EKFAC approximation of the inverse Fisher information matrix, yielding updates that follow the steepest descent direction under a Riemannian metric. To preserve performance on previously learned tasks, ONG projects these natural gradients onto the orthogonal complement of prior task gradients. We provide a theoretical justification for this procedure, introduce the ONG algorithm, and benchmark its performance on the Permuted and Rotated MNIST datasets. All code for our experiments/reproducibility can be found at https://github.com/yajatyadav/orthogonal-natural-gradient.

* Code at https://github.com/yajatyadav/orthogonal-natural-gradient

Via

Access Paper or Ask Questions

Can Transformers Break Encryption Schemes via In-Context Learning?

Aug 13, 2025

Jathin Korrapati, Patrick Mendoza, Aditya Tomar, Abein Abraham

Abstract:In-context learning (ICL) has emerged as a powerful capability of transformer-based language models, enabling them to perform tasks by conditioning on a small number of examples presented at inference time, without any parameter updates. Prior work has shown that transformers can generalize over simple function classes like linear functions, decision trees, even neural networks, purely from context, focusing on numerical or symbolic reasoning over underlying well-structured functions. Instead, we propose a novel application of ICL into the domain of cryptographic function learning, specifically focusing on ciphers such as mono-alphabetic substitution and Vigen\`ere ciphers, two classes of private-key encryption schemes. These ciphers involve a fixed but hidden bijective mapping between plain text and cipher text characters. Given a small set of (cipher text, plain text) pairs, the goal is for the model to infer the underlying substitution and decode a new cipher text word. This setting poses a structured inference challenge, which is well-suited for evaluating the inductive biases and generalization capabilities of transformers under the ICL paradigm. Code is available at https://github.com/adistomar/CS182-project.

Via

Access Paper or Ask Questions