Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ayush Goel

In-Context Learning in Linear vs. Quadratic Attention Models: An Empirical Study on Regression Tasks

Feb 19, 2026

Ayush Goel, Arjun Kohli, Sarvagya Somvanshi

Abstract:Recent work has demonstrated that transformers and linear attention models can perform in-context learning (ICL) on simple function classes, such as linear regression. In this paper, we empirically study how these two attention mechanisms differ in their ICL behavior on the canonical linear-regression task of Garg et al. We evaluate learning quality (MSE), convergence, and generalization behavior of each architecture. We also analyze how increasing model depth affects ICL performance. Our results illustrate both the similarities and limitations of linear attention relative to quadratic attention in this setting.

Via

Access Paper or Ask Questions

An Empirical Review of Adversarial Defenses

Dec 10, 2020

Ayush Goel

Figure 1 for An Empirical Review of Adversarial Defenses

Figure 2 for An Empirical Review of Adversarial Defenses

Figure 3 for An Empirical Review of Adversarial Defenses

Figure 4 for An Empirical Review of Adversarial Defenses

Abstract:From face recognition systems installed in phones to self-driving cars, the field of AI is witnessing rapid transformations and is being integrated into our everyday lives at an incredible pace. Any major failure in these system's predictions could be devastating, leaking sensitive information or even costing lives (as in the case of self-driving cars). However, deep neural networks, which form the basis of such systems, are highly susceptible to a specific type of attack, called adversarial attacks. A hacker can, even with bare minimum computation, generate adversarial examples (images or data points that belong to another class, but consistently fool the model to get misclassified as genuine) and crumble the basis of such algorithms. In this paper, we compile and test numerous approaches to defend against such adversarial attacks. Out of the ones explored, we found two effective techniques, namely Dropout and Denoising Autoencoders, and show their success in preventing such attacks from fooling the model. We demonstrate that these techniques are also resistant to both higher noise levels as well as different kinds of adversarial attacks (although not tested against all). We also develop a framework for deciding the suitable defense technique to use against attacks, based on the nature of the application and resource constraints of the Deep Neural Network.

* 19 pages, 8 Figures, Report Reviewed by Vivek Menon

Via

Access Paper or Ask Questions