Abstract:Foundational language models show a remarkable ability to learn new concepts during inference via context data. However, similar work for images lag behind. To address this challenge, we introduce FLoWN, a flow matching model that learns to generate neural network parameters for different tasks. Our approach models the flow on latent space, while conditioning the process on context data. Experiments verify that FLoWN attains various desiderata for a meta-learning model. In addition, it matches or exceeds baselines on in-distribution tasks, provides better initializations for classifier training, and is performant on out-of-distribution few-shot tasks while having a fine-tuning mechanism to improve performance.
Abstract:Bit flipping attacks are one class of attacks on neural networks with numerous defense mechanisms invented to mitigate its potency. Due to the importance of ensuring the robustness of these defense mechanisms, we perform an empirical study on the Aegis framework. We evaluate the baseline mechanisms of Aegis on low-entropy data (MNIST), and we evaluate a pre-trained model with the mechanisms fine-tuned on MNIST. We also compare the use of data augmentation to the robustness training of Aegis, and how Aegis performs under other adversarial attacks, such as the generation of adversarial examples. We find that both the dynamic-exit strategy and robustness training of Aegis has some drawbacks. In particular, we see drops in accuracy when testing on perturbed data, and on adversarial examples, as compared to baselines. Moreover, we found that the dynamic exit-strategy loses its uniformity when tested on simpler datasets. The code for this project is available on GitHub.