Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bahram Zonooz

PhysVid: Physics Aware Local Conditioning for Generative Video Models

Mar 27, 2026

Saurabh, Pathak, Elahe Arani, Mykola Pechenizkiy, Bahram Zonooz

Abstract:Generative video models achieve high visual fidelity but often violate basic physical principles, limiting reliability in real-world settings. Prior attempts to inject physics rely on conditioning: frame-level signals are domain-specific and short-horizon, while global text prompts are coarse and noisy, missing fine-grained dynamics. We present PhysVid, a physics-aware local conditioning scheme that operates over temporally contiguous chunks of frames. Each chunk is annotated with physics-grounded descriptions of states, interactions, and constraints, which are fused with the global prompt via chunk-aware cross-attention during training. At inference, we introduce negative physics prompts (descriptions of locally relevant law violations) to steer generation away from implausible trajectories. On VideoPhy, PhysVid improves physical commonsense scores by $\approx 33\%$ over baseline video generators, and by up to $\approx 8\%$ on VideoPhy2. These results show that local, physics-aware guidance substantially increases physical plausibility in generative video and marks a step toward physics-grounded video models.

* Accepted for CVPR 2026

Via

Access Paper or Ask Questions

Continual Learning Beyond Experience Rehearsal and Full Model Surrogates

May 28, 2025

Prashant Bhat, Laurens Niesten, Elahe Arani, Bahram Zonooz

Figure 1 for Continual Learning Beyond Experience Rehearsal and Full Model Surrogates

Figure 2 for Continual Learning Beyond Experience Rehearsal and Full Model Surrogates

Figure 3 for Continual Learning Beyond Experience Rehearsal and Full Model Surrogates

Figure 4 for Continual Learning Beyond Experience Rehearsal and Full Model Surrogates

Abstract:Continual learning (CL) has remained a significant challenge for deep neural networks as learning new tasks erases previously acquired knowledge, either partially or completely. Existing solutions often rely on experience rehearsal or full model surrogates to mitigate CF. While effective, these approaches introduce substantial memory and computational overhead, limiting their scalability and applicability in real-world scenarios. To address this, we propose SPARC, a scalable CL approach that eliminates the need for experience rehearsal and full-model surrogates. By effectively combining task-specific working memories and task-agnostic semantic memory for cross-task knowledge consolidation, SPARC results in a remarkable parameter efficiency, using only 6% of the parameters required by full-model surrogates. Despite its lightweight design, SPARC achieves superior performance on Seq-TinyImageNet and matches rehearsal-based methods on various CL benchmarks. Additionally, weight re-normalization in the classification layer mitigates task-specific biases, establishing SPARC as a practical and scalable solution for CL under stringent efficiency constraints.

* 23 pages, 9 figures

Via

Access Paper or Ask Questions

Parameter Efficient Continual Learning with Dynamic Low-Rank Adaptation

May 17, 2025

Prashant Shivaram Bhat, Shakib Yazdani, Elahe Arani, Bahram Zonooz

Figure 1 for Parameter Efficient Continual Learning with Dynamic Low-Rank Adaptation

Figure 2 for Parameter Efficient Continual Learning with Dynamic Low-Rank Adaptation

Figure 3 for Parameter Efficient Continual Learning with Dynamic Low-Rank Adaptation

Figure 4 for Parameter Efficient Continual Learning with Dynamic Low-Rank Adaptation

Abstract:Catastrophic forgetting has remained a critical challenge for deep neural networks in Continual Learning (CL) as it undermines consolidated knowledge when learning new tasks. Parameter efficient fine tuning CL techniques are gaining traction for their effectiveness in addressing catastrophic forgetting with a lightweight training schedule while avoiding degradation of consolidated knowledge in pre-trained models. However, low rank adapters (LoRA) in these approaches are highly sensitive to rank selection which can lead to sub-optimal resource allocation and performance. To this end, we introduce PEARL, a rehearsal-free CL framework that entails dynamic rank allocation for LoRA components during CL training. Specifically, PEARL leverages reference task weights and adaptively determines the rank of task-specific LoRA components based on the current tasks' proximity to reference task weights in parameter space. To demonstrate the versatility of PEARL, we evaluate it across three vision architectures (ResNet, Separable Convolutional Network and Vision Transformer) and a multitude of CL scenarios, and show that PEARL outperforms all considered baselines by a large margin.

* 27 pages, 5 figures

Via

Access Paper or Ask Questions

Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method

Jun 23, 2024

Kishaan Jeeveswaran, Elahe Arani, Bahram Zonooz

Figure 1 for Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method

Figure 2 for Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method

Figure 3 for Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method

Figure 4 for Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method

Abstract:Domain incremental learning (DIL) poses a significant challenge in real-world scenarios, as models need to be sequentially trained on diverse domains over time, all the while avoiding catastrophic forgetting. Mitigating representation drift, which refers to the phenomenon of learned representations undergoing changes as the model adapts to new tasks, can help alleviate catastrophic forgetting. In this study, we propose a novel DIL method named DARE, featuring a three-stage training process: Divergence, Adaptation, and REfinement. This process gradually adapts the representations associated with new tasks into the feature space spanned by samples from previous tasks, simultaneously integrating task-specific decision boundaries. Additionally, we introduce a novel strategy for buffer sampling and demonstrate the effectiveness of our proposed method, combined with this sampling strategy, in reducing representation drift within the feature encoder. This contribution effectively alleviates catastrophic forgetting across multiple DIL benchmarks. Furthermore, our approach prevents sudden representation drift at task boundaries, resulting in a well-calibrated DIL model that maintains the performance on previous tasks.

* Accepted at 41st International Conference on Machine Learning (ICML 2024)

Via

Access Paper or Ask Questions

Mitigating Interference in the Knowledge Continuum through Attention-Guided Incremental Learning

May 22, 2024

Prashant Bhat, Bharath Renjith, Elahe Arani, Bahram Zonooz

Abstract:Continual learning (CL) remains a significant challenge for deep neural networks, as it is prone to forgetting previously acquired knowledge. Several approaches have been proposed in the literature, such as experience rehearsal, regularization, and parameter isolation, to address this problem. Although almost zero forgetting can be achieved in task-incremental learning, class-incremental learning remains highly challenging due to the problem of inter-task class separation. Limited access to previous task data makes it difficult to discriminate between classes of current and previous tasks. To address this issue, we propose `Attention-Guided Incremental Learning' (AGILE), a novel rehearsal-based CL approach that incorporates compact task attention to effectively reduce interference between tasks. AGILE utilizes lightweight, learnable task projection vectors to transform the latent representations of a shared task attention module toward task distribution. Through extensive empirical evaluation, we show that AGILE significantly improves generalization performance by mitigating task interference and outperforming rehearsal-based approaches in several CL scenarios. Furthermore, AGILE can scale well to a large number of tasks with minimal overhead while remaining well-calibrated with reduced task-recency bias.

* Published at 3rd Conference on Lifelong Learning Agents (CoLLAs 2024)

Via

Access Paper or Ask Questions

Beyond Unimodal Learning: The Importance of Integrating Multiple Modalities for Lifelong Learning

May 04, 2024

Fahad Sarfraz, Bahram Zonooz, Elahe Arani

Figure 1 for Beyond Unimodal Learning: The Importance of Integrating Multiple Modalities for Lifelong Learning

Figure 2 for Beyond Unimodal Learning: The Importance of Integrating Multiple Modalities for Lifelong Learning

Figure 3 for Beyond Unimodal Learning: The Importance of Integrating Multiple Modalities for Lifelong Learning

Figure 4 for Beyond Unimodal Learning: The Importance of Integrating Multiple Modalities for Lifelong Learning

Abstract:While humans excel at continual learning (CL), deep neural networks (DNNs) exhibit catastrophic forgetting. A salient feature of the brain that allows effective CL is that it utilizes multiple modalities for learning and inference, which is underexplored in DNNs. Therefore, we study the role and interactions of multiple modalities in mitigating forgetting and introduce a benchmark for multimodal continual learning. Our findings demonstrate that leveraging multiple views and complementary information from multiple modalities enables the model to learn more accurate and robust representations. This makes the model less vulnerable to modality-specific regularities and considerably mitigates forgetting. Furthermore, we observe that individual modalities exhibit varying degrees of robustness to distribution shift. Finally, we propose a method for integrating and aligning the information from different modalities by utilizing the relational structural similarities between the data points in each modality. Our method sets a strong baseline that enables both single- and multimodal inference. Our study provides a promising case for further exploring the role of multiple modalities in enabling CL and provides a standard benchmark for future research.

* Accepted at 3rd Conference on Lifelong Learning Agents (CoLLAs), 2024

Via

Access Paper or Ask Questions

IMEX-Reg: Implicit-Explicit Regularization in the Function Space for Continual Learning

Apr 28, 2024

Prashant Bhat, Bharath Renjith, Elahe Arani, Bahram Zonooz

Abstract:Continual learning (CL) remains one of the long-standing challenges for deep neural networks due to catastrophic forgetting of previously acquired knowledge. Although rehearsal-based approaches have been fairly successful in mitigating catastrophic forgetting, they suffer from overfitting on buffered samples and prior information loss, hindering generalization under low-buffer regimes. Inspired by how humans learn using strong inductive biases, we propose IMEX-Reg to improve the generalization performance of experience rehearsal in CL under low buffer regimes. Specifically, we employ a two-pronged implicit-explicit regularization approach using contrastive representation learning (CRL) and consistency regularization. To further leverage the global relationship between representations learned using CRL, we propose a regularization strategy to guide the classifier toward the activation correlations in the unit hypersphere of the CRL. Our results show that IMEX-Reg significantly improves generalization performance and outperforms rehearsal-based approaches in several CL scenarios. It is also robust to natural and adversarial corruptions with less task-recency bias. Additionally, we provide theoretical insights to support our design decisions further.

* Published in Transactions on Machine Learning Research

Via

Access Paper or Ask Questions

Can We Break Free from Strong Data Augmentations in Self-Supervised Learning?

Apr 15, 2024

Shruthi Gowda, Elahe Arani, Bahram Zonooz

Figure 1 for Can We Break Free from Strong Data Augmentations in Self-Supervised Learning?

Figure 2 for Can We Break Free from Strong Data Augmentations in Self-Supervised Learning?

Figure 3 for Can We Break Free from Strong Data Augmentations in Self-Supervised Learning?

Figure 4 for Can We Break Free from Strong Data Augmentations in Self-Supervised Learning?

Abstract:Self-supervised learning (SSL) has emerged as a promising solution for addressing the challenge of limited labeled data in deep neural networks (DNNs), offering scalability potential. However, the impact of design dependencies within the SSL framework remains insufficiently investigated. In this study, we comprehensively explore SSL behavior across a spectrum of augmentations, revealing their crucial role in shaping SSL model performance and learning mechanisms. Leveraging these insights, we propose a novel learning approach that integrates prior knowledge, with the aim of curtailing the need for extensive data augmentations and thereby amplifying the efficacy of learned representations. Notably, our findings underscore that SSL models imbued with prior knowledge exhibit reduced texture bias, diminished reliance on shortcuts and augmentations, and improved robustness against both natural and adversarial corruptions. These findings not only illuminate a new direction in SSL research, but also pave the way for enhancing DNN performance while concurrently alleviating the imperative for intensive data augmentation, thereby enhancing scalability and real-world problem-solving capabilities.

Via

Access Paper or Ask Questions

The Effectiveness of Random Forgetting for Robust Generalization

Feb 18, 2024

Vijaya Raghavan T Ramkumar, Bahram Zonooz, Elahe Arani

Figure 1 for The Effectiveness of Random Forgetting for Robust Generalization

Figure 2 for The Effectiveness of Random Forgetting for Robust Generalization

Figure 3 for The Effectiveness of Random Forgetting for Robust Generalization

Figure 4 for The Effectiveness of Random Forgetting for Robust Generalization

Abstract:Deep neural networks are susceptible to adversarial attacks, which can compromise their performance and accuracy. Adversarial Training (AT) has emerged as a popular approach for protecting neural networks against such attacks. However, a key challenge of AT is robust overfitting, where the network's robust performance on test data deteriorates with further training, thus hindering generalization. Motivated by the concept of active forgetting in the brain, we introduce a novel learning paradigm called "Forget to Mitigate Overfitting (FOMO)". FOMO alternates between the forgetting phase, which randomly forgets a subset of weights and regulates the model's information through weight reinitialization, and the relearning phase, which emphasizes learning generalizable features. Our experiments on benchmark datasets and adversarial attacks show that FOMO alleviates robust overfitting by significantly reducing the gap between the best and last robust test accuracy while improving the state-of-the-art robustness. Furthermore, FOMO provides a better trade-off between standard and robust accuracy, outperforming baseline adversarial methods. Finally, our framework is robust to AutoAttacks and increases generalization in many real-world scenarios.

* Published as a conference paper at ICLR 2024

Via

Access Paper or Ask Questions

Conserve-Update-Revise to Cure Generalization and Robustness Trade-off in Adversarial Training

Jan 26, 2024

Shruthi Gowda, Bahram Zonooz, Elahe Arani

Abstract:Adversarial training improves the robustness of neural networks against adversarial attacks, albeit at the expense of the trade-off between standard and robust generalization. To unveil the underlying factors driving this phenomenon, we examine the layer-wise learning capabilities of neural networks during the transition from a standard to an adversarial setting. Our empirical findings demonstrate that selectively updating specific layers while preserving others can substantially enhance the network's learning capacity. We therefore propose CURE, a novel training framework that leverages a gradient prominence criterion to perform selective conservation, updating, and revision of weights. Importantly, CURE is designed to be dataset- and architecture-agnostic, ensuring its applicability across various scenarios. It effectively tackles both memorization and overfitting issues, thus enhancing the trade-off between robustness and generalization and additionally, this training approach also aids in mitigating "robust overfitting". Furthermore, our study provides valuable insights into the mechanisms of selective adversarial training and offers a promising avenue for future research.

* Accepted as a conference paper at ICLR 2024

Via

Access Paper or Ask Questions