Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mary Isabelle Wisell

GRASP: Gradient-Aligned Sequential Parameter Transfer for Memory-Efficient Multi-Source Learning

Jun 12, 2026

Mary Isabelle Wisell, Nicholas Jacobs, Aayush Manandhar, Salimeh Yasaei Sekeh

Abstract:Multi-source transfer learning faces a fundamental scalability bottleneck: existing approaches require either loading all K source models into memory simultaneously during parameter fusion, requiring O(K) memory, or deploying all models at inference time, making production deployment infeasible. We propose GRASP (Gradient-Aligned Sequential Parameter Transfer), which achieves superior knowledge integration while maintaining O(1) memory consumption through three key innovations: (1) sequential processing that merges one source at a time into an evolving target model, (2) parameter-wise gradient alignment that selectively transfers only parameters whose optimization directions align with the target domain, avoiding negative transfer, and (3) iterative fine-tuning that adapts transferred knowledge before integrating the next source. Extensive experiments across three continual learning benchmarks (Yearbook, CLEAR-10, CLEAR-100) spanning 10 to 108-year temporal distribution shifts and four architectures (1.3M to 25.6M parameters) demonstrate that GRASP achieves 93.5% mean accuracy over all datasets and architectures compared to ensemble method's 71.7% accuracy while requiring only constant memory versus K models for standard multi-source fusion. Critically, GRASP's sequential previously merged models and scales to arbitrarily many sources without memory growth, making it uniquely suitable for resource-constrained deployment and continually evolving source domains.

Via

Access Paper or Ask Questions

Ghost-Connect Net: A Generalization-Enhanced Guidance For Sparse Deep Networks Under Distribution Shifts

Nov 14, 2024

Mary Isabelle Wisell, Salimeh Yasaei Sekeh

Figure 1 for Ghost-Connect Net: A Generalization-Enhanced Guidance For Sparse Deep Networks Under Distribution Shifts

Figure 2 for Ghost-Connect Net: A Generalization-Enhanced Guidance For Sparse Deep Networks Under Distribution Shifts

Figure 3 for Ghost-Connect Net: A Generalization-Enhanced Guidance For Sparse Deep Networks Under Distribution Shifts

Figure 4 for Ghost-Connect Net: A Generalization-Enhanced Guidance For Sparse Deep Networks Under Distribution Shifts

Abstract:Sparse deep neural networks (DNNs) excel in real-world applications like robotics and computer vision, by reducing computational demands that hinder usability. However, recent studies aim to boost DNN efficiency by trimming redundant neurons or filters based on task relevance, but neglect their adaptability to distribution shifts. We aim to enhance these existing techniques by introducing a companion network, Ghost Connect-Net (GC-Net), to monitor the connections in the original network with distribution generalization advantage. GC-Net's weights represent connectivity measurements between consecutive layers of the original network. After pruning GC-Net, the pruned locations are mapped back to the original network as pruned connections, allowing for the combination of magnitude and connectivity-based pruning methods. Experimental results using common DNN benchmarks, such as CIFAR-10, Fashion MNIST, and Tiny ImageNet show promising results for hybridizing the method, and using GC-Net guidance for later layers of a network and direct pruning on earlier layers. We provide theoretical foundations for GC-Net's approach to improving generalization under distribution shifts.

* 21 pages, 4 figures, 3 subfigures, 42 tables

Via

Access Paper or Ask Questions