Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marcin Osial

Efficient Multi-Source Knowledge Transfer by Model Merging

Aug 26, 2025

Marcin Osial, Bartosz Wójcik, Bartosz Zieliński, Sebastian Cygert

Abstract:While transfer learning is an advantageous strategy, it overlooks the opportunity to leverage knowledge from numerous available models online. Addressing this multi-source transfer learning problem is a promising path to boost adaptability and cut re-training costs. However, existing approaches are inherently coarse-grained, lacking the necessary precision for granular knowledge extraction and the aggregation efficiency required to fuse knowledge from either a large number of source models or those with high parameter counts. We address these limitations by leveraging Singular Value Decomposition (SVD) to first decompose each source model into its elementary, rank-one components. A subsequent aggregation stage then selects only the most salient components from all sources, thereby overcoming the previous efficiency and precision limitations. To best preserve and leverage the synthesized knowledge base, our method adapts to the target task by fine-tuning only the principal singular values of the merged matrix. In essence, this process only recalibrates the importance of top SVD components. The proposed framework allows for efficient transfer learning, is robust to perturbations both at the input level and in the parameter space (e.g., noisy or pruned sources), and scales well computationally.

Via

Access Paper or Ask Questions

Parameter-Efficient Interventions for Enhanced Model Merging

Dec 22, 2024

Marcin Osial, Daniel Marczak, Bartosz Zieliński

Figure 1 for Parameter-Efficient Interventions for Enhanced Model Merging

Figure 2 for Parameter-Efficient Interventions for Enhanced Model Merging

Figure 3 for Parameter-Efficient Interventions for Enhanced Model Merging

Figure 4 for Parameter-Efficient Interventions for Enhanced Model Merging

Abstract:Model merging combines knowledge from task-specific models into a unified multi-task model to avoid joint training on all task data. However, current methods face challenges due to representation bias, which can interfere with tasks performance. As a remedy, we propose IntervMerge, a novel approach to multi-task model merging that effectively mitigates representation bias across the model using taskspecific interventions. To further enhance its efficiency, we introduce mini-interventions, which modify only part of the representation, thereby reducing the additional parameters without compromising performance. Experimental results demonstrate that IntervMerge consistently outperforms the state-of-the-art approaches using fewer parameters.

* 10 pages, 6 figures, SIAM International Conference on Data Mining (SDM) 2025

Via

Access Paper or Ask Questions

A deep cut into Split Federated Self-supervised Learning

Jun 12, 2024

Marcin Przewięźlikowski, Marcin Osial, Bartosz Zieliński, Marek Śmieja

Abstract:Collaborative self-supervised learning has recently become feasible in highly distributed environments by dividing the network layers between client devices and a central server. However, state-of-the-art methods, such as MocoSFL, are optimized for network division at the initial layers, which decreases the protection of the client data and increases communication overhead. In this paper, we demonstrate that splitting depth is crucial for maintaining privacy and communication efficiency in distributed training. We also show that MocoSFL suffers from a catastrophic quality deterioration for the minimal communication overhead. As a remedy, we introduce Momentum-Aligned contrastive Split Federated Learning (MonAcoSFL), which aligns online and momentum client models during training procedure. Consequently, we achieve state-of-the-art accuracy while significantly reducing the communication overhead, making MonAcoSFL more practical in real-world scenarios.

* Accepted to European Conference on Machine Learning (ECML) 2024

Via

Access Paper or Ask Questions