Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vytenis Šliogeris

Elastic Weight Consolidation for Full-Parameter Continual Pre-Training of Gemma2

May 09, 2025

Vytenis Šliogeris, Povilas Daniušis, Artūras Nakvosas

Abstract:This technical report describes an experiment on autoregressive pre-training of Gemma2 2 billion parameter large language model (LLM) with 10\% on the Lithuanian language component of CulturaX from the point of view of continual learning. We apply elastic weight consolidation (EWC) to the full set of the model's parameters and investigate language understanding benchmarks, consisting of Arc, Belebele, Gsm8K, Hellaswag, MMLU, TruthfulQA, and Winogrande sets (both in English and Lithuanian versions), and perplexity benchmarks. We empirically demonstrate that EWC regularisation allows us not only to mitigate catastrophic forgetting effects but also that it is potentially beneficial for learning of the new task with LLMs.

* 8 pages, 4 figures

Via

Access Paper or Ask Questions

Inferring probabilistic Boolean networks from steady-state gene data samples

Nov 11, 2022

Vytenis Šliogeris, Leandros Maglaras, Sotiris Moschoyiannis

Figure 1 for Inferring probabilistic Boolean networks from steady-state gene data samples

Figure 2 for Inferring probabilistic Boolean networks from steady-state gene data samples

Abstract:Probabilistic Boolean Networks have been proposed for estimating the behaviour of dynamical systems as they combine rule-based modelling with uncertainty principles. Inferring PBNs directly from gene data is challenging however, especially when data is costly to collect and/or noisy, e.g., in the case of gene expression profile data. In this paper, we present a reproducible method for inferring PBNs directly from real gene expression data measurements taken when the system was at a steady state. The steady-state dynamics of PBNs is of special interest in the analysis of biological machinery. The proposed approach does not rely on reconstructing the state evolution of the network, which is computationally intractable for larger networks. We demonstrate the method on samples of real gene expression profiling data from a well-known study on metastatic melanoma. The pipeline is implemented using Python and we make it publicly available.

Via

Access Paper or Ask Questions