Abstract:Domain-adaptive pretraining (DAPT) offers a practical path to specializing large language models for high-value domains without full retraining. We conduct an early-stage scaling-law analysis of continued pretraining on U.S. SEC filings, training 1B and 3B-parameter Llama-3.2 models on a 400M-token financial corpus with validation checkpoints at 50M, 100M, 200M, and 400M tokens. Results show consistent improvements in SEC-domain validation loss for both models, with the largest gains occurring within the first 200M tokens and diminishing returns thereafter. Power-law fits reveal shallow exponents, indicating that financial language is highly regular and efficiently learnable under continued pretraining. General-domain validation loss remains effectively unchanged across all token budgets, suggesting minimal drift and no signs of catastrophic forgetting. A data-efficiency frontier further shows that both models move toward improved specialization with negligible mixed-domain degradation. Together, these findings provide early empirical guidance for scaling financial foundation models, suggesting that meaningful domain adaptation can be achieved with comparatively modest token budgets and that larger model scales (7B-70B) remain tractable under projected data requirements.




Abstract:Given recent breakthroughs in Generative Artificial Intelligence (GAI) and Large Language Models (LLMs), more and more professional services are being augmented through Artificial Intelligence (AI), which once seemed impossible to automate. This paper presents a modular model for leveraging GAI in developing strategic plans for large scale government organizations and evaluates leading machine learning techniques in their application towards one of the identified modules. Specifically, the performance of BERTopic and Non-negative Matrix Factorization (NMF) are evaluated in their ability to use topic modeling to generate themes representative of Vision Elements within a strategic plan. To accomplish this, BERTopic and NMF models are trained using a large volume of reports from the Government Accountability Office (GAO). The generated topics from each model are then scored for similarity against the Vision Elements of a published strategic plan and the results are compared. Our results show that these techniques are capable of generating themes similar to 100% of the elements being evaluated against. Further, we conclude that BERTopic performs best in this application with more than half of its correlated topics achieving a "medium" or "strong" correlation. A capability of GAI-enabled strategic plan development impacts a multi-billion dollar industry and assists the federal government in overcoming regulatory requirements which are crucial to the public good. Further work will focus on the operationalization of the concept proven in this study as well as viability of the remaining modules in the proposed model for GAI-generated strategic plans.