Abstract:Large Language Models (LLMs) achieve strong performance across a growing range of domains, yet their scale poses deployment challenges in applications where latency and cost constraints are critical. This paper derives empirical scaling laws for domain-specific LLM compression, quantifying how in-domain and general knowledge performance scale with dataset size, compression ratio, supervision format, and iterative pruning schedule. Using quantitative finance as our application domain, we compare logit-based and LoRA-based distillation under iterative structural pruning, introducing a blended chain-of-thought supervision loss that stabilizes KL-divergence distillation over reasoning traces. In-domain task quality degrades predictably under compression while general-knowledge benchmarks collapse well before the same point; supervision format is the key driver of this tradeoff, with chain-of-thought supervision actively recovering general knowledge that pruning erases. We release the headline dataset FinHeadlineMix, scaling law results, and practical recommendations to provide a reusable framework for domain-specific compression decisions.
Abstract:Multiresolution analysis has applications across many disciplines in the study of complex systems and their dynamics. Financial markets are among the most complex entities in our environment, yet mainstream quantitative models operate at predetermined scale, rely on linear correlation measures, and struggle to recognize non-linear or causal structures. In this paper, we combine neural networks known to capture non-linear associations with a multiscale decomposition to facilitate a better understanding of financial market data substructures. Quantization keeps our decompositions calibrated to market at every scale. We illustrate our approach in the context of seven use cases.