Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bingcheng Mao

ArchAgent: Scalable Legacy Software Architecture Recovery with LLMs

Jan 19, 2026

Rusheng Pan, Bingcheng Mao, Tianyi Ma, Zhenhua Ling

Abstract:Recovering accurate architecture from large-scale legacy software is hindered by architectural drift, missing relations, and the limited context of Large Language Models (LLMs). We present ArchAgent, a scalable agent-based framework that combines static analysis, adaptive code segmentation, and LLM-powered synthesis to reconstruct multiview, business-aligned architectures from cross-repository codebases. ArchAgent introduces scalable diagram generation with contextual pruning and integrates cross-repository data to identify business-critical modules. Evaluations of typical large-scale GitHub projects show significant improvements over existing benchmarks. An ablation study confirms that dependency context improves the accuracy of generated architectures of production-level repositories, and a real-world case study demonstrates effective recovery of critical business logics from legacy projects. The dataset is available at https://github.com/panrusheng/arch-eval-benchmark.

* to be published in ICASSP 2026

Via

Access Paper or Ask Questions

Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction

Oct 09, 2023

Yujie Ding, Shuai Jia, Tianyi Ma, Bingcheng Mao, Xiuze Zhou, Liuliu Li, Dongming Han

Figure 1 for Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction

Figure 2 for Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction

Figure 3 for Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction

Figure 4 for Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction

Abstract:The remarkable achievements and rapid advancements of Large Language Models (LLMs) such as ChatGPT and GPT-4 have showcased their immense potential in quantitative investment. Traders can effectively leverage these LLMs to analyze financial news and predict stock returns accurately. However, integrating LLMs into existing quantitative models presents two primary challenges: the insufficient utilization of semantic information embedded within LLMs and the difficulties in aligning the latent information within LLMs with pre-existing quantitative stock features. We propose a novel framework consisting of two components to surmount these challenges. The first component, the Local-Global (LG) model, introduces three distinct strategies for modeling global information. These approaches are grounded respectively on stock features, the capabilities of LLMs, and a hybrid method combining the two paradigms. The second component, Self-Correlated Reinforcement Learning (SCRL), focuses on aligning the embeddings of financial news generated by LLMs with stock features within the same semantic space. By implementing our framework, we have demonstrated superior performance in Rank Information Coefficient and returns, particularly compared to models relying only on stock features in the China A-share market.

* International Joint Conferences on Artificial Intelligence,2023
* 8 pages, International Joint Conferences on Artificial Intelligence

Via

Access Paper or Ask Questions