Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiao Yang

R&D-Agent-Quant: A Multi-Agent Framework for Data-Centric Factors and Model Joint Optimization

May 21, 2025

Yuante Li, Xu Yang, Xiao Yang, Minrui Xu, Xisen Wang, Weiqing Liu, Jiang Bian

Figure 1 for R&D-Agent-Quant: A Multi-Agent Framework for Data-Centric Factors and Model Joint Optimization

Figure 2 for R&D-Agent-Quant: A Multi-Agent Framework for Data-Centric Factors and Model Joint Optimization

Figure 3 for R&D-Agent-Quant: A Multi-Agent Framework for Data-Centric Factors and Model Joint Optimization

Figure 4 for R&D-Agent-Quant: A Multi-Agent Framework for Data-Centric Factors and Model Joint Optimization

Abstract:Financial markets pose fundamental challenges for asset return prediction due to their high dimensionality, non-stationarity, and persistent volatility. Despite advances in large language models and multi-agent systems, current quantitative research pipelines suffer from limited automation, weak interpretability, and fragmented coordination across key components such as factor mining and model innovation. In this paper, we propose R&D-Agent for Quantitative Finance, in short RD-Agent(Q), the first data-centric multi-agent framework designed to automate the full-stack research and development of quantitative strategies via coordinated factor-model co-optimization. RD-Agent(Q) decomposes the quant process into two iterative stages: a Research stage that dynamically sets goal-aligned prompts, formulates hypotheses based on domain priors, and maps them to concrete tasks, and a Development stage that employs a code-generation agent, Co-STEER, to implement task-specific code, which is then executed in real-market backtests. The two stages are connected through a feedback stage that thoroughly evaluates experimental outcomes and informs subsequent iterations, with a multi-armed bandit scheduler for adaptive direction selection. Empirically, RD-Agent(Q) achieves up to 2X higher annualized returns than classical factor libraries using 70% fewer factors, and outperforms state-of-the-art deep time-series models on real markets. Its joint factor-model optimization delivers a strong balance between predictive accuracy and strategy robustness. Our code is available at: https://github.com/microsoft/RD-Agent.

Via

Access Paper or Ask Questions

R&D-Agent: Automating Data-Driven AI Solution Building Through LLM-Powered Automated Research, Development, and Evolution

May 20, 2025

Xu Yang, Xiao Yang, Shikai Fang, Bowen Xian, Yuante Li, Jian Wang, Minrui Xu, Haoran Pan, Xinpeng Hong, Weiqing Liu(+3 more)

Abstract:Recent advances in AI and ML have transformed data science, yet increasing complexity and expertise requirements continue to hinder progress. While crowdsourcing platforms alleviate some challenges, high-level data science tasks remain labor-intensive and iterative. To overcome these limitations, we introduce R&D-Agent, a dual-agent framework for iterative exploration. The Researcher agent uses performance feedback to generate ideas, while the Developer agent refines code based on error feedback. By enabling multiple parallel exploration traces that merge and enhance one another, R&D-Agent narrows the gap between automated solutions and expert-level performance. Evaluated on MLE-Bench, R&D-Agent emerges as the top-performing machine learning engineering agent, demonstrating its potential to accelerate innovation and improve precision across diverse data science applications. We have open-sourced R&D-Agent on GitHub: https://github.com/microsoft/RD-Agent.

* 7 pages, 1 figure, 1 table

Via

Access Paper or Ask Questions

How to Enable LLM with 3D Capacity? A Survey of Spatial Reasoning in LLM

Apr 08, 2025

Jirong Zha, Yuxuan Fan, Xiao Yang, Chen Gao, Xinlei Chen

Abstract:3D spatial understanding is essential in real-world applications such as robotics, autonomous vehicles, virtual reality, and medical imaging. Recently, Large Language Models (LLMs), having demonstrated remarkable success across various domains, have been leveraged to enhance 3D understanding tasks, showing potential to surpass traditional computer vision methods. In this survey, we present a comprehensive review of methods integrating LLMs with 3D spatial understanding. We propose a taxonomy that categorizes existing methods into three branches: image-based methods deriving 3D understanding from 2D visual data, point cloud-based methods working directly with 3D representations, and hybrid modality-based methods combining multiple data streams. We systematically review representative methods along these categories, covering data representations, architectural modifications, and training strategies that bridge textual and 3D modalities. Finally, we discuss current limitations, including dataset scarcity and computational challenges, while highlighting promising research directions in spatial perception, multi-modal fusion, and real-world applications.

* 9 pages, 5 figures

Via

Access Paper or Ask Questions

Cloud Computing Energy Consumption Prediction Based on Kernel Extreme Learning Machine Algorithm Improved by Vector Weighted Average Algorithm

Mar 06, 2025

Yuqing Wang, Xiao Yang

Figure 1 for Cloud Computing Energy Consumption Prediction Based on Kernel Extreme Learning Machine Algorithm Improved by Vector Weighted Average Algorithm

Figure 2 for Cloud Computing Energy Consumption Prediction Based on Kernel Extreme Learning Machine Algorithm Improved by Vector Weighted Average Algorithm

Figure 3 for Cloud Computing Energy Consumption Prediction Based on Kernel Extreme Learning Machine Algorithm Improved by Vector Weighted Average Algorithm

Figure 4 for Cloud Computing Energy Consumption Prediction Based on Kernel Extreme Learning Machine Algorithm Improved by Vector Weighted Average Algorithm

Abstract:With the rapid expansion of cloud computing infrastructure, energy consumption has become a critical challenge, driving the need for accurate and efficient prediction models. This study proposes a novel Vector Weighted Average Kernel Extreme Learning Machine (VWAA-KELM) model to enhance energy consumption prediction in cloud computing environments. By integrating a vector weighted average algorithm (VWAA) with kernel extreme learning machine (KELM), the proposed model dynamically adjusts feature weights and optimizes kernel functions, significantly improving prediction accuracy and generalization. Experimental results demonstrate the superior performance of VWAA-KELM: 94.7% of test set prediction errors fall within [0, 50] units, with only three cases exceeding 100 units, indicating strong stability. The model achieves a coefficient of determination (R2) of 0.987 in the training set (RMSE = 28.108, RPD = 8.872) and maintains excellent generalization with R2 = 0.973 in the test set (RMSE = 43.227, RPD = 6.202). Visual analysis confirms that predicted values closely align with actual energy consumption trends, avoiding overfitting while capturing nonlinear dependencies. A key innovation of this study is the introduction of adaptive feature weighting, allowing the model to dynamically assign importance to different input parameters, thereby enhancing high-dimensional data processing. This advancement provides a scalable and efficient approach for optimizing cloud data center energy consumption. Beyond cloud computing, the proposed hybrid framework has broader applications in Internet of Things (IoT) and edge computing, supporting real-time energy management and intelligent resource allocation.

Via

Access Paper or Ask Questions

Research on Edge Computing and Cloud Collaborative Resource Scheduling Optimization Based on Deep Reinforcement Learning

Feb 26, 2025

Yuqing Wang, Xiao Yang

Figure 1 for Research on Edge Computing and Cloud Collaborative Resource Scheduling Optimization Based on Deep Reinforcement Learning

Figure 2 for Research on Edge Computing and Cloud Collaborative Resource Scheduling Optimization Based on Deep Reinforcement Learning

Figure 3 for Research on Edge Computing and Cloud Collaborative Resource Scheduling Optimization Based on Deep Reinforcement Learning

Figure 4 for Research on Edge Computing and Cloud Collaborative Resource Scheduling Optimization Based on Deep Reinforcement Learning

Abstract:This study addresses the challenge of resource scheduling optimization in edge-cloud collaborative computing using deep reinforcement learning (DRL). The proposed DRL-based approach improves task processing efficiency, reduces overall processing time, enhances resource utilization, and effectively controls task migrations. Experimental results demonstrate the superiority of DRL over traditional scheduling algorithms, particularly in managing complex task allocation, dynamic workloads, and multiple resource constraints. Despite its advantages, further improvements are needed to enhance learning efficiency, reduce training time, and address convergence issues. Future research should focus on increasing the algorithm's fault tolerance to handle more complex and uncertain scheduling scenarios, thereby advancing the intelligence and efficiency of edge-cloud computing systems.

Via

Access Paper or Ask Questions

Research on Enhancing Cloud Computing Network Security using Artificial Intelligence Algorithms

Feb 25, 2025

Yuqing Wang, Xiao Yang

Figure 1 for Research on Enhancing Cloud Computing Network Security using Artificial Intelligence Algorithms

Figure 2 for Research on Enhancing Cloud Computing Network Security using Artificial Intelligence Algorithms

Figure 3 for Research on Enhancing Cloud Computing Network Security using Artificial Intelligence Algorithms

Figure 4 for Research on Enhancing Cloud Computing Network Security using Artificial Intelligence Algorithms

Abstract:Cloud computing environments are increasingly vulnerable to security threats such as distributed denial-of-service (DDoS) attacks and SQL injection. Traditional security mechanisms, based on rule matching and feature recognition, struggle to adapt to evolving attack strategies. This paper proposes an adaptive security protection framework leveraging deep learning to construct a multi-layered defense architecture. The proposed system is evaluated in a real-world business environment, achieving a detection accuracy of 97.3%, an average response time of 18 ms, and an availability rate of 99.999%. Experimental results demonstrate that the proposed method significantly enhances detection accuracy, response efficiency, and resource utilization, offering a novel and effective approach to cloud computing security.

Via

Access Paper or Ask Questions

Design and implementation of a distributed security threat detection system integrating federated learning and multimodal LLM

Feb 25, 2025

Yuqing Wang, Xiao Yang

Figure 1 for Design and implementation of a distributed security threat detection system integrating federated learning and multimodal LLM

Figure 2 for Design and implementation of a distributed security threat detection system integrating federated learning and multimodal LLM

Figure 3 for Design and implementation of a distributed security threat detection system integrating federated learning and multimodal LLM

Figure 4 for Design and implementation of a distributed security threat detection system integrating federated learning and multimodal LLM

Abstract:Traditional security protection methods struggle to address sophisticated attack vectors in large-scale distributed systems, particularly when balancing detection accuracy with data privacy concerns. This paper presents a novel distributed security threat detection system that integrates federated learning with multimodal large language models (LLMs). Our system leverages federated learning to ensure data privacy while employing multimodal LLMs to process heterogeneous data sources including network traffic, system logs, images, and sensor data. Experimental evaluation on a 10TB distributed dataset demonstrates that our approach achieves 96.4% detection accuracy, outperforming traditional baseline models by 4.1 percentage points. The system reduces both false positive and false negative rates by 1.8 and 2.4 percentage points respectively. Performance analysis shows that our system maintains efficient processing capabilities in distributed environments, requiring 180 seconds for model training and 3.8 seconds for threat detection across the distributed network. These results demonstrate significant improvements in detection accuracy and computational efficiency while preserving data privacy, suggesting strong potential for real-world deployment in large-scale security systems.

Via

Access Paper or Ask Questions

Machine Learning-Based Cloud Computing Compliance Process Automation

Feb 22, 2025

Yuqing Wang, Xiao Yang

Figure 1 for Machine Learning-Based Cloud Computing Compliance Process Automation

Figure 2 for Machine Learning-Based Cloud Computing Compliance Process Automation

Figure 3 for Machine Learning-Based Cloud Computing Compliance Process Automation

Figure 4 for Machine Learning-Based Cloud Computing Compliance Process Automation

Abstract:Cloud computing adoption across industries has revolutionized enterprise operations while introducing significant challenges in compliance management. Organizations must continuously meet evolving regulatory requirements such as GDPR and ISO 27001, yet traditional manual review processes have become increasingly inadequate for modern business scales. This paper presents a novel machine learning-based framework for automating cloud computing compliance processes, addressing critical challenges including resource-intensive manual reviews, extended compliance cycles, and delayed risk identification. Our proposed framework integrates multiple machine learning technologies, including BERT-based document processing (94.5% accuracy), One-Class SVM for anomaly detection (88.7% accuracy), and an improved CNN-LSTM architecture for sequential compliance data analysis (90.2% accuracy). Implementation results demonstrate significant improvements: reducing compliance process duration from 7 days to 1.5 days, improving accuracy from 78% to 93%, and decreasing manual effort by 73.3%. A real-world deployment at a major securities firm validated these results, processing 800,000 daily transactions with 94.2% accuracy in risk identification.

Via

Access Paper or Ask Questions

Flat U-Net: An Efficient Ultralightweight Model for Solar Filament Segmentation in Full-disk H$α$ Images

Feb 11, 2025

GaoFei Zhu, GangHua Lin, Xiao Yang, Cheng Zeng

Abstract:Solar filaments are one of the most prominent features observed on the Sun, and their evolutions are closely related to various solar activities, such as flares and coronal mass ejections. Real-time automated identification of solar filaments is the most effective approach to managing large volumes of data. Existing models of filament identification are characterized by large parameter sizes and high computational costs, which limit their future applications in highly integrated and intelligent ground-based and space-borne observation devices. Consequently, the design of more lightweight models will facilitate the advancement of intelligent observation equipment. In this study, we introduce Flat U-Net, a novel and highly efficient ultralightweight model that incorporates simplified channel attention (SCA) and channel self-attention (CSA) convolutional blocks for the segmentation of solar filaments in full-disk H$\alpha$ images. Feature information from each network layer is fully extracted to reconstruct interchannel feature representations. Each block effectively optimizes the channel features from the previous layer, significantly reducing parameters. The network architecture presents an elegant flattening, improving its efficiency, and simplifying the overall design. Experimental validation demonstrates that a model composed of pure SCAs achieves a precision of approximately 0.93, with dice similarity coefficient (DSC) and recall rates of 0.76 and 0.64, respectively, significantly outperforming the classical U-Net. Introducing a certain number of CSA blocks improves the DSC and recall rates to 0.82 and 0.74, respectively, which demonstrates a pronounced advantage, particularly concerning model weight size and detection effectiveness. The data set, models, and code are available as open-source resources.

* 15 pages, 5 figures, 3 tables, accepted for publication in ApJ

Via

Access Paper or Ask Questions

Effective Black-Box Multi-Faceted Attacks Breach Vision Large Language Model Guardrails

Feb 09, 2025

Yijun Yang, Lichao Wang, Xiao Yang, Lanqing Hong, Jun Zhu

Figure 1 for Effective Black-Box Multi-Faceted Attacks Breach Vision Large Language Model Guardrails

Figure 2 for Effective Black-Box Multi-Faceted Attacks Breach Vision Large Language Model Guardrails

Figure 3 for Effective Black-Box Multi-Faceted Attacks Breach Vision Large Language Model Guardrails

Figure 4 for Effective Black-Box Multi-Faceted Attacks Breach Vision Large Language Model Guardrails

Abstract:Vision Large Language Models (VLLMs) integrate visual data processing, expanding their real-world applications, but also increasing the risk of generating unsafe responses. In response, leading companies have implemented Multi-Layered safety defenses, including alignment training, safety system prompts, and content moderation. However, their effectiveness against sophisticated adversarial attacks remains largely unexplored. In this paper, we propose MultiFaceted Attack, a novel attack framework designed to systematically bypass Multi-Layered Defenses in VLLMs. It comprises three complementary attack facets: Visual Attack that exploits the multimodal nature of VLLMs to inject toxic system prompts through images; Alignment Breaking Attack that manipulates the model's alignment mechanism to prioritize the generation of contrasting responses; and Adversarial Signature that deceives content moderators by strategically placing misleading information at the end of the response. Extensive evaluations on eight commercial VLLMs in a black-box setting demonstrate that MultiFaceted Attack achieves a 61.56% attack success rate, surpassing state-of-the-art methods by at least 42.18%.

Via

Access Paper or Ask Questions