Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Renato Cordeiro Ferreira

University of São Paulo, Jheronimus Academy of Data Science, Technical University of Eindhoven, Tilburg University

A Tale of Two Systems: Characterizing Architectural Complexity on Machine Learning-Enabled Systems

Jun 12, 2025

Renato Cordeiro Ferreira

Abstract:How can the complexity of ML-enabled systems be managed effectively? The goal of this research is to investigate how complexity affects ML-Enabled Systems (MLES). To address this question, this research aims to introduce a metrics-based architectural model to characterize the complexity of MLES. The goal is to support architectural decisions, providing a guideline for the inception and growth of these systems. This paper brings, side-by-side, the architecture representation of two systems that can be used as case studies for creating the metrics-based architectural model: the SPIRA and the Ocean Guard MLES.

* 8 pages, 3 figures (3 diagrams), submitted to the ECSA2025. arXiv admin note: substantial text overlap with arXiv:2506.08153

Via

Access Paper or Ask Questions

A Metrics-Oriented Architectural Model to Characterize Complexity on Machine Learning-Enabled Systems

Jun 09, 2025

Renato Cordeiro Ferreira

Abstract:How can the complexity of ML-enabled systems be managed effectively? The goal of this research is to investigate how complexity affects ML-Enabled Systems (MLES). To address this question, this research aims to introduce a metrics-based architectural model to characterize the complexity of MLES. The goal is to support architectural decisions, providing a guideline for the inception and growth of these systems. This paper showcases the first step for creating the metrics-based architectural model: an extension of a reference architecture that can describe MLES to collect their metrics.

* 4 pages, 3 figures (2 diagrams, 1 table), to be published in CAIN 2025

Via

Access Paper or Ask Questions

Is Your Training Pipeline Production-Ready? A Case Study in the Healthcare Domain

Jun 07, 2025

Daniel Lawand, Lucas Quaresma, Roberto Bolgheroni, Alfredo Goldman, Renato Cordeiro Ferreira

Abstract:Deploying a Machine Learning (ML) training pipeline into production requires robust software engineering practices. This differs significantly from experimental workflows. This experience report investigates this challenge in SPIRA, a project whose goal is to create an ML-Enabled System (MLES) to pre-diagnose insufficiency respiratory via speech analysis. The first version of SPIRA's training pipeline lacked critical software quality attributes. This paper presents an overview of the MLES, then compares three versions of the architecture of the Continuous Training subsystem, which evolved from a Big Ball of Mud, to a Modular Monolith, towards Microservices. By adopting different design principles and patterns to enhance its maintainability, robustness, and extensibility. In this way, the paper seeks to offer insights for both ML Engineers tasked to productionize ML training pipelines and Data Scientists seeking to adopt MLOps practices.

* 9 pages, 3 figures (2 diagrams, 1 code listing), submitted to the workshop SADIS 2025

Via

Access Paper or Ask Questions

MLOps with Microservices: A Case Study on the Maritime Domain

Jun 06, 2025

Renato Cordeiro Ferreira, Rowanne Trapmann, Willem-Jan van den Heuvel

Abstract:This case study describes challenges and lessons learned on building Ocean Guard: a Machine Learning-Enabled System (MLES) for anomaly detection in the maritime domain. First, the paper presents the system's specification, and architecture. Ocean Guard was designed with a microservices' architecture to enable multiple teams to work on the project in parallel. Then, the paper discusses how the developers adapted contract-based design to MLOps for achieving that goal. As a MLES, Ocean Guard employs code, model, and data contracts to establish guidelines between its services. This case study hopes to inspire software engineers, machine learning engineers, and data scientists to leverage similar approaches for their systems.

* 13 pages, 3 figures, to be published in SummerSOC 2025

Via

Access Paper or Ask Questions

Leveraging XP and CRISP-DM for Agile Data Science Projects

May 27, 2025

Andre Massahiro Shimaoka, Renato Cordeiro Ferreira, Alfredo Goldman

Abstract:This study explores the integration of eXtreme Programming (XP) and the Cross-Industry Standard Process for Data Mining (CRISP-DM) in agile Data Science projects. We conducted a case study at the e-commerce company Elo7 to answer the research question: How can the agility of the XP method be integrated with CRISP-DM in Data Science projects? Data was collected through interviews and questionnaires with a Data Science team consisting of data scientists, ML engineers, and data product managers. The results show that 86% of the team frequently or always applies CRISP-DM, while 71% adopt XP practices in their projects. Furthermore, the study demonstrates that it is possible to combine CRISP-DM with XP in Data Science projects, providing a structured and collaborative approach. Finally, the study generated improvement recommendations for the company.

Via

Access Paper or Ask Questions