AI Pipelines: A Scalable Architecture for Dynamic Data Processing

A scalable, modular AI pipeline architecture with a Heartbeat-Update-Synchronize-Ingest-and-Register (HUSIR) protocol for orchestration, fault tolerance, and dynamic job routing. Evaluated over two years in production at Esoft, the architecture replaced a manual two-week pre-processing cycle with automated, A/B-testable AI workflows and matured from TRL 3 to TRL 7.

research
AI pipelines
software architecture
distributed systems
microservices
2025 IEEE 22nd International Conference on Software Architecture Companion (ICSA-C), pp. 85–93.
Authors

Jakob Hviid

Anders Launer Bæk-Petersen

Emil Stubbe Kolvig-Raun

Juan Francisco Marín Vega

Published

April 3, 2025

Publication Esoft (industrial partner)

Abstract

This paper introduces a scalable and dynamic architecture for AI pipelines designed to address challenges in distributed systems and industrial AI applications. The architecture employs a modular, service-oriented structure integrated with the Heartbeat-Update-Synchronize-Ingest-and-Register (HUSIR) protocol to enable efficient task orchestration, fault tolerance, and dynamic routing of jobs.

Automated workflows replace manual data processing, significantly improving scalability, adaptability, and efficiency. Mechanisms for edge-case data collection and non-disruptive real-time testing align AI development with production needs, fostering continuous model improvement.

The proposed system was evaluated over two years in collaboration with Esoft, a global leader in visual content solutions for real estate. The evaluation demonstrates substantial improvements in processing speed, resource utilisation, and production capacity, highlighting the architecture’s effectiveness in scaling automated production systems and handling diverse, dynamic workflows.

Key contributions

  • A scalable architecture enabling horizontal scaling and dynamic workflows for AI services.
  • Automation and adaptability through pipelines for data ingestion and pre-processing, reducing manual intervention.
  • Improved AI model robustness via mechanisms for collecting and processing edge-case data, plus real-time A/B testing without production disruption.
  • Advanced the architecture from TRL 3 to TRL 7 over a three-year period.

Why this matters

Esoft’s previous Digital Asset Management system required manual pre-processing taking up to two weeks per iteration. The new pipeline replaced that with automated, real-time-testable AI workflows – a concrete example of treating pipeline orchestration as a platform concern rather than a per-project artefact.

Citation

Hviid, J., Bæk-Petersen, A. L., Kolvig-Raun, E. S., & Marín Vega, J. F. (2025). AI Pipelines: A Scalable Architecture for Dynamic Data Processing. In 2025 IEEE 22nd International Conference on Software Architecture Companion (ICSA-C), pp. 85–93. IEEE. https://doi.org/10.1109/ICSA-C65153.2025.00020