14.5 C
New York
Saturday, October 25, 2025

Stifel’s strategy to scalable Information Pipeline Orchestration in Information Mesh


This can be a visitor submit by Hossein Johari, Lead and Senior Architect at Stifel Monetary Corp, Srinivas Kandi and Ahmad Rawashdeh, Senior Architects at Stifel, in partnership with AWS.

Stifel Monetary Corp, a diversified monetary providers holding firm is increasing its information panorama that requires an orchestration answer able to managing more and more complicated information pipeline operations throughout a number of enterprise domains. Conventional time-based scheduling techniques fall brief in addressing the dynamic interdependencies between information merchandise, requires event-driven orchestration. Key challenges embody coordinating cross-domain dependencies, sustaining information consistency throughout enterprise items, assembly stringent SLAs, and scaling successfully as information volumes develop. And not using a versatile orchestration answer, these points can result in delayed enterprise operations and insights, elevated operational overhead, and heightened compliance dangers attributable to guide interventions and inflexible scheduling mechanisms that can’t adapt to evolving enterprise wants.

On this submit, we stroll by how Stifel Monetary Corp, in collaboration with AWS ProServe, has addressed these challenges by constructing a modular, event-driven orchestration answer utilizing AWS native providers that permits exact triggering of information pipelines based mostly on dependency satisfaction, supporting close to real-time responsiveness and cross-domain coordination.

Information platform orchestration

Stifel and AWS expertise groups recognized a number of key necessities that might information their answer structure to beat the above listed challenges together with conventional information pipeline orchestration.

Coordinated pipeline execution throughout a number of information domains based mostly on occasions

  • The orchestration answer should assist triggering information pipelines throughout a number of enterprise domains based mostly on occasions corresponding to information product publication or completion of upstream jobs.

Good dependency administration

  • The answer ought to intelligently handle pipeline dependencies throughout domains and accounts.
  • It should be sure that downstream pipelines watch for all crucial upstream information merchandise, no matter which crew or AWS account owns them.
  • Dependency logic needs to be dynamic and adaptable to adjustments in information availability.

Enterprise-aligned configuration

  • A no-code structure ought to enable enterprise customers and information homeowners to outline pipeline dependencies and triggers utilizing metadata.
  • All adjustments to dependency configurations needs to be version-controlled, traceable, and auditable.

Scalable and versatile structure

  • The orchestration answer ought to assist lots of of pipelines throughout a number of domains with out efficiency degradation.
  • It needs to be straightforward to onboard new domains, outline new dependencies, and combine with current information mesh parts.

Visibility and monitoring

  • Enterprise customers and information homeowners ought to have entry exhibiting pipeline standing, together with success, failure, and progress.
  • Alerts and notifications needs to be despatched when points happen, with clear diagnostics to assist fast decision.

Instance State of affairs

The next beneath illustrates a cross-domain information dependency situation, the place a knowledge product in area (D1 and D2) depends on the immediate refresh of information merchandise from different domains, every working on distinct schedules. Upon completion, these upstream information merchandise emit refresh occasions that routinely set off the execution of a dependent downstream pipeline.

  • Dataset DS1 for Area D1 is dependent upon RD1 and RD2 from uncooked information area which will get refreshed at completely different occasions T1 and T2
  • Dataset DS2 for Area D1 is dependent upon RD3 from uncooked information area which will get refreshed at completely different occasions T3
  • Dataset DS3 for Area D1 is dependent upon information refresh of datasets DS1 and DS2 from Area D1
  • Dataset DS4 for Area D1 is dependent upon datasets DS3 from Area D1 and dataset DS1 from Area D2 which is refreshed at time T4.

Resolution Overview

The orchestration answer includes two principal parts.

1. Cross account occasion sharing

The next diagram illustrates the structure for distributing information refresh occasions throughout domains throughout the orchestration answer utilizing Amazon EventBridge. Information producers emit refresh occasions to a centralized occasion bus upon finishing their updates. These occasions are then propagated to all subscribing domains. Every area evaluates incoming occasions in opposition to its pipeline dependency configurations, enabling exact and immediate triggering of downstream information pipelines.

Cross Account Occasion Publish Utilizing Eventbridge

The next snippet exhibits the information refresh occasion:


Pattern EventBridge cross account occasion ahead rule.

The next screenshots depicts a pattern information refresh occasion that shall be broadcasted to client information domains.


2. Information Pipeline orchestration

The next diagram describes the technical structure of the orchestration answer utilizing a number of AWS providers corresponding to Amazon Eventbridge, Amazon SQS, AWS Lambda, AWS Glue, Amazon SNS and Amazon Aurora.

The orchestration answer revolves round 5 core processors.

Information product pipeline scheduler

The scheduler is a each day scheduled Glue job that finds information merchandise which are due for information refresh based mostly on orchestration metadata and, for every recognized information product, the scheduler retrieves each inside and exterior dependencies and shops them within the orchestration state administration system database tables with a standing of WAITING.

Information refresh occasions processor

Information refresh occasions are emitted from a central occasion bus and routed to domain-specific occasion buses. These area buses ship the occasions to a message queue for asynchronous processing. Any undeliverable occasions are redirected to a dead-letter queue for additional inspection and restoration.

The occasion processor Lambda operate consumes messages from the queue and evaluates whether or not the incoming occasion corresponds to any outlined dependencies throughout the area. If a match is discovered, the dependency standing is up to date from WAITING to ARRIVED. The processor additionally checks whether or not all dependencies for a given information product have been happy. In that case, it begins the corresponding pipeline execution workflow by triggering an AWS Step Features state machine.

Information product pipeline processor

Retrieves orchestration metadata to search out the pipeline configuration and related Glue job and parameters for the goal information product. Triggers the Glue job utilizing the retrieved configuration and parameters. This step ensures that the pipeline is launched with the proper context and enter values. It additionally captures the Glue job run Id and updates the information product standing to PROCESSING throughout the orchestration state administration database, enabling downstream monitoring and standing monitoring.

Information product pipeline standing processor

Every area’s EventBridge is configured to pay attention for AWS Glue job state change occasions, that are routed to a message queue for asynchronous processing. A processing operate evaluates incoming job state occasions:

  • For profitable job completions, the corresponding pipeline standing is up to date from PROCESSING to COMPLETED within the orchestration state database. If the pipeline is configured to publish downstream occasions, a knowledge refresh occasion is emitted to the central occasion bus.
  • For failed jobs, the pipeline standing is up to date from PROCESSING to ERROR, enabling downstream techniques to handle exceptions or begin retrying of a failed job.
  • Pattern Glue Job state change occasions for profitable completion. The glue job identify from the occasion is used to replace the standing of the information product.

Information product pipeline monitor

The pipeline monitoring system operates by an EventBridge scheduled set off that prompts each 10 minutes to scan the orchestration state. Throughout this scan, it identifies information merchandise with happy dependencies however pending pipeline execution and initiates these pipelines routinely. When pipeline reruns are crucial, the system resets the orchestration state, permitting the monitor to reassess dependencies and set off the suitable pipelines. Any pipeline failures are promptly captured as exception notifications and directed to a devoted notification queue for thorough evaluation and crew alerting.

Orchestration metadata information mannequin

The next diagram describes the reference information mannequin for storing the dependencies and state administration of the information pipelines.

Desk TitleDescription
data_productThis desk shops info on the information product and settings such publishing occasion for the information product.
data_product_dependenciesThis desk shops info on the information product dependencies for each inside and exterior information merchandise.
data_product_scheduleThis desk shops info on the information product run schedule (Ex: each day / weekly)
data_pipeline_configThis desk shops details about the Glue job used for the information pipeline (ex: Title of the glue job, connections)
data_pipeline_parametersThis desk shops the Glue job parameters
data_product_statusThis desk tracks the execution standing of the information product pipeline, transitioning states from ‘Ready’ to both ‘Full’ or ‘Error’ based mostly on runtime outcomes
data_product_dependencies_events_statusThis desk shops the standing of information dependencies refresh standing. It’s used to maintain observe of the dependencies and updates the standing as the information refresh occasions arrive
data_product_status_historyThis desk shops the historic information of information product information pipeline executions for audit and reporting
data_product_dependencies_events_status_historyThis desk shops the historic information of information product information dependency standing for audit and reporting

Final result

With information pipeline orchestration and use of AWS serverless providers, Stifel was in a position to velocity up the information refresh course of by chopping down the lag time related to mounted scheduling of triggering information pipelines as properly improve the parallelism of executing the information pipelines which was a constraint with on-premises information platform. This strategy affords:

  • Scalability by supporting coordination throughout a number of information domains.
  • Reliability by automated monitoring and backbone of pipeline dependencies.
  • Timeliness by making certain pipelines are executed exactly when their stipulations are met.
  • Value optimization by leveraging AWS serverless applied sciences Lambda for compute, EventBridge for occasion routing, Aurora Serverless for database operations, and Step Features for workflow orchestration and pay just for precise utilization fairly than provisioned capability whereas offering computerized scaling to deal with various workloads.

Conclusion

On this submit, we confirmed how a modular, event-driven orchestration answer can successfully handle cross-domain information pipelines. Organizations can seek advice from this weblog submit to construct strong information pipeline orchestration avoiding inflexible schedules and dependencies by leveraging event-based triggers.

Particular thanks: This implementation success is a results of shut collaboration between Stifel Monetary management crew (Kyle Broussard Managing Director, Martin Nieuwoudt Director of Information Technique & Analytics) , AWS ProServe, and the AWS account crew. We wish to thank Stifel Monetary Executives and the Management Staff for the robust sponsorship and route.

Concerning the authors

Amit Maindola

Amit Maindola

Amit is a Senior Information Architect with AWS ProServe crew targeted on information engineering, analytics, and AI/ML. He helps clients of their digital transformation journey and allows them to construct extremely scalable, strong, and safe cloud-based analytical options on AWS to realize well timed insights and make essential enterprise selections.

Srinivas Kandi

Srinivas Kandi

Srinivas is a Senior Architect at Stifel specializing in delivering the subsequent technology of cloud information platform on AWS. Previous to becoming a member of Stifel, Srini was a supply specialist in cloud information analytics at AWS serving to a number of clients of their transformational journey into AWS cloud. In his free time, Srini likes to discover cooking, journey and study new developments and improvements in AI and cloud computing.

Hossein Johari

Hossein Johari

Hossein is a seasoned information and analytics chief with over 25 years of expertise architecting enterprise-scale platforms. As Lead and Senior Architect at Stifel Monetary Corp. in St. Louis, Missouri, he spearheads initiatives in Information Platforms and Strategic Options, driving the design and implementation of modern frameworks that assist enterprise-wide analytics, strategic decision-making, and digital transformation. Identified for aligning technical imaginative and prescient with enterprise aims, he works carefully with cross-functional groups to ship scalable, forward-looking options that advance organizational agility and efficiency.

Ahmad Rawashdeh

Ahmad Rawashdeh

Ahmad is a Senior Architect at Stifel Monetary. He helps Stifel and its shoppers in designing, implementing, and constructing scalable and dependable information architectures on Amazon Internet Companies (AWS), with a powerful give attention to information lake methods, database providers, and environment friendly information ingestion and transformation pipelines.

Lei Meng

Lei Meng

Lei is a knowledge architect at Stifel. His focus is working in designing and implementing scalable and safe information options on the AWS and serving to Stifel’s cloud migration from on-premises techniques.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles