Downloads: 15 | Views: 163 | Weekly Hits: ⮙4 | Monthly Hits: ⮙12
Informative Article | Computer Science and Information Technology | United States of America | Volume 13 Issue 9, September 2024 | Rating: 6.4 / 10
Leveraging Event - Based Architecture, AWS Step Functions, AWS Batch, and DynamoDB to Run ETL or ELT Jobs Concurrently While Allowing Granular Replay Capabilities
Akshay Prabhu
Abstract: Traditional Extract, Transform, Load (ETL) and Extract, Load, Transform (ELT) jobs are often perceived as hardware - intensive, necessitating the use of persistent EC2 instances to handle large data sets. This conventional approach presents challenges, including the need for manual monitoring of long - running jobs and the inability to replay jobs from specific points or stages in the ETL/ELT process. Additionally, the intricate nature of ETL/ELT phases, each with potential failure points, complicates the operational management of these workflows. AWS provides a suite of serverless services such as EventBridge, S3, SNS, Lambda, Step Functions, and Batch that can be leveraged to create a scalable and resilient ETL/ELT architecture. This paper explores how integrating these services can transform traditional ETL/ELT processes into a more flexible, state - managed saga (1) with granular replay capabilities. The goal is to offer insights into how this architecture using the above - mentioned AWS services can enhance traditional data processing workflows, focusing on concurrent job execution and precise error recovery, especially targeted for Software Architects and Engineers.
Keywords: Event - Based Architecture, AWS Step Functions, AWS Batch, Amazon, DynamoDB, ETL (Extract, Transform, Load), ELT (Extract, Load, Transform), Serverless Computing
Edition: Volume 13 Issue 9, September 2024,
Pages: 25 - 28