International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064




Downloads: 1 | Views: 11 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Review Papers | Computer Science & Engineering | United States of America | Volume 13 Issue 7, July 2024 | Rating: 3.6 / 10


Event Driven Data Architecture: Design and Implementation with Kinesis and Spark Streaming

Arjun Mantri


Abstract: This paper reviews the design and implementation of an event-driven data architecture using Amazon Kinesis and Apache Spark Streaming. The evolution of real-time data processing has enabled organizations to handle and analyze data more dynamically and responsively. Amazon Kinesis is highlighted for its robust data ingestion capabilities, while Apache Spark Streaming is noted for its high-throughput, fault-tolerant stream processing. Integrating these technologies allows the creation of a scalable, low-latency, and fault-tolerant system. The paper explores various case studies to illustrate practical applications and benefits across industries such as OTT streaming services, travel booking platforms, and social media networks. For example, Netflix employs this architecture to personalize content recommendations and monitor service quality, while Expedia uses it for real-time availability and pricing updates. LinkedIn leverages the architecture for monitoring user activities and detecting trends in real-time. Implementation details include setting up Kinesis for real-time data ingestion and configuring Spark Streaming for processing and analytics. The system's scalability is ensured by dynamically adjusting Kinesis shards and Spark executors, while fault tolerance is achieved through data replication and checkpointing mechanisms. The findings demonstrate that integrating Amazon Kinesis and Apache Spark Streaming creates a powerful, event-driven data architecture that significantly enhances operational efficiency and supports advanced analytics. This architecture is crucial for modern data-driven applications, providing organizations with the ability to build scalable, real-time data pipelines that enhance performance and support sophisticated data analysis.


Keywords: Real-time data processing, Event-driven architecture, Amazon Kinesis, Apache Spark Streaming, Scalable data pipelines


Edition: Volume 13 Issue 7, July 2024,


Pages: 653 - 655



How to Download this Article?

Type Your Valid Email Address below to Receive the Article PDF Link


Verification Code will appear in 2 Seconds ... Wait

Top