Downloads: 4 | Views: 161 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1
Analysis Study Research Paper | Computer Science & Engineering | India | Volume 9 Issue 3, March 2020 | Rating: 5.3 / 10
Chaos Engineering for Building Resilient Distributed Systems
Venkata Naga Sai Kiran Challa [7]
Abstract: Chaos Engineering is an advanced methodology used to ensure the reliability and fault tolerance of distributed systems. By deliberately introducing faults, it tests how systems behave under real-world conditions, thereby identifying vulnerabilities that traditional testing may miss. This proactive approach helps organizations like Netflix, Amazon, Google, and Microsoft to maintain high availability of their services. Integrating Machine Learning ML into Chaos Engineering further enhances its effectiveness by predicting anomalies, automating experiments, and improving observability. This combined strategy promotes a culture of continuous learning and resilience, crucial for modern, complex systems.
Keywords: Chaos Engineering, fault tolerance, distributed systems, Machine Learning, resilience
Edition: Volume 9 Issue 3, March 2020,
Pages: 1678 - 1689