The birth of chaos engineering happened somewhat accidentally in 2008 when Netflix moved from the data center to the cloud. The move didn’t go as planned. The thinking at the time was that the data center locked Netflix into an architecture of single points of failure, like large databases and vertically scaled components. Moving to […]
Gremlin Brings Chaos Monkey Testing to Spinnaker CD Platform
Gremlin this week announced it will integrate its hosted chaos engineering service with continuous delivery platforms starting with the open source Spinnaker project developed by Netflix. Chaos engineering traces its roots to a resiliency testing philosophy that posits applications should be able to keep functioning regardless of any service or infrastructure failure. Most IT organizations, […]
Inject failure to make your systems more reliable
Injecting failure into your infrastructure to test your services resilience has been gaining popularity. Most people think of Netflix and their use of “Chaos Monkey”, in fact, they have an entire “Simian Army” of tests and drills they run to make their systems more reliable. At PagerDuty, we couldn’t just copy what has been published […]



