Taiichi Ohno was one of the inventors of the Toyota Production System. Toyota Production System: Beyond Large-Scale Production is a fascinating read, even though it’s decidedly non-practical. After reading it, past software cluster analysis might not even realize that there are cars involved in Toyota’s business.
Did not find what they wanted? Try here
When something goes wrong, we tend to see it as a crisis and seek to blame. A better way is to see it as a learning opportunity. Not in the existential sense of general self-improvement. Instead, we can use the technique of asking why five times to get to the root cause of the problem.
Let’s say you notice that your website is down. Obviously, your first priority is to get it back up. A new bit of code contained an infinite loop! So far, this isn’t much different from the kind of analysis any competent operations team would conduct for a site outage. I have come to believe that this technique should be used for all kinds of defects, not just site outages. Each time, we use the defect as an opportunity to find out what’s wrong with our process, and make a small adjustment. By continuously adjusting, we eventually build up a robust series of defenses that prevent problems from happening.