• Log In
  • Sign Up
    • I was fortunate enough to read the 2018 book MELTDOWN by Chris Clearfield and András Tilcsik. Its subtext?

      What plane crashes, oil spills, and dumb business decisions can teach us about how to succeed at work and at home.

      Both Chris and Andras are experts in their respective fields with quite diverse backgrounds. Chris is a "former derivatives trader who worked in New York, Hong Kong, and Tokyo" while Andras "holds the Canada Research Chair in Strategy, Organizations, and Society at the University of Toronto's Rotman School of Management." Together, they've pooled their knowledge to help readers understand why crisis situations occur, and how we can prevent them for the future.

    • The book begins with the 2009 Washington DC Metro Train 112 tragedy, described from the perspective of passengers who boarded and were affected. "Ann and David Wherley boarded the first car of Metro Train 112, bound for Washington DC, on their way home from an orientation for hospital volunteers." Shortly thereafter, their peaceful commute was cut to a crushing halt by a collision that drove a 13 foot wall of debris into the train - a collision with an invisible train that had halted on the tracks ahead, and that all the system's checks, balances and operators had failed to catch.

      The authors then bring in additional context: we are surrounded by crisis at any given point.

      "Hardly a week goes by without a handful of meltdowns. One week it's an industrial accident, another it's a bankruptcy, and another it's an awful medical error. Even small issues can wreak great havok...these failures - and even large-scale meltdowns like BP's oil spill in the Gulf of Mexico, the Fukushima nuclear disaster, and the global financial crisis - seem to stem from very different problems. But their underlying causes turn out to be surprisingly similar. These events have a shared DNA, one that researchers are just beginning to understand."

    • What is in the DNA of failure? And how can we prevent it?

      By introducing such vividly illustrated examples that range from a holiday social media promotion gone-awry to Three Mile Island, András and Chris outline these common factors and how we can improve and build better outcomes in any kind of situation.

      Charles Perrow is introduced to us early on in the book. A brilliant sociologist, Perrow went through vast amounts of data to research what causes failure points.

      "For years, Perrow and his team of students trudged through the details of hundreds of accidents, from airplane crashes to chemical plant explosions. And the same pattern showed up over and over again. Different parts of a system unexpectedly interacted with one another, small failures combined in unanticipated ways, and people didn't understand what was happening. Perrow's theory was that two factors make systems susceptible to these kinds of failures. If we understand those factors we can figure out which systems are most vulnerable."

      What are these two factors?

      Complexity and Tight Coupling. Complexity can be understood as the layers and opacity relevant to an organization, entity, ecosystem, environment... you name it. In a complex system, "we can't go in to take a look at what's happening in the belly of the beast. We need to rely on indirect indicators to assess most situations...we can see some things but not everything." As far as what Tight Coupling is referring to, "in tightly coupled systems, it's not enough to get things MOSTLY right. The quantity of inputs must be precise, and they need to be combined in a particular order and time frame. Redoing a task if it's not done correctly the first time isn't usually an option. Substitutes or alternative methods rarely work...everything happens quickly, and we just can't turn off the system while we deal with a problem."

      So what does that look like in action?

    • In practice, it looks like this.

      And as per our authors, "The danger zone in Perrow's chart is in the upper-right quadrant. It's the combination of complexity and tight coupling that causes meltdowns. Small errors are inevitable in complex systems, and once things begin to go south, such systems produce baffling symptoms...Perrow called these meltdowns normal accidents...Such accidents are normal not in the sense of being frequent but in the sense of being natural and inevitable...but Perrow's framework also helps us understand these accidents: complexity and tight coupling contribute to preventable meltdowns too."

      So what does this have to do with burned turkey? Chris and András bring up that Thanksgiving is a perfect microcosm of what causes so many failures. The scheduling of Thanksgiving and holiday travel is fixed to a tight schedule, which makes tight coupling inevitable. The meal you cook is complex, meaning that most chefs are juggling a variety of dishes as well as guests. You're under pressure and on the clock! So:

      "To avoid Thanksgiving disasters, some experts recommend simplifying the part of the system that's most clearly in the danger zone: the a result, the turkey becomes a less complex system. The various parts are less connected, and it's easier to see what's going on with each of them. Tight coupling is also reduced...if unexpected issues arise, you can just focus on the problem at hand - without having to worry about a whole complex system of white meats, dark meats, stuffing and all. This approach - reducing complexity and adding slack - helps us escape from the danger zone."

    • András and Chris dig into many more examples like the Knight Capital Software Glitch and the Deepwater Horizon incident, as well as lesser-known examples of system breakdowns like the Horizon Post Office software situation in the United Kingdom. While digging into these fascinating situations, we're given concrete skills we can take away to help address and prevent failures and system meltdowns in our own lives.

      This book is definitely worth reading and reviewing for yourself. You can learn to create your own pairwise Wiki survey at You can harness prospective hindsight to help you make better decisions. You can build a company culture that harnesses the speed bump effect for the greater good.

    • Very fascinating. One of those disasters, the Deepwater Horizon platform disaster, hits close to home for me because I used to be in the energy industry and we had awful disasters like refinery fires and the Exxon Valdez crisis.

      I thought at the time it was complexity and exposure to weather and dangerous chemicals. It's one reason I left the industry; several of my friends had been killed, so I fled for the safety of Silicon Valley.

      I didn't realize that Silicon Valley would give rise to tech like Facebook which may have enabled even greater disasters. 😢

      My very favorite book about getting things right is from Atul Gawande: