Drift into Failure
eBook - ePub

Drift into Failure

From Hunting Broken Components to Understanding Complex Systems

Sidney Dekker

Share book
  1. 234 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Drift into Failure

From Hunting Broken Components to Understanding Complex Systems

Sidney Dekker

Book details
Book preview
Table of contents
Citations

About This Book

What does the collapse of sub-prime lending have in common with a broken jackscrew in an airliner's tailplane? Or the oil spill disaster in the Gulf of Mexico with the burn-up of Space Shuttle Columbia? These were systems that drifted into failure. While pursuing success in a dynamic, complex environment with limited resources and multiple goal conflicts, a succession of small, everyday decisions eventually produced breakdowns on a massive scale. We have trouble grasping the complexity and normality that gives rise to such large events. We hunt for broken parts, fixable properties, people we can hold accountable. Our analyses of complex system breakdowns remain depressingly linear, depressingly componential - imprisoned in the space of ideas once defined by Newton and Descartes. The growth of complexity in society has outpaced our understanding of how complex systems work and fail. Our technologies have gotten ahead of our theories. We are able to build things - deep-sea oil rigs, jackscrews, collateralized debt obligations - whose properties we understand in isolation. But in competitive, regulated societies, their connections proliferate, their interactions and interdependencies multiply, their complexities mushroom. This book explores complexity theory and systems thinking to understand better how complex systems drift into failure. It studies sensitive dependence on initial conditions, unruly technology, tipping points, diversity - and finds that failure emerges opportunistically, non-randomly, from the very webs of relationships that breed success and that are supposed to protect organizations from disaster. It develops a vocabulary that allows us to harness complexity and find new ways of managing drift.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Drift into Failure an online PDF/ePUB?
Yes, you can access Drift into Failure by Sidney Dekker in PDF and/or ePUB format, as well as other popular books in Technik & Maschinenbau & Gesundheit & Sicherheit in der Industrie. We have over one million books available in our catalogue for you to explore.

Information

1
Failure is Always an Option

Accidents are the effect of a systematic migration of organizational behavior under the influence of pressure toward cost-effectiveness in an aggressive, competitive environment.1
Rasmussen and Svedung

Who Messed up Here?

If only there was an easy, unequivocal answer to that question. In June 2010, the U.S. Geological Survey calculated that as much as 50,000 barrels, or 2.1 million gallons of oil a day, were flowing into the Gulf of Mexico out of the well left over from a sunken oil platform. The Deepwater Horizon oil rig exploded in April 2010, then sank to the bottom of the sea while killing 11 people. It triggered a spill that lasted for months as its severed riser pipe kept spewing oil deep into the sea.
Anger over the deaths and unprecedented ecological destruction turned to a hunt for culprits – Tony Hayward, the British CEO of BP, which used the rig (the rig was run by Transocean, a smaller exploration company) or Carl-Henric Svanberg, its Swedish chairman, or people at the federal Minerals Management Service. As we wade deeper into the mess of accidents like these, the story quickly grows murkier, branching out into multiple possible versions. The "accidental" seems to become less obvious, and the roles of human agency, decision-making and organizational trade-offs appear to grow in importance. But the possible interpretations of why these decisions and trade-offs caused an oil rig to blow up are book-ended by two dramatically different families of versions of the story.
Ultimately, these families of explanations have their roots in entirely different assumptions about the nature of knowledge (and, by extension, human decision-making). These families present different premises about how events are related to each other through cause and effect, and about the foreseeability and preventability of disasters and other outcomes. In short, they take very different views of how the world can be known, how the world works, and how it can be controlled or influenced. These assumptions tacitly inform much of what either family sees as common-sense: which stones it should look for and turn over to find the sources of disaster. When we respond to failure, we may not even know that we are firmly in-family in one way or another. It seems so natural, so obvious, so taken-for-granted to ask the questions we ask, to look for causes in the places we do.
One family of explanations goes back to how the entire petroleum industry is rotten to the core, how it is run by callous men and not controlled by toothless regulators and corruptible governments. More powerful than many of the states in which its operates, the industry has governments in its pocket. Managers spend their days making amoral trade-offs to the detriment of nature and humanity. Worker safety gets sacrificed, as do environmental concerns, all in the single-minded and greedy pursuit of ever greater profits.2 Certain managers are more ruthless than others, certain regulators more hapless than others, some workers more willing to cut corners than others, and certain governments easier to buy than others. But that is where the differences essentially end. The central, common problem is one of culprits, driven by production, expediency and profit, and their unethical decisions. Fines and criminal trials will deal with them. Or at least they will make us feel better.
The family of explanations that identifies bad causes (bad people, bad decisions, broken parts) for bad outcomes is firmly quartered in the epistemological space3 once established by titans of the scientific revolution – Isaac Newton (1642—1727) and RenĂ© Descartes (1596—1650). The model itself is founded in, and constantly nourished by, a vision of how the world works that is at least three centuries old, and which we have equated with "analytic" and "scientific" and "rational" ever since. In tins book, I call it the Newtonian–Cartesian vision.4
Nowadays, this epistemological space is populated by theories that faithfully reproduce Cartesian and Newtonian ideas, and that make us think about failure in their terms. We might not even be aware of it, and, more problematically, we might even call these theories "systemic." Thinking about risk in terms of energy-to-be-contained, which requires barriers or layers of defense, is one of those faithful reproductions. The linear sequence of events (of Causes and effects) that breaks through these barriers is another. The belief that, by applying the right method or the best method, we can approximate the true story of what happened is Newtonian too: it assumes that there is a final, most accurate description of the world. And underneath all of this, of course, is a reproduction of the strongest Newtonian commitment of all: reductionism. If you want to understand how something works or fails, you have to take it apart and look at the functioning or non-functioning of the parts inside it (for example, holes in a layer of defense). That will explain why the whole failed or worked.

Rational Choice Theory

The Newtonian, vision has had enormous consequences for our thinking even in the case of systems that are not as linear and closed as Newton's basic model – the planetary system. Human decision-making and its role in the creation of failure and success is one area where Newtonian thought appears very strongly. For its psychological and moral nourishment, this family of explanations runs on a variant of rational choice theory. In the words of Scott Page:
In the literature on institutions, rational choice has become the benchmark behavioral assumption. Individuals, parties, and firms are assumed to take actions that optimize their utilities conditional on their information and the actions of others. This is not inconsistent with that fact that, ex post, many actions appear to be far from optimal.5
Rational choice theory says that operators and managers and other people in organizations make decisions by systematically and consciously weighing all possible outcomes along all relevant criteria. They know that failure is always an option, but the costs and benefits of decision alternatives that make such failure more or less likely are worked out and listed. Then people make a decision based on the outcome that provides the highest utility, or the highest return on the criteria that matter most, the greatest benefit for the least cost. If decisions after the fact ("ex post" as Scott Page calls it) don't seem to be optimal, then something was wrong with how people inside organizations gathered and weighed information. They should or could have tried harder. BP, for example, hardly seems to have achieved an optimum in any utilitarian terms with its decision to skimp on safety systems and adequate blowout protection in its deepwater oil pumping. A few more million dollars in investment here and there (a couple of hours of earnings, really) pretty much pales in comparison to the billions in claims, drop in share price, consumer boycotts and the immeasurable cost in reputation it suffered instead — not to mention the 11 dead workers and destroyed eco-systems that will affect people way beyond BP or its future survival.
The rational decision-maker, when she or he achieves the optimum, meets a number of criteria. The first is that the decision-maker is completely informed: she or he knows all the possible alternatives and knows which courses of action will lead to which alternative. The decision-maker is also capable of an objective, logical analysis of all available evidence on what would constitute the smartest alternative, and is capable of seeing the finest differences between choice alternatives. Finally, the decision-maker is fully rational and able to rank the alternatives according to their utility relative to the goals the decision-maker finds important. These criteria were once formalized in what was called Subjective Expected Utility Theory. It was devised by economists and mathematicians to explain (and even guide) human decision-making: Its four basic assumptions were that people have a clearly defined utility function that allows them to index alternatives according to their desirability, that they have an exhaustive view of decision alternatives, that they can foresee the probability of each alternative scenario and that they can choose among those to achieve the highest subjective utility.
A strong case can be made that BP should have known all of this, and thus should have known better. U.S. House Representative Henry Waxman, whose Energy and Commerce Committee had searched 30,000 BP documents looking for evidence of attention to the risks of the Deepwater well, told the BP chairman, "There is not a single email or document that shows you paid even the slightest attention to the dangers at the well. You cut corner after corner to save a million dollars here and a few hours there. And now the whole Gulf Coast is paying the price."6 This sounded like amoral calculation — of willingly, Consciously putting production before safety, of making a deliberate, rational calculation of rewards and drawbacks and deciding for saving money and against investing in safety.
And it wasn't as if there was no precedent to interpret BP actions in those terms. There was a felony conviction after an illegal waste-dumping in Alaska in 1999, criminal convictions after the 2005 refinery blast that killed 15 people in Texas City, and criminal convictions after a 2006 Prudhoe Bay pipeline spill that released some 200,000 gallons of oil onto the North Slope. After the 2005 Texas City explosion, an independent expert committee concluded that "significant process safety issues exist at all five U.S. refineries, not just Texas City," and that "instances of a lack of operating discipline, toleration of serious deviations from safe operating practices, and apparent complacency toward serious process safety risk existed at each refinery."7 The panel had identified systemic problems in the maintenance and inspection of various BP sites, and found a disconnect between management's stated commitment to safety and what it actually was willing to invest. Unacceptable maintenance backlogs had ballooned in Alaska and elsewhere. BP had to get serious about addressing the underlying integrity issues, otherwise any Other action would only have a very limited or temporary effect.
It could all be read as amoral calculation. In fact, that's what the report came up with: "Many of the people interviewed ... felt pressured to put production ahead of safety and quality."8 The panel Concluded that BP had neglected to clean and check pressure valves, emergency shutoff valves, automatic emergency shutdown mechanisms and gas and fire safety detection devices (something that would show up in the Gulf of Mexico explosion again), all of them essential to preventing a major explosion. It warned management of the need to update those systems, because of their immediate safety or environmental impact. Yet workers who came forward with concerns about safety were sanctioned (even fired in one case), which quickly shut down the flow of safety-related information.
Even before getting the BP chairman to testify, the U.S. congress weighed in with its interpretation that bad rational choices were made, saying "it appears that BP repeatedly chose risky procedures in order to reduce costs and save time, and made minimal efforts to contain the added risk." Many people expressed later that they felt pressure from BP to save costs where they could, particularly on maintenance and testing. Even contractors received a 25 percent bonus tied to BP's production numbers, which sent a pretty clear message about where the priorities lay. Contractors were discouraged from reporting high occupational health and safety statistics too, as this would ultimately interfere with production.9
Rational choice theory is an essentially economic model of decision-making that keeps percolating into our understanding of how people and organizations work and mess up. Despite findings in psychology and sociology that deny that people have the capacity to work in a fully rational way, it is so pervasive and so subtle that we might hardly notice it. It affects where we look for the causes of disaster (in people's bad decisions or other broken parts). And it affects how we assess the morality of, and accountability for, those decisions. We can expect people involved in a safety-critical activity to know its risks, to know possible outcomes, or to at least do their best to achieve as great a level of knowledge about it as possible. What it takes on their part is an effort to understand those risks and possible outcomes, to plot them out. And it takes a moral commitment to avoid the worst of them. If people knew in advance what the benefits and costs of particular decision alternatives were, but went ahead anyway, then we can call them amoral.
The amoral calculator idea has been at the head of the most common family of explanations of failure ever since the early 1970s. During that time, in response to large and high-visibility disasters (Tenerife, Three Mile Island), a historical shift occurred in how societies understood accidents.10 Rather than as acts of God, or fate, or meaningless (that is, truly "accidental") coincidences of space and time, accidents began to be seen as failures of risk management. Increasingly, accidents were Constructed as human failures, as organizational failures. As moral failures.
The idea of the amoral calculator, of course, works only if we can prove that people knew, or could reasonably have known, that things were going to go wrong as a result of their decisions. Since the 1970s, we have "proven" this time and again in accident inquiries (for which the public costs have risen sharply since the 1970s) and courts of law. Our conclusions are most often that bad or miscreant people made amoral trade-offs, that they didn't invest enough effort, or that they were negligent in their understanding of how their own system worked. Such findings not only instantiate, but keep reproducing the Newtonian–Cartesian logic that is so common-sense to us. We hardly see it anymore, it has become almost transparent. Our activities in the wake of failure are steeped in the language of tins worldview. Accident inquiries are supposed to return probable "causes." The people who participate in them are expected by media and industry to explain themselves and their work in terms of broken parts (we have found what was wrong: here it is). Even so-called "systemic" accident models serve as a vehicle to find broken parts, though higher upstream, away from the sharp end (deficient supervision, insufficient leadership). In courts, we argue that people could reasonably have foreseen harm, and that harm was indeed "caused" by their action or omission. We couple assessments of the extent of negligence, or the depth of the moral depravity of people's decisions, to the size of the outcome. If the outcome was worse (more oil leakage, more dead bodies), then the actions that led up to it must have been really, really bad. The fine gets higher, the prison sentence longer.
It is not, of course, that applying this family of explanations leads to results that are simply false. That would be an unsustainable and useless position to take. If the worldview behind these explanations remains invisible to us, however, we will never be able to discover just how it influences our own rationalities. We will not be able to question it, nor our own assumptions. We might simply assume that this is the only way to look at the world. And that is a severe restriction, a restriction that matters. Applying this worldview, after all, leads to particular results. It doesn't really allow us to escape the epistemological space established more than 300 years ago. And because of that, it necessarily excludes other readings and other results. By not considering those (and not even knowing that we can consider those alternatives) we may well short-change ourselves. It may leave us less diverse, less able to respond in novel or more useful ways. And it could be that disasters repeat themselves because of that.

Technology has Developed More Quickly Than Theory

The message of this book is simple. The growth of complexity in society has got ah...

Table of contents