Chapter 1
Mismanaging the Unexpected
âA breakdown is not a negative situation to be avoided, but a situation of nonobviousness.â1
âTerry Winograd and Fernando Flores
âDanger, disquiet, anxiety attend the unknownâthe first instinct is to eliminate those distressing states. First principle: any explanation is better than none.âŠThe first idea which explains that the unknown is in fact the known does so much good that one âholds it for true.ââ2
âFriedrich Nietzsche
Nonobvious breakdowns happen all the time. Some are a big deal. Most are not. But which are which? The answer to that question is hazy because we tend to settle for the âfirst explanationâ that makes us feel in control. That explanation turns the unknown into the known, which makes the explanation appear to be âtrue.â That can be a serious misjudgment. This book is about what we could call âthe second explanation,â the one thatâdiscomforting though it may beâtreats the unknown as knowable. This second explanation is built from processes that produce an ongoing focus on failures, simplifications, operations, options, and expertise. Organizing that incorporates processes with these five areas of focus helps make breakdowns more knowable. These processes are an effortful means to maintain reliable performance, but previous work on high reliability organizations (HROs) shows that effortful processes like these make breakdowns more obvious at earlier stages in their development.
Our ideas come from an evolving body of work that originated with studies of safe operations on the flight decks of aircraft carriers, the generation and transmission of electrical power, and the dispatching of aircraft at an en route air traffic control center.3 The common problem faced by all three was a million accidents waiting to happen that didn't. In each case the question was, How were the units organized to accomplish this outcome? Among the answers that have been proposed are the existence of a unique culture, capability for self-design, networks built on expertise, hybrid structures with special attention to redundancy, training and routines, situation awareness, mind-sets involved in sensemaking, relational strategies, and information processing.4 In an effort to synthesize a workable set of principles from this rich array, we focused on processes that were mixtures of variety and stability or, as the late Michael Cohen called them, âpatterns in variety.â5 One pattern that seemed to recur was a sustained focus on small failures, less abstract specifics, ongoing operations, alternative pathways to keep going, and the mobilization of expertise. The variety within this pattern came from local customizing that produced meaningful practices that did not compromise the adaptive capacity that the pattern generated.
Once that adaptive capacity weakens, reliability suffers. To illustrate how problems with reliability develop over time, in this chapter we analyze the collapse of the Washington Mutual Bank (WaMu). Although this example involves the financial industry, the problems and lessons apply to other industries as well.6 This wider application occurs because all of us, just as was true for those at WaMu, have to act in situations we can't possibly understand.7 And the reason we can't understand them is because all of us âhave to apply limited conceptions to unlimited interdependencies.â8 The conceptions and the ways we apply them are what matter. If we change these conceptions, then we change our ability to function under conditions of nonobviousness. As we will see, WaMu underestimated its interdependencies and overestimated its conceptual grasp of those interdependencies it did see.
Washington Mutual Mismanages the Unexpected
Washington Mutual Bank (WaMu) failed and was seized by the Federal Deposit Insurance Corporation (FDIC) on September 25, 2008, at 6 PM, and sold to JP Morgan Chase. We take a closer look at a sample of surprises in this unit that affected its reliability. And we describe one way to think about these fluctuations in reliability. Our interpretation is grounded in the idea that managing the unexpected is an ongoing effort to define and monitor weak signals9 of potentially more serious threats and to take adaptive action as those signals begin to crystallize into more complex chains of unintended consequences. The phrase âbegin to crystallizeâ is crucial to our argument because managing is an active process that is spread over time as the signals and situations change. As a problem begins to unfold, weak signals are hard to detect but easy to remedy. As time passes, this state of affairs tends to reverse. Signals become easy to detect but hard to remedy. As weak signals change, so do the requirements for adaptive functioning. It is that adapting that became more and more flawed at WaMu.
Overview of Washington Mutual Bank Failure10
During the 1980s WaMu, nearly 100 years old, was a retail savings and loan (S&L) bank that, under chief executive officer (CEO) Louis Pepper, had grown from 35 branches to 50 and from $2 billion in assets to $7 billion. The organization was held together by five values, all nouns: ethics, respect, teamwork, innovation, and excellence.11 When Pepper was replaced in December 1988 by Kerry Killinger, the values were changed to three adjectives: fair, caring, and human.12 Later, as the bank aggressively tried to become the largest at several lines of business (largest S&L, largest mortgage lender,13 and largest home equity lender14) and focused increasingly on high-risk, subprime loans, two new adjectives replaced all other values: dynamic and driven.15 These last two values were christened âThe WaMu way.â16
In 1998 WaMu acquired Long Beach Mortgage (LB), a small subprime lender with $328 million in assets. Subprime lending had become fashionable in the banking industry. WaMu had never made these kinds of loans although they appeared to be more profitable than conventional mortgages, albeit riskier. Subprime loans were more profitable because banks charged higher interest rates and higher fees, but they were riskier because borrowers couldn't qualify for regular prime mortgages.
An early weak signal of unexpected events occurred in the summer of 2003. A sampling of 270 LB loans reviewed by the compliance department revealed that 40 percent were deemed âunacceptable because of a critical error.â17 Underwriting standards had been loosened to sell more loans. An internal flyer had said âa thin file is a good file,â18 suggesting that less effort spent on documentation meant more time to sell more loans. For example, one loan application had a picture of a mariachi singer, and his income is âstatedâ as being in six figures. However, the picture was not a picture of the borrower, nor was that the borrower's income.19
As the bank moved i...