Free Novel Read

Black Box Thinking Page 6


  What Popper had in mind here is that Adler’s theories were compatible with anything. If, say, a man saves a drowning child, then, according to Adler, he is proving to himself that he has the courage to risk his life by jumping into a river. If the same man refuses to save the drowning child, he is proving to himself that he has the courage to risk social disapproval. In both cases, he has overcome his inferiority complex. The theory is confirmed, whatever happens. As Popper put it:

  I could not think of any human behavior which could not be interpreted in terms of the theory. It was precisely this fact—that they always fitted, that they were always confirmed—which in the eyes of their admirers constituted the strongest argument in favor of the theory. It began to dawn on me that this apparent strength was in fact its weakness.

  Most closed loops exist because people deny failure or try to spin it. With pseudosciences the problem is more structural. They have been designed, wittingly or otherwise, to make failure impossible. That is why, to their adherents, they are so mesmerizing. They are compatible with everything that happens. But that also means they cannot learn from anything.

  This hints, in turn, at a subtle difference between confirmation and falsification. Science has often been regarded as a quest for confirmation. Scientists observe nature, create theories, and then seek to prove them by amassing as much supporting evidence as possible. But we can now see that this is only a part of the truth. Science is not just about confirmation, it is also about falsification. Knowledge does not progress merely by gathering confirmatory data, but by looking for contradictory data.

  Take the hypothesis that water boils at 100ºC. This seems true enough. But, as we now know, the hypothesis breaks down when water is boiled at altitude. By finding the places where a theory fails, we set the stage for the creation of a new, more powerful theory: a theory that explains both why water boils at 100ºC at ground level and at a different temperature at altitude. This is the stuff of scientific progress.

  This also reveals a subtle asymmetry between confirmation and falsification, between success and failure. If you are careful enough to limit your observations to low altitudes and open containers, you could find countless instances where water does indeed boil at 100ºC. But none of this successful “evidence” would have expanded our knowledge very much. Indeed, in one sense, it would not have even increased the probability of the assertion “water always boils at 100ºC.”8

  This point was originally made by the Scottish philosopher David Hume in the eighteenth century, and popularized recently by Nassim Nicholas Taleb, the mathematician and author.9 Taleb has pointed out that you could observe a million white swans, but this would not prove the proposition: all swans are white. The observation of a single black swan, on the other hand, would conclusively demonstrate its falsehood.

  Failure, then, is hardwired into both the logic and spirit of scientific progress. Mankind’s most successful discipline has grown by challenging orthodoxy and by subjecting ideas to testing. Individual scientists may sometimes be dogmatic but, as a community, scientists recognize that theories, particularly those at the frontiers of our knowledge, are often fallible or incomplete. It is by testing our ideas, subjecting them to failure, that we set the stage for growth.

  Aviation is different from science but it is underpinned by a similar spirit. After all, an airplane journey represents a kind of hypothesis: namely, that this aircraft, with this design, these pilots, and this system of air traffic control, will reach its destination safely. Each flight represents a kind of test. A crash, in a certain sense, represents a falsification of the hypothesis. That is why accidents have a particular significance in improving system safety, rather as falsification drives science.

  What is true at the level of the system also has echoes at the level of the individual. Indeed, this framework explains one of the deepest paradoxes in modern psychology. It is well known that experts with thousands of hours of practice can perform with almost miraculous accuracy. Chess grandmasters can instantly compute an optimal move; top tennis players can predict where the ball is going before their opponent has even hit it; experienced pediatric nurses can make almost instant diagnoses, which are invariably confirmed by later testing.

  These individuals have practiced not for weeks or months, but often for years. They have slowly but surely built up intuitions that enable them to perform with remarkable accuracy. These findings have led to the conclusion that expertise is, at least in part, about practice (the so-called 10,000-hour rule). Not everyone has the potential to become world champion, but most people can develop mastery with training and application.*

  But further studies seemed to contradict this finding. It turns out that there are many professions where practice and experience do not have any effect. People train for months and sometimes years without improving at all. Research on psychotherapists, for instance, finds that trainees obtain results that are as good as those of licensed “experts.” Similar results have been found with regard to college admissions officers, personnel selectors, and clinical psychologists.* 10

  Why is this? How can experience be so valuable in some professions but almost worthless in others?

  To see why, suppose that you are playing golf. You are out on the driving range, hitting balls toward a target. You are concentrating, and every time you fire the ball wide you adjust your technique in order to get it closer to where you want it to go. This is how practice happens in sport. It is a process of trial and error.

  But now suppose that instead of practicing in daylight, you practice at night—in the pitch-black. In these circumstances, you could practice for ten years or ten thousand years without improving at all. How could you progress if you don’t have a clue where the ball has landed? With each shot, it could have gone long, short, left, or right. Every shot has been swallowed by the night. You wouldn’t have any data to improve your accuracy.

  This metaphor solves the apparent mystery of expertise. Think about being a chess player. When you make a poor move, you are instantly punished by your opponent. Think of being a clinical nurse. When you make a mistaken diagnosis, you are rapidly alerted by the condition of the patient (and by later testing). The intuitions of nurses and chess players are constantly checked and challenged by their errors. They are forced to adapt, to improve, to restructure their judgments. This is a hallmark of what is called deliberate practice.

  For psychotherapists things are radically different. Their job is to improve the mental functioning of their patients. But how can they tell when their interventions are going wrong or, for that matter, right? Where is the feedback? Most psychotherapists gauge how their clients are responding to treatment not with objective data, but by observing them in clinic. But these data are highly unreliable. After all, patients might be inclined to exaggerate how well they are to please the therapist, a well-known issue in psychotherapy.

  But there is a deeper problem. Psychotherapists rarely track their clients after therapy has finished. This means that they do not get any feedback on the lasting impact of their interventions. They have no idea if their methods are working or failing—if the client’s long-term mental functioning is actually improving. And that is why the clinical judgments of many practitioners don’t improve over time. They are effectively playing golf in the dark.11

  Or take radiologists, who try to identify breast tumors by examining low-dose X-rays known as mammograms. When they diagnose a malignancy they obtain feedback on whether they are right or wrong only after exploratory surgery is undertaken sometime later. But by then they may have largely forgotten the reasons for the original diagnosis and become preoccupied by new cases. Feedback, when delayed, is considerably less effective in improving intuitive judgment.*

  But more seriously, suppose that the doctor fails to diagnose a malignancy and the patient goes home, relieved. If, some months or years later, this diagnosis turns out to be mistaken and the cancer has de
veloped, the radiologist may never find out about his original mistake. That means that radiologists can’t learn from the error. This explains, in part, why junior doctors learn so slowly, gradually approaching, but rarely exceeding, 70 percent diagnostic accuracy.12

  If we wish to improve the judgment of aspiring experts, then we shouldn’t just focus on conventional issues like motivation and commitment. In many cases, the only way to drive improvement is to find a way of “turning the lights on.” Without access to the “error signal,” one could spend years in training or in a profession without improving at all.

  In the case of radiologists, imagine a training system in which students have access to a library of digitized mammograms for which the correct diagnoses have already been confirmed. Students would be able to make diagnoses on an hour-by-hour basis and would receive instant feedback about their judgments. They would fail more, but this is precisely why they would learn more. The library of mammograms could also be indexed to encourage the student to examine a series of related cases to facilitate detection of some critical feature or type of tumor.13

  And this takes us back to science, a discipline that has also learned from failure. Just look at the number of scientific theories that have come and gone: the emission theory of vision, Ptolemy’s law of refraction, the luminiferous aether theory, the hollow earth theory, the electron cloud model, the caloric doctrine, phlogiston theory, the miasma theory of disease, the doctrine of maternal impression, and dozens more.

  Some of these theories were, in practical terms, not much better than astrology. But the crucial difference is that they made predictions that could be tested. That is why they were superseded by better theories. They were, in effect, vital stepping stones to the successful theories we see today.

  But notice one final thing: students don’t study these “failed” scientific theories anymore. Why would they? There is a lot to learn in science without studying all the ideas that have been jettisoned over time. But this tendency creates a blind spot. By looking only at the theories that have survived, we don’t notice the failures that made them possible.

  This blind spot is not limited to science; it is a basic property of our world and it accounts, to a large extent, for our skewed attitude to failure. Success is always the tip of an iceberg. We learn vogue theories, we fly in astonishingly safe aircraft, we marvel at the virtuosity of true experts.

  But beneath the surface of success—outside our view, often outside our awareness—is a mountain of necessary failure.

  III

  In 2002, Dr. Gary S. Kaplan, the recently appointed chief executive of the Virginia Mason Health System in Seattle, visited Japan with fellow executives. He was keen to observe organizations outside health care in action: anything that might challenge his assumptions and those of his senior team.

  It was while at the Toyota plant that he had a revelation. Toyota has a rather unusual production process. If anybody on the production line is having a problem or observes an error, that person pulls a cord that halts production across the plant.

  Senior executives rush over to see what has gone wrong and, if an employee is having difficulty performing her job, she is helped as needed by executives. The error is then assessed, lessons learned, and the system adapted. It is called the Toyota Production System, or TPS, and is one of the most successful techniques in industrial history.

  “The system was about cars, which are very different from people,” Kaplan says when we meet for an interview. “But the underlying principle is transferable. If a culture is open and honest about mistakes, the entire system can learn from them. That is the way you gain improvements.”

  Kaplan has bright eyes and a restless curiosity. As he talks, his hands move animatedly. “We introduced the same kind of system in Seattle when I returned from Japan,” he says. “We knew that medical errors cost thousands of lives across America and we were determined to reduce them.”

  One of his key reforms was to encourage staff to make a report whenever they spotted an error that could harm patients. It was almost identical to the reporting system in aviation and at Toyota. He instituted a twenty-four-hour hotline as well as an online reporting system. He called them Patient Safety Alerts.

  The new system represented a huge cultural shift for staff. Mistakes were frowned on at Virginia Mason, just like elsewhere in health care. And because of the steep hierarchy, nurses and junior doctors were fearful of reporting senior colleagues. To Kaplan’s surprise and disappointment, few reports were made. An enlightened innovation had bombed due to a conflict with the underlying culture.*

  As Cathie Furman, who served as senior vice president for Quality, Safety and Compliance at Virginia Mason for fourteen years, put it: “In health care around the world the culture has been one of blame and hierarchy. It [can prove] very difficult to overcome that.”14

  But in November 2004 everything changed at Virginia Mason. Mary McClinton, sixty-nine, a mother of four, died after she was inadvertently injected with a toxic antiseptic called chlorhexidine, instead of a harmless marker dye, during a brain aneurysm operation. The two substances had been placed side by side in identical stainless-steel containers and the syringe had drawn from the wrong one.15 One of her legs was amputated and she died from multiple organ failure nineteen days later.

  Gary Kaplan responded not by evading or spinning, but by publishing a full and frank apology—the opposite of what happened after the death of Elaine Bromiley. “We just can’t say how appalled we are at ourselves,” it read. “You can’t understand something you hide.” The apology was welcomed by relatives and helped them to understand what had happened to a beloved family member.

  But the death provided something else, too: a wake-up call for the 5,500 staff members at Virginia Mason. “It was a tough time, but the death was like a rallying cry,” Kaplan says. “It gave us the cultural push we needed to recognize how serious an issue this is.”

  Suddenly, Patient Safety Alerts started to fly in. Those who reported mistakes were surprised to learn that, except in situations in which they had been clearly reckless, they were praised, not punished. Dr. Henry Otero, an oncologist, made a report after being told by a colleague that he had failed to spot the low magnesium level of a patient. “I missed it,” he told a newspaper. “I didn’t know how I missed it. But I realized it’s not about me, it’s about the patient. The process needs to stop me making a mistake. I need to be able to say, ‘I might be the reason, fix me.’”16

  Today, there are around a thousand Patient Safety Alerts issued per month at Virginia Mason. A report by the U.S. Department of Health found that these have uncovered latent errors in everything from prescription to care. “After a pharmacist and nurse misinterpreted an illegible pharmacy order, leading to patient harm, the medical center developed a step-by-step protocol that eliminates the likelihood of such incidents occurring,” the report said.

  Another alert warned about wristbands: “After a newly admitted patient received a color-coded wristband signifying ‘Do Not Resuscitate’ instead of one indicating drug allergies (as a result of a nurse being color blind), the medical center added text to the wristbands.”

  In 2002, when Kaplan became CEO, Virginia Mason was already a competent Washington hospital. In 2013, however, it was rated as one of the safest hospitals in the world. In the same year, it won the Distinguished Hospital Award for Clinical Excellence, the Outstanding Patient Experience Award, and was named a Top Hospital by the influential Leapfrog group for the eighth successive year. Since the new approach was taken, the hospital has seen a 74 percent reduction in liability insurance premiums.17

  This success is not a one-off or a fluke; it is a method. Properly instituted learning cultures have transformed the performance of hospitals around the world. Claims and lawsuits made against the University of Michigan Health System, for example, dropped from 262 in August 2001 to 83 in 2007 following the introduction
of an open and disclose policy.18 The number of malpractice claims against the University of Illinois Medical Center fell by half in two years after the creation of an open-reporting system.19

  The example of the Virginia Mason system reveals a crucial truth: namely, that learning from mistakes has two components. The first is a system. Errors can be thought of as the gap between what we hoped would happen and what actually did happen. Cutting-edge organizations are always seeking to close this gap, but in order to do so they have to have a system geared up to take advantage of these learning opportunities. This system may itself change over time: most experts are already trialing methods that they hope will surpass the Toyota Production System. But each system has a basic structure at its heart: mechanisms that guide learning and self-correction. Yet an enlightened system on its own is sometimes not enough. Even the most beautifully constructed system will not work if professionals do not share the information that enables it to flourish. In the beginning at Virginia Mason, the staff did not file Patient Safety Alerts. They were so fearful of blame and reputational damage that they kept the information to themselves. Mechanisms designed to learn from mistakes are impotent in many contexts if people won’t admit to them. It was only when the mindset of the organization changed that the system started to deliver amazing results.

  Think back to science. Science has a structure that is self-correcting. By making testable predictions, scientists are able to see when their theories are going wrong, which, in turn, hands them the impetus to create new theories. But if scientists as a community ignored inconvenient evidence, or spun it, or covered it up, they would achieve nothing.

  Science is not just about a method, then; it is also about a mindset. At its best, it is driven forward by a restless spirit, an intellectual courage, a willingness to face up to failures and to be honest about key data, even when it undermines cherished beliefs. It is about method and mindset.