Black Box Thinking Page 16
You can even see the basic contours of this perspective in the modern history of artificial intelligence. When the chess grand-master Garry Kasparov was defeated by Deep Blue in the famous “victory of the machine” match in 1997, it created a storm. The popular interpretation was “computers are better than humans!”
In fact, the real surprise was that Kasparov came so close. Humans can only search three or so moves per second. Deep Blue could search two hundred million moves per second. It was designed to look deep into the various possibilities. But, crucially, it could not search every possibility due to the vast number of permutations (chess is characterized by a certain kind of complexity). Moreover, although it had been preprogrammed with a great deal of chess knowledge, it couldn’t learn from its own mistakes as it played the games.
This gave Kasparov a fighting chance, because he had something the computer largely lacked: practical knowledge developed through trial and error. He could look at the configuration of pieces on a board, recognize its meaning based upon long experience, and then instantly select moves. It was this practical knowledge which almost propelled him to victory despite a formidable computational deficit. Deep Blue won the series three and a half to two and a half.
But artificial intelligence has moved on since then.7 One of the vogue ideas is called temporal difference learning. When designers created TD-Gammon, a program to play backgammon, they did not provide it with any preprogrammed chess knowledge or capacity to conduct deep searches. Instead, it made moves, predicted what would happen next, and then looked at how far its expectations were wide of the mark. That enabled it to update its expectations, which it took into the next game.
In effect, TD-Gammon was a trial-and-error program. It was left to play day and night against itself, developing practical knowledge. When it was let loose on human opponents, it defeated the best in the world. The software that enabled it to learn from error was sophisticated, but its main strength was that it didn’t need to sleep, so could practice all the time.
In other words it had the opportunity to fail more often.
III
Before we go on to look at what all this means in practice, and how we might harness the evolutionary process in organizations and in our lives, let us deal with a question that immediately arises: isn’t it just obvious that we should test our assumptions if there is a cost-effective way of doing so? Why would any business leader, politician, or, indeed, sports team do otherwise?
It turns out, however, that there is a profound obstacle to testing, a barrier that prevents many of us from harnessing the upsides of the evolutionary process. It can be summarized simply, although the ramifications are surprisingly deep: we are hardwired to think that the world is simpler than it really is. And if the world is simple, why bother to conduct tests? If we already have the answers, why would we feel inclined to challenge them?
This tendency to underestimate the complexity around us is now a well-studied aspect of human psychology and it is underpinned, in part, by the so-called narrative fallacy. This term was coined by the philosopher Nassim Nicholas Taleb and has been studied by the Nobel Prize–winner Daniel Kahneman: it refers to our propensity to create stories about what we see after the event.
You see the narrative fallacy in operation when an economist pops up on the early-evening news and explains why the markets moved in a particular direction during the day. His arguments are often immaculately presented. They are intuitive and easy to follow. But they raise a question: Why, if the market movements are so easy to understand, was he unable to predict the market movement in advance? Why is he generally playing catch-up?
Another example of the narrative fallacy comes from sports punditry. In December 2007, Fabio Capello, an Italian, became head coach of the England soccer team. He was a disciplinarian. He ordered players to arrive at meetings five minutes early, clamped down on cell phones, and even banned tomato ketchup in the cafeteria. These actions were highly visible and well reported. This is what psychologists call “salience.” And the results on the pitch were, at the outset, very good.
Rather like the economists on the early evening news, soccer journalists began to tell a simple and convincing story as to why the team was doing well: it was about Capello’s authoritarian manner. His methods were eulogized. Finally, a coach who was willing to give the players a kick up the rear! At last, a coach who has provided discipline to those slackers! One flattering headline read: “The Boss!”
But at the FIFA World Cup, the biggest competition in the sport, England bombed. They limped through the qualifying stage before being decisively eliminated with a 4–1 defeat by Germany. Almost instantly the narrative flipped. Capello is too tough! He is taking the fun out of the game! The Italian is treating our players like children! Many soccer journalists didn’t even notice that they had attempted to explain contradictory effects with the same underlying cause.
That is the power of the narrative fallacy. We are so eager to impose patterns upon what we see, so hardwired to provide explanations that we are capable of “explaining” opposite outcomes with the same cause without noticing the inconsistency.
In truth, England’s soccer results were not caused not by the salient features of Capello’s actions, but by myriad factors that were not, in advance, predictable. That is why soccer journalists who are brilliant at explaining why teams won or lost after the event are not much better than amateurs at predicting who is going to win or lose beforehand. Daniel Kahneman has said:
Narrative fallacies arise inevitably from our continuous attempt to make sense of the world. The explanatory stories that people find compelling are simple; are concrete rather than abstract; assign a larger role to talent, stupidity, and intentions than to luck; and focus on a few striking events that happened rather than on the countless events that failed to happen. Any recent salient event is a candidate to become the kernel of a causal narrative.8
But think about what this means in practice. If we view the world as simple, we are going to expect to understand it without the need for testing and learning. The narrative fallacy, in effect, biases us toward top-down rather than bottom-up. We are going to trust our hunches, our existing knowledge, and the stories that we tell ourselves about the problems we face, rather than testing our assumptions, seeing their flaws, and learning.
But this tendency, in turn, changes the psychological dynamic of organizations and systems. The greatest difficulty that many people face, as we have seen, is in admitting to their personal failures, and thus learning from them. We have looked at cognitive dissonance, which becomes so severe that we often reframe, spin, and sometimes even edit out our mistakes.
Now think of the Unilever biologists. They didn’t regard the rejected nozzles as failures because they were part and parcel of how they learned. All those rejected designs were regarded as central to their strategy of cumulative selection, not as an indictment of their judgment. They knew they would have dozens of failures and were therefore not fazed by them.
But when we are misled into regarding the world as simpler than it really is, we not only resist testing our top-down strategies and assumptions, we also become more defensive when they are challenged by our peers or by the data. After all, if the world is simple, you would have to be pretty stupid not to understand it.
Think back to the divide between aviation and health care. In aviation there is a profound respect for complexity. Pilots and system experts are deeply aware that they are dealing with a world they do not fully understand, and never will. They regard failures as an inevitable consequence of the mismatch between the complexity of the system and their capacity to understand it.
This reduces the dissonance of mistakes, increases the motivation to test assumptions in simulators and elsewhere, and makes it “safe” for people to speak up when they spot issues of concern. The entire system is about preventing failure, about doing everything possible to stop
mistakes happening, but this runs alongside the sense that failures are, in a sense, “normal.”
In health care, the assumptions are very different. Failures are seen not as an inevitable consequence of complexity, but as indictments of those who make them, particularly among senior doctors whose self-esteem is bound up with the notion of their infallibility. It is difficult to speak up about concerns, because powerful egos come into play. The consequence is simple: the system doesn’t evolve.
Now, let us take these insights into the real world and, in particular, the rapidly-growing industry of high technology.
IV
Drew Houston was getting frustrated. A young computer programmer from Massachusetts, he had a creative idea for a high-tech start-up. It was an online file sharing and storage service, which seamlessly uploads files and replicates them across all computers and devices.
Houston thought of the idea while traveling on a bus from Boston to New York. He opened his laptop but realized he had forgotten his flash drive, which meant that he couldn’t do the work he wanted to. “I had a big list of things I wanted to get done. I fished around in my pockets only to find out I’d forgotten my thumb drive,” he said. “I was like: ‘I never want to have this problem again.’”9
He was so annoyed with himself that he started to write some code that would remove the need for a flash drive. Then he realized that this was something everyone could benefit from. “This wasn’t a problem unique to me; it was a problem that everyone faced. As a product, it might really sell,” he said.
Houston toured venture capital companies but they kept raising the same issue. The market for storage and file sharing was already pretty crowded. Houston explained that these alternative products were rarely used because they were clunky and time-consuming. A more streamlined product would be different, he said. But he couldn’t get through.
“It was a challenge to raise our first money because these investors would say: ‘There are a hundred of these storage companies. Why does the world need another one of them?’ I would respond with: ‘Yes, there are a lot of these companies out there, but do you use any of them?’ And invariably, they would say: ‘Well, no.’”
Houston was clever enough to know that his product wasn’t a guaranteed winner. Predicting whether consumers will actually buy a product is often treacherous. But he was quietly confident and wanted to give it a go. However, after a year he wondered if he would ever get a shot. He was close to desperate.
• • •
Let us leave Houston for a moment or two and look at two other tech entrepreneurs—Andre Vanier and Mike Slemmer, grappling with a different problem. They had an idea for a free online information service called 1-800-411-SAVE. Unlike Houston they had the money to develop the software. But they had very different ideas about how to write the code, as the author Peter Sims reveals in his book Little Bets.10
Vanier, a former consultant with McKinsey, thought they should spend plenty of time in the office getting the software absolutely right, so that it was capable of supporting all the millions of users they hoped to attract. He believed that the people at the company had great ability and, given time, would come up with bug-free and efficient software. This is the old perspective on development, with its emphasis on rigorous top-down planning.
Slemmer had a different view. He had already started two tech companies and realized something profound: it is pretty much impossible to come up with perfect code the first time around. It is only when people are using the software, putting it under strain, that you see the bugs and deficiencies you could never have anticipated. By putting the code out there and subjecting it to trial and error you learn the insights that create progress. Why, he asked Vanier, would you try to answer every question before you have a single user?
The debate between Slemmer and Vanier echoes the contrast between the biologists and mathematicians at Unilever (and at a higher level of abstraction between Kealey’s idea of progress and those who think progress always emerges from theoretical advance): it is pitting top-down against bottom-up. Vanier wanted to get everything right via a blueprint while Slemmer wanted to test early, and then iterate rapidly while receiving feedback from consumers, thus developing new insights. He wanted to test his assumptions.
Slemmer’s arguments won out. The company got the software out at an early stage of development, and rapidly learned the inevitable flaws in their pre-market reasoning. They had to rewrite large sections, learning new insights that increased in direct proportion to the growing user base. Ultimately they developed arguably the most sophisticated software in the industry.
“Although they competed against substantially larger, better-resourced companies . . . they were consistently first to identify new features and services such as driving directions and integrated web-phone promotional offers,” Peter Sims, the tech author who followed the company’s progress, has written. “As Vanier explains, if he can launch ten features in the same time it takes a competitor to launch one, he’ll have ten times the amount of experience to draw from in figuring out what has failed the test of customer acceptance and what has succeeded.”11
This story hints at the dangers of “perfectionism”: of trying to get things right the first time. The story of Rick, a brilliant computer scientist living in Silicon Valley, will highlight the problem even more starkly.
Rick had the idea of creating a Web service that would allow people to post simple text articles online. He had this idea well before the blogging revolution. He could sense the potential and worked on it fifteen hours a day. Soon he had a working prototype. But instead of giving consumers a chance to use it, perceive its weaknesses, and then make changes, he decided the software would run more efficiently if he could design a more sophisticated programming language. He spent the next four years designing this new language. It proved disastrous. Two psychologists, Ryan Babineaux and John Krumboltz, have written:
Over the next four years, he got more and more mired in technical details and lost sight of his original idea. Meanwhile, other entrepreneurs began to build blogging platforms that were neither perfect nor technologically advanced. The difference was that they quickly put their flawed efforts out into the world for others to try. In doing so, they received crucial feedback, evolved their software, and made millions of dollars.12
The desire for perfection rests upon two fallacies. The first resides in the miscalculation that you can create the optimal solution sitting in a bedroom or ivory tower and thinking things through rather than getting out into the real world and testing assumptions, thus finding their flaws. It is the problem of valuing top-down over bottom-up.
The second fallacy is the fear of failure. Earlier on we looked at situations where people fail and then proceed to either ignore or conceal those failures. Perfectionism is, in many ways, more extreme. You spend so much time designing and strategizing that you don’t get a chance to fail at all, at least until it is too late. It is pre-closed loop behavior. You are so worried about messing up that you never even get on the field of play.
In their book Art and Fear David Bayles and Ted Orland tell the story of a ceramics teacher who announced on the opening day of class that he was dividing the students into two groups. Half were told that they would be graded on quantity. On the final day of term, the teacher said he would come to class with some scales and weigh the pots they had made. They would get an “A” for 50 lbs of pots, a “B” for 40 lbs, and so on. The other half would be graded on quality. They just had to bring along their one, perfect pot.
The results were emphatic: the works of highest quality were all produced by the group graded for quantity. As Bayles and Orland put it: “It seems that while the ‘quantity’ group was busily churning out piles of work—and learning from their mistakes—the ‘quality’ group had sat theorizing about perfection, and in the end had little more to show for their efforts than grandiose theories and a pile of dead
clay.”13
You see this in politics, too. Politicians come up with theories (bordering on ideologies) about whether, say, wearing school uniform improves discipline. They talk to psychologists and debate the issue in high-level meetings. It is an elaborate, top-down waste of time. They end up with dead clay. They should conduct a test, see what works, and what doesn’t. They will fail more, but that is precisely why they will learn more.
Babineaux and Krumboltz, the two psychologists, have some advice for those who are prone to the curse of perfectionism. It involves stating the following mantras: “If I want to be a great musician, I must first play a lot of bad music.” “If I want to become a great tennis player, I must first lose lots of tennis games.” “If I want to become a top commercial architect known for energy-efficient, minimalist designs, I must first design inefficient, clunky buildings.”
The notion of getting into the trial and error process early informs one of the most elegant ideas to have emerged from the high-tech revolution: the lean start-up. This approach contains a great deal of jargon, but is based upon a simple insight: the value of testing and adapting. High-tech entrepreneurs are often brilliant theorists. They can perform complex mathematics in their sleep. But the lean start-up approach forces them to fuse these skills with what they can discover from failure.
How does it work? Instead of designing a product from scratch, techies attempt to create a “minimum viable product” or MVP. This is a prototype with sufficient features in common with the proposed final product that it can be tested on early adopters (the kind of consumers who buy products early in the life cycle and who influence other people in the market).