Reactions to “We Have Learned Nothing”

Colossus recently published my essay on the seeming ineffectiveness of the startup methods introduced over the last 20-odd years. It got a lot of reaction. Some of it was emotional, as you’d expect: people have devoted a lot of time and energy assuming things like Lean Startup and customer interviewing work; of course they would reflexively refute criticism. But much of it was thoughtful and some of it was illuminating. I’ve collected the criticisms and solicited more from some of the smartest people I know.¹

First, some context. This essay was not what I set out to write. My previous essay for Colossus had done pretty well and Jeremy, the editor there, asked what I was thinking about writing next. I always have a list of four or five things I am thinking about. Top of that list was the changing locus of technological revolutions and the preconditions for where the next would be.² This is boring, but I’m curious. Bottom of the list was a criticism of scientism in entrepreneurship. I did not have a thesis about this—other than that I think entrepreneurs should act more like engineers than scientists and that the question of what an epistemology of engineering would look like tied in to some of my other hobby horses, like tinkering, non-analytical decision-making, and the difference between theory and practice—so I was unprepared to write it. But I unwisely sent the entire list over and Jeremy, who is clearly more on top of the zeitgeist than I am, plucked it from the bottom.

Having trapped myself into writing something I was unprepared to write, I decided to look at the data about trends in how entrepreneurs are taught and what they do to make sure I was writing about a real thing. I looked at the evolution of entrepreneurship textbook contents over the last fifty years and, while it was hard to say whether entrepreneurs are urged to be more “scientific” over that time, the change in emphasis overall was striking, if not surprising. Starting sometime in the 2000s, entrepreneurship pedagogy moved from small business management + innovation to method-based frameworks like lean and customer development. Before looking to rate the scientism of all these, I wanted to see how much they had changed outcomes.

When I took the BLS startups failure rate and looked at it slightly differently than they present it on their site, I was shocked.

What this chart shows is that the new methods adopted over the past 20 years or so have had no effect on the success rates of startup companies. I include lean, customer development, the business model canvas, design thinking, and effectuation in what I called the “new punditry.” I did not anticipate this and I did not welcome it. I had been teaching entrepreneurship at Columbia for 15 years, more than 1000 students, and I had taught all of them these methods. If they didn’t work, I had wasted their time and my own.

This is why this topic is important. A lot of money and time is spent teaching these methods. Not just in universities, but in government programs, inside corporations, at incubators, and by investors. If they don’t work, it’s all wasted. My essay drew attention not because it was new information or very good or, even, probably correct, but because the question of whether they work or not had already been simmering under the surface. What I hoped for was that people would start talking about it in the open because, as Richard Feynman said “We are trying to prove ourselves wrong as quickly as possible, because only in that way can we find progress.”³

I make some recommendations in the article:

Base your ideas on some data. No one should believe a method works without some proof. Sounding reasonable is not enough.
Figure out what works, build on that.
Strategy is being different.

None of this is surprising. If you strip the narrative out of the article, it’s all pretty commonplace. And yet the reaction has been mixed. Some people refuse to engage and just tell me I don’t know what I’m talking about. These are, as Sinclair Lewis put it, people whose salary depends on them not understanding it. Others said they knew it all along, which is another way of not engaging.

But there have been nuanced takes for and against. In support: academics have long known there was no evidence that these things worked in the real world. I think the article was there for me to write because it is not always a great career move to spit into the wind. The specific criticisms are worth talking about in more detail because, while some of them are bad, some of them are valid.

Bad data

The criticism: You’ve used the wrong data or the data is wrong or incomplete.

Not valid: The data is too broad because that chart looks at all startups, not just tech startups. Or, the data is too narrow. Or, my company used these methods and they worked. First off, I agree. That’s why the first thing I say in the essay is that we need more data. If we are going to invest so much in this, we should see if they are working. I could also argue about anecdotes and that even if the data is too broad there should be some effect. But the bottom line is that the null hypothesis is that these methods don’t work, and a flat curve is consistent with that.

Also not valid: This data doesn’t prove they don’t work. I got this a lot, and it was truly confusing. How could you prove they don’t work? I mean, sometimes people use them and they succeed. Sometimes people use them and they don’t. On average, the success rate stays the same. What do the people advancing this criticism propose we do? Not teach the methods when they will end up not working? If so, then I agree! Let’s go gather the data to see where they work and where they don’t. (But I don’t think this is what they were saying.)

Valid: The data is weird. It is too flat. Something is going on there that isn’t fully explained. This is true, and it really bugs me. Why is the curve so flat? Morten Ansteensen, who invited me to speak at NTNU, ran the numbers for startups in Norway, both all startups and only tech startups, and those curves were also flat and at a very similar failure rate as in the US. Why? Very few things are stable in the economy over that period of time. There is some mechanism here.

Also valid: There is some countervailing data. Repeat founders seem to do better, so entrepreneurs are learning something. What, exactly, is unclear. This, again, deserves some research.

Wrong definition of success

Not valid: Success is failing faster, or learning faster. Those are valid definitions of success, but they are hard to measure directly. I thought about this (and several other ways to define success) as I was writing and, in the end, realized that these would show up in the success data. Most entrepreneurs don’t use lean to discover an idea won’t work and then shut their company down and start another; they pivot. This means that if the method works, the success rate of companies should increase because they jettison a bad idea before it bankrupts them.

Also not valid: Success is bigger outcomes. I also looked at this. There definitely are bigger outcomes than there were, but by looking at the power law distribution of outcomes (which hasn’t changed) it seems this is a result of more draws from the distribution, not a change in its shape. This seemed like a point that would be difficult to explain in magazine format, although I would be amused to be the first to publish some calculus in Colossus. The main intuition is that if results are power law distributed, then the more draws you make from the distribution the higher the largest draw will be, because power law distributions have fat tails. So when more companies are started, the biggest outcome (and even the average outcome) will be bigger than when fewer companies are started. The math is here, if you’re interested: Power Laws in Venture.

Valid: Maybe success means more companies, more entrepreneurs, more or bigger innovation. This is a good point. What is it we want, exactly? As a society we want more entrepreneurs, more startups, more innovation, and more progress. I taught this in my first class every semester. The students, though, were interested but eager to move on to how they could succeed as entrepreneurs. If what we are doing is not trying to make entrepreneurs more successful but, rather, ourselves more successful, well, I don’t think that’s wrong. But it also shouldn’t be a secret agenda.

Confounding factors

Not valid: The data shows no effect, but that’s because the methods were offset by [fill in the blank]: more entrepreneurs lowering the quality, less fundamental innovation, macroeconomic factors. The problem is it’s flat. Flat over thirty years, through expanding and contracting founder pools, through booms and busts. Also, I think the “more entrepreneurs means lower average quality of entrepreneur” objection sounds good but fundamentally misunderstands what is happening in startupland. There is no test showing entrepreneurial aptitude that we are skimming the cream from; we don’t really know why some people are better than others at entrepreneurship. People self-select in primarily because they find they can’t work for someone else. You’d really have to contort your brain to think this has some innate correlation to the skills needed to build a large successful company, common sense suggests the opposite. Best case, we’re simply drawing randomly from a pool of people, so you’d expect the quality to remain constant.

Also not valid: Don’t reject the methods, just amend them. This is epicycle reasoning. You can fix any theory by adding new rules for each exception, making it more and more convoluted.

Valid: The world is genuinely messy, and I personally wouldn’t know where to begin with an RCT given the confounding factors. And, as mentioned, the flatness is genuinely strange. If there is some sort of autocorrection going on, then it is an overriding confounding factor. These methods might work 100% of the time but the limiting mechanism then kicks in. I don’t think this is happening, but there’s definitely something here I don’t understand.

Prevalence of methods

Not valid: People know of them but don’t use them. I tried to address this by showing that people know of them. But yes, there is data that shows that most people don’t use them. This raises a puzzle: if people know them and they work, why would people not use them? They’re not hard to use, certainly not if you think the alternative is failure. My guess is that as founders get into it, they realize they’re not applicable to their situation. But this is just a guess. The formal refutation to this criticism is that if even a small percentage of startups use them, that should still show in the data.

Also not valid: People aren’t using them correctly. This is just the No True Scotsman argument.

Valid: I say in the essay that maybe there is no effect because the methods are obvious and everybody already used them. Companies have always talked to customers, tested products and iterated, designed for use. If that argument is true, or if the Red Queen argument is true—that once everyone knows a technique it no longer gives competitive advantage—then you still have to learn these techniques. Either because it will save you time once you are operating your business, or because otherwise you are at a competitive disadvantage. So it could be that they work but don’t raise the overall success rate. This should be discoverable with more research.

Positivist philosophy

Not valid: Using the natural sciences as the analogy is wrong. This sort of broke my brain. Is the objection that they work but aren’t measurable? My follow-up question has been, are you saying that what we are telling entrepreneurs to do does not lead to observable differences in outcomes? Because, if so, what is the point of it in the first place?

Valid: Despite my desire for a measurable way to help entrepreneurs succeed—I’m an engineer, after all—this objection has gotten to me a little. Herbert Simon, one of my intellectual influences, wrote in Sciences of the Artificial about a way to create a non-positivist approach to science, though not necessarily a non-empirical one. And Saras Sarasvathy, a student of Simon’s, took an approach to entrepreneurship very different from the others, effectuation, that goes down that path. Perhaps it shouldn’t be tarred with the same brush. I’ll come back to that.

I won’t let go of the idea that if you dictate an approach, it’s fair game to try to measure if it works. But it’s possible there’s something here I’m just not getting.

Where do we go from here

1. Not to repeat this over and over, but we need more data. It’s expensive and difficult so there needs to be funding and institutional support. The first step is probably convincing people that what currently exists is not good enough, which is what the essay was trying to do. I don’t have the knowledge or organization to do this. I hope someone does.

2. On the other hand, we have a ton of data already. I personally have thirty years of experience with more than a hundred companies just floating around in my head, and I’m not even an especially prolific investor. The problem with my data is that I have no framework to order and communicate it. It’s not really data at all, it’s just a bunch of facts. Researchers have the complementary problem. The people in the field and the people with the big brains need to talk to each other more. This is partly a cultural problem. I sat down with two academics who had read my book. I brought up some observations from my career. One of them said “it’s not your job to interpret what you experienced; that’s what we do.” I’m not interested in just being the object of observation. No one thinks harder about the situations I was in than I did. We need real partnerships. On the flip side, practitioners don’t read the academic literature, partly because most of it’s behind a paywall they lost access to the day they graduated. Somehow researchers have to get their ideas into the hands of people who can use them. Practitioners are hungry for thoughtful advice.

3. Which brings me to effectuation. I’ve always admired Sarasvathy. She looked at what entrepreneurs actually do, and told us. That’s good work. It kind of pained me to include her in the article, but I felt I had to, to be intellectually honest. But some people, especially my friend Rob Wuebker at the U of U, have pushed me toward the idea that effectuation is actually doing what I was asking for in the Red Queen argument: that flowcharts for success won’t work, but that effectuation as design patterns for generating diverse strategies might. Of course, effectuation hasn’t moved the line on the chart either. But it was the least-taught method of all the methods I surveyed. And when it is taught, it’s usually in the framework-y “five steps to a successful company” way that removes the necessary nuance. I’m not convinced yet, but I think it deserves more thinking. It’s certainly the best we have right now.

4. Teaching entrepreneurship is sort of funny. I taught it for 15 years without a PhD. This is typical. Places like Columbia have PhDs teaching theoretical work while adjuncts who have been practitioners teach the how-to. This is fine, but it means there is often no overlap or even social connection between the two groups. Entrepreneurship studies should be descriptive, and theoretical, and also somehow help entrepreneurs and VCs do their jobs better. Universities have departments of theoretical physics and departments of applied physics. Maybe we need something like that for entrepreneurship to move theory through the pipeline to use and things learned in practice the other way back to theory.

5. Frank Knight said that the function of the entrepreneur is to bear the uncertainty that others won’t, and that what makes good entrepreneurs good is good judgment—something that can be learned but not taught. That always sounded like mysticism to me, and a lot of my motivation over the years has been trying to find something better. The essay didn’t help: it seems to lead right back to where Knight landed 100 years ago. As a “positivist”, this sits poorly with me. I’m still optimistic we can think of something better than judgment.

NTNU asked me to give a talk on the essay because, I think, it brings an interesting debate that has taken place within academia to a wider audience. These notes are the ones I made for that talk. ↩
If you wish I had written that, read Nicolas Colin: https://www.linkedin.com/posts/nicolas-colin-drift-signal_there-are-two-competing-theses-about-what-share-7456952808110632960-Ll9l?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAAFGHgBBSuW8qJtCsPnlbvGUwYNTyprsFE ↩
Feynman, The Character of Physical Law, the audiobook, as intended. ↩

Reaction Wheel

helping the workers own the means of production since 1997

Reactions to “We Have Learned Nothing”