## Power Laws in Venture

…The more rightward-skewed the distribution is, whether Pareto-Levy, log normal, or some related form, the more difficult it is to hedge against risk by supporting sizable portfolios of innovation projects. The potential variability of economic outcomes with Pareto-Levy distributions is so great that large portfolio draws from year to year can have consequences for the macroeconomy.1

We live in a Normal world. Most phenomenon have a central tendency, and things that are not average tend not to be too far from average. Almost all people are within a fairly narrow range of heights; there are a few outliers, but only a few. If you are meeting an American man for the first time, you are confident that most of the time he will be between the heights of 5’7″ and 6’1″, almost all of the time he will be between 5’4″ and 6’4″, and pretty surprised if he is shorter than 5’1″ or taller than 6’7″. Height in a population is normally distributed, it follows a bell curve. US adult men have an average height of 5’10” with a standard deviation of 3″. If you’re looking for men taller than 7’4″, you’ll find perhaps one for every billion people.  A normal curve drops off sharply as you move away from its mean.

Normal distributions are normal because they are everywhere. And they are everywhere because of a mathematical property called the Central Limit Theorem: a large number of independent, random inputs that all feed into a single outcome results in a normal distribution, regardless of the individual distributions of the inputs2 So something like height, determined by a large number of different randomly distributed factors, ends up normally distributed.

Normal distributions are well-understood, and easy to work with. Almost all of modern finance theory is built around the assumption that things like prices and returns are normally distributed (or lognormally distributed: a lognormal distribution becomes a normal distribution if you take the logarithm of the x-axis, useful when an increase in x is multiplicative rather than arithmetic.) Normal distributions underlie insurance and allow investors to minimize risk using modern portfolio theory

But not everything is normally distributed; other processes can lead to other distributions. Earthquakes do not have a typical size: there is no central tendency. Cities do not have a typical size, wars do not have a typical intensity. These things are power-law distributed.

A power law distribution is a curve that looks like this:

Small outcomes are most likely, and large outcomes less likely. The formula for the line is: $$p(x) = Cx^{-\alpha}$$, where $$\alpha$$ (alpha)3 defines the shape of the power law and C is a normalization constant to make the total area under the curve sum to 14.

Power laws have a property that normal distributions do not: they have “fat tails.” Normal curves fall off much more quickly the further out the x-axis you get.

Here’s some detail on the tail, so you can see the faster drop-off of the normal distribution.

The most important thing about a power law distribution is the alpha. The smaller the alpha, the heavier the right tail of the curve is.

Here is some detail on the tails, so you can see more clearly that lower alphas mean a heavier tail.

Some phenomenon thought to follow power laws, and their alphas:

 Intensity of wars 1.805 Solar flare intensity 1.835 Frequency of use of words 2.205 Population of U.S. cities 2.305 Magnitude of earthquakes 3.045 Protein interaction degree 3.16 Email address book size 3.56 Sales of books 3.76 Papers authored 4.36

Are Venture Capital Returns Power-Law Distributed?

The professional innovation community takes it as a given that venture returns are power-law distributed. In Peter Thiel’s class at Stanford he said “…actual returns are incredibly skewed. The more a VC understands this skew pattern, the better the VC. Bad VCs tend to think the dashed line is flat, i.e. that all companies are created equal, and some just fail, spin wheels, or grow. In reality you get a power law distribution.”

Not everyone agrees7. Power law distributions can be hard to distinguish from the tail of lognormal distributions or from a distribution built out of several exponential distributions. People fit the data to all of these.

Lognormal and normal distributions are so widespread they may seem universal (they are also well studied and easier to work with, so generally the path of least resistance), and many theoreticians prefer them to the relative novelty of the power law. Power law proponents, on the other hand, liken the effort to devise non-intuitive, ad-hoc distributions to fit the data to Ptolemaic astronomy. Benoit Mandelbrot, of fractal fame, in a comment on an academic paper that fitted a multi-part exponential to financial data better fit by a power law8, wrote “An acknowledged feature of financial prices is that, compared to Gaussianity and independence or Markovian behaviour, their variability is extremely ‘anomalous’…power laws acknowledge that the anomaly extends, at least, to the scale of centuries. If so, everything is simplified at no cost by using a model that implies that the anomaly extends forever.” He concluded “Power-law behaviours exemplify a ‘wildly random’ phenomenon. They do not go away by only looking at them through hasty and ad hoc approximations that exemplify ‘mild randomness’ and underestimate the difficulty and the conceptual novelty of the field.”

But pretending power law behavior is a variant of normal ignores the reality of extreme outcomes.

The professors who live by the bell curve adopted it for mathematical convenience, not realism. It asserts that when you measure the world, the numbers that result hover around the mediocre; big departures from the mean are so rare that their effect is negligible. This focus on averages works well with everyday physical variables such as height and weight, but not when it comes to finance. One can disregard the odds of a person’s being miles tall or tons heavy, but similarly excessive observations can never be ruled out in economic life…In other words, we live in a world of winner-take-all extreme concentration. Similarly, a very small number of days accounts for the bulk of stock market movements: Just ten trading days can represent half the returns of a decade.

The economic world is driven primarily by random jumps. Yet the common tools of finance were designed for random walks in which the market always moves in baby steps. Despite increasing empirical evidence that concentration and jumps better characterize market reality, the reliance on the random walk, the bell-shaped curve, and their spawn of alphas and betas is accelerating, widening a tragic gap between reality and the standard tools of financial measurement9.

Venture capital returns are not normal. Most investments return a small multiple or lose money, but many return larger multiples. And a few have had outcomes well outside what any normal curve would encompass. Venture returns are best described by a power law distribution.

What Would Cause Returns to be Power Law Distributed?

Here is a simple model of company growth and exit10 that creates a power law distribution of venture capital return multiples (i.e. 1x, 2x, 3x, etc.) It is only meant to hold for returns of more than 111–because no matter what the distribution of actual company outcomes looks like, preference provisions standard in VC contracts distort the part below 1x12.

At time 0 a company’s value as a multiple of its initial value is 1. Value then grows continuously at a rate13 g so at time T, the company’s value multiple is $$X = e^{gT}$$.

Time to exit is exponentially distributed, with an average time to exit of i. So the probability of exit at time t is $$p(T=t) = \frac{1}{i}e^{-t/i}$$.

Taking these two together, the probability of an exit value of x is:

$p(X = x) = \frac{1}{gi} x^{-(\frac{1}{gi} + 1)}$

This is a power law distribution with $$\alpha = 1 / {gi} + 1$$.

OK, math aside: this model says that a power law return can be generated by a simple mechanism that relies only on growth and time to exit. It also says that as the product of growth and expected time to exit gets larger, the tail of the power law distribution gets fatter.

Here’s a chart showing the model’s prediction of alpha at various average times to exit and year over year growth rates14

Our model does generate a power law distribution of returns. But is it intuitively reasonable? It is simplistic15. It glosses over details that every practitioner knows: growth rates slow as companies get larger, exits cluster around raises, etc. But sometimes simple models can contain a good part of the explanatory power of more complicated models. Exponential growth is reasonable, at least during the time-frame of a venture investment, and an exponential time to exit distribution seems reasonable16 A better model could be built, but this one’s a good first order approximation.

Venture Capital Power Law Distributions

Let’s plug in some rules of thumb and see what the model predicts.

Venture capitalists hold investments for an average of 4 years. They expect year over year growth of about 30%, meaning a continuously compounded growth rate of 26%. With these the model gives us an alpha of (1/(.26 * 4)) + 1 = 1.96. How does this compare to the real world?

Below are estimates of venture capital power law alphas from various sources. Note that some of them look at the distribution of returns, some look at the distribution of values, and some look at the distribution of revenues. I believe the alpha in all approaches should be the same, if we assume reasonable initial valuations and that value tends to track revenue in most companies17. I calculated the italicized alphas myself using thin data found in various summary charts; I did not have access to the underlying data. You should assume that the ones I calculated are less precise (on the order of +/- 0.1 to 0.2.)

Sorted by alpha from smallest to largest:

 Return multiples, fund size <$100m 1.6818 Total Value to Paid In, Small Funds ($50m-$250m), 1981-2003 1.7519 PSED Study, revenue growth yr 2 to 5 1.762 Total Value to Paid In, Large Funds (>$250m), 1981-2003 1.7821 Kauffman Study, revenue growth yr 2 to 5 1.822 North American angel investment returns 1.823 Return multiples, fund size $250m-$500m 1.8424 Return multiples, fund size $100m-$250m 1.8524 Inc 500, revenue growth year 2 to 5 1.8625 Derived from Correlation Ventures data 1.8826 Return multiples, fund size > $1b 1.8927 All VC-backed startups, per Horsley-Keogh 1.928 All VC-backed startups, per Venture Economics 1.9728 British angel investment returns 1.9729 Unicorn valuations 2.133 Return multiples, fund size$500m-$1b 2.2731 As per our model, these cluster around the value of 1.96 we calculated. Some other things to note: 1. The alphas are all relatively close to 2. Looking at the chart of non-financial alphas above, you can see that this is not a universal feature of power law distributions. 2. Larger alphas seem to be correlated to later stage and less risky portfolios. The correlation is weak, but supports the ideas behind the model. If you look at the chart of non-VC financial alphas below, you can see this trend more clearly. 3. This implies that venture capitalists ‘choose’ their alpha as a primary driver of their strategy. 4. To target an alpha you can either target an average time to exit, i, and a growth rate, g, or simply target a return multiple, $$m = e^{gi}$$. If $$\alpha = 1 + e^{1/gi}$$ then the targeted multiple should be $$m = e^{1/(\alpha-1)}$$. If you want to target an alpha of 1.95, then you should invest in companies you think will return 2.9 times your investment. This approximates another VC rule of thumb. Some alphas for non-VC innovative activity  Value of patents 1.332 U.S. Patents 1.4333 Value of patents 1.45-1.6734 Harvard Patents 1.7135 German Patents 1.8735 Size of all U.S. Firms 2.0636 Corporate R&D (simulation from sparse data) 2.2137 Pharmaceutical development-1970s 2.2237 Size of Largest 500 US Firms 2.2538 Pharmaceutical development-1980s 2.3639 Movies with stars 2.7240 Movie income 2.9140 Movies without stars 3.2640 Infinite Mean Power Law distributions must have an alpha greater than one41. They do not have a standard deviation if alpha is less than three. They do not have an average if alpha is less than two. What does not having an average mean? Think about a normal distribution: if you make a large number of picks from a normal distribution, the average will be right in the middle of the distribution. If you measure the height of 100,000 U.S. men, the average will be 5’10”, I guarantee it. The more picks you make, the closer the average of your picks gets to the average of the normal distribution. If you do the same thing with a power law distribution with an $$\alpha \lt 2$$, the average will tend to grow as you make more picks. If you make an infinite number of picks, the average will be infinite. This is strange behavior42. It is also what makes power laws so hard to intuit. The largest value you are likely to get from a power law distribution depends on the number of picks you take from it43: $$<x_{max}> \sim n^{1/(\alpha – 1)}$$. The graph to the left shows $$<x_{max}>$$ for various alphas, given a certain number of picks. When $$\alpha = 2$$, then the mean value of the largest pick is n. That is, if you invest in 10 companies, the likeliest largest multiple is 10x. When $$\alpha < 2$$ then the mean value of the largest pick is greater than n. In other words, if alpha is less than or equal to 2, one company is likely to return the entire amount invested in all of the successful companies. With some luck, it returns the fund. Of course, when alpha is larger than 2, the mean value of the largest pick is much smaller. When alpha equals 3, for instance, $$<x_{max}>$$ grows as the square root of the number of picks. The average of all the picks grows quickly as alpha gets smaller. In our model, where we have an expected growth rate, g, and an average time to exit, i, it would make sense to expect an average return multiple on a given company of $$m = e^{gi}$$ (I’ll call this deterministic growth.) If venture capital were normal, that would be true. But the mean of a power law distribution is $$(\alpha – 1)/(\alpha -2)$$. Using our model’s result of $$\alpha = 1/gi + 1$$ to substitute into the power law mean formula, we can compare the deterministic mean to the power law mean. It’s no surprise that these are similar when gi is close to zero (equivalent to a high alpha.) But the power law mean grows much more quickly than the deterministic mean as growth rates get larger. If the average time to exit is four years, then at a growth rate of 20%, the power law mean is more than twice the deterministic mean. This upside surprise is what draws investors to low alpha power laws. But this strategy comes with risk. In his book The Black Swan, Taleb warns against the financial sector using risk measurement tools like VAR and Black-Scholes that were built on an expectation of normal or lognormal returns. Normal and lognormal distributions give too little weight to the fat tails of many of the actual financial sector probability distributions. The chance that something that seems unlikely in the Normal world (what Taleb calls “Mediocristan”) is actually not that improbable in a power law world (“Extremistan”) can result in what looks in hindsight like reckless behavior. Taleb ascribes the failure of Long Term Capital Management to this, brought down by events so many standard deviations away from the mean that it would have been safe to ignore them in a Normal world. In a power law world ignoring them meant economy-shaking losses. If the public financial markets are Extremistan, then venture capital is Absurdistan. The fat tails in the public markets lead to black swans, but they’re nowhere near as fat as the tails in venture capital. Alphas Close to Two Betting against a power law return (as at LTCM) can cause some nasty surprises, but going long on a fat tail is a good bet, so long as you can make enough investments and be patient enough to find the rare anomaly. Sure, you sacrifice predictability, and that’s an issue for the investors in your fund. But once you’ve gone under two, why not keep going? The fatter the tail, the higher the probability of outsize events. Once you’ve sacrificed predictability, you’re in for a penny. Why not be in for a pound? Why do the VC alphas cluster so closely around 2, the alpha where the mean goes to infinity? Why not even lower? One reason is timing. If VCs have a 10-year fund life and they invest in the first two or three years, they have seven or eight years to realize gains. If exits are distributed exponentially, then if VCs want to exit 80% of their investments within eight years of making them, they need to have an average time to exit of about 5 years. If they want to exit 90%, they need an average time to exit of about 3.5 years44. This means that investing in patents–with an alpha somewhere between 1.3 and 1.7–is out, it would take too long to realize the investment. This points to the real problem: look at the chart a few pages ago of year/year growth as a function of time to exit. For a given alpha, a shorter time to exit requires a larger growth rate. If it takes 20 years to exit a patent (alpha = 1.5) it implies a year over year growth rate in value of about 10%. If you wanted to exit in five years you would need a year over year growth rate of closer to 50%. To get to an alpha close to 2, as in venture capital, with an average time to exit of 5 years, the year over year growth rate of the portfolio companies needs to be 22%. For a time to exit of 3.5 years, the growth rate needs to be 33%. These are high growth rates, and if the best VCs are the ones who can maintain the lowest alphas45 then they are the ones who have the highest growth rates in their portfolios. At a given alpha, the more investments you make, the better, because your mean return multiple increases with the number of investments, as does the likeliest highest multiple. Dave McClure makes this case: Most VC funds are far too concentrated in a small number (<20–40) of companies. The industry would be better served by doubling or tripling the average # of investments in a portfolio, particularly for early-stage investors where startup attrition is even greater. If unicorns happen only 1–2% of the time, it logically follows that portfolio size should include a minimum of 50–100+ companies in order to have a reasonable shot at capturing these elusive and mythical creatures. Peter Thiel flatly contradicts this: Given a big power law distribution, you want to be fairly concentrated. If you invest in 100 companies to try and cover your bases through volume, there’s probably sloppy thinking somewhere. There just aren’t that many businesses that you can have the requisite high degree of conviction about. McClure believes he can find hundreds of companies with high enough growth to maintain his requisite alpha. Thiel thinks this is not possible. Venture capitalists have always faced this tension: the average growth rate of all small businesses in the US is closer to 7.5% than 30%. The pool of companies that can grow fast enough is limited. How many companies can you find that will grow fast enough, knowing that when you’re wrong about the growth rate, you’re probably wildly wrong? But why 2? A lower alpha is better, but getting a lower alpha is constrained by finding enough companies who can generate the required amount of growth in the time a VC has to go through a cycle of investing and exiting. But it seems a bit coincidental that these things balance out so close to the point where the power law distribution mean goes infinite. The best explanation is supply and demand. When alphas of less than two are available–the supply of fast-growth companies has increased–venture capitalists have an incentive to make more investments, so they raise more money and start more funds, increasing the demand for these companies until the alpha returns to 246. Unresolved Questions 1. Failure rates Chris Dixon notes that better fund returns–implying a fatter tail–are tied to more failures–implying a fatter head. This is not power law distribution behavior. The area under a power law distribution sums to one, so if the tail gets fatter, the rest of the distribution gets thinner, including the head. Look at the fourth chart in this post to see this. But the model we are using applies only only to companies with a return multiple of more than 1, only those that succeed. It is not clear whether failure rates should follow the implied power law that drives the returns power law distribution. Picking growth rates is an inherently uncertain process and the venture capitalist is likely to be wrong. Our model, ironically, assumes that when picking growth rates there is a central tendency and that errors cancel each other out–that the growth rates are ‘normally’ distributed around the picked growth rate. There is no evidence this is true. An alternative, one that many practitioners subscribe to, is that companies that do not achieve their targeted growth rates simply fail. While this behavior does not seem to fit with any model of underlying firm growth47 it could arise from the staged-funding model of venture capital: companies that underperform compared to expectation can not raise further funds and go out of business. Measuring the relationship between alpha and failure rate would help shed some light on this. 2. Growth In our model, varying amounts of VC imply varying distributions of growth rates of early-stage companies. Work on the distribution of growth rates has been focused on growth in firm size (measured by revenues, employees, or the like), not on firm value48. The distribution of firm size growth seems relatively stable over time. If this is true then any increase (or decrease) in venture capital funding is due to anticipated growth (decline) in the ratio of firm value to firm size. This, on the one hand, seems obvious. But, on the other, it seems not to account for new industry creation. A time study of the evolution of firm growth rate distributions in an emerging industry would lead to useful predictions of money available for venture capital. 3. How much is a power law option worth? Early-stage venture capital valuations are higher than standard finance theory would predict. No reasonable discounted cash flow model would assign$5m+ valuations to a person with a bright idea. When asked, VCs often cite the ‘option value’ of the investment.

Black-Scholes, the standard option pricing model, was built on the assumption that prices move normally (and specifically, that there is a finite variance to the underlying asset’s return distribution.) While option formulas assuming non-normal distributions have been proposed49, there seems to be no work connecting startup valuations to the pricing of options on power law outcomes. While a theory like this would probably not influence actual VC valuations, it would be valuable in the debate over how much we should spend on R&D. My guess is that the rational amount to spend is quite a bit larger than the amount we spend today.

4. Is it really a power law distribution?

What does an infinite mean imply? The quote that started this post said “The potential variability of economic outcomes…is so great that large portfolio draws from year to year can have consequences for the macroeconomy.” If returns are power-law distributed up to very high multiples (and I have not seen any data suggesting a tail-off, a la earthquakes) then this is undoubtedly true. If you think of value as economic value, not dollar value, then there is perhaps no limit to the largest multiplier possible. One of the consequences of more companies being funded today–if the industry is maintaining an alpha less than two–is the increased probability that we will see something so far outside Mediocristan, so far along the fat tail, that it will fundamentally change how we live.

1. Scherer, F. (1998). The Size Distribution of Profits from Innovation. Annales d’Economie et de Statistique. Retrieved from http://www.jstor.org/stable/20076127

2. See any beginning stats textbook for a more formal description and derivation. I like Bulmer’s Principles of Statistics as a simple reference.

3. Do not confuse this with the alpha of the capital asset pricing model, used to denote the amount that an investor’s prowess is different than luck.

4. Many treatments use $$-(\alpha + 1)$$ as the exponent instead of $-\alpha$ because it makes for simpler formulae in some calculations. We will use $$-\alpha$$ and all alphas cited have been adjusted to reflect this.

5. Newman, M. (2005). Power laws, Pareto distributions and Zipf’s law. Contemporary Physics, (1). Retrieved from http://www.tandfonline.com/doi/abs/10.1080/00107510500052444

6. Clauset, A., Shalizi, C.R. & Newman, M.E.J. Power-law distributions in empirical data. SIAM review 51, 661–703 (2007).

7. There is an entire paper dedicated to debunking many claimed power laws: Clauset, A., Shalizi, C., & Newman, M. (2009). Power-law distributions in empirical data. SIAM Review. Retrieved from http://epubs.siam.org/doi/abs/10.1137/070710111. But, as an amusing example of the back and forth, see the Wikipedia entry on Gibrat’s Law, which spends far more time on whether or not the population of cities is power law or lognormal distributed than it does on Gibrat’s Law.

8. Mandelbrot, B. B. (2001). Stochastic volatility, power laws and long memory. Quantitative Finance, 1(6), 558–559. http://doi.org/10.1080/713665999

9. Mandelbrot, Benoit and Nassim Nicholas Taleb, “How the Finance Gurus Get Risk All Wrong”, Fortune Magazine, July 11, 2005, http://archive.fortune.com/magazines/fortune/fortune_archive/2005/07/11/8265256/index.htm

10. How a combination of exponentials leads to a power law distribution is discussed in more detail in Newman, M. (2005). Power laws, Pareto distributions and Zipf’s law. Contemporary Physics, (1). Retrieved from http://www.tandfonline.com/doi/abs/10.1080/00107510500052444

11. Denoted X0. We need X0 > 0 because you can’t normalize a power law distribution when X0 is zero.

12. For instance, if a VC owned 20% of a company, any company-level return between .2x and 1x could result in a 1x return to the VC.

13. This does not need to be a deterministic process. A stochastic process that gives the same result would be a linear birth and death process. A quick overview of birth and death processes is here. A specific example in this context is in Reed, W., & Hughes, B. (2002). From gene families and genera to incomes and internet file sizes: Why power laws are so common in nature. Physical Review E, 66(6), 067103. http://doi.org/10.1103/PhysRevE.66.067103

14. Note that the growth rate in the alpha equation, g, is the continuously compounded growth rate: that is, a company would grow by $$e^g$$ each time period. So the growth in value between time x+1 and time x is $$e^{g}-1$$. If g is 26% per year, then year over year growth is about 30%. Since we all tend to think of things this way because it’s easier to compute, this chart is presented in those units.

15. It may, in fact, be the simplest possible model that roughly fits the data I have. It certainly has the fewest assumptions and the least hand-waviness. That’s because all the hard work is encompassed in the growth rate. What determines the growth rate in a company? The huge number of variables that feed into growth is where the next level of complexity lies. I would dig further into that if I had more time and it didn’t hurt my ego to become so quickly lost in its models. In the meantime, I take comfort in the story of Rob Shaw, who one day realized that he could model the time between drips from a faucet with a pretty simple–albeit chaotic–equation, one that didn’t account for all the microscopic physics that determine it (I don’t remember where I read that story, but Shaw is pretty famous among complexity researchers for his work on dripping faucets. Cf. http://amzn.to/1BOp3Mw.)

16. It assumes that there is a constant small probability of exit: if you haven’t exited by time $$t_n$$ then you have a fixed probability of exiting at time $$t_{n+\delta}$$, and if you don’t exit at time $$t_{n+\delta}$$, then you have the same fixed probability of exiting at time $$t_{n+2\delta}$$. In the limit of $$\delta \to 0$$ you have an exponential distribution.

17. A power law chart like the one in this Ben Thompson post is different: it is indeed a power law, but not a power law distribution. It roughly corresponds to a power law distribution with the x and y axes swapped.

18. My calculation of alpha from from StepStone data, reported on Seth Levine’s blog: http://www.sethlevine.com/wp/2014/08/some-more-data-on-venture-outcomes

19. My calculation of alpha from Preqin data in Weber, Sven, et al, “Dialing Down: Venture Capital Returns to Smaller Size Funds”, https://www.svb.com/Publications/Industry-Trends/Venture-Capital-Update/Dialing-Down–Venture-Capital-Returns-to-Smaller-Size-Funds-(PDF)/

20. Crawford, G.C., and B. McKelvey, “Strategic Implications of Power-Law Distributions in the Creation and Emergence of New Ventures”, Frontiers of Entrepreneurship Research: Vol 32, Iss 12, Article 1 (2012).

21. My calculation of alpha from Preqin data in Weber, Sven, et al, “Dialing Down: Venture Capital Returns to Smaller Size Funds”, https://www.svb.com/Publications/Industry-Trends/Venture-Capital-Update/Dialing-Down–Venture-Capital-Returns-to-Smaller-Size-Funds-(PDF)/

22. Crawford, G.C., and B. McKelvey, “Strategic Implications of Power-Law Distributions in the Creation and Emergence of New Ventures”, Frontiers of Entrepreneurship Research: Vol 32, Iss 12, Article 1 (2012).

23. My calculation of alpha from Wiltbank, Robert E., “Returns to Angel Investors in Groups”, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1028592

24. My calculation of alpha from from StepStone data, reported on Seth Levine’s blog: http://www.sethlevine.com/wp/2014/08/some-more-data-on-venture-outcomes

25. Crawford, G.C., and B. McKelvey, “Strategic Implications of Power-Law Distributions in the Creation and Emergence of New Ventures”, Frontiers of Entrepreneurship Research: Vol 32, Iss 12, Article 1 (2012).

26. My calculation of alpha from Correlation Ventures data, reported on Seth Levine’s blog: http://www.sethlevine.com/wp/2014/08/venture-outcomes-are-even-more-skewed-than-you-think

27. My calculation of alpha from from StepStone data, reported on Seth Levine’s blog: http://www.sethlevine.com/wp/2014/08/some-more-data-on-venture-outcomes

28. Scherer, F.M., Harhoff, D. & Kukies, J., “Uncertainty and the size distribution of rewards from innovation”, Journal of Evolutionary Economics 10, 175-200 (2000).

29. My calculation of alpha from Wiltbank, Robert E., “Siding With the Angels”, https://www.nesta.org.uk/sites/default/files/siding_with_the_angels.pdf

30. My calculation of alpha from Fortune’s “Unicorn List”, http://fortune.com/unicorns/

31. My calculation of alpha from from StepStone data, reported on Seth Levine’s blog: http://www.sethlevine.com/wp/2014/08/some-more-data-on-venture-outcomes

32. Nordhaus, W.D., (1989). “Comment on Zvi Griliches’ ‘Patents: Recent Trends and Puzzles'”, Brookings Papers on Economic Activity: Microeconomics, pp.320-325.

33. Scherer, F.M., Harhoff, D. & Kukies, J., “Uncertainty and the size distribution of rewards from innovation”, Journal of Evolutionary Economics 10, 175-200 (2000).

34. Scherer, F.M., “The Size Distribution of Profits from Innovation”, Annales d’Economie et de Statistique, No. 49/50, (Jan-Jun 1998), pp. 496-516.

35. Scherer, F.M., Harhoff, D. & Kukies, J., “Uncertainty and the size distribution of rewards from innovation”, Journal of Evolutionary Economics 10, 175-200 (2000).

36. Axtell, R., “Zipf Distribution of U.S. Firm Sizes”, Science Vol. 293, No. 5536, pp. 1818-1820 (2001).

37. Scherer, F.M., Harhoff, D. & Kukies, J., “Uncertainty and the size distribution of rewards from innovation”, Journal of Evolutionary Economics 10, 175-200 (2000).

38. Axtell, R., “Zipf Distribution of U.S. Firm Sizes”, Science Vol. 293, No. 5536, pp. 1818-1820 (2001).

39. Scherer, F.M., Harhoff, D. & Kukies, J., “Uncertainty and the size distribution of rewards from innovation”, Journal of Evolutionary Economics 10, 175-200 (2000).

40. Vany, A.D.E. & Walls, W.D. “Uncertainty in the Movie Industry: Does Star Power Reduce the Terror of the Box Office?” Journal of Cultural Economics 23, 285-318 (1999).

41. If $$\alpha \le 1$$, the line does not go to zero as x goes to infinity, so the sum of the area under the line can not sum to one, as it must in a probability distribution.

42. Power law distributions often have some sort of tail-off at high x to cope with reality. Earthquake sizes, for instance, can’t be infinite: there’s only so much deformational energy stored in the earth’s crust ( see Zaliapin, I. V., Kagan, Y. Y., & Schoenberg, F. P. (2005). Approximating the distribution of Pareto sums. Pure and Applied Geophysics, 162(6-7), 1187–1228. http://doi.org/10.1007/s00024-004-2666-3, p.1190.). So the earthquake power law has another term that decreases the tail at high x.

43. See Newman (2005) for the math.

44. The cumulative distribution function of the exponential distribution is $$1 – e^{-t/i}$$. Setting this equal to 80% and t equal to 8 years, we find i = -8/ln(1 – 0.8) = 4.97 years. Similarly for 90%, we get 3.47 years.

45. This, in fact, would be a better way of rating venture capital firms than IRR or cash-on-cash returns. The latter two are too dependent on luck, even in the medium-term. Say two firms each invest in one of two companies with identical prospects; one of these firms becomes Facebook or Google. The firm that invested in that one becomes one of the top-returning firms, the other does not. This is luck, not skill. Measuring alpha would distill out luck and focus purely on skill. Luck is, as the saying goes, better than being good, but it’s hard to measure. I am actually extremely lucky, and it runs in the family, in my mother’s line. But I don’t expect any investor in venture funds would believe me, no matter how strong the empirical evidence.

46. This hypothesis should be testable by looking at industry alphas around dramatic changes in the amount invested in VC.

47. For instance, Fu, D., Pammolli, F., Buldyrev, S. V, Riccaboni, M., Matia, K., Yamasaki, K., & Stanley, H. E. (2005). The growth of business firms: theoretical framework and empirical evidence. Proceedings of the National Academy of Sciences of the United States of America, 102(52), 18801–18806. http://doi.org/10.1073/pnas.0509543102, etc.

48. C.f. Stanley, M., Amaral, L., Buldyrev, S., Havlin, S., Leschhorn, H., Maass, P., … Stanley, H. E. (1996). Scaling Behaviour in the Growth of Companies. Nature. Retrieved from http://cps-www.bu.edu/hes/articles/sabhlmss96.pdf

49. C.f. Borland, L. (2002). A Theory of Non-Gaussian Option Pricing, 1–52. http://doi.org/10.1080/14697688.2002.0000009

wonderful!! post.

re:value of options, one of the problems is the imperfection of the market, and the reality that the most successful companies “know it” and they’re more demanding of investors, both on terms, engagement and of course, valuation. 500 startups doesn’t get to invest in the “next Facebook” for the same reason they didn’t get to invest in Facebook itself (or Google, or Uber, or …).

IMHO the most interesting model is YC, where they both “go wide” and yet enjoy a number of big winners like AirBnB.

2. The problem at seed stage is Bayesian. You don’t know the probability of success. Hence, investing in a wide swath of seed companies and the pressing your investment in the ones that make it given conditional probability after the first investment allows you to take advantage of the power law theory you illustrate above.

This is why the rule of thumb for seed investors is make at least 20 investments-because 4 will cover your entire portfolio.

But, to your point, there isn’t a Fama-esque efficiency to seed investing. It’s because risk is iid.