Uncertainty

Trial and Error, Part 1

Uncertainty results in unpredictability, and two types of things cause unpredictability: novelty and complexity. Novelty uncertainty is when you can’t predict something because it has never been done before. You might not know know the existing conditions, all the actions you can take, how those actions might change conditions, and/or what the possible outcomes are. We’ve talked about this before. You mitigate novelty uncertainty by learning the things you don’t know; if you lack knowledge, you acquire it. Sometimes you can only learn by doing: you try things and see what you learn, then you try other things based on that. This process is called trial and error. Trial and error is one of the fundamental building blocks of knowledge.

As such, it is woefully taken for granted. Probably because it seems so easy: you try things and see what happens, what’s the big deal? But if you’ve ever used trial and error, you know that it works well with simple problems and quickly becomes unwieldy with more complicated problems. If you want your learning to be more than brute force searching, you have to structure the process. This post (and part 2) explores how to do that, in part by discussing two processes that have been explicated in far more depth than generic trial and error, but are, in fact, simply subsets of that more fundamental process.

The first is the scientific method and the second is biological evolution. In both there is a process of (i) generation of tests, (ii) testing, and (iii) selection, and both are (iv) repeated processes. But soon after this, their similarities wane. This hasn’t prevented each being applied as a metaphor to various other fields. In business, for example, the Lean methodology likens itself to the scientific method while evolutionary economics uses evolution. Each of these metaphors is apt, to a point, but only because business trial and error is a sister process to these two trial and error processes. Presenting it as a sub-process is inherently limiting.

This is why trial and error is worthy of serious consideration: the best way for you to find the knowledge you need is not necessarily to adopt the scientific method or evolution, it is to design a trial and error process that suits your needs.

This essay is heavily indebted to Herbert Simon, especially his Sciences of the Artificial. At some points in writing it I considered stopping because I was really just recapitulating Simon. But in the end I want to avoid his eventual conclusion: that trial and error in a complex world ends up requiring satisficing. There are a lot of efficiencies to be gained before that point. I also want to thank Ben Reinhardt and Jason Crawford for helping me think things through.

Last, this was taking me forever to write, so I broke it into two pieces. This part is about trial and error, and what the scientific method can teach us. The second part will be about biological evolution, and how to apply these lessons generally in designing a trial and error process.

Why Trial and Error

You have questions. Maybe “Will customers buy my product and how much will they pay?”. This sort of thing is often very hard to predict if your product has never been sold before, if it solves a new problem, or an old problem in a very different way, or a hundred other reasons. The facts you need and the processes through which they flow probably exist, but they are difficult to observe, both because ascertaining them would require resources and because they are mainly in peoples’ heads, the quintessential black box.

We have talked previously about induction and deduction being the two ways to create knowledge from previous knowledge. If deduction or induction are to describe our world, they must begin with observations about our world. Many of these observations can be had by merely, well, observing. Though: observation can be difficult and complex; it might require telescopes or microscopes or the disassembly of some structure so you can observe its constituent structures.

All of the resulting observations are knowledge and can be used inductively or deductively to create more knowledge. You might observe some facts and induce or deduce that other facts must exist, or that certain processes must exist. You might observe some processes and induce or deduce other processes, or that certain facts must exist for these processes to work. You should, of course, then try to observe these induced or deduced facts or processes directly to see if your induction or deduction was correct.

But you might do all the observation, induction, and deduction you can do and still not know the answer. Perhaps the observations you need are simply not in evidence. Perhaps they can’t be observed directly. They might be unobservable using current technology, or they might be taking place inside some black box that can’t be opened. Gaining knowledge of these things requires a different method: you must create the observations you need through trial and error.

Trial and error is one of the fundamental ways of generating the knowledge we need to predict what will happen, of mitigating uncertainty. You create knowledge in one of four ways: you observe, you deduce, you induce, and you create new observations. Each of these things is absolutely necessary to get to know any system with even the slightest bit of workings. You might argue that “creating new observations” is not the same as “trial and error”: the ‘error’ part seems like a tack-on. But in most circumstances you are not interested in just creating new observations, you are interested in creating useful observations. To do this with any hope of getting what you are looking for, you must embed the trials in a process, where results guide you to better future trials. The results are what we refer to as ‘errors’, even though they might not be errors at all. “Trial and results” might be a more appropriate description, but I’m not going to introduce needless new terminology.

John Dewey, when describing inquiry, described a trial and error process:

  1. a felt difficulty;
  2. its location and definition;
  3. suggestion of possible solution;
  4. development by reasoning of the bearing of the suggestions;
  5. further observation and experiment leading to its acceptance or rejection.
Dewey, J. (1910). How We Think. United Kingdom: D.C. Heath & Company, p. 72.

You might note the resemblance of this process to the scientific method, as you learned it in grade school. Thinkers have always regarded trial and error, in the guise of the scientific method, or of inquiry, or in any of its other manifestations, as one of the deepest tools in existence. It deserves some thought.

Imagine you find yourself in the middle of a hedge maze. What do you do? Having no knowledge of where the exit is, where you are, what the design principles of the maze are, or its general shape or size, you have no other option but to try different paths. At each junction you choose one of the possible paths and see where it leads you. If you’re smart, you’ll break twigs as you go so you can tell if you’re retracing your steps. Sooner or later, you’ll find the exit.

This is a trial and error process. It is trial and error because at each junction you pick one of the available options to try and the erroneous ones lead you to a dead end or back to someplace you’ve already been. It is a process because it consists of many trials. You must use a trial and error process because you have no other way of finding the exit. And while the maze is a stark example of trial and error, it is also a metaphor for problem solving in general.

The process [of human problem solving] can be—and often has been—described as a search through a maze…The process ordinarily involves much trial and error. Various paths are tried; some are abandoned, others are pushed further. Before a solution is found, many paths of the maze may be explored. The more difficult and novel the problem, the greater is likely to be the amount of trial and error required to find a solution. At the same time, the trial and error is not completely random or blind; it is, in fact, rather highly selective…Indications of progress spur further search in the same direction; lack of progress signals the abandonment of a line of search. Problem solving requires selective trial and error.

Simon, H.A. (1969). The Sciences of the Artificial. Cambridge, MA: The MIT Press, pp. 95-96.

Not all problem solving, though. Some problems can be solved because you already know how to solve them. And some can be solved because you know enough to deduce a solution. Trial and error is useful when you don’t know enough to solve the problem or enough to figure it out analytically. Trial and error is useful when there is novelty uncertainty: when something is unknown because it has never been done before (at least, by you.) When there is novelty uncertainty around a problem, trial and error is the primary way of finding a solution. Donald Campbell goes so far as to say

A blind-variation-and-selective-retention process is fundamental to all inductive achievements, to all genuine increases in knowledge, to all increases in fit of system to environment.

Campbell, D.T. (November 1960). Blind variation and selective retention in creative thoughts as in other knowledge processes. Psychological Review. 67 (6): 380–400.

This is why trial and error is so ubiquitous: deduction and induction rely on observations, and if observations are not readily available, you must do something to evince them.

But trial and error is, at its core, a brute force way of searching for knowledge. It is the least efficient way to solve a problem: try, try again, until by chance you stumble on the solution. Trial and error is far less powerful than the other ways of finding a solution and when the problem-space is large, should be considered a last resort.

In real life, processes that are based on trial and error use features of their problem space to make the search through it more efficient. Interesting problem spaces often have these types of features, because they are generated by more fundamental mechanisms. They are, in some sense, ‘compressible.’

For instance, running a maze is very close to a brute force process, but even here there are ways to make it more efficient. Contra Campbell, above, running a maze doesn’t really use ‘selective retention’ of paths so much as selective deletion. This may seem like the same thing, but in a complicated maze you find a lot more bad paths than good paths. And you probably only care about finding a single path, because once you reach the exit you’ll probably feel no urge to go back into the maze and see where the paths you did not follow go. So while choosing which path to try in a maze might be blind—that is, you choose at random at first—as you proceed in the process, you mark off dead ends and loops so you don’t try them again. This interim learning makes each trial a bit less blind than the one before. As Simon said, problem solving requires ‘selective’ trial and error.

There may be other clues and strategies that can make your search slightly less brute force. For instance, look at the below maze, where you know the size and shape of the maze and where the exit and entrances are. Once you have traversed part of the path in the middle diagram, you know you have made a mistake, even without completing the path.

The maze is a simple example, but it begins to show that trial and error processes can have some nuance:

  • trial generation is not completely random;
  • errors might not be evident until more trials are done; and
  • results (those that are not either errors or successes) help inform future trials.

Simon thought of trial and error “generate and test.”1 But just as thinking of the process as step one: trial, step two: error, this oversimplifies. For instance, generating ideas in the maze example is easy: there are only a few paths you can take at each junction, and you generally have no way to weight them. But in more complex problems, finding things to test can be both too easy and too hard. It’s easy to come up with all sorts of stupid things to try, and it’s hard to choose the really salient tests from them. And then, having tested the idea, how do you decide if it was an ‘error’ or a success? What does success even mean, and what should you do with it?

Let’s think harder about the process. What do you need to do? Here’s a list I made:

  1. Setup
    • Articulate your goals, marshal your resources, and think about your stopping conditions;
  2. Generate trial
    • Determine available trials: what is it possible to do?
    • Decide which of the available trials would be the best to try first, and whether you should try one at a time or many in parallel;
    • Configure the trial to maximize what you want from it: what types of observations would be most useful to make?
  3. Test
    • Figure out how you can run the selected trial;
    • Run the trial;
    • Observe the results;
    • Update your knowledge, however it is structured;
  4. Decide if you need to, or should iterate.

By far the most mysterious part of this process is deciding what to try. There are different ways to approach this, depending on what you are trying to find. You might be trying to figure out how something works, or you might be trying to figure out what will work. This is a somewhat fine distinction2 but how questions are the kind where you have plenty of facts but don’t know how they fit together; additional facts are evidence, but not usually dispositive. Creating more observations only helps if they are the right observations. What questions are the kind where it is harder to get facts, but each fact can be dispositive.

A how question might be “what will customers do with our product?” What each customer will do is a what question, but what customers in the whole will do is a how question. The answer is often many-fold: customers do not care about the product, they care about their problems and may use your product in ways you did not imagine. If you understand the users’ problems, you can understand how they will decide to use the product. (You could, of course, just ask them, avoiding trial and error and using simple observation. But customers can’t always articulate their problems and you might have to figure them out indirectly.)

A what question might be “which configuration of our product will be most useful?” Often there is one configuration that just works better than any other. Sometimes you can hone in on this through expertise, but many products have had breakthrough moments where a change in configuration causes a sea-change in use. These breakthroughs are unpredictable.

Science primarily addresses how questions (though it does, of course, also address what questions: what is the charge of the electron, what is the speed of light) while biological evolution primarily addresses what questions. Each process is optimized for its particular type of problem.

The Scientific Method

Feynman at the board

If the method of trial and error is developed more and more consciously, then it begins to take on the characteristic features of the ‘scientific method’. This ‘method’ can briefly be described as follows. Faced with a certain problem, the scientist offers, tentatively, some sort of solution—a theory. This theory science only accepts provisionally, if at all; and it is most characteristic of the scientific method that scientists will spare no pains to criticize and test the theory in question. Criticizing and testing go hand in hand; the theory is criticized from very many different sides in order to bring out those points that may be vulnerable. And the testing of the theory proceeds by exposing these vulnerable points to as severe an examination as possible…success depends mainly on three conditions, namely, that sufficiently numerous (and ingenious) theories should be offered, that the theories offered should be sufficiently varied, and that sufficiently severe tests should be made.

Popper, K.R. (2002). Conjectures and refutations. United Kingdom: Routledge, pp. 420-421. Originally published 1963.

Richard Feynman explains the scientific method during the last of his Cornell Messenger Lectures like this: guess, compute consequences, compare against experiment, if unexpected results then iterate. This is essentially the same scientific method you learned in high school:

  1. Make an observation.
  2. Ask a question.
  3. Form a hypothesis, or testable explanation.
  4. Make a prediction based on the hypothesis.
  5. Test the prediction.
  6. Iterate: use the results to make new hypotheses or predictions.3

Each experiment is a trial, and experiments that don’t agree with the predictions made beforehand are “errors.” Of course, scientists don’t call them errors: the point of an experiment is to learn, and you can learn from failures as well as successes (though, usually, not as much.) They are results, not errors, as we discussed. Scientists experiment for many reasons: confirming the predictions of a theory, sharpening a theory by exploring some of its components (how they interact, what they do in unusual circumstances, etc.), etc. Most experiments have some of both exploration and confirmation, and the relative amount affects the design of the experiment.

I need to point out, before we start, that science is more than just the scientific method. Scientists’ insistence on experiment as the sine qua non is, perhaps, just institutional insecurity left over from science’s split from natural philosophy.4 Experiment is a way of grounding reasoning in the natural world and, since the aim of science is understanding the natural world, this is crucially important. But science is more than just experiment. Einstein, after all, conducted no experiments of note, but he was an extraordinarily successful scientist. He used observations others had made to reason out how the world must be working. This building upon observation is as important as making the observations themselves. Saying one is more fundamental than the other, though, is just arguing where a circle begins. Note also that while all experiments generate observations, some are primarily designed to elicit unknown information while others are primarily designed to test whether hypotheses are true. Science, like any developed knowledge-generation process, uses trial and error alongside direct observation, deduction, and induction.

Science also includes ways to make the process of science work better. Ways of achieving consensus around the truth of an observation, disseminating observations and hypotheses, parallelizing experiments, rewarding success, etc. These can be interesting as ways to understand how to make trial and error work more effectively, even if they are not part of Trial-and-Error per se.

I mentioned that figuring out what to try is one of the hardest parts of the trial and error process. It is often the first question you’ll struggle with when running a trial and error process. Popper said:

The initial stage, the act of conceiving or inventing a theory, seems to me to neither call for logical analysis nor be susceptible of it.

Popper, K.R. (2002). The Logic of Scientific Discovery. New York: Routledge Classics, p. 7. Originally published 1935.

But, in fact, science has a well-developed way of choosing trials. Simple observations lead to questions that lead to more observations. These observations, at some point, can be collected into a theory (comprised of mechanisms, models, or both). The set of observations can often be explained by more than one theory, so experiments are designed to test the points of difference and determine if one or the other is wrong.

Empirical constraints are simply not enough to explain theoretical relevance.
-Bechtel and Richardson5

Why did Popper think this was mysterious? Perhaps because infinitely many explanations can be fit to any set of data. But scientists are practical people. They are happy putting forth theories they know are incomplete, or even slightly wrong, hoping the details will be filled in and errors corrected later. They can choose the simplest explanation for a given set of data and leave it to stand until enough new data contradicts it. Because of the hierarchical nature of science the imperfect theory can be incomplete, vague, and even wrong and still be useful. Further experiments will improve the theory. These further observations might refine the theory or might point to an entirely new theory. Kuhn said, “The unit of scientific achievement is the solved problem.”6 But I think it is more accurate to say that the goal of each trial is theory improvement. Problems are never quite solved.

Figuring out how the world works is similar, in a very high-level sense, to figuring out how anything else works. Sometimes you can look at the thing and see how it works. You can use direct observation to describe it. You know if you are correct if your explanation allows you to predict how that thing will work in the future. But sometimes you can’t just look at a thing and see how it works. The observables may not fully specify the thing. This may be because you can’t really see the thing in its entirety. Perhaps it is not yet directly observable, like microorganisms once and like dark matter today, so you can only observe its effects on other things. Perhaps it is a type of black box that you can’t open, like the proton, so you can’t see its workings. Perhaps there are many possible ways to describe the observations you have made and you don’t have enough information to pick one over another.

What do you do then? What does a scientist do then? Mostly you and the scientist would do the same sorts of things: if you can’t observe something directly, you observe it indirectly by seeing its effects on other things. If you can’t open the black box you shake it and prod it and see what happens. If there are many possible mechanisms for something, you settle on one that seems likely and see if it is predictive…if it isn’t, you try a different one. Direct observation is better, of course: progress in biology went much more quickly once the microscope was invented. But we have to try to use the observations we have to imagine a mechanism that might produce them and build a model that will predict them. Experimentation is vital in this process in two ways: in generating observations to use and in winnowing out mechanisms and models that fit previous observations but are wrong.

The space of all possible experiments is too large for brute force: scientists have to find patterns, and use those patterns to choose experiments to run. Fortunately, there are patterns. Most of what we observe is generated by some deeper process, more fundamental mechanisms. The task of science, after observing the world, is to make sense of these mechanisms. There are also very many of these mechanisms, of course, because most mechanisms are produced by even deeper mechanisms. This hierarchical structure of reality is extremely lucky for us, as Simon observes: “This…construction of science from the roof down to the yet unconstructed foundation was possible because the behavior of the system at each level depended on only a very approximate, simplified, abstracted characterization of the system at the level next beneath.”7 If we could not make progress from our simplest observations to models we could use to predict, before explaining why those models worked, we would not have been able to do anything with science at all until we had finally explained everything. That is, the trial and error that we call the scientific method takes advantage of this hierarchy of mechanisms to avoid being simply a brute force search. This is what makes science, science; the most interesting things the scientific method adds to trial and error are the ways it searches for structure: the how, not just the what.

Theorizing often takes the form of thought experiments. These are not experiments in the Popperian sense: they do not generate data; nor are they, in a strict sense, trial and error. On the other hand, they can falsify hypotheses. They do so through deduction, in some sense of that word: they evaluate theories for plausibility, fit to known data, and internal coherence.8

Einstein came up with his theory of special relativity by trying to reconcile observations about the invariability of the speed of light with what Maxwell’s equations described and then deducing how this affected time and space for different observers. He reasoned: if this is true and this is true, what else also must be true. Notably, the observations Einstein was using could be explained by many possible theories. His “thought experiment” mentally sorted this plethora of theories, casting out the ones that could not be true and the ones he believed were not true. This was pure deduction.

Thought experiments winnow models. Experiments also winnow models. But thought experiments are important because they are generally far cheaper than actual experiments. Einstein could discard hundreds of conceivable models without leaving his desk at the patent office by simply assuming Maxwell’s equations to be true. In areas where thought experiments are more expensive than physical experiments, like testing the effects of a new drug, physical experiments predominate.

Thought experiments can contradict observations in interesting ways. Galileo contradicted Aristotle’s rule that heavy things fall faster than light things using a thought experiment: if a light thing and a heavy thing are chained together, the composite object is heavier than both. If light things fell slower than heavy ones, then the light object would slow down the heavy object by pulling on the chain. So the assembly of light thing and heavy thing should fall somewhere faster than the light thing but slower than the heavy thing. But the composite thing, the heavy thing and light thing chained together, is heavier than the heavy thing, so it should fall faster than it. The composite thing falls both slower and faster than the heavy thing. Aristotle probably proposed his rule using induction: heavy things generally do fall faster than light things. (If you don’t believe me because your high school physics teacher told you different, try it yourself in a bathtub filled with water.) The thought experiment contradicted observation! This sort of anomaly is an important way that scientists find new phenomena to explore.

They reason theoretically, without demonstrating experimentally, and errors are the result.
-Michael Faraday9

This is not to say that thought experiments are always superior to actual experiments. Not least, as Thomas Huxley said, a beautiful hypothesis can be slayed by an ugly fact.10 But deduction (and induction) can make science more efficient, because there is deeper structure. Any system that has deeper structure should pair trial and error with other ways of reasoning to generate trials more efficiently.

By deeper structure I mean that models can make understanding easier because they allow you to compress the solution space. Imagine throwing a ball through the air and noting its x, y, and z coordinates every millisecond along the way. This would be a lot of numbers. If you had this list of numbers and you were asked where the ball was at time t, you could look it up. More generally, you can imagine a four-dimensional geometric space that has every combination of imaginable paths of the ball. Call this the solution space. This space would show the path of the ball if you threw it hard in one direction and if you threw it softly in another, etc. For each combination of direction and speed of throw, the path the ball would take would be in the solution space. Finding the path of the ball through naive experimentation is like searching through this space. But the solution space would also have many paths that the ball could never take: reversing direction, defying gravity, going faster than light, etc. These paths could be removed from the solution space, making it smaller and easier to search. The pruned-down solution space is much, much smaller than the initial solution space because we have imposed constraints. But in science we can take this a step further. The constrained solution space can be made even smaller because it is highly redundant: given two four-dimensional points on a path, the rest of the points can be found. The entire solution space can be collapsed into a formula, a mathematical description of a model. Science expects this to be true.

Why this should be so is an open question. Of course, simple systems can give rise to complex phenomena, so in most worlds observations will be more complex than the system giving rise to them. It may be that we keep peeling back layers of the onion, saying “look it’s actually simpler under here!” and that at some point we will peel back a layer and find irreducible complexity. Or there may be some other explanation.11

Models allow us to make sense of observations and suggest which observations to make. Einstein said “If a researcher would approach things without a preconceived opinion, how would he be able to pick the facts from the tremendous richness of the most complicated experiences that are simple enough to reveal their connections through laws?”12 If the model does not explain the phenomena completely or precisely enough, then scientists will do experiments to expand or sharpen the model. Moreover, with any set of data, there is always more than one theory that can account for it. Occam’s Razor doesn’t apply to data, but it does to models.

If you are examining a complex system, brute force trial and error may be enormously inefficient. There may be too many observations that can be made if you can’t generalize, the same observation may have different results if you don’t theorize how the non-observable state of the system changes. The map might have to be as big as the territory if you can’t find some way to compress it. But if the complex system is generated by some simpler system, you can make trial and error much more efficient, as science has done.

Scientists also take other considerations into account when choosing trials.

There are some experiments that should not be conducted because they are unethical. This may be because there are externalities generated by the experiment that are borne by those who did not choose to bear them. This is true when an experiment is irreversible and may have an existential outcome, for instance. But it can be true of smaller experiments as well. Unethical experiments should be eliminated from consideration.

If experiments are ethical, cost/benefit comes into play. Given the same potential benefit, scientists will (rationally) do the cheaper experiments first. Cheaper, but not necessarily cheap. The Large Hadron Collider was not cheap, but it seemed the cheapest way to make the observations particle physicists needed to progress. Thought experiments are always cheapest, and money is always a constraint. This may be why theorists are so preeminent in physics.

Time is another constraint. Scientists will generally prefer to put the knowledge they already have to use, if they can, rather than spend the time learning an entirely new set of things. This is a rational cost calculation.

Costs are usually more apparent than benefits. This is part and parcel of uncertainty. Benefits may sometimes be known, at least to an extent. The benefits of cold fusion can be analytically estimated, and would be enormous. Sometimes benefits can be ranked relative to others: knowing the mass of the electron might have been, idk, more valuable than knowing the mass of the neutrino. If the cost of determining either was approximately the same, this allows a determination of which trial to prefer. Etc.

We can determine more interesting benefits by thinking about the process. Experiments that can rule out swathes of other experiments have the benefit of not having to incur the costs of those experiments. Experiments that can open up new, more productive areas of exploration have value that others don’t. Etc. When scientists calculate benefits, they include the follow-on benefits: if an experiment must be performed before other important experiments, then this value has to be tallied. These follow-on benefits may be more important than the immediate result. Calculating the benefit of an experiment as simply the value of its observed outcome is short-sighted in a repeated process. You must think about the benefit to the course of the process itself.

When does the process stop? When running a maze it’s easy to know when to stop: when you exit the maze. It’s harder in science. Science seems to endlessly experiment on and refine models. But, in reality, some experimental paths are no longer pursued when what they are exploring starts to be exhausted or unproductive. Some models are no longer refined because they have been abandoned (the aether, for example.) Some models so accurately represent and predict observations that scientists stop thinking about them: chemical bonds, for example.13

But, then, neither of the examples I just used hold up to scrutiny: if you go to Google Scholar and search for the aether or chemical bonds you’ll find plenty of papers written on both in the past year alone. There are scientists working on refining models that seem to be entirely correct, like Maxwell’s equations.14 Because observations might incompletely specify solutions, problems are never quite ‘solved.’ Indeed, as Kuhn points out, problems may seem closer and closer to being solved and then be upended entirely. The process of solving a specific scientific problem doesn’t stop abruptly, it sort of peters out. Resources shift to areas where models are less accurate or precise because these are often the glaring priorities, or to areas that are more productive or that promise bigger benefits. Sometimes a new observation later reopens a problem and resources again shift to explaining it.

Science balances out more promising and less promising areas of research through parallelism. Different scientists work on different things, though they also often overlap. This is made more efficient by sharing knowledge through publication. And it is motivated by acknowledging the first person or people to make an important advance (Even though this is often such an artificial distinction that it seems almost a gamification.)

Scientists often work in groups, either formal or informal. Models—hypotheses and theories—are shared in the group and group members generate ideas to test and further the models. The size and number of different groups can modulate the trial and error search process between broad and deep. Having many scientists doing research in parallel is, on the one hand, inefficient. There must be an enormous amount of trialing that is done several times more often than it needs to be. Partly because the scientists may need to see the result first-hand, partly because interim results may be kept secret to protect a path towards a bigger result, and partly because failed trials are often not reported. The flip side is that quicker advances can be made both because many eyes are looking at the problem, many different theories can be pursued at the same time, and because competition is a great motivator.

Not all trial and error processes can be parallelized. Running a maze might be difficult to parallelize because it requires sequential trials (depth-first, not breadth-first search.) And some science (large particle accelerator experiments, for instance) is too costly to parallelize much. (Cost and parallelism are tradeoffs when money is a constraint, this is why there is far more parallelism in theorizing.) But parallelism can greatly increase the speed of a trial and error process.

Science has customized trial and error in several interesting ways: trial generation and choice, what it considers an ‘error’, iteration and stopping conditions, and parallelism. These contradict several naive views on trial and error.

The naive view of trial and error is: you try random things until something works (“blind variation…”.) In reality, we try things that seem most likely to succeed. Science does this, but also tries things most likely to generate more knowledge, knowledge that might help future trials. Understanding why things happen, rather than what happens, is more general and more valuable. After a trial’s results are in, deduction and induction are used to help decide what to try next.

We can also refine the naïve view by noting that “most likely to succeed” must be modulated with cost calculations and benefit calculations and, when possible, comparing the two.

The naive view is that when a trial doesn’t work as planned, it is an error and you should backtrack and try something else. Science learns from things that don’t work, as well as things that do. Both help explain how. Any piece of evidence can help narrow the space of possible explanations.

The naïve view is that the error part of trial and error is easy to spot. The truth is that it often isn’t. Science can go a long way down blind alleys. In addition, scientists often misinterpret their results, even when they are careful to set out what constitutes local success or failure. Real-world results are often somewhat ambiguous, and can increase or decrease the probability of a model without fully proving or disproving it. And because models are built on observations and observations taken to build models, scientists can find themselves using data to support entirely wrong models by expanding those models to encompass new observations. It is not always easy to know what is true.

The naïve view is that you try things until you find something that works. Science, on the other hand, does not stop when it finds something that works, it continues to test and refine it. Science is an open-ended process. The goal is to increase knowledge. Rather than thinking of stopping conditions, it makes more sense to think of resource allocation between different trial and error processes. When one process seems less interesting, either because it has bogged down or because it has adequately explained the phenomena (and both may be because costs have become too high or benefits too low,) resources move elsewhere.

Scientists must also include the task of convincing others in what they consider success. If they don’t, then their experiments are wasted.

Parallelism doesn’t contradict a naïve view: most people know you can try many things at once. Science tries many things at once, but in a constrained way. There are prevailing schools of thought that both compete and coordinate, and within those schools are sub-schools that also compete and coordinate. This narrows the scope of parallelism. Instead of many people doing unrelated things until somebody makes a breakthrough (a breadth-first approach), those people organize into groups (formal or informal) to go deeper on specific approaches. This balance between depth-first and breadth-first makes sense when you might have to go quite deep before you have enough evidence of your model to discard other models.

All these things broaden our view of what trial and error is and how it can be configured. In the next post we will talk about a different trial and error process—biological evolution—to see another way. Then we will talk about the levers you can pull and dials you can turn when designing ad hoc trial and error processes, including business processes.

Bibliographical Note

I bit off more than I can chew with these posts. It seemed to me that trial and error would be pretty straightforward. But I’ve convinced myself while writing it that trial and error is as fundamental a knowledge-gaining mechanism as, say, induction. And a whole lot more has been written about induction. Maybe because everyone else has made the same mistake?

Anyway, I did a lot of reading for this, and I looked for others’ thoughts about it quite broadly. I didn’t find a lot but I did end up doing a ton more reading than I planned. Where I have quoted or drawn directly I have cited works, but several other books and papers fed into the process and some books fed into the process far more than what I quoted would suggest.

  • Bechtel, W., Richardson, R. C. (2010). Discovering Complexity: Decomposition and Localization as Strategies in Scientific Research. USA: MIT Press.
  • Glennan, S. (2017). The New Mechanical Philosophy. United Kingdom: Oxford University Press.
  • Machamer, P.K., Darden, L., Craver, C.F. (2000). “Thinking about Mechanisms”, Philosophy of Science, 67:1–25.
  • Hull, D.L. (1988). Science as a Process: An Evolutionary Account of the Social and Conceptual Development of Science. Chicago: University of Chicago Press. (Hull argues, following a comment by Kuhn, that science is like evolution. He then describes how a selection process works. I think his explication would be more powerful if he jettisoned the idea that science is evolutionary and just called his “selection” trial and error. The Kuhn remark, towards the end of Structure, avers that the choice between scientific paradigms is made through “selection by conflict within the scientific community of the fittest way to practice future science.” This seems to gloss over that the loser in a biological bout of survival of the fittest dies. This is a definitive way to resolve a conflict. What do the followers of Kuhn imagine actually decides a scientific conflict? If the answer is anything like “people are swayed by a more convincing argument,” it is really hard to see how this is remotely akin to biological evolution, whose sine qua non is that it is an emergent process, where nobody makes decisions. Like Kuhn, Hull does not tie the selection process back to the ultimate arbiter of truth: that the science explains the phenomena.)
  • Galison, P. (1987). How Experiments End. Chicago: University of Chicago Press. (This book is more about how scientists decide to end the trial and error process because they have been convinced they are correct, as opposed to ending it because it is unfruitful.)
  • Shapin, S., Schaffer, S. (1985). Leviathan and the Air Pump. New Joisey: Princeton University Press. (This book was eye-opening. Hobbes’ rejection of experiment as a preeminent way to create knowledge, as opposed to reason alone, is difficult to entirely dismiss. Especially if you’ve gone through the kind of educational regime where they make you read too much philosophy. The idea that real knowledge must be bottom-up, from first principles, is appealing. Top-down knowledge has, after all, frequently lead us astray. Aside from that, Boyle’s program of building science itself points out some things that are non-obvious because we take them for granted, like the idea that finding a fact isn’t enough, you must also convince others that you have found it.)

I mean, the thing you should know is, the less sure I am about what I am saying, the more I look for backup. The length of the reading list on this one should tell you all you need to know about my own opinion of it. This all seems like a natural extension of Herbert Simon’s work, and I have to believe that someone smarter than me has extended it. If I had found that work, I would have summarized it for you. I did not. Maybe it exists? Let me know if you find it.


  1. For instance, in Simon, H. A. (1964). On the Concept of Organizational Goal. Administrative Science Quarterly, 9(1), 1–22. https://doi.org/10.2307/2391519 

  2. And the two are analogous in the sense that finding what is determining a point in a solution space while finding how is the same but in a somewhat more abstract solution space, where whats have been mapped to causes. 

  3. Khan Academy, “The Scientific Method”, https://www.khanacademy.org/science/high-school-biology/hs-biology-foundations/hs-biology-and-the-scientific-method/a/the-science-of-biology 

  4. See for background, Shapin, S., Shaffer, S. (2017). Leviathan and the Air-pump, Princeton: Princeton University Press., though I in no way attribute this view to the authors. 

  5. In Bechtel, W., Richardson, R. C. (2010). Discovering Complexity: Decomposition and Localization as Strategies in Scientific Research. USA: MIT Press, p. 5. 

  6. Kuhn, T. (1962). The Structure of Scientific Revolutions. Chicago: University of Chicago Press (2nd Edition, 1970), p.170. 

  7. Simon, HA. (1969). The Sciences of the Artificial. Cambridge, MA: MIT Press, p.17. 

  8. There is probably a Kuhnian counter-point here, that theory choice is not really so rational, but I think the irrationality is simply a failure of the deductive process or, in fact, Simon’s satisficing. Since this essay is working towards the normative, I’m avoiding that discussion. 

  9. In a letter to Benjamin Abbott, Feb 23, 1815; given in James, F.A.J.L., ed., The Correspondence of Michael Faraday, vol. 1, London: Institution of Engineering and Technology, 1991, p. 128. 

  10. Huxley, T. (2011). Biogenesis AND Abiogenesis [1870]. In Collected Essays (Cambridge Library Collection – Philosophy, pp. 244). Cambridge: Cambridge University Press. doi:10.1017/CBO9781139149273.009 

  11. https://aeon.co/essays/why-is-simplicity-so-unreasonably-effective-at-scientific-explanation 

  12. Einstein, A., “Induction and Deduction in Physics”, 25 December 1919, Berliner Tageblatt, 25 December 1919, p. 1. Beiblatt. In The Collected Papers of Albert Einstein, Volume 7: The Berlin Years: Writings, 1918-1921 (English translation supplement) Translated by Alfred Engel; https://einsteinpapers.press.princeton.edu/vol7-trans/124 

  13. See How Experiments End, for more on this. 

  14. Such as, I assume, this. I can’t really tell you for sure because I’m having a little trouble making sense of the abstract.