Logics of discovery and justification

An ideal theory of scientific method would consist of instructions that could lead an investigator from ignorance to knowledge. Descartes and Bacon sometimes wrote as if they could offer so ideal a theory, but after the mid-20th century the orthodox view was that this is too much to ask for. Following Hans Reichenbach (1891–1953), philosophers often distinguished between the “context of discovery” and the “context of justification.” Once a hypothesis has been proposed, there are canons of logic that determine whether or not it should be accepted—that is, there are rules of method that hold in the context of justification. There are, however, no such rules that will guide someone to formulate the right hypothesis, or even hypotheses that are plausible or fruitful. The logical empiricists were led to this conclusion by reflecting on cases in which scientific discoveries were made either by imaginative leaps or by lucky accidents; a favourite example was the hypothesis by August Kekulé (1829–96) that benzene molecules have a hexagonal structure, allegedly formed as he was dozing in front of a fire in which the live coals seemed to resemble a snake devouring its own tail.

Although the idea that there cannot be a logic of scientific discovery often assumed the status of orthodoxy, it was not unquestioned. As will become clear below (see Scientific change), one of the implications of the influential work of Thomas Kuhn (1922–96) in the philosophy of science was that considerations of the likelihood of future discoveries of particular kinds are sometimes entangled with judgments of evidence, so discovery can be dismissed as an irrational process only if one is prepared to concede that the irrationality also infects the context of justification itself.

Sometimes in response to Kuhn and sometimes for independent reasons, philosophers tried to analyze particular instances of complex scientific discoveries, showing how the scientists involved appear to have followed identifiable methods and strategies. The most ambitious response to the empiricist orthodoxy tried to do exactly what was abandoned as hopeless—to wit, specify formal procedures for producing hypotheses in response to an available body of evidence. So, for example, the American philosopher Clark Glymour and his associates wrote computer programs to generate hypotheses in response to statistical evidence, hypotheses that often introduced new variables that did not themselves figure in the data. These programs were applied in various traditionally difficult areas of natural and social scientific research. Perhaps, then, logical empiricism was premature in writing off the context of discovery as beyond the range of philosophical analysis.

In contrast, logical empiricists worked vigorously on the problem of understanding scientific justification. Inspired by the thought that Frege, Russell, and Hilbert had given a completely precise specification of the conditions under which premises deductively imply a conclusion, philosophers of science hoped to offer a “logic of confirmation” that would identify, with equal precision, the conditions under which a body of evidence supported a scientific hypothesis. They recognized, of course, that a series of experimental reports on the expansion of metals under heat would not deductively imply the general conclusion that all metals expand when heated—for even if all the reports were correct, it would still be possible that the very next metal to be examined failed to expand under heat. Nonetheless, it seemed that a sufficiently large and sufficiently varied collection of reports would provide some support, even strong support, for the generalization. The philosophical task was to make precise this intuitive judgment about support.

During the 1940s, two prominent logical empiricists, Rudolf Carnap (1891–1970) and Carl Hempel (1905–97), made influential attempts to solve this problem. Carnap offered a valuable distinction between various versions of the question. The “qualitative” problem of confirmation seeks to specify the conditions under which a body of evidence E supports, to some degree, a hypothesis H. The “comparative” problem seeks to determine when one body of evidence E supports a hypothesis H more than a body of evidence E* supports a hypothesis H* (here E and E* might be the same, or H and H* might be the same). Finally, the “quantitative” problem seeks a function that assigns a numerical measure of the degree to which E supports H. The comparative problem attracted little attention, but Hempel attacked the qualitative problem while Carnap concentrated on the quantitative problem.

It would be natural to assume that the qualitative problem is the easier of the two, and even that it is quite straightforward. Many scientists (and philosophers) were attracted to the idea of hypothetico-deductivism, or the hypothetico-deductive method: scientific hypotheses are confirmed by deducing from them predictions about empirically determinable phenomena, and, when the predictions hold good, support accrues to the hypotheses from which those predictions derive. Hempel’s explorations revealed why so simple a view could not be maintained. An apparently innocuous point about support seems to be that, if E confirms H, then E confirms any statement that can be deduced from H. Suppose, then, that H deductively implies E, and E has been ascertained by observation or experiment. If H is now conjoined with any arbitrary statement, the resulting conjunction will also deductively imply E. Hypothetico-deductivism says that this conjunction is confirmed by the evidence. By the innocuous point, E confirms any deductive consequence of the conjunction. One such deductive consequence is the arbitrary statement. So one reaches the conclusion that E, which might be anything whatsoever, confirms any arbitrary statement.

To see how bad this is, consider one of the great predictive theories—for example, Newton’s account of the motions of the heavenly bodies. Hypothetico-deductivism looks promising in cases like this, precisely because Newton’s theory seems to yield many predictions that can be checked and found to be correct. But if one tacks on to Newtonian theory any doctrine one pleases—perhaps the claim that global warming is the result of the activities of elves at the North Pole—then the expanded theory will equally yield the old predictions. On the account of confirmation just offered, the predictions confirm the expanded theory and any statement that follows deductively from it, including the elfin warming theory.

Hempel’s work showed that this was only the start of the complexities of the problem of qualitative confirmation, and, although he and later philosophers made headway in addressing the difficulties, it seemed to many confirmation theorists that the quantitative problem was more tractable. Carnap’s own attempts to tackle that problem, carried out in the 1940s and ’50s, aimed to emulate the achievements of deductive logic. Carnap considered artificial systems whose expressive power falls dramatically short of the languages actually used in the practice of the sciences, and he hoped to define for any pair of statements in his restricted languages a function that would measure the degree to which the second supports the first. His painstaking research made it apparent that there were infinitely many functions (indeed, continuum many—a “larger” infinity corresponding to the size of the set of real numbers) satisfying the criteria he considered admissible. Despite the failure of the official project, however, he argued in detail for a connection between confirmation and probability, showing that, given certain apparently reasonable assumptions, the degree-of-confirmation function must satisfy the axioms of the probability calculus.

Bayesian confirmation

That conclusion was extended in the most prominent contemporary approach to issues of confirmation, so-called Bayesianism, named for the English clergyman and mathematician Thomas Bayes (1702–61). The guiding thought of Bayesianism is that acquiring evidence modifies the probability rationally assigned to a hypothesis.

For a simple version of the thought, a hackneyed example will suffice. If one is asked what probability should be assigned to drawing the king of hearts from a standard deck of 52 cards, one would almost certainly answer 1/52. Suppose now that one obtains information to the effect that a face card (ace, king, queen, or jack) will be drawn; now the probability shifts from 1/52 to 1/16. If one learns that the card will be red, the probability increases to 1/8. Adding the information that the card is neither an ace nor a queen makes the probability 1/4. As the evidence comes in, one forms a probability that is conditional on the information one now has, and in this case the evidence drives the probability upward. (This need not have been the case: if one had learned that the card drawn was a jack, the probability of drawing the king of hearts would have plummeted to 0.)

Bayes is renowned for a theorem that explains an important relationship between conditional probabilities. If, at a particular stage in an inquiry, a scientist assigns a probability to the hypothesis H, Pr(H)—call this the prior probability of H—and assigns probabilities to the evidential reports conditionally on the truth of H, PrH(E), and conditionally on the falsehood of H, Pr−H(E), Bayes’s theorem gives a value for the probability of the hypothesis H conditionally on the evidence E by the formula PrE(H) = Pr(H)PrH(E)/[Pr(H)PrH(E) + Pr(−H)Pr−H(E)] .

One of the attractive features of this approach to confirmation is that when the evidence would be highly improbable if the hypothesis were false—that is, when Pr−H(E) is extremely small—it is easy to see how a hypothesis with a quite low prior probability can acquire a probability close to 1 when the evidence comes in. (This holds even when Pr(H) is quite small and Pr(−H), the probability that H is false, correspondingly large; if E follows deductively from H, PrH(E) will be 1; hence, if Pr−H(E) is tiny, the numerator of the right side of the formula will be very close to the denominator, and the value of the right side thus approaches 1.)

Any use of Bayes’s theorem to reconstruct scientific reasoning plainly depends on the idea that scientists can assign the pertinent probabilities, both the prior probabilities and the probabilities of the evidence conditional on various hypotheses. But how should scientists conclude that the probability of an interesting hypothesis takes on a particular value or that a certain evidential finding would be extremely improbable if the interesting hypothesis were false? The simple example about drawing from a deck of cards is potentially misleading in this respect, because in this case there seems to be available a straightforward means of calculating the probability that a specific card, such as the king of hearts, will be drawn. There is no obvious analogue with respect to scientific hypotheses. It would seem foolish, for example, to suppose that there is some list of potential scientific hypotheses, each of which is equally likely to hold true of the universe.

Bayesians are divided in their responses to this difficulty. A relatively small minority—the so-called “objective” Bayesians—hope to find objective criteria for the rational assignment of prior probabilities. The majority position—“subjective” Bayesianism, sometimes also called personalism—supposes, by contrast, that no such criteria are to be found. The only limits on rational choice of prior probabilities stem from the need to give each truth of logic and mathematics the probability 1 and to provide a value different from both 0 and 1 for every empirical statement. The former proviso reflects the view that the laws of logic and mathematics cannot be false; the latter embodies the idea that any statement whose truth or falsity is not determined by the laws of logic and mathematics might turn out to be true (or false).

On the face of it, subjective Bayesianism appears incapable of providing any serious reconstruction of scientific reasoning. Thus, imagine two scientists of the late 17th century who differ in their initial assessments of Newton’s account of the motions of the heavenly bodies. One begins by assigning the Newtonian hypothesis a small but significant probability; the other attributes a probability that is truly minute. As they collect evidence, both modify their probability judgments in accordance with Bayes’s theorem, and, in both instances, the probability of the Newtonian hypothesis goes up. For the first scientist it approaches 1. The second, however, has begun with so minute a probability that, even with a large body of positive evidence for the Newtonian hypothesis, the final value assigned is still tiny. From the subjective Bayesian perspective, both have proceeded impeccably. Yet, at the end of the day, they diverge quite radically in their assessment of the hypothesis.

If one supposes that the evidence obtained is like that acquired in the decades after the publication of Newton’s hypothesis in his Principia (Philosophiae naturalis principia mathematica, 1687), it may seem possible to resolve the issue as follows: even though both investigators were initially skeptical (both assigned small prior probabilities to Newton’s hypothesis), one gave the hypothesis a serious chance and the other did not; the inquirer who started with the truly minute probability made an irrational judgment that infects the conclusion. No subjective Bayesian can tolerate this diagnosis, however. The Newtonian hypothesis is not a logical or mathematical truth (or a logical or mathematical falsehood), and both scientists give it a probability different from 0 and 1. By subjective Bayesian standards, that is all rational inquirers are asked to do.

The orthodox response to worries of this type is to offer mathematical theorems that demonstrate how individuals starting with different prior probabilities will eventually converge on a common value. Indeed, were the imaginary investigators to keep going long enough, their eventual assignments of probability would differ by an amount as tiny as one cared to make it. In the long run, scientists who lived by Bayesian standards would agree. But, as the English economist (and contributor to the theory of probability and confirmation) John Maynard Keynes (1883–1946) once observed, “in the long run we are all dead.” Scientific decisions are inevitably made in a finite period of time, and the same mathematical explorations that yield convergence theorems will also show that, given a fixed period for decision making, however long it may be, there can be people who satisfy the subjective Bayesian requirements and yet remain about as far apart as possible, even at the end of the evidence-gathering period.