`It has been claimed, most prominently by Dr.
Hugh Ross on his web site`

`that the so-called "fine-tuning" of
the constants of physics supports a supernatural origin of the universe.
Specifically, it is claimed that many of the constants of physics must be
within a very small range of their actual values, or else life could not
exist in our universe. Since it is alleged that this range is very small,
and since our very existence shows that our universe has values of these
constants that would allow life to exist, it is argued that the probability
that our universe arose by chance is so small that we must seek a supernatural
origin of the universe.`

`In this article we will show that this argument
is wrong. Not only is it wrong, but in fact we will show that the observation
that the universe is "fine-tuned" in this sense can only count
against a supernatural origin of the universe. And we shall furthermore
show that with certain theologies suggested by deities that are both inscrutable
and very powerful, the more "finely-tuned" the universe is, the
more a supernatural origin of the universe is undermined.`

`[ Note added 020106: We have learned that
the philosopher of science, Elliott Sober, has made some similar points
in a recent article written for the Blackwell Guide to Philosophy of
Religion. A recent copy can be obtained here:
We have some slight differences with Professor Sober (in particular, we
think that his condition (A3) is too strong, and that a weaker version of
(A3) actually gives a stronger result), but he has an excellent discussion
of the role that selection bias plays where the bias is due to self-selection
by sentient observers.]`

`Our basic argument starts with a few very simple
assumptions. We believe that anyone who accepts that the universe is "fine-tuned"
for life would find it difficult not to accept these assumptions. They are:`

a) Our universe exists and contains life.

b) Our universe is "life friendly," that is, the conditions in our universe (such as physical laws, etc.) permit or are compatible with life existing naturalistically.

c) Life cannot exist in a universe that is governed solely by naturalistic law unless that universe is "life-friendly."

`In this FAQ we will discuss only the Weak Anthropic
Principle (WAP), since it is uncontroversial and generally accepted. We
will not discuss the Strong Anthropic Principle (SAP), much less the Completely
Ridiculous Anthropic Principle :-)`

`According to the WAP, which is embodied in assumption
(c), the fact that life (and we as intelligent life along with it) exists
in our universe, coupled with the assumption that the universe is governed
by naturalistic law, implies that those laws must be "life-friendly."
If they were not "life-friendly," then it is obvious that life
could not exist in a universe governed solely by naturalistic law. However,
it should be noted that a sufficiently powerful supernatural principle or
entity (deity) could sustain life in a universe with laws that are not "life-friendly,"
simply by virtue of that entity's will and power.`

`We will show that if assumptions (a-c) are true,
then the observation that our universe is "life-friendly" can
never be evidence against the hypothesis that the universe
is governed solely by naturalistic law. Moreover, "fine-tuning,"
in the sense that "life-friendly" laws are claimed to represent
only a very small fraction of possible universes, can even undermine the
hypothesis of a supernatural origin of the universe; and the more "finely-tuned"
the universe is, the more this hypothesis can be undermined.`

`There are a number of traditional arguments
that have been made against the "fine-tuning" argument. We will
state them here, and we think that they are valid, although our main interest
will be directed towards some new insights arising from a deeper understanding
of probability theory.`

`1) In proving our main result, we do not assume
or contemplate that universes other than our own exist (e.g., as in cosmologies
such as those proposed by A. Vilenkin ["Quantum creation of the universe,"
Phys Rev D Vol. 30, pp. 509-511 (1984)], André Linde
["The self-reproducing inflationary universe," Scientific American,
November 1994, pp. 48-55], and most recently, Lee Smolin [Life of the
Cosmos, Oxford University Press (1997)], or as in some kinds of "many
worlds" quantum models). One argument against Ross has been to claim
that there may be many universes with many different combinations of physical
constants. If there are enough of them, a few would be able to support life
solely by chance. It is hypothesized that we live in one of those few. Thus,
this argument seeks to overcome the low probability of having a universe
with life in it with a multiplicity of universes. A recent technical discussion
of this idea by Garriga and Vilenken can be found at http://xxx.lanl.gov/abs/gr-qc/0102010.`

`2) Others have argued against the assumption
that the universe must have very narrowly constrained values of certain
physical constants for life to exist in it. They have argued that life could
exist in universes that are very different from ours, but it is only our
insular ignorance of the physics of such universes that misleads us into
thinking that a universe must be much like our own to sustain life. Indeed,
virtually nothing is known about the possibility of life in universes that
are very different from ours. It could well be that most universes could
support life, even if it is of a type that is completely unfamiliar to us.
To assert that only universes very like our own could support life goes
well beyond anything that we know today.`

`Indeed, it might well be that a fundamental
"theory of everything" in physics would predict that only a very
narrow range of physical constants, or even no range at all, would be possible.
If this turns out to be the case, then the entire "fine-tuning"
argument would be moot.`

`While recognizing the force and validity of
these arguments, the main points we will make go in quite different directions,
and show that even if Ross is correct about "fine-tuning"
and even if ours is the only universe that exists, the "fine-tuning"
argument fails.`

`In this section, we will introduce some necessary
notation and discuss some basic probability theory needed in order to understand
our points`

`First, some notation. We introduce several predicates,
(statements which can have values true or false).`

`Let L="The universe exists and contains
Life." L is clearly true for our universe (assumption a).`

`Let F="The conditions in the universe are
'life-Friendly,' in the sense described above." Ross, in his arguments,
certainly assumes that F is true. So will we (assumption b). The negation,
~F, would be that the conditions are such that life cannot exist naturalistically,
so that if life is present it must be because of some supernatural principle
or entity.`

`Let N="The universe is governed solely
by Naturalistic law." The negation, ~N, is that it is not governed
solely by naturalistic law, that is, some non-naturalistic (supernaturalistic)
principle or entity is involved. N and ~N are not assumptions; they are
hypotheses to be tested. However, we do not rule out either possibility
at the outset; rather, we assume that each of them has some non-zero a-priori
probability of being true.`

`Probability theory now allows us to write down
some important relationships between these predicates. For example, assumption
(c) can be written mathematically as N&L==>F ('==>' means logical
implication). In the language of probability theory, this can be expressed
as`

P(F|N&L)=1

`where P(A|B) is the probability that A is true,
given that B is true [see footnote 1 for a formal mathematical definition],
and '&' is logical conjunction.`

`Expressed in the language of probability theory,
we understand the "fine-tuning" argument to claim that if naturalistic
law applies, then the probability that a randomly-selected universe would
be "life-friendly" is very small, or in mathematical terms, P(F|N)<<1.
Notice that this condition is not a predicate like L, N and F; Rather, it
is a statement about the probability distribution P(F|N), considered
as it applies to all possible universes. For this reason, it is not possible
to express the "fine-tuning" condition in terms of one of the
arguments A or B of a probability function P(A|B). It is, rather, a statement
about how large those probabilities are.`

`The "fine-tuning" argument then reasons
that if P(F|N)<<1, then it follows that P(N|F)<<1. In ordinary
English, this says that if the probability that a randomly-selected universe
would be life-friendly (given naturalism) is very small, then the probability
that naturalism is true, given the observed fact that the universe is "life-friendly,"
is also very small. This, however, is an elementary if common blunder in
probability theory. One cannot simply exchange the two arguments in a probability
like P(F|N) and get a valid result. A simple example will suffice to show
this.`

Example

Let A="I am holding a Royal Flush."

Let B="I will win the poker hand."

It is evident that P(A|B) is nearly 0. Almost all poker hands are won with hands other than a Royal Flush. On the other hand, it is equally clear that P(B|A) is nearly 1. If you have a Royal Flush, you are virtually certain to win the poker hand.

`There is a second reason why this "fine-tuning"
argument is wrong. It is that for an inference to be valid, it is necessary
to take into account all known information that may be relevant to
the conclusion. In the present case, we happen to know that life
exists in our universe (i.e., that L is true). Therefore, it is invalid
to make inferences about N if we fail to take into account the fact that
L, as well as F, are already known to be true. It follows that any inferences
about N must be conditioned upon both F and L. An example
of this is seen in the next section.`

`The most important consequence of the previous
paragraph is very simple: In inferring the probability that N is true, it
is entirely irrelevant whether P(F|N) is large or small. It is entirely
irrelevant whether the universe is "fine-tuned" or not. Only probabilities
conditioned upon L are relevant to our inquiry.`

`Richard Harter <cri@tiac.net> has suggested
a somewhat different interpretation of the "fine-tuning" argument
in E-mail (reproduced here with permission). He writes:`

This takes care of the WAP; if one argues solely from the WAP the FAQ argument is correct. However the "fine tuning" argument is not (despite what its proponents say) a WAP argument; it is an inverse Bayesian argument. The argument runs thusly:

P(F|~N) >> P(F|N)

ergo

P(~N|F) >> P(N|F)

Considered as a formal inference this is a fallacy. None-the-less it is a normal rule of induction which is (usually) sound. The reason is that for the "conclusion" not to hold we need

P(N) >> P(~N)

[This is not the full condition but it is close enough for government work.]

`There are two fallacies in this form of the
argument. The first is the failure to condition on L, mentioned above. This
in itself would render the argument invalid. The second is that the first
line of the argument, P(F|~N) >> P(F|N), is merely an unsupported
assertion. No one knows what the probability of a supernatural entity creating
a universe that is F is! For example, a dilettante deity might never get
around to creating any universes at all, much less ones capable of supporting
life.`

`[ Note added 010612: Since this was written,
we have proved that if You, knowing as a sentient observer that L is true,
adopt an a priori position that is neutral between N and ~N, i.e.,
that P(~N|L) is of the same order of magnitude as P(N|L), then when You
learn that F is true and that P(F|N)<<1, You will conclude
that P(F&L&~N)<<1. See Appendix 1 (Reply to Kwon) at the end
of this essay for the proof. This observation is problematic for Harter's
argument. For under these assumptions we have`

P(F&L&~N)=P(L|F&~N)P(F|~N)P(~N)<<1.

`Thus under these assumptions it follows that
at least one of P(L|F&~N), P(F|~N) or P(~N) is quite small. A small
P(L|F&~N) says that it is almost certain that the supernatural deity,
having created a "life-friendly" universe, would make it sterile
(lifeless). A small P(F|~N) says that it is highly unlikely that
this deity would even create a universe that is "life-friendly".
Both of these undermine the usual concepts attributed to the deity by "intelligent
design" theorists, although either would be consistent with a deity
that was incompetent, a dilettante, or a "trickster". A small
P(F|~N) is also consistent with a deity who makes many universes, most of
them being ~F, with many of these ~F universes perhaps containing life (that
is, ~F&L universes, as we discuss below). A small P(~N) says that it
is nearly certain that naturalism is true a priori and unconditioned
on L, so that Harter's "escape" condition P(N)>>P(~N) in
fact holds.`

`Please remember that if You are a sentient observer,
You must already know that L is true, even before You learn anything about
F or P(F|N). Thus it is legitimate, appropriate, and indeed required,
for You to elicit Your prior on N versus ~N conditioned on L and use that
as Your starting point. If You then retrodict that P(~N)<<1 as a consequence,
all You are doing is eliciting the prior that You would have had in the
absence of Your knowledge that You existed as a sentient observer. This
is the only legitimate way to infer Your value of P(~N) unconditioned on
L.]`

`Having understood the previous discussion, and
with our notation in hand, it is now easy to prove that the WAP does not
support supernaturalism (which we take to be the negation ~N of N). Recall
that the WAP can be written as P(F|N&L)=1. Then, by Bayes' theorem [see
footnote 2] we have`

P(N|F&L) = P(F|N&L)P(N|L)/P(F|L)= P(N|L)/P(F|L)>= P(N|L)

`where '>=' means "greater than or equal
to." The second line follows because P(F|N&L)=1, and the inequality
of the third line follows because P(F|L) is a positive quantity less than
or equal to 1. (The above demonstration is inspired by a recent article
on talk.origins by Michael Ikeda <mmikeda@erols.com>; we have simplified
the proof in his article. The message ID for the cited article is <5j6dq8$bvj@winter.erols.com>
for those who wish to search for it on dejanews.)`

`The inequality P(N|F&L)>=P(N|L) shows
that the WAP supports (or at least does not undermine) the hypothesis that
the universe is governed by naturalistic law. This result is, as we have
emphasized, independent of how large or small P(F|N) is. The observation
F cannot decrease the probability that N is true (given the known background
information that life exists in our universe), and may well increase it.`

`Corollary: Since P(~N|F&L)=1-P(N|F&L)
and similarly for P(~N|L), it follows that P(~N|F&L)<=P(~N|L). In
other words, the observation F does not support supernaturalism (~N), and
may well undermine it.`

`The thrust of practically all "Intelligent
Design" and Creationist arguments (excepting the anthropic argument
and perhaps a few others) is to show ~F, since it is evident, we think,
that if ~F then we cannot have both life and a naturalistic universe. We
evidently do have life, so the success of one of these arguments would clearly
establish ~N. In other words, given our prior opinion P(N&L), where
0<P(N&L)<1 but otherwise unrestricted (thus we neither rule in
nor rule out N initially), arguments like Behe's attempt to support ~F so
as to undermine N:`

P(N|~F&L)<P(N|L).

`But the "anthropic" argument is that
observing F also undermines N:`

P(N|F&L)<P(N|L).

`We assert that the intelligent design folks
want these inequalities to be strict (otherwise there would be no point
in their making the argument!)`

`From these two inequalities we readily derive
a contradiction, as follows. From the definition of conditional probability
[see footnote 1], the two inequalities above yield`

P(N&~F&L)<P(N|L)P(~F&L), P(N& F&L)<P(N|L)P( F&L)

`Adding,`

P(N&L)= P(N&~F&L)+P(N&F&L)< P(N|L)(P(~F&L)+P(F&L))= P(N|L)P(L)=P(N&L),

`a contradiction since the inequality is strict.`

`If we remove the restriction that the inequalities
be strict, then the only case where both inequalities can be true is if`

P(N|~F&L)=P(N|L) and P(N|F&L)=P(N|L).

`In other words, the only case where both can
be true is if the information that the universe is "life-friendly"
has no effect on the probability that it is naturalistic (given the
existence of life); and this can only be the case if neither inequality
is strict.`

`In essence, we see that the intelligent design
folks who make the anthropic argument are really trying to have it both
ways: They want observation of F to undermine N, and they also want observation
of ~F to undermine N. That is, they want any observation whatsoever to undermine
N! But the error is that the anthropic argument does not undermine
N, it supports N. They can have one of the prongs of their argument, but
they can't have both.`

`[ Note added 010612: Some people have
objected to us that Behe is not making the argument ~F, but is only making
a statement that it is highly unlikely that certain of his "IC"
structures could arise naturalistically. Our reading of Behe that he is
making an argument that it is impossible for this to happen (a form
of ~F as we understand it), but even if we are wrong and he is not making
this argument, the point of our comments in this section is that making
the argument that the universe is F or is "fine-tuned" (P(F|N)<<1)
does not support supernaturalism; the argument that should be made is that
the universe is ~F, since this manifestly supports supernaturalism by refuting
naturalism. See Appendix 1 (Reply to Kwon) at the end of this essay.]`

`Ross' argument discusses the case where the
conditions in our universe are not only "life-friendly," but they
are also "fine-tuned," in the sense that only a very small fraction
of possible universes can be "life-friendly." We have shown that
regardless how "finely-tuned" the the laws of physics are, the
observation that the universe is capable of sustaining life cannot undermine
N.`

`As we have pointed out above, others have responded
to the claim of "fine-tuning" in several ways. One way has been
to point out that this claim is not corroborated by any theoretical understanding
about what forms of life might arise in universes with different physical
conditions than our own, or even any theoretical understanding about what
kinds of universes are possible at all; it is basically a claim founded
upon our own ignorance of physics. To those that make this point, the argument
is about whether P(F|N) is really small (as Ross claims), or is in fact
large. The point (against Ross) is essentially that Ross' crucial assumption
is completely without support.`

`A second response is to point out that several
theoretical lines of evidence indicate that many other, and perhaps even
an infinite number of other universes, with varying sets of physical constants
and conditions, might well exist, so that even if the probability that a
given universe would have constants close to those of our own universe is
small, the sheer number of such universes would virtually guarantee that
some of them would possess constants that would allow life to arise.`

`Nevertheless, it is necessary to consider the
implications of Ross' assertion that the universe is "fine-tuned."
Suppose it is true that amongst all naturalistic universes, only a very
small proportion could support life. What would this imply?`

`We have shown that the WAP tends to support
N, and cannot undermine it. This observation is independent of whether
P(F|N) is small or large, since (as we have seen) the only probabilities
that are significant for inference about N are those that are conditioned
upon all relevant data at our disposal, including the fact that L is true.
Therefore, regardless of the size of P(F|N), valid reasoning shows that
observing that F is true cannot decrease the probability that N is true,
and may increase it.`

`We believe that the real import of observing
that P(F|N) is small (if indeed that is true) would be to strengthen Vilenkin/Linde/Smolin-type
hypotheses that multiple universes with varying physical constants may exist.
If indeed the universe is governed by naturalistic laws, and if indeed the
probability that a universe governed by naturalistic laws can support life
is small, then this supports a Vilenkin/Linde/Smolin model of multiple universes
over a model that includes only a single universe with a single set of physical
constants.`

`To see this, let S="there is only a Single
universe," and M="there are Multiple universes." Let E =
"there Exists a universe with life." Clearly, P(E|N)<P(F|N),
since it is possible that a universe that is "life-friendly" could
still be barren. But, since L is true, E is also true, so observing L implies
that we have also observed E.`

`Then, assuming that P(F|N)<1 is the probability
that a single universe is "life-friendly," that this probability
is the same for each "random" multiple universe as it would be
for a single universe, and that the probability that a given universe exists
is independent of the existence of other universes, it follows that`

P(E|S&N) = p = P(E|N) < P(F|N) < 1 (and for Ross, P(F|N)<<1);

P(E|M&N) = 1 - (1-p)^{m}, where m is the number of universes if M is true; This is less than 1 but approaches 1 (for fixed p) as m gets larger and larger. Since all the Multiple-universe proposals we have seen suggest that m is in fact infinite, it follows that P(E|M&N)=1. (If one postulates that m is finite, then the calculation depends explicitly on p and m; this is left as an exercise for the reader.)

`Since`

P(S|E&N) = P(E|S&N)P(S|N)/P(E|N) andP(M|E&N) = P(E|M&N)P(M|N)/P(E|N),

`with these assumptions it follows by division
that`

P(M|E&N) 1 P(M|N)-------- = --- x ------,P(S|E&N) p P(S|N)

`which shows that observing E (or L) increases
the evidence for M against S in a naturalistic universe by a factor of at
least 1/p. The smaller P(F|N)=p (that is, the more "finely-tuned"
the universe is), the more likely it is that some form of multiple-universe
hypothesis is true.`

`The next section is rather more speculative,
depending as it does upon theological notions that are hard to pin down,
and therefore should be taken with large grains of salt. But it is worth
considering what effect various theological hypotheses would have on this
argument. It is interesting to ask the question, "given that observing
that F is true cannot undermine N and may support it, by how much can N
be strengthened (and ~N be undermined) when we observe that F is true?"`

`It is evident from the discussion of the main
theorem that the key is the denominator P(F|L). The smaller that denominator,
the greater the support for N. Explicitly we have`

P(F|L)=P(F|N&L)P(N|L)+P(F|~N&L)P(~N|L)

`But since P(F|N&L)=1 we can simplify this
to`

P(F|L)=P(N|L)+P(F|~N&L)P(~N|L).

`Plugging this into the expression P(N|F&L)=P(N|L)/P(F|L)
we obtain`

P(N|F&L) = P(N|L)/[P(N|L)+P(F|~N&L)P(~N|L))]= 1/[1+P(F|~N&L)P(~N|L)/P(N|L)]= 1/[1+C P(F|~N&L)],

`where C=P(~N|L)/P(N|L) is the prior odds in
favor of ~N against N. In other words, C is the odds that we would offer
in favor of ~N over N before noting that the universe is "fine-tuned"
for life.`

`A major controversy in statistics has been over
the choice of prior probabilities (or in this case prior odds). However,
for our purposes this is not a significant consideration, as long as we
don't choose C in such a way as to completely rule out either possibility
(N or ~N), i.e., as long as we haven't made up our minds in advance. This
means that any positive, finite value of C is acceptable.`

`One readily sees from this formula that for
acceptable C`

(1) as P(F|~N&L)-->0, P(N|F&L)-->1;(2) as P(F|~N&L)-->1, P(N|F&L)-->1/[1+P(~N|L)/P(N|L)]=P(N|L),

`where '-->' means "approaches as a limit"
and the last result follows from the fact that P(N|L)+P(~N|L)=1.`

`So, P(N|F&L) is a monotonically decreasing
function of P(F|~N&L) bounded from below by P(N|L). This confirms the
observation made earlier, that noting that F is true can never decrease
the evidential support for N. Furthermore, the only case where the evidential
support is unchanged is when P(F|~N&L) is identically 1. This is interesting,
because it tells us that the only case where observing the truth of F does
not increase the support for N is precisely the case when the likelihood
function P(F|x&L), evaluated at F, and with x ranging over N and ~N,
cannot distinguish between N and ~N. That is, the only way to prevent the
observation F from increasing the support for N is to assert that ~N&L
also requires F to be true. Under these circumstances we cannot distinguish
between N and ~N on the basis of the data F. In a deep sense, the two hypotheses
represent, and in fact, are the same hypothesis. Put another way,
to assume that P(F|~N&L)=1 is to concede that life in the world actually
arose by the operation of an agent that is observationally indistinguishable
from naturalistic law, insofar as the observation F is concerned. In essence,
any such agent is just an extreme version of the "God-of-the-gaps,"
whose existence has been made superfluous as far as the existence of life
is concerned. Such an assumption would completely undermine the proposition
that it is necessary to go outside of naturalistic law in order to
explain the world as it is, although it doesn't undermine any argument for
supernaturalism that doesn't rely on the universe being "life-friendly".`

`So, if supernaturalism is to be distinguished
from naturalism on the basis of the fact that the universe is F, it must
be the case that P(F|~N&L)<1. Otherwise, we are condemned to an unsatisfying
kind of "God-of-the-gaps" theology. But what sort of theologies
can we consider, and how would they affect this crucial probability?`

`To make these ideas more definite, we consider
first a specific interpretation that is intended to imitate, albeit crudely,
how the assumption of a relatively powerful and inscrutable deity (such
as a generic Judeo-Christian-Islamic deity might be) could affect the calculation
of the likelihood function P(F|~N&L).`

`We suggest that any reasonable version of supernaturalism
with such a deity would result in a value of P(F|~N&L) that is, in fact,
very small (assuming that only a small set of possible universes are F).
The reason is that a sufficiently powerful deity could arrange things so
that a universe with laws that are not "life-friendly" can sustain
life. Since we do not know the purposes of such a deity, we must assign
a significant amount of the likelihood function to that possibility. Furthermore,
if such a deity creates universes and if the "fine-tuning" claims
are correct, then most life-containing universes will be of this
type (i.e., containing life despite not being "life-friendly").
Thus, all other things being equal, and if this is the sort of deity we
are dealing with, we would expect to live in a universe that is ~F.`

`To assert that such a deity could only
create universes containing life if the laws are life-friendly is to restrict
the power of such a deity. And to assert that such a deity would
only create universes with life if the laws are life-friendly is to assert
knowledge of that deity's purposes that many religions seem reluctant to
claim. Indeed, any such assertion would tend to undermine the claim, made
by many religions, that their deity can and does perform miracles that are
contrary to naturalistic law, and recognizably so.`

`Our conclusion, therefore, is that not only
does the observation F support N, but it supports it overwhelmingly against
its negation ~N, if ~N means creation by a sufficiently powerful and inscrutable
deity. This latter conclusion is, by the way, a consequence of the Bayesian
Ockham's Razor [Jefferys, W.H. and Berger, J.O., "Ockham's Razor and
Bayesian Analysis," American Scientist 80, 64-72 (1992)].
The point is that N predicts outcomes much more sharply and narrowly than
does ~N; it is, in Popperian language, more easily falsifiable than is ~N.
(We do not wish to get into a discussion of the Demarcation Problem here
since that is out of the scope of this FAQ, though we do not regard it as
a difficulty for our argument. For our purposes, we are simply making a
statement about the consequences of the likelihood function having significant
support on only a relatively small subset of possible outcomes.) Under these
circumstances, the Bayesian Ockham's Razor shows that observing an outcome
allowed by both N and ~N is likely to favor N over ~N. We refer the reader
to the cited paper for a more detailed discussion of this point.`

`Aside from sharply limiting the likely actions
of the deity (either by making it less powerful or asserting more human
knowledge of the deity's intentions), we can think of only one way to avoid
this conclusion. One might assert that any universe with life would appear
to be "life-friendly" from the vantage point of the creatures
living within it, regardless of the physical constants that such a universe
were equipped with. In such a case, observing F cannot change our opinion
about the nature of the universe. This is certainly a possible way out for
the supernaturalist, but this solution is not available to Ross because
it contradicts his assertions that the values of certain physical constants
do allow us to distinguish between universes that are "life-friendly"
and those that are not. And, such an assumption does not come without cost;
whether others would find it satisfactory is problematic. For example, what
about miracles? If every universe with life looks "life-friendly"
from the inside, might this not lead one to wonder if everything that happens
therein would also look to its inhabitants like the result of the simple
operation of naturalistic law? And then there is Ockham's Razor: What would
be the point of postulating a supernatural entity if the predictions we
get are indistinguishable from those of naturalistic law?`

`In the previous section, we have discussed just
one of many sorts of deities that might exist. This one happens to be very
powerful and rather inscrutable (and is intended to be a model of a generic
Judeo-Christian-Islamic sort of deity, though believers are welcome to disagree
and propose--and justify--their own interpretations of their favorite deity).
However, there are many other sorts of deities that might be postulated
as being responsible for the existence of the universe. There are somewhat
more limited deities, such as Zeus/Jupiter, there are deities that share
their existence with antagonistic deities such as the Zoroastrian Ahura-Mazda/Ahriman
pair of deities, there are various Native American deities such as the trickster
deity Coyote, there are Australian, Chinese, African, Japanese and East
Indian deities, and even many other possible deities that no one on Earth
has ever thought of. There could be deities of lifeforms indigenous to planets
around the star Arcturus that we should consider, for example.`

`Now when considering a multiplicity of deities,
say D _{1},D_{2},...,D_{i},..., we would have to
specify a value of the likelihood function for each individual deity, specifying
what the implications would be if that deity were the actual deity
that created the universe. In particular, with the "fine-tuning"
argument in mind, we would have to specify P(F|D_{i}&L) for
every i (probably an infinite set of deities). Assuming that we have a mutually
exclusive and exhaustive list of deities, we see the hypothesis ~N revealed
to be composite, that is, it is a combination or union of the individual
hypotheses D_{i} (i=1,2,...). Our character set doesn't have the
usual "wedge" character for "or" (logical disjunction),
so we will use 'v' to represent this operation. We then have`

~N = D_{1}v D_{2}v...v D_{i}v...

`Now, the total prior probability of ~N, P(~N|L),
has to be divvied up amongst all of the individual subhypotheses D _{i}:`

P(~N|L) = P(D_{1}|L) + P(D_{2}|L) + ... + P(D_{i}||L) + ...

`where 0<P(D _{i})<P(~N|L)<1
(assuming that we only consider deities that might exist, and that there
are at least two of them). In general, each of the individual prior probabilities
P(D_{i}|L) would be very small, since there are so many possible
deities. Only if some deities are a priori much more likely than
others would any individual deity have an appreciable amount of prior probability.`

`This means that in general, P(D _{i}|L)<<1
for all i.`

`Now when we originally considered just N and
~N, we calculated the posterior probability of N given L&F from the
prior probabilities of N and ~N given L, and the likelihood functions. Here
it would be simpler to look at prior and posterior odds. These are derived
straightforwardly from probabilities by the relation`

Odds = Probability/(1 - Probability).

`This yields a relationship between the prior
and posterior odds of N against ~N [using P(N|F&L)+P(~N|F&L)=1]:`

P( N|F&L) P(F| N&L) P( N|L)Posterior Odds = --------- = ---------- x -------P(~N|F&L) P(F|~N&L) P(~N|L)= (Bayes Factor) x (Prior Odds)

`The Bayes Factor and Prior Odds are given straightforwardly
by the two ratios in this formula.`

`Since P(F|N&L)=1 and P(F|~N&L)<=1,
it follows that the posterior odds are greater than or equal to the prior
odds (this is a restatement of our first theorem, in terms of odds). This
means that observing that F is true cannot decrease our confidence that
N is true.`

`But by using odds instead of probabilities,
we can now consider the individual sub-hypotheses that make up ~N. For example,
we can calculate prior and posterior odds of N against any individual D_i.
We find that`

P( N|F&L) P(F| N&L) P( N|L)Posterior Odds = --------- = --------- x -------P(D_{i}|F&L) P(F|D_{i}&L) P(D_{i}|L)

`This follows because (by footnote 2)`

P(N |F&L) = P(F| N&L)P( N|L)/P(F|L),P(D_{i}|F&L) = P(F|D_{i}&L)P(D_{i}|L)/P(F|L),

`and the P(F|L)'s cancel out when you take the
ratio.`

`Now, even if P(F|D _{i}&L)=1, which
is the maximum possible, the posterior odds against D_{i} may still
be quite large. The reason for this is that the prior probability of ~N
has to be shared out amongst a large number of hypotheses D_{j},
each one greedily demanding its own share of the limited amount of prior
probability available. On the other hand, the hypothesis N has no others
to share with. In contrast to ~N, which is a compound hypothesis, N is a
simple hypothesis. As a consequence, and again assuming that no particular
deity is a priori much more likely than any other (it would be incumbent
upon the proposer of such a deity to explain why his favorite deity
is so much more likely than the others), it follows that the hypothesis
of naturalism will end up being much more probable than the hypothesis of
any particular deity D_{i}.`

`This phenomenon is a second manifestation of
the Bayesian Ockham's Razor discussed in the Jefferys/Berger article (cited
above).`

`In theory it is now straightforward to calculate
the posterior odds of N against ~N if we don't particularly care which
deity is the right one. Since the D_{i} form a mutually exclusive
and exhaustive set of hypotheses whose union is ~N, ordinary probability
theory gives us`

P(~N|F&L) = P(D_{1}|F&L) + P(D_{2}|F&L) + ...= [P(F|D_{1}&L)P(D_{1}|L) + P(F|D_{2}&L)P(D_{2}|L) + ...]/P(F|L)

`Assuming we know these numbers, we can now calculate
the posterior odds of N against ~N by dividing the above expression into
the one we found previously for P(N|F&L). Of course, in practice this
may be difficult! However, as can be seen from this formula, the deities
D _{i} that contribute most to the denominator (that is, to the supernaturalistic
hypothesis) will be the ones that have the largest values of the likelihood
function P(F|D_{i}&L) or the largest prior probability P(D_{i}|L)
or both. In the first case, it will be because the particular deity is closer
to predicting what naturalism predicts (as regards F), and is therefore
closer to being a "God-of-the-gaps" deity; in the second, it will
be because we already favored that particular deity over others a priori.`

`Some make the mistake of thinking that "fine-tuning"
and the anthropic principle support supernaturalism. This mistake has two
sources.`

`The first and most important of these arises
from confusing entirely different conditional probabilities. If one observes
that P(F|N) is small (since most hypothetical naturalistic universes are
not "fine-tuned" for life), one might be tempted to turn the probability
around and decide, incorrectly, that P(N|F) is also small. But as
we have seen, this is an elementary blunder in probability theory. We find
ourselves in a universe that is "fine-tuned" for life, which would
be unlikely to come about by chance (because P(F|N) is small), therefore
(we conclude incorrectly), P(N|F) must also be small. This common mistake
is due to confusing two entirely different conditional probabilities.
Most actual outcomes are, in fact, highly improbable, but it does
not follow that the hypotheses that they are conditioned upon are themselves
highly improbable. It is therefore fallacious to reason that if we have
observed an improbable outcome, it is necessarily the case that a hypothesis
that generates that outcome is itself improbable. One must compare
the probabilities of obtaining the observed outcome under all hypotheses.
In general, most, if not all of these probabilities will be very small,
but some hypotheses will turn out to be much more favored by the actual
outcome we have observed than others.`

`The second source of confusion is that one must
do the calculations taking into account all the information at hand.
In the present case, that includes the fact that life is known to
exist in our universe. The possible existence of hypothetical naturalistic
universes where life does not exist is entirely irrelevant to the question
at hand, which must be based on the data we actually have.`

`In our view, similar fallacious reasoning may
well underlie many other arguments that have been raised against naturalism,
not excluding design and "God-of-the-Gaps" arguments such as Michael
Behe's "Irreducible Complexity" argument (in his book, Darwin's
Black Box), and William Dembski's "Complex Specified Information,"
as described in his dissertation (University of Illinois at Chicago). We
conclude that whatever their rhetorical appeal, such arguments need to be
examined much more carefully than has happened so far to see if they have
any validity. But that discussion is outside the scope of this article.`

`Bottom line: The anthropic argument should be
dropped. It is wrong. "Intelligent design" folks should stick
to trying to undermine N by showing ~F. That's their only hope (though we
believe it to be a forlorn one).`

Michael Ikeda Bill JefferysStatistical Research Division Department of AstronomyBureau of the Census University of TexasWashington DC 20233 Austin TX 78712Department of StatisticsUniversity of VermontBurlington VT

`Michael Ikeda's work on this article was done
on his own time and not as part of his official duties. The authors' affiliations
are for identification only. The opinions expressed herein are those of
the authors, and do not necessarily represent the opinions of the authors'
employers.`

`Copyright (C) 1997-2006 by Michael Ikeda and
Bill Jefferys. Portions of this FAQ are Copyright (C) 1997 by Richard Harter.
All Rights Reserved.`

`[1] By definition, P(A|B)=P(A&B)/P(B); it
follows that also P(A|B&C)=P(A&B|C)/P(B|C).`

`[2] We use Bayes' theorem in the form`

P(A|B&K)=P(B|A&K)P(A|K)/P(B|K)

`which follows straightforwardly from the identity`

P(A|B&K)P(B|K)=P(A&B|K)=P(B|A&K)P(A|K)

`(a consequence of footnote 1) assuming that
P(B|K)>0.`

APPENDIX 1: Reply to Kwon (April 30, 2001)

`David Kwon has posted a web
page` `in which he claims to have
refuted the arguments in our article. However, he has made a simple error,
which we detail below, along with comments on some of his other assertions.`

`[ Note added 040109: Kwon's original article
has disappeared from the web. The above link is to the last version of his
article archived by the Internet
Wayback Machine via Makeashorterlink.com]`

`Kwon's Equation (3) reads as follows:`

P(N|F&L) = P(N&F&L) / {P(~N&F&L) + P(N&F&L)}

`This is an elementary result of probability
theory and we agree with it. Kwon then goes on and assumes what he calls
the "fine-tuning" condition P(F|N)<<1 from which he correctly
derives Equation (8), the important part of which reads`

P(N&F&L) << 1

`From these two results (3 and 8) Kwon derives`

P(N|F&L)<<1 unless P(~N&F&L)<<1

`Unfortunately, nothing in Kwon's "proof"
shows that P(~N&F&L) is not <<1, so he cannot assert unconditionally
that P(N|F&L)<<1 as a consequence of his assumptions. He asserts`

"The only way not to come to this conclusion [that P(N|F&L)<<1] is to start with ana prioriassumption of P(~N&F&L)<<1. In other words, the only way to hold on to naturalism is by assuming that theism is virtually impossible to begin with."

`This, however, is incorrect, and here the "proof"
falls apart. Kwon apparently recognizes that according to his Equation (3),
the value of P(N|F&L) is not governed by the actual size of P(N&F&L),
but instead by the relative sizes of P(N&F&L) and P(~N&F&L).
In particular, if P(N&F&L)<<P(~N&F&L) then P(N|F&L)
will be close to zero; if P(N&F&L) is approximately equal to P(~N&F&L),
then P(N|F&L) will be of order one-half; and if P(N&F&L)>>P(~N&F&L),
then P(N|F&L) will be nearly unity. Therefore, we need to look at the
ratio R = P(N&F&L)/P(~N&F&L) to see what factors govern
its size and what assumptions this entails.`

`We obtain:`

R = P(N&F&L) / P(~N&F&L) = {P(F|N&L) P(N&L)} / {P(F|~N&L) P(~N&L)} (A) = P(N&L) / {P(F|~N&L) P(~N&L)} (B) >= P(N&L) / P(~N&L) (C) = {P(N|L) P(L)} / {P(~N|L) P(L)} (D) = P(N|L) / P(~N|L) (E)

`Here, (A) and (D) follow from the definition
of conditional probability, (B) by the WAP--which Kwon says he accepts--and
which asserts that P(F|N&L)=1, (C) because the probability P(F|~N&L)
in the denominator is <=1, and (E) by cancellation of P(L) in numerator
and denominator.`

`Thus we see that in fact the ratio R cannot
be small unless P(N|L)/P(~N|L) is also small. Therefore we cannot conclude
that P(N|F&L)<<1 unless P(N|L)/P(~N|L)<<1--regardless of
the size of P(N&F&L). But what is P(N|L)/P(~N|L)? Why, it is just
the prior odds ratio that You assign to describe Your relative belief in
N and ~N before You learn that F is true. Thus, although Kwon is correct
in noting that the only way to keep P(N|F&L) from being very small is
to have P(~N&F&L)<<1, this does not represent a prior commitment
to naturalism as he asserts. Indeed, a prior commitment to naturalism would
be to assume that P(N|L)/P(~N|L)>>1, and as (E) shows, if we assume
P(N|L)/P(~N|L) of order unity, which reflects a neutral prior position between
the N and ~N, and not a prior commitment to naturalism, we will end up being
at least neutral between N and ~N after observing that F is true, regardless
of the size of P(N&F&L) and P(F|N).`

`Indeed, it requires a prior commitment to supernaturalism
to get P(N|F&L)<<1, because You would have to presume a priori
that P(N|L)<<P(~N|L). Kwon has it exactly backwards.`

`So the absolute size of P(N&F&L) and
P(F|N) do not tell us anything about P(N|F&L); this is a confusion between
conditional and unconditional probability. The only thing that counts is
the ratio R. Kwon's calculation in his steps (4-8) is simply irrelevant
to the final result. Indeed, we have the following theorem:`

Theorem: If p(F|N)<<1 and You are exactly neutral between N and ~N before learning F, then P(~N&F&L)<<1.

Proof: Under the assumptions we have P(F&N&L)=P(N|L)P(L)<<1; but if we are exactly neutral between N and ~N before learning F we have P(N|L)=0.5=O(1) so the unconditional probability P(L)<<1. But by standard probability theory P(~N&F&L)<=P(L)<<1. QED.

`Thus, far from reflecting a prior commitment
to naturalism as Kwon claims, the result P(~N&F&L)<<1 is a
consequence of the fine tuning condition together with the adoption of an
at least neutral prior position on N versus ~N. It is due to the fact that
P(N&L&F) and P(~N&L&F) both have P(L)<<1 as a factor
when they are expanded using the definition of conditional probability.`

`Furthermore, it is even possible for P(~N|F&L)
to be very small (and therefore P(N|F&L) close to unity), without making
a prior commitment to naturalism. For example, suppose we adopt the neutral
position P(N|L)=P(~N|L)=0.5; then from (B) we find that R = 1/P(F|~N&L),
and if P(F|~N&L)<<1 then R>>1 and P(F|N&L) is close
to unity. But what does P(F|~N&L)<<1 mean? Is this a "prior
commitment to naturalism?" No, a prior commitment to naturalism would
involve some conditional probability on N, not some conditional probability
on F. The condition P(F|~N&L)<<1 actually means that it is likely
that an inhabitant of a supernaturalistically created universe would find
that it is ~F: a universe where life exists despite the fact that it could
not exist naturalistically, for example as a consequence of the suspension
of natural law by the supernatural creator. We discussed this extensively
in our article. Indeed, without psychoanalyzing the Deity and analysing
its powers and intentions, it is a priori quite likely that the Deity might
create universes that are ~F&L, for such universes are not excluded
unless we know something about this Deity that would prevent it from creating
such universes. An example of such a universe would be Paradise, and it
seems unlikely that enthusiasts of the "fine-tuning" argument
would be willing to say that the Deity would not create anything like Paradise.
But the only way for them to escape from P(F|~N&L)<<1 would be
for them to assert that the Deity would only, or mostly, create universes
that, if they contain life, are F, and we see no justification for such
an assumption.`

`Kwon makes some other incorrect statements later
in his web article. He says that our argument "incorrectly attributes
significance to P(N|L)." Kwon here appears to have missed the fact
that we are talking about Bayesian probabilities. The probability P(N|L)
refers to our universe, and is Your Bayesian prior probability that N is
true, given that You know that L is true (which must be the case since it
is a condition of reasoning that You be alive), but before You learn that
F is true. It is a reflection of Your epistemological condition or state
of knowledge at a particular moment in time. Thus, P(N|L) has a perfectly
definite meaning in our universe, although the value of P(N|L) will differ
from individual to individual because every individual has different background
information (not explicitly called out here but mentioned in our article).`

`Furthermore, Kwon is incorrect when he states
that "P(N|L) is irrelevant to our universe for the same reason that
P(N|F) is irrelevant." We never said that P(N|F) is irrelevant, only
that it is irrelevant for inference. The reason why P(N|F) is irrelevant
for inference is that no sentient being is unaware of L as background information.
Every sentient being knows that he is alive and therefore knows that L is
true; thus every final probability statement that he makes must be conditioned
on L. This is not true of F. There are sentient beings in our universe,
indeed in our world, that do not yet know that F is true. Most schoolchildren
do not know that F is true, although they know that L is true. Probably
most adults do not know that F is true. Thus, Kwon errs in drawing a parallel
between P(N|L) and P(N|F).`

`Kwon started with the perfectly reasonable proposal
that "fine tuning" is best defined by P(F|N)<<1, and attempted
to derive his result. That he was unable to do this comes as no surprise
to us, because one of us [whj] spent the better part of a year trying to
get useful information from propositions such as P(F|N)<<1, without
success. All such attempts were fruitless, and the reason why is seen in
our discussion. For example, suppose we were to assume in addition that
P(F|~N)=1. Even then, no useful result can be derived, for from this we
can only determine the obvious fact that P(F&L&~N)<=1, which
gives no useful information about the crucial ratio R. The inequality goes
in the wrong direction! Thus, "fine tuning"--P(F|N)<<1--tells
us nothing useful, which is why in our article we concentrated instead on
finding out what "life friendliness"--F--and the WAP can tell
us.`

`Kwon says, "We have always known that F
is true for our universe..." This is false. In fact, the suspicion
that F is true is relatively recent, going only back to Brandon Carter's
seminal papers in the mid-1970's. Earlier, physicists such as Dirac had
in fact speculated that the values of some fundamental physical constants
(e.g., the fine structure constant) might have been very different in the
past, which would violate F, and somewhat later other scientists (for example
Fred Hoyle in the early 1950s) have used the assumption that F is true in
order to predict certain physical phenomena, which were later found to be
the case. Had those observations NOT been found to be true, F would have
been refuted, and we would seriously have to consider ~N. Even today we
do not know that our universe is F--"life-friendly"--in
the sense that we use the term in our article. We strongly suspect that
it is true, but it is conceivable that someone will make a WAP prediction
that will turn out to be false and which might refute F.`

`Kwon incorrectly asserts that the idea that
there may be other universes is "simply unscientific." Certainly
many highly respected cosmologists and physicists like Andrei Linde (Stanford),
Lee Smolin (Harvard) and Alexander Vilenkin (Tufts) and Nobel laureate Stephen
Weinberg (Texas) would disagree with this statement. Kwon claims that the
hypothesis of other universes "cannot be tested." While we might
agree that testing the hypothesis of other universes will be difficult,
we do not agree that the hypothesis is untestable, and neither do scientists
that work in this area. Some specific tests have been suggested. For example,
David Deutsch has proposed specific tests of the Everett-Wheeler interpretation
of quantum mechanics commonly known as the "Many-Worlds" hypothesis.
And recently an article that proposed another way that other universes might
be detected was published ( Science, Vol. 292, p. 189-190, original
paper archived as http://arXiv.org/abs/hep-th/0103239).
Regardless, our argument is not dependent on the notion that there are many
other universes. It stands on its own.`

`Kwon misunderstands the point of the "god
of the gaps" argument. The problem isn't that the gap is being filled
by a god, the problem is what happens if the gap is filled by physics. Then
the god that filled the gap gets smaller. This is a theological problem,
not an epistemological or scientific problem. We agree with Kwon that there
are gaps in our physical explanation of the universe that may never be filled;
but it is hoping against hope that we will never fill any of the gaps currently
being touted by "intelligent design theorists" as proof of supernaturalism.
Some of them are certain to be filled in time, and each time this happens,
the god of the intelligent designers will be diminished. (In fact, some
of them were in fact filled even before the recent crop of "ID theorists"
made their arguments--this is true of some of Michael Behe's examples, for
which evolutionary pathways had already been proposed even before Behe published
his book).`

`As to Kwon's last point, that we incorrectly
claim that "intelligent design theorists" incoherently assert
both F and ~F. We believe that it is a correct statement that at least some
are arguing ~F. It is our impression, for example, that Michael Behe is
arguing that it is actually impossible, and not just highly unlikely, for
certain "irreducibly complex" (IC) structures to evolve without
supernatural intervention, and that is a form of ~F. Regardless, even if
no one is attempting to argue from ~F to ~N, our point still stands. Attempts
to prove ~N that argue from either F or P(F|N)<<1 or both do not work.
But attempts to prove ~N by showing ~F would work. Thus, people making anthropic
and "fine tuning" arguments have hold of the wrong end of the
stick. They should be trying to show that the universe is not F. It is clear
that showing that the universe is not F would at one stroke prove ~N; it
follows that showing that the universe is F can only undermine ~N and support
N; this is an elementary result of probability theory, since it is not possible
that observations of F as well as ~F would both support ~N. Since it is
trivially true that observing ~F does support ~N, observing F must undermine
it. Put another way, it seems to us that Michael Behe--if we understand
him--is making the right argument from a logical and inferential point of
view, and Hugh Ross is making the wrong argument. If it turns out that Behe
is not making the argument we think he is, then it is still the case that
Hugh Ross is making the wrong argument.`

`Kwon makes some remarks about "nontheists"
that seem to indicate that he thinks that only "nontheists" would
argue as we have. This is not the case. The issue here is whether the "fine
tuning" argument is correct. It is exactly analogous to the centuries
of work done on Fermat's last theorem. It is likely that most mathematicians
thought that the theorem was true for most of that time, yet they continued
to reject proofs that had flaws in them. They rejected them not because
they thought Fermat's last theorem was false, but because the proofs were
wrong. They even rejected Wiles' first attempt at a proof, because it was
(slightly) flawed. In the same way a theist can and should reject a flawed
"proof" of the existence of God. Our argument is that the fine
tuning arguments are wrong, and no one should draw any conclusions about
our personal beliefs from the fact that we say that these arguments are
wrong.`

`Conclusion: Kwon's "proof" is fatally
flawed. He incorrectly asserts that the only way to keep P(N|F&L) from
being very small is to assume naturalism a priori. Quite the contrary, the
only way to make P(N|F&L) small is to assume supernaturalism
a priori. Kwon apparently does not understand the significance of some of
the Bayesian probabilities we use; this is forgiveable in a sense since
Bayesian probability theory is still misunderstood by most people, even
those with some training in probability theory...but it means that Kwon
should withdraw these comments until he understands Bayesian probability
theory well enough to criticize it. Kwon's assertion that we have always
known that our universe is F is false; his assertion that the existence
of other universes is untestable is also false, and in any case is not relevant
to our main argument. Finally, he mistakenly thinks that the god-of-the-gap
argument somehow tells against science. It does not, since it is purely
a theological conundrum, not a scientific one.`

`Nonetheless, we thank David Kwon for his serious
and attentive reading of our article and for his comments. He is the first
to attempt a mathematical rather than a polemical refutation of our argument.
His argument fails because, as we show here, it isn't possible to derive
anything useful from the fine-tuning proposition P(F|N)<<1. When all
factors are taken into account, it is clear that the only way to end up
with a final result that P(N|F&L)<<1 is to assume at the outset
that supernaturalism is almost surely true, thus begging the question.
M. I.
W. J.
April 30, 2001`

`[ Note added 010613: When we posted this
response, we informed Mr. Kwon, so that he could either respond to our criticisms
or withdraw his web page. We regret to say that up to now he has done neither.`

`Note added 040109``: Kwon has never responded to our criticisms; his web page
disappeared when he apparently finished his career as a Berkeley graduate
student. It is archived and can be obtained courtesy of the Internet
Wayback Machine via Makeashorterlink.com]`

`Note added 060406``: Another version of Kwon's article appears to have migrated
here;
We do not know if this site is his or someone else's.`

APPENDIX 2: Why one must condition on L

`A correspondent who prefers to remain anonymous wrote us as follows
(reproduced with permission):`

`------------------------------Begin Quote--------------------------`

Recently I was led to your article with Michael Ikeda called "The Anthropic Principle Does Not Support Supernaturalism,"

http://quasar.as.utexas.edu/anthropic.html .

That is quite a striking conclusion.

A key step in your argument, on which you insist repeatedly, is that one must conditionalize on L, the claim that "[t]he universe exists and contains life." The only justification given for this claim, as far as I could find, is that we all know L and we should use everything that we know.

However, this bit of advice leads quickly to a paradox well known to philosophers of science, viz., Clark Glymour's "problem of old evidence."

The problem is that conditionalizing using everything that one knows leads, in some cases, to the absurd conclusion that new theories cannot be confirmed by old evidence. Such a conclusion contradicts common sense and scientific practice. A standard example is the confirmation of Einstein's GR by its entailing the anomalous perihelion precession of Mercury. This precession was known long before Einstein's theory, but Einstein and others have taken it to provide evidence for GR. Surely they were correct. But if one must always use all of the evidence on hand, then Einstein should have reasoned like this:

E=anomalous perihelion precession of Mercury

T=GR

P(E)=1 because E is known.

P(E|T)=1 because P(E)=1.

So Bayes's theorem

P(T|E) = P(T) P(E|T)/P(E) gives P(T|E) = P(T)*1/1 =P(T): the probability of GR is not increased by E! Some standard responses to this problem involve not using all of one's evidence in some fashion or other.

In short, the only motivation that I find in your paper cited above for conditionalizing on L is one that is widely known among philosophers of science to give absurd conclusions in certain cases. Glymour discusses this problem in "Why I Am Not a Bayesian" in his _Theory and Evidence_ (Princeton, 1980), which is also reprinted Curd and Cover, _Philosophy of Science: The Central Issues_ (Norton, NY, 1998), with commentary, which is where I am looking at it. A dozen or two responses or counterresponses to the problem can be found in the Philosopher's Index database. Thus a key step in your argument is presently unmotivated in your online paper.

`------------------------------End Quote--------------------------`

`We have quoted our correspondent's letter in full to address several
issues. First, the argument that he attributes to Glymour is wrong. Second,
even if it were right, it is not properly applied to the present situation.
Third, we will show that for any argument to be sound, it must
include all background information which is known to be true and
which affects (changes) the likelihood. In the present situation, L has
this status. This will motivate in a formal way our assertion that we must
condition on L.`

`Since we have not had an opportunity to read Glymour's original essay,
and are therefore not absolutely certain that our correspondent has presented
his argument correctly, in the following we will designate the argument
our correspondent attributes to Glymour as "Argument A".`

`We will first deal with Argument A. The argument contains an obvious,
fatal flaw.`

`It is simply not the case that the fact that we have observed
evidence E entails that P(E)=1. Since everything in Argument A follows from
this mistaken assumption, Argument A is wrong.`

`P(E) is not the probability that E has been observed.
It is the probability of observing E, instead of something else,
averaged over all theories in the set TH = {T1, T2,...} under consideration,
with weights proportional to the prior probabilities of the theories in
TH. [We assume that every theory T in TH has positive prior probability,
i.e., P(T)>0 for all T in TH]. E is a candidate from the set of all
possible outcomes EV = {E1, E2,...} that these theories predict could
be observed. Therefore, P(E) is in general not equal to 1, even
after you have observed E. Indeed, P(E) is the same number before
you observe evidence E, after you observe evidence E, or even if you never
observe evidence E. It is equal to 1 if and only if every theory
in TH predicts that only E could ever be observed.`

`As Tom Loredo
pointed out to us when we showed him Argument A, "Time plays the same
role in probability theory as it does in logic, i.e., no role whatsoever."
This means the probability calculus, like the logic calculus, produces sound
results, independently of when you learn the truth or falsity of any of
the premises in the statement. This fact becomes obvious when one learns
that in the limit when propositions are definitely true or false, probability
theory reduces to ordinary logic, as a consequence of a theorem due to Cox (1946). For
a transparent discussion of this relationship, see pp. 12-23 of the following
lecture
by Tom Loredo.`

`P(E) is known technically as the marginal likelihood, and
it is correctly computed using a specific formula involving another
quantity known as the likelihood function. It is never computed
from a naive statement such as "I've observed E, therefore P(E)=1."
In what follows we will define these quantities and show how Argument A
should have calculated P(T|E) from P(T) and knowledge of E. We will also
show precisely where Argument A went wrong.`

`In Bayesian inference, one is interested in learning how the inclusion
of evidence E changes our belief about the plausibility of various
theories, compared to what one believed about those theories without
that evidence. This means that one should start with P(T), unconditioned
on E (i.e., without that evidence), and given E, calculate P(T|E) (with
that evidence). This is what Argument A alleges to do, but does incorrectly.
For clarity, we will restrict ourselves to just two theories, {T1, T2}.`

`Standard Bayesian theory starts with P(E|T). This is generally not
equal to 1, even if we have already observed evidence E. Technically,
when P(E|T) is conditioned on a fixed theory T and considered as a function
of the various E in EV, it is known as the sampling distribution under
T. It tells us, on the assumption that T is true, the probability of
observing each outcome E, where E ranges over all the possible outcomes
in EV. Since it is a probability (when considered as a function of E), its
sum over all the possible values of E is 1:`

P(E1|T)+P(E2|T)+P(E3|T)+...=1

`Because of this equation, P(E|T) can be equal to 1 only when
the theory T predicts that it is impossible to observe any
outcome other than E. This is true regardless of whether E has already been
observed, is yet to be observed, or even if it is never observed.`

`The sampling distribution (that is, the function P(E|T)) doesn't
care what evidence we actually observe. It is constructed independently
of any observed evidence, and has the same numerical value for each
of its arguments after evidence E is observed as it had before. It
is therefore only a tool to describe a particular theory T, and not
a description of evidence that may or may not have been observed. `

`In Bayesian inference, one is interested in comparing several theories.
For each theory T in TH, we construct its sampling distribution P(E|T),
which tells us how likely it is, under each theory, that we would observe
evidence E (ranging over all the alternatives contained in EV). Once we
observe a particular piece of evidence E, we are able to consider
P(E|T) as a function of the second argument T. The function of T that we
get by fixing E at its observed value and allowing T to vary over all theories
in TH is known as the likelihood function. It is not a probability,
and it is not normalized (the sum of P(E|T) over all T doesn't have
to add up to 1). It can even be multiplied by an arbitrary positive constant
C (independent of T) without affecting any inferences. `

`In the general relativity example, we are interested in comparing
theory T1 (say general relativity) with theory T2 (say Newtonian physics).
The likelihood function is given by the values of P(E|T1) and P(E|T2), evaluated
with the actual evidence E we have observed. Suppose there are only two
possible outcomes of our experiment, E1="observe anomalous perihelion
precession of Mercury" and E2="observe no anomalous perihelion
precession of Mercury". `

`The sampling distribution under the two theories is as follows:`

P(E1|T1)=1, P(E2|T1)=0

P(E1|T2)=0, P(E2|T2)=1

`This is because T1 predicts that we must observe anomalous
perihelion motion, and T2 predicts that we cannot observe anomalous
perihelion motion[1]. It doesn't matter when E1 or E2 is observed, these
probabilities are dictated by the theory alone, and not by any observations
that might or might not have been made. Historically, E1 was observed almost
a century before general relativity was proposed. But even so, the sampling
distributions under each theory, which are always constructed independently
of any evidence, describe only what the theories say we can observe,
and are as given above.`

`Once we say to ourselves, "We observed E1, not E2", we
can refine the situation. For now we can write down the likelihood function,
which is a function of the second argument, with the first argument
fixed at the observed E1. Consulting the above four equations, we
find that the likelihood is given by`

P(E1|T1)=1, P(E1|T2)=0

`Note: Even though we now know that E1 is true, P(E1|T2) does not
suddenly change its value to 1 as Argument A would seem to say, but (in
this example) remains equal to 0. To repeat what we've said before, this
is because for every theory T, the function P(E|T) describes the
theory T, independently of any evidence E we may have actually observed.
`

`Next, we must assign priors to T1 and T2. As an illustration, set
P(T1)=P(T2)=1/2. With this assignment, we can compute the marginal likelihood,
P(E1). This is always computed by expanding P(E1) as follows:`

P(E1)=P(E1|T1)P(T1)+P(E1|T2)P(T2)=1*1/2+0*1/2=1/2

`Note: Argument A claims that P(E1)=1; this is manifestly false. P(E1)
is just a normalization constant, designed to guarantee that the posterior
probability is a normalized probability on the theories T1, T2, T3,... Thus,
P(T1|E1)+P(T2|E1)+...=1. Routine calculation shows that this requires us
to set`

P(E1)=P(E1|T1)P(T1)+P(E1|T2)P(T2)

`Finally, we calculate the posterior probability of T1, given E1,
this time correctly:`

P(T1|E1)=P(E1|T1)P(T1)/P(E1)=1*1/2/(1/2)=1

`Notice that the calculation results in a posterior probability P(T1|E1)
that is different from the prior probablity P(T1)! Contrary to Argument
A's assertion, we can learn from old data, and the inclusion of old
evidence E1 does support T1 by showing (in this case) that P(T1|E1)>P(T1).`

`Evidently, something has gone wrong. A clue as to what is wrong with
Argument A can be gleaned from its (incorrect) claim that P(E)=1. Evidently,
the thinking is: E is old evidence, I know that E is true, therefore P(E)=1.
This reasoning is incorrect, because the only correct way to calculate
P(E) is through the expression we have displayed above. Nonetheless, from
this insight into the thinking, we can infer what's gone wrong. Argument
A is actually conditioning on the fact that E has already been observed,
without displaying that conditioning explicitly. Thus, what Argument A calls
P(E) is actually P(E|E), which is equal to 1. It regards E as already-known
background information.`

`Bayes' theorem, written with background information B, takes the
form`

P(T|E,B)=P(E|T,B)P(T|B)/P(E|B)

`If E is regarded as background information B, simple substitution
yields`

P(T|E)=P(T|E,E)=P(E|T,E)P(T|E)/P(E|E)=P(T|E),

`since trivially P(E|E)=P(E|T,E)=1. This statement correctly demonstrates
that if we start with P(T|E) as the prior on T, then inserting E into Bayes'
theorem as evidence does not change anything. The posterior equals the prior.
Bayes' theorem does not allow you to use the same evidence twice.`

`But the rub is that the real prior P(T) has never used
evidence E, not even once. Argument A is claiming that if evidence is old,
Bayes' theorem shows that P(T|E)=P(T). But that is false. If one substitutes
P(T) for P(T|B) on the right hand side of Bayes' theorem above, one gets
the "equation"`

P(T|E,B)=P(E|T,B)P(T)/P(E|B) (???),

`which is not a theorem and is in general false. If we were
to set B=E in this expression, we would get P(T|E)=P(T), but since the expression
is not a theorem, the argument is invalid.`

`The late E. T. Jaynes, in his book Probability Theory: The Logic
of Science (Cambridge University Press), put his finger on the problem
when he pointed out that failure to condition properly on all known and
relevant background information often leads to apparent paradoxes
in probability theory. These apparent paradoxes disappear when the correct
conditioning is displayed explicitly, as we have done above. `

`The attentive reader will also notice that Jaynes' dictum to condition
on all known and relevant background information is precisely what
we have been saying all along in our discussion of the anthropic principle.
L is known true a priori, and affects the likelihood, therefore one
must condition on L in order to avoid apparent "paradoxes" such
as Argument A.`

`Even if Argument A were correct, it is irrelevant to our discussion.
The reason is simple. Our interest is in what happens when new information
F is presented to someone who already knows that L is true, and who
has evaluated his priors in the light of the fact that L is true.
In other words, we are only interested in what happens when a Bayesian calculating
machine that knows that L is true is given, for the first time, the new
information that F is true. As we point out in our article, every sentient
being knows from the first time that it becomes sentient that L is true.
But F is genuinely new information, only known to some physicists since
c. 1950 at the earliest, and still unknown to the majority of human beings.
The "fine tuning" argument isn't "What do you think about
God, when you learn that you are alive?" but "What do you think
about God, when you learn that the universe is (apparently) fine-tuned or
life-friendly?"`

`This means that the argument about "old evidence" is not
even relevant to our discussion, since we are talking about what happens
when you learn that F is true, not about what happens when you learn that
L is true (which you already knew...). The "old information" is
L, but the "new information" is F.`

`Having dealt with Argument A, we now deal with the objection that
our requirement to condition on L is unmotivated. We motivate the conditioning
on L by appealing to the principle that arguments should be sound. `

`For an argument to be sound, it must be both factually
correct and valid.`

`Factually correct`` means that all its premises are
true. Valid means that the conclusions follow from the premises.`

`For example, consider the argument:`

All men are immortal

Socrates is a man

Therefore, Socrates is immortal

`This argument is valid, because the conclusion logically follows
from the premises. However, the premise "All men are immortal"
is factually incorrect, therefore the argument is unsound.`

`Conversely, the argument`

All men are mortal

Socrates is mortal

Therefore, Socrates is a man

`is unsound because it is invalid, even though it is factually correct.
The premises are true, but the conclusion does not follow from the premises.
`

`Similarly, a Bayesian calculation is valid if it uses the
probability calculus correctly, and factually correct if all of its
premises (assumptions) are correct. It is sound if it is both valid
and factually correct.`

`We will show that if one attempts to ignore true background information
B in the likelihood function P(E|T,B), and if B actually affects the values
taken on by the likelihood, the argument will not be factually correct,
and therefore the argument will be unsound.`

`Suppose that I claim to draw a conclusion about T from evidence E,
and claim that only P(E|T), unconditioned on B, needs to be considered as
the likelihood.`

`You are skeptical of this. You note that, regardless of what B is,
one can always write`

P(E|T)=P(E|B,T)P(B|T)+P(E|~B,T)P(~B|T)

`You also note that, by Bayes' theorem,`

P(B|T)=P(T|B)P(B)/P(T) and P(~B|T)=P(T|~B)P(~B)/P(T)

`Plugging these expressions into the previous we find`

P(E|B,T)P(T|B)P(B) + P(E|~B)P(T|~B)P(~B)P(E|T) = ---------------------------------------- ,P(T|B)P(B)+P(T|~B)P(~B)

`where the denominator P(T) has been expanded using the formula we
explained above.`

`You challenge me to tell you whether B is true or not. If I know
that B is true, regardless of how I know it, I am obliged to tell you the
truth. If I fib to you, then the argument I am trying to make will automatically
be factually incorrect, since some premises will be false, and hence
my argument will be unsound.`

`Thus, I am obliged to report to you that B is true, so that P(~B)=0.
You will then calculate`

P(E|T)=P(E|B,T) for all values of E and T

`You will conclude that you cannot leave B out of the conditioning
on the likelihood. My attempt to avoid conditioning on B has failed: If
the presence of B affects the likelihood P(E|B,T), we must use a P(E|T)
that reflects that information by being numerically equal to P(E|B,T) for
all values of E and T. Thus, the actual likelihood is P(E|B,T), despite
my attempt to pull the wool over your eyes by not mentioning B when I wrote
down the likelihood. Only if E is independent of B is it justified to use
just P(E|T), because independence means that P(E|B,T)=P(E|T).`

`Jaynes' dictum, "Condition on everything you knew before the
new evidence," is validated.`

`Specifically, in the example at hand, E=F, T=N and B=L. So, we have
shown that even if I attempt to leave L out of the equation, we find that
numerically, for all values of F and N,`

P(F|N)=P(F|L,N)

`Specifically, we know that the sampling distribution of F
under L and N is 1, so P(F|L,N)=1; for if we have a naturalistic universe
that contains life, this entails that F is true. And, we know that the sampling
distribution of F under L and ~N is <= 1, since we cannot logically rule
out non-naturalistic universes with life that are ~F.`

`Therefore, we compute that the Bayes factor P(F|L,N)/P(F|L,~N)>=1,
i.e., observing that F is true supports (or at least does not undermine)
our belief in N. This is precisely the same conclusion we obtained before.
Even if I try to pull the wool over your eyes by failing to mention L in
the conditioning, the above argument shows that P(F|N)/P(F|~N)>=1 when
the correct likelihood is used. `

`----`

`[1] This is only a very good approximation. The value of the anomalous
GR precession is about 43"/century, close to the observed value. But
actually, if through extraordinary bad luck the observational errors just
happened to be horrible, it might be possible to have observed an anomalous
43"/century even if the true value were zero, and vice versa. Strictly
speaking, therefore, the probabilities in the table should be very close
to 1 or 0, but ought to differ from these numbers by a very small quantity.`

`----`

`M.I.
W.J.
April, 2006`

`All materials at this website Copyright (C) 1994-2006 by William
H. Jefferys. This webpage Copyright (C) 1997-2006 by Michael Ikeda and William
H. Jefferys. Portions of this webpage Copyright (C) 1997 by Richard Harter.
All rights reserved.`

`This page was last modified on 060206.
`