SEGUNDA PARTE

Proving History: Bayes's Theorem and the Quest for the Historical Jesus

As one analyst put it, BT actually explains “what are regarded as sound methodological procedures” and reveals “the infirmities of what are acknowledged as unsound procedures” in almost any empirical field.¹ In other words, Bayes's Theorem underlies all other methodologies and thus explains why certain methods are regarded as sound, and others not—even when advocates or detractors of various methods are unaware of BT's capability in this regard. This entails a testable prediction, that all valid empirical methods reduce to BT: any method you propose will either be logically invalid or it will be described by BT. One might challenge how universally that's true, but here I will demonstrate that it at least holds for historical methods. I'll start with the most widely applicable examples, increasing in degrees of generalization, then test a few common methods of narrower scope.

Years ago I described two historical methods as defining the best that historians have deployed: the Argument to the Best Explanation (ABE) and the Argument from Evidence (AFE).² The literature on historical method and the epistemology of history essentially supports this conclusion, all of it being reducible to one or the other.³ Yet I have since discovered that everything I had argued can be better framed in Bayesian terms, especially since neither the ABE nor the AFE solves the problems I now identify as plaguing every historical method: establishing logical validity and epistemic sufficiency; in other words, why should the conclusion of these arguments be deemed logically valid and when is the evidence enough to warrant belief in the conclusion?

THE ARGUMENT FROM EVIDENCE (AFE)

According to the AFE, there are at least five respected categories of historical evidence, and the more a conclusion has support from each category, the more likely it is to be true. Those categories are:

Physical-Historical Necessity: the degree to which history could not have proceeded as it did had the event(s) not occurred; that is, the degree to which the event is required to account for all subsequent history.

Direct Physical Evidence: archaeological evidence, material evidence, evidence physically produced by the event(s) or person(s) in question.

Unbiased or Counterbiased Corroboration: witnesses who have no known motive to lie or exaggerate, or even a motive to lie or exaggerate in the opposite direction.

Credible Critical Accounts: accounts by known scholars of the period that exhibit the use of a critical analysis and evaluation of multiple lines of evidence (as opposed to just repeating a story).

Eyewitness Accounts: accounts by actual eyewitnesses of the event(s) or person(s) in question.

All these amount to saying “these categories of evidence are unlikely to exist unless the proposed event happened,” which in Bayesian terms means a low P(e|~h.b) relative to a high P(e|h.b). That's how you represent the fact in BT that the evidence (e.g., subsequent history, eyewitness testimony, etc.) is very improbable unless h (the hypothesized event[s]) happened. For example, Julius Caesar's conclusive capture of Rome and his unchallenged firsthand account of crossing the Rubicon would both have been improbable unless Caesar actually crossed the Rubicon. Not impossible, but improbable—certainly less probable than if Caesar had indeed done that. Which is simply a colloquial way of saying P(e|~h.b) is low (because any plausible alternative account of things would render that evidence unlikely) and P(e|h.b) is high (as the event having occurred makes all that evidence very likely indeed). The five categories of evidence in the AFE just represent five different ways evidence can be more probable on h than ~h. And that list of five isn't even exhaustive. Though additions to it are likely to have much less weight (these five being typically the strongest types of evidence to have), there still are other kinds of evidence. Hence, BT is more generic and thus more universal, and all the premises of the AFE are logically included in the premises of BT.

This is confirmed by observing the effect of taking evidence away. If the physical-historical necessity of an event is minimal, then given solely that factor P(e|~h.b) and P(e|h.b) are about equal, because then the evidence (of the subsequent course of history) is as likely to have occurred even if our hypothesis is false. We thus need other evidence. Indeed, if we should expect the subsequent course of history to have been different on h, then the value of that evidence is actually reversed and counts against our hypothesis. Then it is P(e|~h.b) that is high, and P(e|h.b) that is low. Likewise, if there is no physical evidence and if that absence of evidence is just as likely on either h or ~h (such as due to the rarity of evidence surviving, as is frequently the case in ancient history), then again the consequents are equal (or near enough—see my discussion of ‘evidence loss’ in chapter 6, page 219). But if that absence of evidence is unusual on h (as is often the case in more modern history, where we actually expect physical and documentary evidence), then its absence argues against h, and P(e|~h.b) is then higher than P(e|h.b), sometimes very much higher (I'll examine the specific logic of an Argument from Silence later in this chapter, page 117). The same reasoning can be followed through for the other three categories of evidence. In fact, you can use BT to fully analyze the consequences of differing degrees of evidence. For example, just how unbiased or counterbiased a source is may not be so black and white. If it is only slightly unbiased, then the degree by which it will lower P(e|~h.b) will be smaller than if it were an ideally neutral source (such as someone who doesn't even understand the significance of what they are attesting to).⁴

In every case, the degree to which the consequents differ from each other will reflect the degree to which the evidence is unexpected on either h or ~h. It should not be difficult to select an a fortiori measure of degree from the canon of probabilities supplied in chapter 3 (and repeated in the appendix, page 286). If you want to be sure whether h is credible, and the evidence does entail P(e|h.b) is higher than P(e|~h.b), then select a measure of degree that is less than you are certain it is. For example, if you are sure the odds must be way lower than one in one hundred that all this evidence would exist if ~h, then select one in one hundred (0.01), or even one in twenty (0.05), as reflecting the degree of difference between the consequents. For example, you might assign P(e|h.b) = 1 and P(e|~h.b) = 0.05. Then whatever result you get, if it supports believing h, it will support that conclusion even more, since you know any correction of your estimated probabilities can only raise the final epistemic probability that h is true (as explained in chapter 3, page 85).

What about prior probability? The AFE as stated implicitly assumes all competing hypotheses are equally likely prior to considering the evidence for them—which limits its validity. For example, the famous rain miracle of Marcus Aurelius only has extant evidence supporting magic or miracle as its explanation and description.⁵ To what extent are we obliged to conclude that those reports are correct, rather than preferring a hypothesis of what happened that explains these reports and the event without recourse to sorcery or meddling gods? The AFE can't answer that question in any logically valid way. But it can answer the question of whether the evidence we have is very likely or unlikely on any given hypothesis, which corresponds to the consequent probabilities in BT. So when the priors are indeed equal, or close enough that the evidence-produced disparity in consequents would easily overwhelm them, the AFE gives intuitively correct results. Thus, BT formally represents the logic of the AFE, and the AFE is only valid insofar as it can be validly represented with BT.

THE ARGUMENT TO THE BEST EXPLANATION (ABE)

According to the ABE, there are five qualities a theory can possess in respect to the evidence, such that the more it fulfills those qualities over any alternative explanation of the same evidence, the more likely it is to be true. Those qualities are (quoting and paraphrasing McCullagh⁶):

Plausibility: the hypothesis must conform to the expectations set by our background knowledge; formally, “it must be implied…by a greater variety of accepted truths than any other, and be implied more strongly than any other; and its probable negation must be implied by fewer beliefs, and implied less strongly than any other.”

Ad Hocness: the hypothesis must rely on the fewest ad hoc assumptions possible to explain the evidence, that is, assumptions for which there is no evidence or established agreement, or things just made up to force the hypothesis to fit; formally, “it must include fewer new suppositions about the past,” and about the nature of man and the world, “which are not already implied to some extent by existing beliefs.”

Explanatory Power: the hypothesis must make the evidence we have very probable; formally, “it must make the observation statements it implies more probable than any other.”

Explanatory Fitness: the hypothesis must not contradict any evidence or well-established beliefs, or at least contradict them much less than any competing theory does (since contradictory evidence can be explained away by various devices, sometimes legitimately; indeed a new result contradicting a prior belief is exactly how we discover a prior belief is false); formally, “when conjoined with accepted truths it must imply fewer observation statements…which are believed to be false.”

Explanatory Scope: the hypothesis must explain more of the evidence we have than any other hypothesis can; formally, “it must imply a greater variety of observation statements” that can be checked against surviving evidence.

This list is essentially just a lay summary of BT. For each criterion, the question hinges on how much h exceeds all the alternatives (which together constitute ~h), which requires a measure of degree, which is by definition mathematical. And such a measure of degree (by which h exceeds ~h) is exactly what BT employs. With the ABE, the end result requires combining all five factors, which can be complicated if competing hypotheses match or exceed each other on different criteria. Yet the ABE provides no means of ascertaining the effect of combining all five criteria. Even in straightforward cases, where h exceeds ~h on every criterion, what degree of belief is warranted by that degree of superiority on each of those five criteria? The ABE alone cannot answer that question. BT can; likewise on complex cases. Hence, BT is superior to the ABE.

In fact, the ABE criteria themselves are just colloquial versions of the premises in BT. Plausibility combined with Ad Hocness is simply a description of prior probability. Hence, prior probability in BT is the combination of ‘Plausibility’ and ‘Ad Hocness’ in the ABE. The more our background evidence renders our theory more typical, the higher its prior. The less typical our background evidence renders our theory, the lower its prior. And requiring new suppositions (about the world and the past) entails an untypical explanation, whereas an explanation fully (and indeed better) supported by background knowledge is thereby, by definition, more typical. BT represents this fact by a difference in priors. And this can only be validly represented when all possible explanations are represented in relative terms to each other, which only a proper application of BT ensures (as in BT the sum of all priors must equal one). Adding ad hoc elements likewise includes the tactic of inventing ‘excuses’ for the evidence not fitting your theory, which also lowers prior probability (as I demonstrated in chapter 3, page 80), which means that not depending on such excuses necessarily raises the prior (by exactly as much as depending on them would have lowered it).

The last three criteria then combine to entail the consequent probabilities in BT. Explanatory Power is almost an exact description of consequent probability. It only lacks reference to the circumstance of a hypothesis entailing a low consequent—which is accomplished by the Criterion of Explanatory Fitness. In fact, those two criteria are obviously two sides of the same measure: one refers to evidence that is very expected on a hypothesis, the other to evidence that is very unexpected on that same hypothesis; the one entailing a high consequent, the other a low one. The only thing missing is the middle possibility: evidence some hypotheses neither predict nor contradict. And that's essentially what Explanatory Scope picks up, by addressing facts a theory makes likely but that another theory makes neither likely nor unlikely. Combine all three, along with the fact that it is stated in each that you are measuring the degree of difference between the tested hypothesis and all its alternatives, and you simply have the difference of consequent probabilities measured in BT.

To see why they're the same, once again we only need examine what happens when evidence is added or taken away. Increasing Explanatory Scope (relative to a competing hypothesis) entails decreasing P(e|~h.b), since not explaining those facts renders them less probable, while explaining them renders them probable, keeping P(e|h.b) high. In contrast, increasing Explanatory Fitness or Power directly entails increasing P(e|h.b), while decreasing either of them entails decreasing it. And insofar as a competing hypothesis itself has a high or low Fitness or Power, P(e|~h.b) is also rendered high or low accordingly; likewise if ~h has the greater Scope, then it's P(e|h.b) that drops instead of rises. Thus, BT describes and in fact legitimizes the ABE. And yet BT is superior to the ABE, by having the precision that guarantees the logical clarity and validity of its results and ensures that ambiguous and unstated assumptions (about measures of relative degree) become clear and stated, and thus open to challenge and thus requiring sounder defense, thereby also ensuring its premises will be more sound than is likely under the vague structure of the ABE.

Thus, BT achieves greater soundness and validity than the ABE. Which reminds us again that any argument against the applicability of BT to history logically entails the same argument against the applicability of the ABE to history, and the AFE as well; and, I predict, all methodologies whatever, insofar as they have any validity to begin with.

THE HYPOTHETICO-DEDUCTIVE METHOD (HDM)

In fact, any valid form of hypothetico-deductive method is described by BT. Even the principle of Ockham's Razor (when validly formulated) follows necessarily from BT.⁷Hypothetico-Deductive Method (HDM) is the procedure of forming a hypothesis h, deducing observations that would be made if h is true and observations that would be made if h is false, and making many observations until the probability of h either far exceeds all known alternatives, or drops below all credibility. This exactly describes the BT ratio of consequents (or else the iterative use of BT, on which see chapter 5, page 168). But you can always redesign any alternative hypothesis so that ~h also predicts all the same observations as h. So how do you tell the difference? In practice, if your predictions are good ones (i.e., the evidence that results from many diverse observations normally entails a high ratio of consequents favoring h), then redesigning a new hypothesis so as to “just happen” to make all those same predictions will require an enormous Rube Goldbergesque contraption of additional assumptions, whereas the tested hypothesis is simple and requires few new assumptions; at which point, Ockham's Razor is invoked, and h thus prevails over ~h.

This difference between such explanations is described by the logic of prior probability in BT.⁸ Thus, Ockham's Razor is merely a declaration of that fact: the more assumptions you tack onto any h (especially novel assumptions), the lower its prior must be (as I demonstrated in chapter 3, page 80). Because any collection of “coincidences” like that is less typical, and thus less probable, than causes that do not require them (much less so many of them). This follows not merely from the addition of novel assumptions, but also the addition of improbable assumptions. For example, that the CIA might meddle with science experiments is not impossible (in fact such meddling in other affairs is actually attested, as are some of the means and motives necessary to meddle in the same way with scientific research), yet any h that relied on the unverified assumption that the CIA was meddling with your experiment would by virtue of that fact be far less probable than almost any h excluding that assumption, and thus ‘CIA interference’ is usually (and rightly) axed by Ockham's Razor. And that represents the lowering of priors based on our background knowledge regarding what is and isn't typical (the CIA meddling in science experiments isn't typical).

Thus all scientific methods, which are simply iterations of HDM, are described by BT.⁹ Historical methods are identical to scientific methods in this respect, being just another set of iterations of HDM. In fact, many sciences are historical, for example, geology, cosmology, paleontology, criminal forensics, all of which explore not merely scientific generalizations but historical particulars, such as when the Big Bang occurred, or how the solar system formed, or exactly when or where a large asteroid struck the earth, or when a volcano erupted and what resulted from it, or what happened to a specific species in a specific historical period, or who committed what crime when. Not even the claim that historians must deal with human thoughts and intentions makes a difference, as these are as much a necessary occupation of psychologists, economists, sociologists, and anthropologists. It's also fundamental to the scientific study of game theory and all of cognitive science. Nor is there any demarcation based on the role of controlled experiments. Much of science does not rely on experiments but primarily involves field observations (e.g., astronomy, zoology, ecology, paleontology), an approach to evidence directly analogous to the historian (most clearly parallel in the science of archaeology, but “field observations” of the artifacts we call “texts” and “documents” is just as analogous). Conversely, experiments sometimes do have a place in historical methodology.¹⁰ And as noted in chapter 3 (page 47), science is actually dependent on history, just as much as history depends on science.

So there is no qualitative difference. History is thus continuous with science. The difference between them is only quantitative: history must work with much less data, of much less reliability. Therefore, its results have less certainty and less precision. BT even explains this: the data available to science are of such scale and quality as to raise the final epistemic probability of its results to incredible heights (so high, in fact, often no one even bothers to calculate them, nor need they). But in history we are almost always dealing with final probabilities that, however high they may be, nevertheless allow a possibility of being mistaken that isn't negligible—to such a degree, in fact, that scientists would reject comparably uncertain results. But their rejecting such results does not mean those results are not believable, only that they do not obtain a scientific degree of certainty. The demarcation between science and nonscience is not the demarcation between believable and unbelievable conclusions. It is merely the demarcation between conclusions that are, for all intents and purposes, decisively certain (albeit still revisable), and conclusions that are not. But many conclusions can be believed with legitimate confidence without being decisively certain. We meet with such beliefs routinely in journalism, economics, and daily life. So as long as we face this fact and accept history is like that, we can proceed scientifically without pretending to the certainty of scientific results.

FORMAL PROOF OF UNIVERSAL APPLICABILITY

Since BT fully describes HDM without remainder, and HDM is a higher-level generalization of all historical methods, including the AFE and ABE, we could simply conclude here and now that Bayes's Theorem models and describes all valid historical methods. No other method is needed, apart from the endless plethora of techniques that will be required to apply BT to specific cases—of which the AFE and ABE represent highly generalized examples, but examples at even lower levels of generalization could be explored as well (such as the methods of textual criticism, demographics, or stylometrics). All become logically valid only insofar as they conform to BT and thus are better informed when carried out with full awareness of their Bayesian underpinning.

This should already be sufficiently clear by now, but there are always naysayers. For them, I shall establish this conclusion by formal logic.

P1.	BT is a logically proven theorem.
P2.	No argument is valid that contradicts a logically proven theorem.
C1.	Therefore, no argument is valid that contradicts BT.

P1 is an established fact (see note 9 for chapter 3, page 300). P2 is true by definition, that is, what it is to be a logically valid form of argument is to be consistent with formal logic, and all logical proofs are consistent with formal logic, ergo, to be inconsistent with a logical proof is to be inconsistent with formal logic, which entails by definition an invalid argument. Formally, if B = ‘we must accept the sound conclusions of formal logic,’ A = ‘BT is true,’ and C = ‘there is some historical method that is logically valid but contradicts BT,’ then:

P3.	If B, then A.
P4.	If C, then ~A.
C2.	Therefore, if B, then ~C.

This is a logically necessary truth.¹¹ Therefore there can be no valid historical method that contradicts BT. This leaves only two other possibilities: either (a) all valid historical methods are fully modeled and described by BT (and are thereby reducible to BT), or (b) there is at least one valid historical method that does not contradict BT but that nevertheless entails a different epistemic probability than BT for at least one historical claim h. The only way that can be logically possible is if there is something that could be said about the epistemic probability of h that is not said about the epistemic probability of h in BT. Because if BT already says that about h (i.e., if it already contains a premise about the effect of that same fact on the epistemic probability of h), then the only way any method can say anything different (about the effect of that fact on the epistemic probability of h) is by contradicting BT, which we just demonstrated is logically impossible. Any method that did that would have to be logically invalid.

Can that point be proven? Yes. About the epistemic probability of h a method can say all the same things as BT (in which case it must give the same conclusion as BT, or else it is contradicting BT and therefore contradicting logic), or it can say more things than BT, or it can say fewer things than BT. Methods that say less than BT, yet declare an epistemic probability for h, can only be methods that ignore known facts that affect the epistemic probability of h, and such methods are necessarily invalid and must thereby be excluded, which leaves only methods that say more than BT. This does not mean, however, all methods are invalid that only seem to say less than BT but that in fact (implicitly) say all the same things as BT. Just as the AFE is logically valid only if it is allowed to implicitly assume all prior probabilities are equal, so, too, any valid statistical arguments that ignore considerations of prior probability are still implicitly assuming some prior probability, whereas if they are not assuming that, then they are logically invalid.¹² For not assuming the priors are equal entails assuming the priors must be different, and any method that assumes the prior probability of h is different from ~h but does not enter that difference somehow into its calculation of the final probability of h is willfully illogical—at any rate, it thereby directly contradicts BT, which, as proved, is contrary to logic.¹³ So all that's left are methods that say more than BT. But we know of nothing that can be said that would validly affect the epistemic probability of h other than what is already said by the premises in BT. And if a method says nothing different about the probability of a claim being true than is already said by BT, that method can be fully replaced by BT without logical consequence.

Accordingly, I propose the following testable hypothesis:

P5. Anything that can be said about any historical claim h that makes any valid difference to the probability that h is true will either (a) make h more or less likely on considerations of background knowledge alone or (b) make the evidence more or less likely on considerations of the deductive predictions of h given that same background knowledge or (c) make the evidence more or less likely on considerations of the deductive predictions of some other claim (a claim which entails h is false) given that same background knowledge.

Thus to reject my conclusion (that all valid historical methods are reducible to BT) requires providing a counter-example to P5—which I predict no one can do. It's probably impossible, as by definition b and e encompass all data (i.e., the union of those two sets produces the set of all things known), and h and ~h encompass all theories, and BT logically includes every probability entailed by b and e on every theory. For example, P(h|b.e) is transitively identical to P(h|e.b) and is by definition the probability that h is true given all available knowledge (the union of e and b), so there is by definition no other knowledge that can alter that probability. Yet P(h|e.b) follows by logical necessity from P(h|b), P(e|h.b), and P(e|~h.b); therefore no other probability can have any relevance to determining P(h|e.b).

That means the following is true by definition:

P6. Making h more or less likely on considerations of background knowledge alone is the premise P(h|b) in BT; making the evidence more or less likely on considerations of the deductive predictions of h on that same background knowledge is the premise P(e|h.b) in BT; making the evidence more or less likely on considerations of the deductive predictions of some other claim that entails h is false is the premise P(e|~h.b) in BT; any value for P(h|b) entails the value for the premise P(~h|b) in BT; and these exhaust all the premises in BT.

Formally, if C = ‘a valid historical method that contradicts BT,’ D = ‘a valid historical method fully modeled and described by (and thereby reducible to) BT,’ and E = ‘a valid historical method that is consistent with but only partly modeled and described by BT,’ then:

P8. Either C, D, or E. (proper trichotomy)

P9. ~C. (from C2.)

C3. Therefore, either D or E.

P10. If P5 and P6, then ~E.

P11. P5 and P6.

C4. Therefore, ~E.

P12. If ~C and ~E, then only D. (from P8.)

P13. ~C and ~E. (from C2 and C4.)

C5. Therefore, only D.

Therefore, only a valid historical method fully modeled and described by (and thereby reducible to) BT exists. In other words, no other valid historical methods exist.¹⁴

Indeed, I believe that for any claim h (whether in history or any other subject of knowledge whatever), if h is capable of being true or false, then its probability of being true (given what you happen to know at the time) is exactly and only the probability entailed by BT. Therefore, no one can reject the valid and sound conclusions of BT in any subject of factual inquiry, and no one can claim that BT does not or cannot determine the probability that any h is true (given all present knowledge). The preceding proof already entails this must be true for claims about history, and that is sufficient for my present purpose. But I shall pause to demonstrate the broader thesis as well, as it reinforces the narrower.

The broader thesis follows from an argument I developed in chapter three (pages 83–88). Of the four probabilities in BT, one entails another (each prior is always the converse of the other), so there are only three independent statements of probability in BT. For each of those you either know that its value is higher than 0.5 or lower than 0.5, or else so far as you know it is 0.5—because if you don't know whether it's higher or lower, then by definition so far as you know it's as likely as not. If it weren't as likely as not, then by definition that would mean you know it is not 0.5, which entails you know it is either higher or lower. Therefore, for every premise in BT, you always know its probability is either A (0.5), B (higher than 0.5), or C (lower than 0.5), “so far as you know,” and that exhausts all logical possibilities. But no matter what value thereby results for each and every premise in BT (whether A, B, or C), a conclusion necessarily then follows regarding the probability that h is true, given what you are thus claiming to know—even if that probability is flat out 50/50. For example, for any claim h, if you know nothing about what any of these three probabilities for it are, then so far as you know they are all 0.5, which logically entails the posterior epistemic probability, the probability that h is true (“given what you know so far”) is 0.5. And because BT is logically valid, you must always accept its conclusions when you accept its premises. Thus the only way to deny such a conclusion is to affirm different premises (e.g., that one of those three probabilities is not 0.5), in other words, to affirm that you know what those probabilities are (at least well enough to know they aren't the probabilities just affirmed).

Of course by arguing a fortiori you might say something like it's “0.5 or higher,” but even in that case mathematically you are entering a 0.5 in the equation, so it amounts to affirming that the probability is simply 0.5. What you then do with the conclusion is determined by whether you think the “true” probability could be higher or could be lower (the method of arguing a fortiori, which I discussed in chapter 3, page 85). But mathematically you still have to choose to set the limit of your margin of error at 0.5, or above 0.5, or below 0.5. There is no fourth alternative. That some conclusion is then always entailed I demonstrate with a logical flow chart in the appendix (page 286), where BT conclusions are shown to follow even from the simplest trichotomy here proposed, that is, that each premise has a value of either 0.5 or < 0.5 or > 0.5. But the same analysis follows for any degree of precision your knowledge can honestly claim.

For example, if you must admit P(h|b) is > 0.5, then for any arbitrary number above 0.5, for instance n = 0.75, either you believe (A) P(h|b) is n, or (B) P(h|b) is greater than n, or (C) P(h|b) is less than n, or else (D) you can't claim to know anything more than that n is above 0.5. And so on, for any other n. You cannot deny any of these possibilities without affirming one of them, and once one of them is affirmed, a conclusion always necessarily follows from BT as to what the probability is that h is true given what you know. It might still be that or higher, or that or lower, depending on which limit of your margin of error you are defining, but that's still a probability that cannot be rejected without giving in and accepting BT and just affirming different values for its premises.

To be a little more specific, if you affirm “very high confidence” that h would turn out to be true before looking at the specific evidence for h, then you cannot logically believe P(h|b) is less than 0.75. Because to simultaneously assert “I have a very high confidence that h will turn out to be true” and “I believe h will be true only three out of every four times” is to hold two contradictory beliefs. To believe h will be true only three out of four times is simply not to believe “with very high confidence” that h will turn out to be true. That would be a confidence only somewhat high. Just ask yourself again whether you'd get into a car that had a one in four chance of exploding, and then ask why you would still claim to have a “very high confidence” in the safety of that car. Or if that context is too extreme, ask yourself if you'd bet your career on a result that had a one in four chance of soon being refuted, and then ask why you'd have a “very high confidence” in a result like that. This becomes all the clearer as your certainty increases. If you are certain h will turn out to be false, because you rightly believe it is wildly improbable for h to be true (e.g., as when h = “Caesar rode a winged horse and camped on the moon”), then you cannot believe that P(h|b) is even as high as 0.01 (or 1 percent), much less any higher. This was already evident in chapter 3's opening example of the sun being supernaturally eclipsed for three hours.

It follows that no other method of inference, such as using ordinal or qualitative rankings of confidence absent any reference to probabilities, can supplant BT. It can work alongside it, as a heuristic or simplified way of getting the same result. But you can't get a valid conclusion from any other method and have that conclusion contradict BT. And for every hypothesis, BT entails some conclusion (for whatever state of knowledge you are presently in). Because for any probability in BT (whether the prior or either consequent), you will always have some confident belief that it is at least some value or higher, or some value or lower, and this belief logically requires you to accept the conclusions that follow from that belief, in the very manner BT entails.

Again, for example, if you cannot deny that the prior probability that the sun went out for three hours is “not higher than 0.01,” and you cannot deny that the consequent probability that no one else would notice this is “not higher than 0.01,” and you cannot deny that the consequent probability is anything significantly less than 1 that some sacred writing about Jesus would claim the sun went out if that claim was fabricated, then you simply cannot deny that the epistemic probability that the sun went out is not higher than 0.001 / (0.001 + 0.99) = 0.00101, or roughly one tenth of 1 percent (and is likely much lower). You are logically obligated to agree that this is true, unless and until you can demonstrate any of those underlying probabilities to be different. Resorting to other methods of inference simply cannot extricate you from this obligation. At best they can only confirm the same result, and thus simply corroborate BT.

My conclusion is therefore inescapable. For each of the three premises in BT (the prior and the two consequents), for any claim h, always there is some probability that you will be confident declaring, such that any amount beyond it you would not be confident declaring. That probability is then the only one that will entail a conclusion you can be confident in—because to get a different conclusion out of BT, you must input a different probability, yet if you are inputting a probability you are not confident in, then you cannot be confident in whatever conclusion that that probability entails. Because every weakness in an argument's premises always translates to the conclusion. Thus a conclusion that can only be arrived at by affirming premises you are not confident are true is by definition a conclusion you are not confident is true. But for every probability you are confident in, BT entails a conclusion you are logically compelled to be equally confident in.

There is no way around this. Because no matter how ignorant you claim to be, some value always necessarily follows for P(h|e.b), since by definition that is a probability conditional on what you know, and therefore a probability follows even if what you know is nothing (that probability would always be 0.5). So there is always a Bayesian probability that h is true given everything you know. This is due to the key difference between epistemic and physical probabilities as demarcated in chapter 2 (page 24). With regard to physical probabilities, you can legitimately say you simply have no idea what the probability is (and therefore you are not obligated to pick any one of the only logical possibilities available), but you cannot legitimately say this with regard to epistemic probabilities. To say you have no idea, before looking at evidence e, whether h or ~h is true logically entails that for you (i.e., so far as you presently know), h is as likely as ~h (until you see some evidence). Because the only way you can claim to know that h is not as likely as ~h is to claim to know that h is more likely than ~h or that h is less likely than ~h; but you have already affirmed that you do not know either. Therefore, so far as you know, h is as likely as ~h. Which when translated into mathematical notation simply means P(h|b) = 0.5. So even denying there is a probability always entails there is a probability—for you, and given the information you have at that point in time. It follows that you are always in some state of knowledge or ignorance that entails an epistemic probability for any h according to BT, and no other method of inference can validly contradict it. BT therefore underlies all valid empirical reasoning. Or so I contend.

Even if you disagree with that broader thesis, you still cannot deny that BT underlies all valid historical reasoning, as that at least I formally proved earlier. Applying this knowledge now allows us to test the validity of any methodological principle in the study of history. Two major examples are the ‘Argument from Silence’ and what's amusingly called the ‘Smell Test.’ The following analysis of these can serve as a model by which to evaluate any other methodological principle in history.

BAYESIAN ANALYSIS OF THE ‘SMELL TEST’

The ‘Smell Test’ is a common methodological principle in the study of myth, legend, and hagiography. This test can be most simply stated as “if it sounds unbelievable, it probably is.” When we hear tales of talking dogs and flying wizards, we don't take them seriously, even for a moment. We immediately rule them out as fabrications. We usually don't investigate. We don't wait until we can find evidence against the claim. We know right from the start the tale is bogus. Yet the only basis for this judgment is the Smell Test. Is that test valid?

It is certainly ubiquitously accepted by historians in every field. It is suspiciously only rejected by religious believers, and then only when it's applied to amazing claims they prefer to believe. They ground this rejection in the claim that we shouldn't be biased against the supernatural, and God can do anything. Yet if they honestly believed in those principles they would be compelled to concede the miracle claims of every religion “because you shouldn't be biased against the supernatural, and God can do anything.” This includes all the pagan miracles (incredible apparitions of goddesses, mass resurrections of cooked fish, wondrous healings, and teleportations), Muslim miracles (splitting moons, wailing trees, flights to outer space), Buddhist miracles (bilocation, levitation, creating golden ladders with a mere thought), and indeed every and any amazing claim whatever. Tales “proving” reincarnation? We can't reject them—because God can do anything. Ghosts confirming to the living that heaven is run by a Chinese magnate and his staff? We can't rule it out. That would be bias against the supernatural.

Honestly living that way would be impossible. You would have to believe everything you read or hear unless you can specifically present evidence sufficient to discount it: an impossible task. You would be left with a belief system hopelessly frightening and contradictory—and mired in a thousand false beliefs. Such behavior also goes against all established background knowledge, which contains endless examples of miracle claims refuted by fortuitous inquiry (and no good case of any miracle claim surviving such inquiry).¹⁵ In other words, our bias against the supernatural is warranted, just as our bias against the honesty of politicians is warranted: we've caught them being dishonest so many times it would be foolish to implicitly trust anyone in politics. Likewise, amazing tales: we've caught them being fabricated so many times it would be foolish to implicitly trust any of them.

The Smell Test thus represents an intuitive recognition of: (a) the low prior probability of the events described (i.e., P(h|b) << 0.5); (b) the ease with which the evidence could be fabricated (i.e., P(e|~h.b) is always high, unless we have sufficient evidence to the contrary), in fact often the ease with which such an event if real would produce or entail much better evidence (i.e., P(e|h.b) is often low); (c) how typically miracle claims are deliberately positioned in places and times where a reliable verification is impossible (and when such verification is possible, are refuted), which fact alone makes them all inherently suspicious; and (d) sometimes the similarity of a miracle story to other tales told in the same time and culture is additionally suspect, like the odd frequency with which gods in the ancient West rose from the dead, transformed water into wine, or resurrected dead fish, oddities that curiously never occur anymore, and which are so culturally specific as to suggest more obvious origins in storytelling.¹⁶

Both (c) and (d) can raise the consequent probability of nonmiraculous explanations, and also reduce the consequent probability of the miraculous. But condition (a) is the point just made: such claims are contrary to reality as we know it. This doesn't just mean only miracles, but all wonders, like implausible coincidences and unrealistic social reactions and behaviors. Hence, the issue is not the presumption that miracles never happen, but the documented fact that, if they happen, they happen exceedingly rarely (just as implausible coincidences and unrealistic human behaviors do), whereas false tales of the fantastic happen with exceeding frequency; and likewise, the fact that miracles suspiciously happen all the time only in historical periods (or geographical regions) that are comparatively illiterate, superstitious, or unenlightened, in conditions lacking the means of verifying no shenanigans were involved (in either the event or its telling), whereas in ages and places where we have widespread education and organized skepticism and the tools and opportunity to test wild claims, the phenomena always disappear. Both are established facts in b (our background knowledge) and thus all our estimates of probability must be conditioned on these facts. Even if you are a firm believer in the miraculous, the facts remain the same: most wondrous claims (by far) are bogus. Your priors must reflect that, regardless of your worldview. And like the Tibetan peasant claim in chapter 3 (page 72), when we lack specific evidence to confirm a claim, or lack the means to verify it by reliable tests, the priors must dictate what is reasonable to believe. That reasoning is both logically valid and sound. Thus, BT confirms that the Smell Test is valid, even on point (a) alone.

Conditions (b), (c), and (d) only strengthen this conclusion. Even outside the context of wondrous claims, ancient texts are full of lies and falsehoods, even when generated by eyewitnesses, contemporaries, and critical historians, or anyone who ought to have known better.¹⁷ Our background knowledge also establishes how easy it is to rapidly fabricate and disseminate false stories, even without challenge (like the darkening of the sun with which we began in chapter 3; and more examples I'll explore in the next volume), and how easy it is for a claimed miracle to entail evidence we curiously don't have. The darkening of the sun predicts a vast quantity of evidence that, by not existing, disconfirms the story. Likewise, the frequency of resurrection stories in antiquity entails a phenomenon that should still be observed with the same frequency, yet is not (except in such mundane ways as to refute any miracle claimed to be analogous—such as from the application of CPR and ordinary cases of misdiagnosed death). Thus, the disappearance of this phenomenon is an unexpected piece of evidence on the theory that any resurrection is real, just as the disappearance of angels and gods who used to descend and deliver speeches with surprising frequency in antiquity is unexpected on the theory that these things ever really happened. It's not impossible that “things just changed,” but it is improbable—because we cannot predict from any established theory that such a change would indeed have happened, much less happened conveniently as soon as we had better methods and means to test such claims (and it is precisely that coincidence that is otherwise very improbable). Any logically valid argument must take this improbability into account.¹⁸ Thus, incredible claims can only pass the Smell Test if they have correspondingly strong evidence in their support, which means evidence that is even more improbable on the claim's being false than the claim's being true is already improbable on prior considerations ((a) through (d)). For example, if in all past cases a claim's being true is a tenth as likely as its being false, then to believe that claim we need evidence that's over ten times more unlikely on any other explanation. That is, if P(~h|b) = 10 × P(h|b), then P(e|h.b) must exceed 10 × P(e|~h.b) for h to be credible; and if P(~h|b) = 1,000 × P(h|b), then P(e|h.b) must exceed 1,000 × P(e|~h.b) for h to be credible; and so on. In other words, the Smell Test simply reduces to the principle “extraordinary claims require extraordinary evidence” (see chapter 3, page 72; chapter 5, page 177; and chapter 6, page 253). Which means the Smell Test reduces to BT.

BAYESIAN ANALYSIS OF THE ARGUMENT FROM SILENCE

Historians routinely rely on Arguments from Silence: when something isn't said or attested, we conclude it didn't happen. Such reasoning is often challenged with the quip “absence of evidence is not evidence of absence.” But the truth is, absence of evidence is evidence of absence—but only when that evidence is expected. You also sometimes hear the axiom “you can't prove a negative,” but that's also false. Negatives are often quite easy to prove and we prove them all the time. In fact, logically, every positive claim entails a converse negative claim, thus merely in the act of proving a positive we have always proven a negative; often a great number of them.

The question of whether Jesus existed, for example, would be decisively proven in the negative by the recovery of an authenticated letter signed by the Apostle Peter outright saying that Jesus was only a cosmic being whose sojourn on earth was merely a symbolic myth, and who was only known to anyone through mystical perception. And we could have had a great deal more evidence than that—as we do for the ahistoricity of Betty Crocker, for example. Hence, proving a negative in principle is no difficulty. The ahistoricity of Moses and Abraham and all the other patriarchs is now generally accepted by scholars the world over as an established fact, quite rightly, and yet without even need of such a smoking gun as a contemporary epistle declaring them a fiction.¹⁹ But can we validly argue that if Jesus didn't exist we would have such a letter from Peter (or any such evidence), and therefore the fact that we don't argues against the notion? Unfortunately, no, because we have little reason to expect such evidence to have survived for us to now have it. Indeed, there would be no reason for anyone actually to say Jesus didn't walk the earth until someone started saying he did. If that only happened after Peter died, he won't ever have written a letter gainsaying it. Whether we can expect someone to have done so, however, is a question I must ask in the next volume.

For the present, our concern is with when an Argument from Silence is valid and sound—and when it is not. The logical conditions have already been correctly stated:

To be valid, the argument from silence must fulfill two conditions: the writer whose silence is invoked in proof of the non-reality of an alleged fact, would certainly have known about it had it been a fact; [and] knowing it, he would under the circumstances certainly have made mention of it. When these two conditions are fulfilled, the argument from silence proves its point with moral certainty.²⁰

That would be a slam-dunk case. But a relatively weaker deployment is possible, to the extent that either condition is less certain. So it may only be “somewhat certain” that the relevant authors knew the fact and would mention it, in which case this argument can produce only a “somewhat certain” conclusion. Generally speaking, based on the hypothesized fact itself, and in conjunction with everything we know on abundant, reliable evidence, should we expect to have evidence of that fact? If the answer is yes, and yet no such evidence appears, then an Argument from Silence is strong. If the answer is no, then it's weak. Not having more evidence of the sun going out (examined in chapter 3) is a strong Argument from Silence, but not having a letter from the first Apostles explicitly declaring Jesus a fiction is weak. The examples of Caesar shaving or playing dice with a hooker (examined in chapter 2) are weaker still, being exactly what historians have in mind when declaring absence of evidence is not evidence of absence. Yet as the sun case proves, that rule does not always apply.

Once again, BT describes the logic of this argument. If on h we should expect some evidence e₁ given b (all our background knowledge) and yet we don't have e₁, then the consequent probability of h must be reduced—by exactly as much as lacking e₁ is unlikely (because the absence of that evidence is a part of the full e that must be explained by h). This same rule operates on the consequent of ~h as well, if ~h entails evidence we don't have. The tricky bit is the effect b has on this estimate. Our background knowledge establishes a very low expectation for the survival of evidence from antiquity, particularly the kind of evidence we would expect if Jesus didn't exist (I shall discuss the more generic problem of ‘lost evidence’ in chapter 6, page 219). However, that same background knowledge establishes a rather high expectation that the evidence that did survive would possess certain characteristics, and some scholars have argued that the surviving evidence is of a different character entirely. In my next volume I will discuss this oddity and how it might be dealt with. The point to observe here is that the Argument from Silence is a commonly accepted historical tool, and is logically valid precisely and only when it conforms to BT.

What else will Bayes's Theorem teach us about the methods particularly used by Jesus scholars? To that we now turn.

Proving History: Bayes's Theorem and the Quest for the Historical Jesus

In chapter 1, I demonstrated the growing loss of faith in the methodology of Jesus studies, as well as the inevitable consequence of relying on those invalid methodologies; namely, the plethora of contradictory conclusions about Jesus, each as confidently asserted as the next. This overall approach, which has so dismally failed, consists in the development and application of ‘historicity criteria.’ Chief among the cited defects of this approach was a failure to solve the Threshold Problem. When is the evidence enough to warrant believing any conclusion? No discussions of these criteria have made any headway in answering this question. Yet the question is inherently mathematical in nature. Only Bayes's Theorem can answer it. In chapter 3, I explained why the Threshold Problem requires mathematical reasoning to answer and how BT accomplishes that. And in chapter 4, I demonstrated that all valid historical methods represent different applications of Bayesian reasoning and methods that violate BT are not valid. What happens when we apply that same analysis to the popular historicity criteria?

Many dozens of criteria have been proposed in the field of Jesus studies; some overlap or just re-label another criterion. But at least eighteen distinctive criteria can be identified:

Dissimilarity: if dissimilar to Judaism or the early church, it's probably true

Embarrassment: if it was embarrassing, it must be true

Coherence: if it coheres with other confirmed data, it's likely true

Multiple Attestation: if attested in more than one source, it's more likely true

Explanatory Credibility: if its being true better explains later traditions, it's true

Contextual Plausibility: must be plausible in Judeo-Greco-Roman context

Historical Plausibility: must cohere with a plausible historical reconstruction

Natural Probability: must cohere with natural science (etc.)

Oral Preservability: must be capable of surviving oral transmission

Crucifixion: must explain (or make sense of) why Jesus was crucified

Fabricatory Trend: mustn't match trends in fabrication or embellishment

Least Distinctiveness: the simpler version is the more historical

Vividness of Narration: the more vivid, the more historical

Textual Variance: the more invariable a tradition, the more historical

Greek Context: credible, if context suggests parties speaking Greek

Aramaic Context: credible, if context suggests parties speaking Aramaic

Discourse Features: credible, if Jesus’ speeches cohere in a unique style

Characteristic Jesus: credible, if it's both distinctive and characteristic of Jesus

The validity and applicability of each of these criteria can be tested with Bayes's Theorem.

DISSIMILARITY

Formulated in various ways, the Criterion of Dissimilarity has been identified as underlying many of the others, which are just particular applications of this more general principle. One reference defines it this way: “If a tradition is dissimilar to the views of Judaism and to the views of the early church, then it can confidently be ascribed to the historical Jesus.”¹ Another says: “If, therefore, Jesus is presented as saying or doing things that seem almost out of place for both Palestinian Judaism and early Christianity, the likelihood of the presentation being accurate seems great.”² Yet the latter author immediately follows this with an example that is factually false,³ thus illustrating the most common folly in using this or any criterion: the evidence often fails to fulfill the criterion, contrary to a scholar's assertion or misapprehension. But even once you've eliminated all applications that depend on demonstrably false assertions (which represent instances of ignoring evidence or field-comprehensive background knowledge), you still have the two remaining problems I identified before: logical invalidity and inconclusive threshold.

First, we simply don't know much of what went on in second-temple Judaism and the early church. To the contrary, we know early Christianity and Judaism were wildly diverse and that we have scarce to no data about all the many different communities we know were flourishing at the time (I'll come back to that problem on page 129). Thus we cannot establish for alternative hypotheses either a low prior or a low consequent. Second, just because something unusual is attributed to or said about Jesus doesn't make it true. If something that unusual can happen to or be said by Jesus, then it can just as easily have happened to or been said by anyone in the Christian tradition after him. So when does “being unusual” indicate it came from an innovating Jesus, rather than a later innovating missionary or prophet? At what point does meeting this criterion make the former explanation more probable? Everything that would tend to raise the prior or consequent for a hypothesis of “historicity” on this criterion will raise the prior or consequent of the contrary hypothesis just as much—or nearly as much, producing a conclusion so far from certainty as to be of no use.

Two more particular problems arise for this criterion. First, if a purported fact was so unusual, why then was it preserved so faithfully? Any answer to the latter question will entail as much a reason to invent it as to preserve it if true. Indeed, combining this problem with the first, the most obvious answer to “why was it preserved?” is that it was entirely in accord with the views of early Judaism or the church—and we just lack the information in surviving evidence to confirm this. As Christopher Tuckett says, “The very existence of the tradition may thus militate against its being regarded as ‘dissimilar’ to the views of ‘the early church.’”⁴ The last problem is that a saying or story that did originate with Jesus or eyewitnesses can easily have become confused or distorted in the retelling until the version we have appears unusual against its Judeo-Christian background, yet not because it is historical, but precisely because it is not. In other words, due to our ignorance of crucial data (in consequence of the extremely poor survival of evidence), the consequent probability of any given oddity on explanations other than “historicity” can often be high—too high for “historicity” to be much more probable.

It's thus generally agreed this is one of the most deficient of all the criteria, of essentially no valid use in determining the truth.⁵ And yet by underlying so many others, it takes them all down with it. I will use the following criterion (Embarrassment) as a comprehensive example of that fact, then treat the remaining criteria more briefly. Subtle variants of the Criterion of Dissimilarity (or any other criterion) are likewise no more valid, for example, the Criterion of Rarity says that if something Jesus said or did is only rarely evinced by Jews or Christians otherwise, then it's probably true, but that's just a non sequitur—for all the same reasons surveyed above.

EMBARRASSMENT

The EC (or Embarrassment Criterion) is based on the folk belief that if an author says something that would embarrass him, it must be true, because he wouldn't embarrass himself with a lie. An EC argument (or Argument from Embarrassment) is an attempt to apply this principle to derive the conclusion that the embarrassing statement is historically true. For example, “the criterion of embarrassment states that material that would have been embarrassing to early Christians is more likely to be historical since it is unlikely that they would have made up material that would have placed them or Jesus in a bad light,”⁶ or in other words, “the point of the criterion is that the early Church would hardly have gone out of its way to create material that only embarrassed its creator or weakened its position in arguments with opponents.”⁷

The EC has been much discussed, but rarely its underlying logic.⁸ This is odd, since the EC is essentially an application to history of the legal principal of ‘statement against interest’ whose logical soundness has indeed been questioned.⁹ The increasing trend in law now is to require corroborating evidence before granting admission of statements against interest. And even when admitted as evidence, juries are instructed not to assume the testified fact is true, but to critically evaluate such testimony like any other. Historians should do the same.

Adapting various formulations of the juridical ‘statement against interest’ rule into a concise form applicable to the EC, we can define it as “a statement made by someone having sufficient knowledge of the subject, which is so far contrary to their interests that a reasonable person in their position would not have made the statement unless believing it to be true.” This sets two requirements, each of which must be reasonably proved or known for the principle to apply: we must be able to confirm the speaker was actually in a position to know the truth of the matter, and that their statement was so far contrary to their interests that they would not have said it unless they believed it to be true. Only in this formulation is the EC logically valid, which entails the following EC argument: “If a person made a statement so far contrary to their interests that a reasonable person in their position would not have made the statement unless believing it to be true and that person was in a position to reliably know what the truth really was, then their statement is probably true.” So formulated, a valid EC argument is exceedingly difficult to apply in Jesus studies, where typically neither requirement can be reliably met.

In Bayesian terms, when this criterion as thus stated is met, P(e|~h.b) << P(e|h.b). So, as long as prior probability does not render the claim incredible (i.e., as long as P(~h|b) is not >> P(h|b)), it should be believed. This diversion of the consequent probabilities is what is accomplished only when both requirements of the criterion are met simultaneously. So when one or both are not met, the argument fails to have this effect on the consequents, and thus fails to have any significant effect on the probability that the examined claim is true. The following analysis shows why we probably can never meet both requirements of this EC argument in Jesus studies.

GENERAL INADEQUACY OF THE CRITERION OF EMBARRASSMENT

When all our background knowledge about the nature of man, the world, and the ancient evidence and context is taken into account, we will find that several general defects plague the EC.

(a) Problem of self-contradiction

The first general problem facing successful application of the EC is that most actual uses of the EC in Jesus studies have been based on an inherent contradiction. The assumption is that embarrassing material “would naturally be either suppressed or softened in later stages of the Gospel tradition.”¹⁰ But all extant Gospels are already very late stages of the “Gospel tradition,” the Gospel having already been preached for nearly an entire lifetime across three continents before any Gospel was written. The most widely held consensus in the field is that the Gospels post-date the life of Paul, as he never mentions them or any uniquely identifying information in them. And this is a strong Argument from Silence, as he cannot have been ignorant of them if they then existed, and the information in the Gospels would surely and repeatedly have been used by him or against him, either way ending up in his letters. And there is no other evidence that securely dates the Gospels to that period. Yet Paul's ministry spanned two continents and nearly three decades, long enough even for the authors of the Gospels to have been born into the Christian faith and grown into adulthood believing in no other (not that they did, but they well could have, the tradition by their time was indeed that old), while countless other missionaries (Apollos, for example) also spent decades preaching and wrangling with opponents inside and outside the church, all across Africa, as well as (like Paul) Europe and the Middle East. It's simply inconceivable that no one would ever have noticed “embarrassing” details in the story until Mark wrote them down after all this time. So the fact that we see a redactional tendency in later Gospels to soften or erase embarrassing material in Mark is very nearly conclusive proof that those embarrassing details never existed in the tradition at all before Mark. For had they preceded him, they would have undergone all that redactional treatment well before Mark put pen to paper.

We might think to make it plausible that such embarrassing material preceded Mark by proposing that it was not yet embarrassing when Mark recorded it but only came to be embarrassing to later authors. But if the material was not embarrassing to Mark, the EC does not apply, and you have no EC argument (see problem (b) below). The only remaining option is to somehow prove that Mark was under different pressures, or was innately more honest, than the other Evangelists. But the latter cannot be proved—and is doubtful, especially given the quantity of dubious material in Mark (not least being the false story of the sun going out, examined in chapter 3).¹¹ And the former bears consequences some scholars might fear to allow, such as conceding the other Evangelists were liars, and requiring the device of weighing down your hypothesis with conditions that explain why Mark would be compelled to tell truths that the other Evangelists felt free to suppress. As the sun-darkening story proves (among much else in Mark that's dubious, e.g., Mark 5:11–16, 6:35–52, 8:1–21, 11:13–20, etc.), the claim that eyewitnesses prevented Mark from fabricating doesn't carry much force, either. Innumerable other cases of generational storytelling establish that the commonly assumed theory of an “eyewitness check” against deviations from truth actually has little to no basis in fact. But that's a conclusion I must explore in the next volume. For the present, this one spectacular failure of the theory (Mark's inordinately public yet entirely fabricated erasure of the sun) should suffice to prove that this theory of an “eyewitness check” is in serious trouble. More to the point, such a theory could never explain Mark's failure to omit embarrassing information, as no “eyewitness check” would have prevented that (not even in his sources, much less in Mark).

We should also be warned against too readily assuming an author would automatically know how a statement would be taken by his audience or how embarrassing it could become, especially if it is later exaggerated or taken out of context. Authors often don't realize how critics will exploit the things they say, which opponents will often do in unexpected ways. It's entirely conceivable that an author could write one thing, not even thinking about how his critics will use it, then after it gets thus used, a subsequent author would be keen to try and edit the tradition to counter the new and unexpected controversy it created. The history of doctoring the manuscripts of the New Testament is rife with examples of statements that only later became embarrassing as they came to be exploited or emphasized in novel ways (by “heretics,” which is to say, “competing interpreters of the gospel”).¹² This happened even as the Gospels were composed. Like Matthew's stated excuse for introducing guards into the story of the empty tomb narrative—which reveals a rhetoric that apparently only appeared after the publication of Mark's account of an empty tomb.¹³ For Mark shows no awareness of the problem. It clearly hadn't occurred to Mark when composing the empty tomb story that it would invite accusations the Christians stole the body (much less that any such accusations were already flying). Which should be evidence enough that Matthew invented that story, as otherwise surely that retort would have been a constant drum beat for decades already, powerfully motivating Mark to answer or resolve it (if his sources already hadn't, and they most likely would have). There can therefore have been no such accusation of theft by the time Mark wrote. The full weight of every probability is against it. Mark simply didn't anticipate how his enemies would respond to his story.

(b) Problem of ignorance

The second general problem facing successful application of the EC is the extraordinary degree of our ignorance. As Stanley Porter says, “determining what might have been embarrassing to the early Church” is “very difficult,” especially given “the lack of detailed evidence for the thought of the early Church, apart from that found in the New Testament.”¹⁴ Even conservative scholar Mark Strauss admits that “what seems embarrassing to us may not have seemed so to the early church” and “there also may be reasons we cannot immediately recognize for the creation” of seemingly embarrassing details.¹⁵ Theissen and Winter also call attention to this defect of ignorance: what is perceived to be in conflict with Christian assumptions and beliefs at the time of a Gospel's composition might not have been in earlier decades, and such material could have been fabricated in that earlier period.¹⁶ Thus, perceiving material in a Gospel to be at odds with the assumptions otherwise embraced by that author cannot reliably indicate that that material is historical, any more than that it was invented by an earlier community. As Morna Hooker puts it, “Use of this criterion seems to assume that we are dealing with two known factors (Judaism and early Christianity) and one unknown—Jesus,” but “it would perhaps be a fairer statement of the situation to say that we are dealing with three unknowns, and that our knowledge of the other two is quite as tenuous and indirect as our knowledge of Jesus himself.” As she rightly says, “It could be that if we knew the whole truth about Judaism and the early Church, our small quantity of ‘distinctive’ teaching would wither away altogether,” which could be equally true of our small quantity of “embarrassing” content.¹⁷

The incest and immorality of the gods in Homer was embarrassing to Plutarch and Plato, for example (as they chafe at it constantly), yet no one today uses that fact to argue that Homer's stories of the gods must therefore be true. Different people in different communities react to the same claims differently. And the early Christian Church was nothing if not wildly diverse, already in the time of Paul (as Paul's own letters attest), much more so by the time the Gospels were written. This same error arises when modern readers mistake what is embarrassing to them for what would have been embarrassing to any ancient readers, or simply get wrong what was embarrassing to ancient readers. Or they treat ancient readers monolithically, as if everyone in antiquity shared the same opinions and values. Early Christianity was in many respects rooted in open rejection of elite norms. What was embarrassing to many elites (such as embracing pacifism or placing faith before reason or even worshipping a Jewish god) was not embarrassing to Christians. That's why they were Christians.¹⁸ One example of this fact is how rapidly Christians abandoned the elite Jewish requirement of ritual diet and circumcision, even though we know these cannot have been ideas promulgated by the historical Jesus (as in Galatians Paul reveals that he introduced that innovation himself, years after Jesus is supposed to have died; which is corroborated in Romans, where Paul makes a lengthy defense of the innovation without ever once citing the authority of Jesus).

Likewise, as Hooker observes, “what seems incoherent to us may have seemed coherent in first-century Palestine” or, I would add, in any particular Christian or Jewish community, especially if extra-textual discourse was expected to unravel any apparent contradictions or embarrassments, even turning them to particular uses.¹⁹ The canonical Gospels together are a rhetorical nightmare of glaring problems and contradictions (indeed many of the teachings of Jesus were deliberately written to be paradoxical), yet in the end were all still embraced by a single church. In reality, most religions throughout history have buried themselves in inconsistencies, yet persist unperturbed. Just consider how much easier Unitarianism is to defend than Trinitarianism, yet how fiercely the Catholic Church clung to the latter even though it was not believed of Jesus in Paul's time. So mere contradictions (or any other rhetorical vulnerabilities) will never satisfy the EC. We need to have a firm grasp of the whole nature of a group's thinking, values, concerns, and ideas, and its particular circumstances, before we can start declaring what they would or would not have thought a useful or suitable claim about Jesus. And the fact of the matter is, we simply don't have anywhere near enough of that information for any of the diverse Christian communities before Mark. To the contrary, we know for a fact that we are ignorant of far too much.²⁰

John Meier is peculiarly guilty of covertly assuming (even when overtly denying) there was no diversity or evolution in the views of the early church. Meier often speaks of the Gospels as if they were all products of a single monolithic “Church” that never changed from beginning to end, when in fact they were produced by very diverse church communities in different places and times, and thus whose authors did not share the same values and concerns, nor even the same beliefs. Hence, we cannot speak of what would have been embarrassing to “the church” (as Meier repeatedly does) because there was no such animal. There were many churches, constantly changing and competing over time and place. In consequence, what one author found embarrassing might not have been embarrassing to another, or one author may have had overriding interests that another did not, or one author may have been embarrassed by a statement to a very different degree than another, and we just don't know enough about the issues, concerns, and variations of the myriad church communities across the first century to make any confident statements about what would have been embarrassing to whom—except in a very few cases, and (as we'll see) none of those support Meier's use of the EC.

Even ascertaining what a particular author was thinking is difficult. Craig Evans frames the EC in terms of identifying a “tradition contrary to the evangelists’ editorial tendency.”²¹ But that assumes you actually have a correct read on what that tendency is for any given author; that you actually know what that evangelist's aims were. Given the countless unresolved disagreements over this point among the entire scholarly community today, this often does not appear to be a safe assumption. And even when safely assumed, this still only moves the problem back to that author's sources, not all the way back to historical fact.

John Meier actually produces his own example: Jesus’ cry of dereliction on the cross (“My God, my God, why have you forsaken me!?” in Mark 15:34 and Matthew 27:46), which is a quotation of Psalms 22:1. “At first glance,” Meier says, “this seems a clear case of embarrassment; the unedifying groan is replaced in Luke by Christ's trustful commendation of his spirit to his Father (Luke 23:46) and in John by a cry of triumph” (John 19:30).²² Here Meier admits that Luke and John lived in a different era and thus had “later theological agendas” different from Mark's (and Matthew's). He further admits that this cry originally served a mytholiterary function far outweighing any embarrassment it may have incurred (namely, the assimilation of Jesus in his death to a venerable Jewish tradition of “the suffering just man”), and therefore Meier concludes not only that the EC cannot rescue this statement as historical, but that the statement is probably not historical. “By telling the story of Jesus’ passion in the words of these psalms, the narrative presented Jesus as the one who fulfilled the OT [Old Testament] pattern of the just man afflicted and put to death by evildoers, but vindicated and raised up by God,” which mythic pattern Mark realized by weaving allusions and quotations of the relevant psalms “throughout the Passion Narrative,” including this cry.²³

Meier also argues that due to the mythic tradition this device evoked, Mark's play upon it would not even have been perceived as embarrassing. But here Meier's argument is immediately refuted by his own evidence that later Evangelists found it so embarrassing they changed it. But certainly his general point is still valid: that we should be careful about assuming we know what ancient readers and writers regarded as an embarrassment, particularly as this would vary even then among different communities and over time. But this is not an example of that kind of error, but of a very different one: that of naively assuming we know what an author is doing with a story. Mark's Gospel is actually rife with irony and reversals of expectation.²⁴ Which means Mark appears to have inserted material precisely because it was embarrassing—to outsiders, not privy to the “secret” (as Mark actually says in 4:11–12, 33–34). Insiders would not perceive any embarrassment, because they would be taught the real point behind everything, just as Mark says Jesus’ disciples were (a story that itself establishes the model Christians were to emulate in their own churches). In effect, Mark's literature is designed to exclude people who “don't get it,” thereby increasing the commitment of insiders who are made to feel they are special by the very fact that they understand what others don't. As this is explicitly stated by Paul (1 Corinthians 1:19–27, 2:14, 3:18–20, 8:1–2) and our earliest Gospel (per above), it's one of the few instances where we can be certain of how the earliest Christians thought about what they preached. We should not be “shocked” to find Mark realizing this principle, especially if you agree (as many scholars do) that Mark's Gospel is closest to Paul's in message and inspiration.

When it comes to Jesus’ “cry on the cross,” the fact that it begins his death with a quotation from Psalms 22:1 actually establishes a clue to the rest of the passion-resurrection narrative, which is based on a three-day sequence of Psalms (Psalm 22: crucifixion and death; Psalm 23: burial and sojourn among the dead; and Psalm 24: resurrection “on the first day of the week,” Mark 16:2 quoting the Septuagint edition of Psalms 24:1).²⁵ Once the pattern is evident, the conclusion is inescapable: this whole narrative is a literary creation, structured with well-crafted allusions to the Psalms (so well, I believe, that we can be certain we are not reading an eyewitness record of what happened, but whether that conclusion is valid I will examine in my second volume). In Bayesian terms, the probability of all these coincidences with the Psalms is much lower on the hypothesis that they happened than on the hypothesis that Mark is creating a narrative out of the Psalms. Authors don't just randomly stumble into smartly constructed and remarkably apposite literary structures like this, nor do historical events themselves just accidentally do that for them. Thus, P(e|INVENTED) >>> P(e|HAPPENED), and since we have no evidence the priors diverge anywhere near as strongly the other way, we end up with P(INVENTED|e) >>> P(HAPPENED|e): Mark is almost certainly inventing. Once we realize what Mark as an author is doing here, it no longer becomes plausible to see Mark as reluctantly including the “cry of dereliction” out of “embarrassment,” for he could easily have invented it to suit his literary intent. By contrast, if it were embarrassingly true, what possible reason could Mark have had even to mention it?

Once our ignorance is dispelled, we discover the EC fails to apply. How many more applications of the EC will fail once we correctly grasp the author's actual intent? Meier fails to realize that there is no difference between this example and those he finds convincing. The evidence is identical (in every case, all Meier has to offer is the fact that later authors expose their embarrassment at the remark by changing it, exactly as they did here), and the reasoning is identical (theologically, a perfect God should not be in despair, much less talking to himself in the third person and expecting a reply, yet all Meier can propose in every other case is some similar conflict between what Christians were supposed to think of Jesus and how he was actually depicted, exactly as should be the case here). Meier thus fails to apply his own principles and background knowledge consistently. He knows full well (from his own example of Jesus’ “cry of dereliction”) that both this evidence and this reasoning fail to entail the conclusion intended (that the cry is historical), and yet he appeals to exactly this kind of evidence and exactly the same reasoning to argue other claims are historical, thus employing a method he already knows to be invalid.

The third general problem facing successful application of the EC is the fact that (as I've suggested already) the EC most typically ought to entail exactly the opposite conclusion. Scholars advancing EC arguments fail to establish the answer to a key question: If a statement was embarrassing, then why is it in the text at all? You have to explain why this author included something contrary to his supposed editorial tendency. Yet any such explanation will entail as much reason to invent it as record it. The EC thus becomes self-defeating.

Quite simply, it's inherently unlikely that any Christian author would include anything embarrassing in his Gospel account, since he could choose to include or omit whatever he wanted (and as we can plainly see, all the Gospel authors picked and chose and altered whatever suited them—even Mark excluded a vast amount of material found in Matthew, Luke, and John, so unless that was all fabricated after Mark, Mark left out quite a lot that would have been as available to him as it was to them). In contrast, it's inherently likely that anything a Christian author included in his account, he did so for a deliberate reason, to accomplish something he wanted, since that's how all authors behave, especially those with a specific aim of persuasion. This would be all the more true for those who transmitted the tradition to Mark over the span of many decades.

It's worth remarking here, as I'll further discuss later (page 192), that everyone literate enough to compose books in antiquity was educated almost exclusively in the specific skill of persuasion: that is what all writing was believed to be for, and how all literate persons were taught to write. Therefore, already the prior probability that a seemingly embarrassing detail in a Christian text is in there only because it is true is low, whereas the prior probability that it is there for a specific, persuasive reason regardless of its truth is high, the exact opposite of what's assumed by an EC argument (and I can demonstrate this mathematically, see page 162). The mere fact that, as Meier observes, later Gospel authors freely omitted or revised what they received, is proof enough that that's exactly what authors do. So the fact that, for example, Mark shows no signs of being embarrassed by something that later authors found embarrassing, should either tell us that something has changed (which leads to the previous two problems), or that Mark had overriding reasons to include that detail. And if the latter, he (or his sources) may have had that same reason to fabricate it.

Surely if anything was actually embarrassing about Jesus, we can fairly well assume it would not survive in the record at all, since very likely no one would have recorded it, at least not faithfully. We do not have, after all, any neutral or hostile sources about Jesus from anyone actually witness to his life or (so far as we can tell) by any author in contact with anyone who was. Some of the New Testament Epistles could at least claim such contact, yet they contain nothing about the historical Jesus beyond extremely generic and essentially theological statements.²⁶ All the evidence that survives comes solely from Christians, hardly unbiased parties, and from several sources removed, which is not a reliable proximity to the facts. This cannot be overlooked.

For example, Craig Evans says the EC “calls attention to sayings or actions that were potentially embarrassing to the early Church and/or the evangelists,” and “the assumption here is that such material would not likely be invented or, if it was, be preserved. The preservation of such material, therefore, strongly argues for its authenticity.”²⁷ But this is a non sequitur. For if it “would not likely be preserved,” then there must still be a reason to have preserved it, which reason must have been stronger than the impetus to discard or alter it due to its supposed embarrassment, otherwise we cannot explain its preservation at all (because otherwise “it would not likely be preserved” entails a low P(e|h.b); that is, without any reason to preserve or mention it, its preservation is simply unlikely). But if there were such a reason, and it was that compelling, that reason would just as easily overcome the embarrassment of a fiction as the embarrassment of a fact. In other words, that it was preserved at all entails Christians must have also had a reason to invent it that would have overcome any embarrassment it created—the very same reason they would have had to preserve it if it were true. Therefore, its preservation does not argue for its authenticity at all, much less “strongly,” as Evans avers.

I'll present some examples later of exactly this problem. The exceptions are very few and hard to establish in particular cases. This means we must abandon the demonstrably false assumption that authors never write embarrassing things if they are false. Authors, especially religious ones, often have overriding reasons to include embarrassing details, such as to convey a lesson or shame their audience into action. Mark's antiheroic depiction of Peter could have exactly that intent, now modeled by modern preachers who often employ exaggerated claims of their own moral depravity and denial of God prior to “finding Christ.” Such rhetoric serves as a model for redemption and as evidence of sincerity (thus simultaneously proving “you, too, can be redeemed” and “having undergone such a dramatic reversal, surely I must be telling the truth”). But sometimes we cannot even fathom the motivation. The castration of Attis and his priests was widely regarded by the ancient literary elite as disgusting and shameful, and thus was a definite cause of embarrassment for the cult, yet the claim and the practice continued unabated. No one would now argue that the god Attis must therefore have actually been castrated. The humiliation of Inanna, Queen of the Gods, was similarly embarrassing (her story even seems deliberately crafted to be), yet no one would argue from this that Inanna really was stripped naked, killed by a death spell, and nailed up in hell.²⁸ The mythical Romulus murdered his own brother, which was then among the most despicable of crimes, and still he remained a revered god of the Roman people—and yet no one believes that ever happened, either.²⁹

We simply have no clue why these shocking stories were invented, much less became the objects of veneration and symbolic emulation. Religions frequently rally around apparently embarrassing yet entirely false myths, often in defiance of common sense. The Jews were no exception. Contrary to current assumption, the execution of their own messiah was believed to have been predicted by Daniel (Daniel 9:26; even more clearly in the Greek), yet he was widely recognized as an inspired prophet of God.³⁰ And the Gospels clearly regarded Daniel as an authority: Mark's apocalypse (in chapter 13) and Matthew's nativity and empty tomb stories all incorporate overt allusions to the Book of Daniel.³¹ Hence it would not matter if the execution of their messiah was embarrassing, for it already had the full prior authority of God and his prophets. Which would be just as much a reason to invent the detail, for such an invention would overcome any and all embarrassment at the fact by virtue of having clear scriptural endorsement from God. If you could prove God had said it would happen centuries in advance, then you will get far more traction inventing a confirmation of that prophecy than you would suffer from the fact that what God had ordained was in any sense embarrassing. Whether that's what happened or not (I make no case here either way), the conclusion remains: sometimes embarrassing details are invented for a reason.

(d) Problem of bootstrapping

Some scholars admit that the EC is insufficient to establish a detail as historical, and insist it must be used in conjunction with other criteria. But as Porter observes, “this may well create a vicious circular argument, in which various criteria, each one in itself insufficient to establish the reliability or authenticity of the Jesus tradition, are used to support other criteria,” with the result that “the sum of a number of inconclusive arguments is asserted to be decisive,” which is a non sequitur, especially when those individual arguments are not merely inconclusive, but logically invalid.³² Not even a million logically invalid arguments can establish a conclusion—at all, much less “decisively.” Porter also notes that the EC is basically just a particular application of the Criterion of Double Dissimilarity, whose fatal defects Porter also catalogs, and which are conclusively brought out by Theissen and Winter as well.³³ Indeed, some attempts to bootstrap an EC argument with other criteria can actually make that argument weaker, not stronger. For example, responding to instances when the EC is bolstered with the Criterion of Multiple Attestation, Mark Goodacre remarks, “I can't help thinking that one cancels out the other. If everyone, Q, an independent Thomas, Mark, Matthew, Luke all have this same material, who is embarrassed about it? The multiple attestation is itself an argument against embarrassment.”³⁴ Quite so.

Some will say that nevertheless, in some cases the EC makes a particular claim “more probably” historical than otherwise. But even if that's true (and it cannot be assumed—you have to prove in each specific case that the EC has that effect on that data, overcoming all the problems noted so far), that by itself is an observation of no value. Ten percent is ten times more likely than 1 percent, a huge increase in probability, and yet that would still be only 10 percent likely to be historical, which means it probably isn't. In fact, the odds it isn't are then 90 percent, which makes it fairly certainly not historical. Thus, the mere fact of making something more probable does not make it more believable. Because it doesn't necessarily make it believable at all.

Thus historians cannot hide behind meaningless assertions like “more probably historical” in order to bootstrap their way to “probably is historical.” That's logically invalid, and therefore not a rational historical argument. You have to confront the hard question: just how probable is it? And BT is the only viable method for answering that question (as I've shown in chapters 3 and 4).

SPECIFIC INADEQUACY OF THE CRITERION OF EMBARRASSMENT

Such are the general problems facing any attempt to apply the EC. Now let's examine some particular cases.

(i) Jesus’ crucifixion by Romans

This is the most common example. As John Meier puts it:

Such an embarrassing event created a major obstacle to converting Jews and Gentiles alike (see, e.g., 1 Corinthians 1:23), an obstacle that the church struggled to overcome with various theological arguments. The last thing the church would have done would have been to create a monumental scandal for which it then had to invent a whole apologetic…Precisely because the undeniable fact of Jesus’ execution was so shocking, precisely because it seemed to make faith in this type of Messiah preposterous, the early church felt a need from the beginning to insist that Jesus’ scandalous death was “according to the Scriptures,” that it had been proclaimed beforehand by the OT prophets, and that individual OT texts even spelled out details of Jesus’ passion.³⁵

We have already seen several reasons why this reasoning is invalid. The example of the castration of Attis alone refutes it. Many religions contain scandalous claims they must work hard to defend, yet few of those claims are true. So why are we obliged to assume this one is? The fact that the OT very clearly did predict the execution of the messiah (Daniel is explicit, and had already convinced the Jews of Qumran) likewise refutes Meier's claim that OT support must have been sought for after the fact.³⁶ Maybe it was. But we cannot conclude it was from the mere fact that more OT support was accumulated over time. Certainly more and more support would be sought (especially as Christians, already convinced it's there, begin to see it everywhere). But Paul does not appear to be speaking of post-hoc rationalizations when he says the core creed of the church was “that according to the Scriptures Christ died for our sins, and that he was buried, and that according to the Scriptures he has been raised on the third day” (1 Corinthians 15:3–4), nor does Hebrews appear to be stumbling over any embarrassing historical fact when it freely assumes Jesus’ death was a fully ordained Levitical sacrifice. In fact, Hebrews 9 makes such complete sense of the messiah's death, we can easily imagine that very same reasoning inspiring a belief in that death.³⁷

Accordingly, the hypothesis that the Christian messiah's trial and execution by “the ruler who would come” (Daniel 9:26) was indeed derived from the OT is initially as good an explanation of the evidence as Meier's, particularly since the Jews at Qumran were already equating this messiah with the servant of Isaiah 52–53, wherein we have almost the whole core Gospel story.³⁸ This fully explains why they expected the messiah to be executed, why they would imagine (or preach) that the Romans did it, and thus why they would conceive that death as a crucifixion (it being the standard mode of Roman execution for those who have humbled themselves completely). It would only be a bonus that being hung on a stake accorded with Scripture (Galatians 3:13) and that the Jewish authorities also crucified the bodies of their convicts.³⁹ It's notable that later, Babylonian Jews knew only of an account in which Jesus was executed by stoning (and then hung up), and solely by the Jewish authorities, all exactly in accordance with Jewish law.⁴⁰ Even the Christian Gospel of Peter (vv. 1–2) only knows of an execution by the Jewish state, under the command and supervision of Herod Antipas, rather than Pontius Pilate. In contrast, the canonical Gospels have to twist their story into a convoluted and implausible sequence of events just to get Jesus executed by Romans and not the actual Jews who were accusing him of violating only Jewish laws. Nearly every scholar acknowledges the glaring inconsistency. Some even conclude that the story can only be a whitewash for what was really a Roman execution of Jesus for attempting a coup.⁴¹ But one can just as easily argue the reverse, that this element is the whitewash, for what was really a Jewish execution for blasphemy, as the Talmud records, some Christian texts confirm, and even the canonical Gospels imply.⁴²

I am not here arguing Jesus wasn't crucified by the Romans, only that we cannot establish this with an argument like Meier's. For if finding OT support post hoc made this “scandalous” message successful, finding that OT support ante hoc would be just as successful. Therefore, Meier's reasoning is unsound. We know too little about the actual thinking of Christians in the time of Paul (such as regarding the origin of their creed of a crucified messiah); there are many plausible ways a crucifixion by Romans could have been preconceived or even served Christian interests; and religions often center their message around embarrassing myths they then have to defend. So we have to go back and ask questions about the relative probabilities of competing hypotheses. The fact that Mark is deliberately casting Jesus as a “suffering just man” and packing that story with deliberate irony would be reason enough to construct the tale as it is. Indeed, the humiliation-execution theme was a trope for Jewish mythic and figurative heroes of the time, and thus fashionable, not embarrassing (it appears in many ancient Jewish texts, including Wisdom of Solomon 2–5, Isaiah 52–53, and 1QIsaa 52.13–53.12).⁴³ If Mark thought depicting Jesus like this would be too embarrassing, he would have colored his depiction exactly as John Meier observes the later Evangelists did. That he didn't entails Mark was not embarrassed by his version of events. It served his purposes. Which means it would have served his purposes whether it was true or false. So it must be noted that contrary to common assumption, Paul never says Jesus was crucified by Romans. In Galatians 3:13, for example, he only seems aware of a Jewish execution (in which the convict would be killed and then crucified, the two events often conflated even in Jewish sources, like the Talmudic account of Jesus’ death noted earlier). Might that be what actually happened, and Mark simply changed it up to make a literary point about Roman complicity in the corrupt world order? On what basis are we certain the answer is no?

What your analysis comes to on this detail will thus depend on a number of other facts and conclusions you must settle first. The EC argument alone is simply insufficient. The fact of the crucifixion being embarrassing would make no significant difference in the consequent probabilities, and as deifying an actually crucified man has no greater prior probability than imagining a deity crucified, there is no significant difference in priors either. Likewise, whitewashing the Jewish stoning and hanging of a blasphemer as the more heroic Roman crucifixion of a misunderstood insurrectionist has no greater prior probability than whitewashing the Roman crucifixion of an insurrectionist as the Jewish-instigated crucifixion of a wrongly accused blasphemer. Six of one, half a dozen of the other.

(ii) Jesus’ birth in Nazareth

Some scholars argue that no one would invent a story about a messiah born in a hick Galilee town like Nazareth, so surely that detail must be true. Elsewhere I have refuted one underlying premise of this argument: the assumption that such an origin would be embarrassing.⁴⁴ Others have refuted its other premise by identifying several plausible reasons why Jesus would be falsely contrived as a Nazarene.

Eric Laupot makes a plausible case that the term was originally derived from Isaiah 11:1 as the name of the Christian movement (as followers of a prophesied Davidic messiah), which was retroactively made into Jesus’ hometown (either allusively or in error).⁴⁵ J. S. Kennard makes just as plausible a case that it was a cultic title derived from the Nazirites (“the separated” or “the consecrated”) described in Numbers 6 (and the Mishnah tractate Nazir).⁴⁶ As he points out, a Nazirite vow was most typically of limited duration (a fixed number of days), consecrating oneself to God by certain rituals—most prominently, abstaining from wine (which Jesus indeed vows to do: Mark 14:25; Matthew 26:29), although Kennard argues the Christians adopted the term to designate a new notion of separation or holiness reflected in the Baptist cult (similar in function to the title “Essene”). Such an interpretation is all but confirmed by Acts 24:5, where “Nazarenes” cannot mean inhabitants of Nazareth. René Salm notes that the Gospel of Phillip seems only to know the word as an epithet of “Truth” and not a geographical moniker (and he's right),⁴⁷ and hence it may have begun as the one and been literarily transformed into the other—quite possibly by Mark.⁴⁸ Salm's suspicion is notably confirmed by Irenaeus, who believed as early as the late second century that “Jesus Nazaria” meant “Savior of Truth” in Hebrew or Aramaic.⁴⁹ The conversion of this into a town (or its association with a town already existing) might then be just another instance of the common mythographic practice of symbolic eponymy.⁵⁰

Since no prior author mentions a connection between Jesus and Nazareth (Paul, for example, makes no mention of it), such developments are more than merely possible. Though I do not agree with all the theories of either Salm, Kennard, or Laupot, their arguments on this point are correct: these are viable possibilities, at least sufficiently probable to require us to rule them out first. Judges 13:5 (in light of Numbers 6) could also have been symbolically interpreted to “derive” a fictional hometown for Jesus (in much the same way that we see wildly unexpected “readings” of Scripture in the early Christian treatise of Hermas). In short, Mark may have invented this detail to serve a literary purpose (as we know he did for other details in his Gospel), or he may have received a tradition that had become garbled over time, in a decades-long telephone game that lost track of the word's original meaning. That this is possible is why we must compare any EC thesis with competing explanations of how any claim came to exist. Which only BT can competently accomplish.

There is a more specific EC argument in defense of Nazareth: later Gospel writers were clearly concerned to have their messiah born in Bethlehem, where some thought the OT predicts he should have been born, but this conflicted with Mark's depiction of Jesus as a native of Nazareth, so they invented convoluted stories to explain how he could come from both places. Yet this is the same evidence even Meier agrees should be suspect: if this had originally been a problem, why didn't Mark address it? Indeed, why hadn't it already been addressed decades before the tradition even reached Mark? Why only after Mark do these convoluted double-origin stories arise? If Mark 1:9 is discounted as an interpolation (via contamination from, or harmonization with, the other Gospels, which we know to have been a frequent occurrence in their transmission), then Mark never actually said Jesus came from Nazareth. In fact Mark seems to imagine him hailing from Capernaum (cf. Mark 2:1 and 9:33 with 6:3–4), which was also in accord with prophecy (Isaiah 8:21–9:2, verified by Matthew 4:12–16).⁵¹ The epithet ‘Nazarene’ might then not be geographical even in Mark, but instead mistaken for such only by later authors.⁵² But either way, we still have no credible case for an EC argument using only Mark. And since, as far as we can tell, later Gospels only get the idea that Jesus came from Nazareth from Mark, we have no EC argument at all.

So although Matthew and Luke's struggle to make Jesus come from both Nazareth (as Mark claimed or implied) and Bethlehem (as the OT predicted) suggests a Nazareth origin was embarrassing to Matthew and Luke, there is no indication it was embarrassing in Mark. The embarrassment seems to have been created by Mark's introduction of a Nazareth origin, combined with a need to have Jesus fulfill a prophecy of origin. It seems only later Christians may have evinced that need. Yet Matthew even claims a Nazareth origin derived from prophecy—in fact, not the town, but the epithet, which could thus have been Mark's source as well (see Matthew 2:23).⁵³ Unless Matthew was lying, we are obliged to agree that Mark (or his source) could have had that same prophecy in mind. Thus we have two different prophecies, with later Christians trying to force Jesus to fit both, but they have exactly the same reasons to invent either. So if Matthew and Luke are inventing a Bethlehem origin to force a fit with prophecy (as this argument for the authenticity of the Nazareth tradition entails), and Nazareth was also prophesied (possibly in a now-lost scripture, like the Hazon Gabriel, or a lost variant reading in an extant OT text), then Mark had as much reason to invent a Nazareth origin to force a fit with prophecy as Luke and Matthew (or their source) had to invent a Bethlehem origin.⁵⁴ Even if this began as an epithet later interpreted as a town the same process could occur. There is an apposite parallel in Matthew's duplication of the donkeys Jesus rode into Jerusalem: trying to force the story to fit what Matthew took to be a double prediction, he invented an extra donkey (implausibly having Jesus straddle both), but this in no way argues that the first donkey must then have been historical.⁵⁵ To the contrary, it was already itself mandated by prophetic Scripture. Thus the fact that Matthew found just the one donkey “embarrassing” does not argue for that one donkey being historical.

So here the EC argument collapses under the weight of the problem of ignorance. It also falls to the problems of self-contradiction and self-defeat, since the argument requires us to imagine an embarrassment that persisted for decades was only resolved after Mark, as if Mark knew nothing of it, which contradicts the assumption that it was ever embarrassing before Mark wrote of it. Why, after all, does Mark even go out of his way to use the words Nazareth or Nazarene in the first place? His story appears to operate quite well without them. So if they were embarrassing, he could have simply omitted them. That he didn't entails he was including them for some specific purpose, and whichever purpose that was, it would serve a fiction as well as a fact.

Here our background knowledge establishes that weird corruptions and attributions like this occur frequently in mythic and legendary traditions (hence many a mythic hero was given an obscure town to hail from), so prior probability doesn't favor either hypothesis (unless we can adduce more evidence otherwise—hence the EC cannot stand alone), and there are several plausible hypotheses other than “historicity” that render the Nazareth detail every bit as likely (if not more so, for example by better explaining the absence of this detail in any earlier Christian literature, as well as the unusual use of it in Acts and the Gospel of Phillip, and the testimony of Irenaeus), leaving no relevant difference in the consequents. Thus the EC cannot establish that Jesus was born at Nazareth. Again, perhaps some other argument can, but the EC is unable to.

(iii) John's baptism of Jesus

We've dealt with supposed embarrassments regarding Jesus’ birth and death. The next most common example falls in between: his baptism, ironically a symbol of both death and (re)birth. As John Meier puts it, “the baptism of the supposedly superior and sinless Jesus by his supposed inferior, John the Baptist,” who was proclaiming “a baptism of repentance for the forgiveness of sins,” must have been embarrassing, because it contradicts Christian beliefs (that Jesus was “superior” and “sinless”), and because subsequent evangelists scrambled for damage control.⁵⁶

But this is the same double error Meier himself refuted in the case of Jesus’ cry on the cross. First, we might see subsequent evangelists were embarrassed by the story, but Mark is not—had he been, he would already have engaged the same damage control they did. In fact, this would have been done by transmitters of the story decades before it even got to Mark (probably even before Jesus had died). The embarrassment is thus obviously new. So on that point alone the EC fails to apply. Second, Meier simply assumes Mark (and all prior Christians) believed Jesus was “superior” and “sinless” and thus would not countenance anything implying otherwise. Neither is even plausible, much less established for early Christians, or even those of Mark's time. Paul included Christ's voluntary submission and humbling as fundamental to the Gospel (Philippians 2:5–11). Christians did not imagine Jesus as then “superior” until he was exalted by God—at his resurrection (e.g., Romans 1:4, 1 Corinthians 15:20–28). There is nothing in Mark's depiction of Jesus submitting to John that conflicts with this view. Only later Christians had a problem with it. Mark instead portrays what Christians originally thought: that Jesus would be exalted as the superior later on. Hence he has John say exactly this (Mark 1:7–8). Likewise, the notion that Jesus was “sinless” from birth is nowhere to be found in Mark or Paul. It is clearly a later development, and thus not a concern of Mark's. To the contrary, Mark has full reason to invent Jesus’ baptism by John specifically to create his sinless state, so Jesus can be adopted by God, and then live sinlessly unto death. Mark makes a point of saying John's baptism remits all sins (1:4), that Jesus submits to that baptism, and that God adopts Jesus immediately afterward (1:9–11). This is hardly a coincidence. The role of John and his baptism are explicitly stated by Mark: to prepare the way for the Lord (1:2–3). And that's just what he does. There is no embarrassment here. Again the EC does not apply.

So when Meier insists “it is highly unlikely that the Church went out of its way to create the cause of its own embarrassment,” we can see in fact such a thing is not unlikely at all: once Christians started amplifying the divinity and sinlessness of Jesus, the story Mark had already popularized started to create a problem for them, so they had to redact that story to suit their changing theology. This proves Mark preceded those redactors and lacked their concerns, but it doesn't prove Mark's story is true. To the contrary, Mark had a clear motive to invent the story, particularly as he needed to cast someone as the predicted Elijah who would precede the messiah and “reconcile father and son” (Mark 1:6 in light of 2 Kings 1:8; and Mark 9:11–14 in light of Malachi 4:5–6) and set up Jesus’ cleansing for adoption. Why not cast in that role his most revered predecessor, John the Baptist? Having John prepare Jesus by cleansing him of sin and establishing his divine parentage, and then endorsing Jesus as his successor, is actually far too convenient for Mark. That is not a statement against interest!⁵⁷

As John Gager explains, “in Mark, the incident appears very briefly and with no sign of embarrassment or editorial ‘improvement’” or “discomfiture” of any sort, and therefore the EC doesn't apply, especially as there are plausible reasons for Christians to have invented the detail.⁵⁸ Gager cites Enslin's theory that the narrative allows Christians to co-opt the authority of the Baptist cult by representing their leader as his designated heir. Another reason for inventing the story is more symbolic (yet perhaps more obvious): this baptism, as for all other Christians, represents for Jesus “rebirth” through adoption by God. Just as Christ's crucifixion and last supper are both models for Christian life, so is his baptism. That it was intended as such cannot be doubted, given that it had been the Christian tradition for decades already that that's what a baptism was: being reborn as an adopted son of God (Romans 6:3–4; 1 Corinthians 12:13; Galatians 3:27–29, 4:5–7). Mark could hardly have told a story about Jesus’ baptism and meant anything else by it (at least without explaining to his readers that he did). This is corroborated by Mark's ending the baptism with God's declaration of fatherhood, as well as the implications of his quoting Psalm 2:7, which evokes rebirth.⁵⁹ If Jesus had to be reborn through baptism as a model for all other Christians, whom could Mark have chosen to baptize him? Was there any more suitable choice than the famous John the Baptist?

With three solid theories of why and how Mark would invent this baptismal tale, each quite probable, the EC fails. It's not even clear that Matthew and Luke were actually embarrassed by Mark's story. The notion that Luke tries to downplay John's baptism of Jesus is only based on subjective assumptions about Luke's ordering of verses. But if we set aside our preconceived suspicions and just read the text, Luke concedes the baptism without blush (Luke 3:7, 12, 21). And though Matthew inserts an apologetic vignette (Matthew 3:13–15), this really only clarifies what Mark already said (Mark 1:2–3, 7–8). Though Matthew's insertion does bear signs of a post-Markan magnification of Jesus, he is still just responding to a criticism that Mark had not foreseen would result from his story. That Matthew thought he'd eliminated the embarrassment by making clear what Mark had already meant suggests the baptism itself was not embarrassing, even to Matthew—merely its interpretation could be. This being an instance of unforeseen embarrassment, once again the EC fails to apply. Only the Gospel of John takes the logical step of actually deleting the baptism (while retaining most of the other connections Mark had established between Jesus and John: e.g., John 1:29–34). So perhaps we can say the authors of John found the baptism embarrassing, but now we are so far removed in time from Mark we have no valid EC argument left to make. For John's theology could not be further removed from Mark's—indeed, in John, Jesus is identical to God (John 1:1–5) not (as had originally been preached) his subordinate (Philippians 2:5–11; 1 Corinthians 15:24), eliminating any rationale for the baptism. Again, even at best, prior probability would favor neither the historicity nor invention of this detail, and no difference remains in the consequents; while at worst, we've already seen Mark is not developing a reputation for straight-up, reportorial honesty, tipping the priors toward fabrication, while the wholly suspicious convenience of both the claim and the structure of its narration in Mark, and the absence of this claim in any prior author, tips the consequents well in favor of fabrication. So the historicity of the baptism cannot be established with the EC. To the contrary, in the final analysis it looks quite dubious.

(iv) Jesus’ ignorance of the future

I have shown how the three EC arguments most often regarded as unassailable are in fact unsustainable. Any other example you care to choose will fall to the same analysis. One such is Mark 13:32, where we find, as John Meier puts it, “the affirmation by Jesus that, despite the Gospels’ claim that he is the Son who can predict the events at the end of time, including his own coming on the clouds of heaven, he does not know the exact day or hour of the end.” But Meier gives no explanation why we are supposed to believe Mark expected Jesus to be omniscient.⁶⁰ Mark simply says Jesus knows some details and not others; that God has reserved those for himself and not told his Son. Mark shows no embarrassment at this at all. That would only be embarrassing to later Christians who were increasingly equating Jesus with God, to the point that it became less and less intelligible how God could simultaneously know and not know something. Hence, as Meier shows, later scribes tried meddling with this passage in Mark and Matthew, while Luke and John deleted it altogether. But this concern did not exist in Mark's time, or before (as surely it would have been deleted decades before Mark even got ahold of the story). According to Paul (in the verses cited on page 148), Jesus was an appointed emissary of God, not identical with him, and there is no evidence Mark thought otherwise (in fact quite the contrary: Mark 10:18, 14:36, etc.).⁶¹

This is another instance of Meier simply ignoring the fact that the embarrassment was only created by a development in Christian theology that occurred long after the embarrassing statement had been made. Since it was not embarrassing when made, the EC does not apply, and therefore there is no EC argument for the historicity of Jesus’ ignorance of the exact time of the apocalypse. To the contrary, when this statement is examined with the correct logic, we must consider why Mark would even bother including this remark from Jesus. Why not simply omit it if it's supposed to have been embarrassing? That's what Luke and John did; so why couldn't Mark? A moment's thought should lead us to a far more obvious explanation. Mark is inventing this ignorance as an apologetic to explain what was embarrassing to Mark and his community: the fact that the end had not yet arrived, even though everyone up to then had believed Jesus told them it was nigh. By having Jesus declare the hour was not yet known to him, Mark was rescuing Jesus from the most fatal charge of being a false prophet (or at least rescuing the received tradition about Jesus from this charge, since it is not simply a given that the “end is nigh” is what Jesus actually taught, rather than coming from Christian prophets who, like the author of Revelation, claimed to be communicating with his resurrected spirit).⁶² It's ironic to see what Mark probably invented to counter an embarrassing fact now being used as if it were itself so embarrassing that what Mark invented had to be what Jesus actually said!

In this case, the EC not only fails, but the conclusion should be quite the reverse. There is no reason to favor either hypothesis with a higher prior (at least on the facts given here), but the evidence as a whole is more probable if the statement is fabricated than if it's historical (there being no likelihood of its inclusion otherwise), so the balance of consequents ultimately favors fabrication.

(v) Did Jesus not know he was the Son of Man?

In Mark, Jesus speaks of an eschatological Son of Man as if it were a different person. Some argue this proves Jesus really preached that, because later Christians (as we see in subsequent Gospels) believed Jesus and this eschatological Son of Man were one and the same, and thus would never depict Jesus saying the contrary.⁶³ But Mark alone uses the concept of a mysteriously unidentified Son of Man, a mode of speaking that suits Mark's narrative theme of a “messianic secret.” Hence it's just as likely (if not more so) that Mark has Jesus speak of the coming Son of Man as if he were speaking of another, when secretly he was speaking of himself. Since Paul already tells us that Jesus and the coming eschatological Lord were one and the same decades before Mark wrote (Paul speaks of no Son of Man or anyone else returning but Jesus: 1 Corinthians 15:20–28; 1 Thessalonians 4:16–17), it is not believable that Mark would think otherwise, much less say otherwise. And in fact Mark clearly understood them to be the same person (Mark 8:31, 9:9, 9:31, 10:33, 10:45, 14:21, 14:41). If he intended the contrary he would certainly have to be keenly aware of the embarrassment (or indeed confusion) it would cause, as would those transmitting the material to him—so why did neither he nor they seek to resolve the problem, or even acknowledge it? Again, the EC becomes self-defeating: it's so improbable that Mark would mean they were different people (when Paul knew no such thing, and Mark clearly meant them to be the same) that we can be sure he didn't. We should therefore conclude Mark is using a literary device, not recording the true words of Jesus (unless Jesus really did preach in this mode of third-person secrecy, though even then he would still be speaking about himself). Of course, it's also possible Mark perceived no contradiction in using a Son of Man source text that didn't even come from Jesus (but derived instead from some other Christian or even pre-Christian prophet) and then attributing it to Jesus, but that would still entail he felt no embarrassment at the result, and in any case the sayings would then not be historical. So no EC argument can gain purchase from that premise, either.

Again, the EC fails here, and only BT gives us a sound argument. If this detail were at all embarrassing, prior probability would favor fabrication, as historicity would make suppression or alteration far more probable than faithful retention (for a mathematical demonstration, see page 162). And if this detail wasn't embarrassing, the priors could go either way, but the consequents would then slightly favor fabrication: because it's more likely Jesus would be depicted saying this in Mark if Mark were employing a literary device well known to him (or Mark's source had done so), than if Jesus really said such things, particularly as the latter does not make probable any of the other evidence (such as in Paul), while the former does. So a valid and sound analysis argues against the historicity of these sayings more than in favor of them (even if not decisively).

(vi) Jesus’ betrayal by Judas Iscariot

In the hypothetical source document Q, Jesus is made to say all twelve of his disciples would receive eternal honors (Luke 22:30; Matthew 19:28). Meier insists:

If one wants to claim that the saying was instead created by the early church, one must face a difficult question: Why would the early church have created a saying (attributed to the earthly Jesus during his public ministry) that in effect promised a heavenly throne and power at the last judgment to the traitor Judas Iscariot?⁶⁴

Even Theissen and Winter declare, “Early Christianity always numbered Judas Iscariot among the twelve disciples and had simply scorned and condemned him as the one who betrayed Jesus,” so “When nevertheless the Jesus tradition preserves a promise that the twelve (and not the ‘eleven’!) disciples will exercise future rule over the restored Israel (Matthew 19:28/Luke 22:28–30), there can be no doubt that this is a saying that has withstood the tendencies of the tradition.”⁶⁵

Already, their first fact isn't true. For at least the twenty or thirty years of Paul's ministry, in other words the entirety of “early Christianity,” we never hear of any of these claims (that a Judas Iscariot was one of the twelve or that he, or any member of the twelve, betrayed Jesus or was “scorned and condemned” for doing so, despite an occasion to mention it in 1 Corinthians 11:23–27; nor even the saying about twelve thrones, despite an occasion to mention it in 1 Corinthians 6:2–4). If anything, we have evidence confuting this betrayal. Unless we admit to an interpolation, Paul says “the twelve” were honored with a vision of Jesus almost immediately after his death (1 Corinthians 15:5), which is hard to reconcile with any notion that one of those twelve was known to have betrayed Jesus. This likewise suggests the later Gospel story of Judas's suicide must be false (since he must then have been still alive to receive—and report—a revelation of the risen Jesus). It's therefore more likely that the story of Judas's betrayal is a literary invention, whose meaning was thought more important than any embarrassment it might cause.

What Meier, Theissen, and Winter might then wish to argue is that Jesus’ statement that “the twelve” would reign in the future world contradicts the literary invention of Judas's betrayal. But those two traditions may have separate origins: the prediction does not exist in Mark, who (so far as we know) invented the Judas story, while the Judas story does not exist in Q, the hypothetical sayings source that alone included any reference to a rule of the twelve—unless Matthew 27:3–10 and Acts 1:18–20, the tales of Judas's suicide, derive from Q (despite being so oddly contradictory), but even if Judas's betrayal found its way into Q, it's still widely believed Q was redacted several times (and the Judas story certainly wasn't part of any collection of sayings from Jesus), leaving still the question of whether Judas's suicide was a later addition to Q or derived from yet another source.⁶⁶ Likewise, the predicted rule of twelve may have originally been communicated in visions of the risen Jesus (and thus might not derive from a historical Jesus) or even been adopted from pre-Jesus messianic tradition, all before the Judas story was conceived. The fact that Jesus does not say “the twelve” will rule but that those who follow him will sit on twelve thrones judging the twelve tribes seems innately disconnected from the number of disciples (since the governing number is the Jewish tradition of twelve tribes, not the number of people who just happened to be following Jesus). Hence it seems rather to derive from typical messianic thinking of a sort that would not require Jesus to have ever said this himself. It would also have been in the interests of “the twelve” (if such a body really was known to Paul) to have invented this saying for Jesus, specifically to legitimize their special status (or even to have taken it from a pre-Christian apocalypse and attributed it to Jesus for the same reason). On the other hand, if there was no “twelve,” the Q statement no longer has any connection with the disciples at all—it then is simply another apocalyptic prediction (that twelve men someday elected by the messiah would sit on twelve thrones), which could have derived from any source and been later attributed to Jesus (like so many other sayings may have been).

In any of these scenarios, the myth of Judas's betrayal would have arisen in a tradition separate from the “twelve thrones” saying, and these traditions were combined later—creating a new problem that (like so many others) wasn't noticed right away. And yet it's just as possible the saying was simply invented by Matthew, without realizing the contradiction it thus created—or without believing it did create a contradiction. For if Matthew believed that, despite the fate of Judas, there would nevertheless be an official “twelve” to enthrone when the time came, as both Luke and Paul seem to have believed, then no contradiction results.⁶⁷ After all, Matthew is smart enough to emend the number to eleven when describing the first resurrection appearance to the disciples (Matthew 28:16; likewise Luke 24:9 and 33, and Pseudo-Mark 16:14). He would not suddenly forget about the same conflict remaining with the twelve thrones saying. The odds that at least one of all these scenarios occurred are not small enough to grant historicity a significantly higher probability. They all have nearly the same prior probability, and none far enough outstrips the others in consequent probability.

When we return to the supposed “embarrassment” of inventing the story that a member of the twelve betrayed Jesus, we find that argument equally weak. As noted already, the evidence makes it unlikely that any member of the twelve, much less named Judas, engaged in any such betrayal as Mark depicts. No one seems to have heard of this before Mark. The betrayal story also makes no historical sense. The authorities did not need Judas (much less have to pay him) to find or identify Jesus (Mark 14:10–11, 14:43–50). Given what Mark has Jesus say in 14:49 (and what Jesus had been doing in Jerusalem only days before), the authorities knew what he looked like, and they could have seized him any time he appeared in public. They were not on a timeline. The idea that Jesus had to be tried and crucified illegally in a rushed overnight trial exactly at Passover is a Christian theological concept that cannot have had any role in the decisions of the Sanhedrin (especially since they had jails to hold him over in). Thus, the story as a whole looks like fiction.⁶⁸ The inclusion of Jesus’ foreknowledge of Judas's betrayal (Mark 14:17–21), directly in parallel to his prediction of Peter's betrayal (14:29–31), both framing the Eucharist and prediction of resurrection and apostolic abandonment, only highlights the mythic character of the entire plot element. Therefore, that it looks (or later became) embarrassing is not necessarily because the story is true. Indeed that must be improbable, as the subsequent trend was to make Judas's character and betrayal even more despicable (and hence more mythically grandiose), rather than apologetically softening or eliding it or explaining it away (or even, in fact, making it any more historically intelligible, which confirms later redactors had no genuine sources). This suggests Mark's invention of Judas's betrayal would not in fact have been embarrassing, because it was something later authors found rhetorically useful, and even amplified. And if them, so Mark.

The fact that Jesus’ betrayer's name essentially means “Jew” should already make us suspicious.⁶⁹ Mark may have intended him as a symbol of particular recent poignance. In both name and deed, Judas may be an intentional symbol of the very internecine betrayal that was destroying Jewish society and causing it to fail to realize God's kingdom, even just recently having caused the destruction of Judea, Jerusalem, and God's own Temple (if Mark wrote in the 70s CE, as most scholars now think). Judas was also a name famously associated with the path of violent rebellion (Judas Maccabeus and Judas the Galilean), which is all the more obvious an allusion if “Iscariot” is (as many scholars believe) an Aramaicism for the Latin “Sicarius,” the infamous “Killers” whom Josephus blames for provoking Rome to bring about the destruction of the Jews (which would further mean that Judas's full name meant in Aramaic “The Jew Who Kills [Him],” which one might think would be too coincidental to be historical). The name Judas may also be intended to evoke the divided kingdoms Judah and Israel, a symbol of Jews disunited and at war with each other, the more so if you agree that a number of indicators suggest Jesus is typecast in the Gospels as a symbol of Israel (as Thompson argues in convincing detail in The Messiah Myth), which alone could have inspired the creation of a Judah to oppose him. The text of Zechariah from which Matthew borrows many details of his expanded Judas story even contains this very juxtaposition, including the very name of Judas.⁷⁰ In Zechariah, the one who is paid the thirty shekels is to “become shepherd of the flock doomed to slaughter” (an apt description of Judas in respect to Jesus) and then, by abandoning the task (and the sheep to their death) and casting the money aside, to “break the brotherhood between Judas and Israel” (the very point of the Judas story: you can take the money and die, or follow Jesus and live, thus either joining the New Israel or the grave).⁷¹ Matthew thus saw this very symbolic value of the Judas story, which inspired him to exaggerate it with even more scripturally derived detail. Mark may have had the same idea all along. So whether this possibility is at all probable must first be explored before we can rule it out (I'll examine it again in my next volume).⁷²

So even here an EC argument gets little traction. Instead, the consequents fall in favor of fabrication for the Judas story, and even slightly in favor of fabrication for the twelve thrones saying, since the hypothesis of “fabrication” makes all the evidence more probable. If the priors favor neither, the consequents prevail, and we should conclude the Judas story is myth, and the twelve thrones saying only possibly authentic at best. Any attempt to gainsay this conclusion requires presenting evidence that ups the prior or consequent for historicity, or lowers either for fabrication, thus even if you disagree with this conclusion, only BT can lead you to the correct one. The EC alone is of no use.

And so on…

Similar arguments eliminate every other attempt to deploy an EC argument on the Gospel materials. That Jesus had enemies who slandered him, that Jesus went to parties with sinners to save them, that Jesus’ family rejected him, and so on, all face the same problems of self-contradiction (had they been a problem, they would have been removed or altered long before Mark even wrote), ignorance (we don't really know whether these stories were embarrassing to the communities who told them at the time they were first told), and self-defeat (any reason to preserve them if true can be just as much reason to fabricate them, and in every case we can easily construct plausible motives for their invention, which often make even more sense than the stories being true).⁷³

When EC arguments are scrutinized, some fall to the analysis that the experiences of Christians themselves in their battles, trials, and evangelizations were being mapped onto Jesus as a model to follow and commiserate with. Indeed, to imagine God suffered the same things you do is the highest form of vanity, hardly a statement against interest. Such constructs, moreover, are always rhetorically useful, and thus always well motivated. For example, Jesus being called crazy aligns too well with the fact that Christians themselves faced this charge—so how apposite to depict their Lord as being unjustly accused of the same, and then supplying him with clever speeches refuting it. That's simply too useful to be a statement against interest.⁷⁴ Christians similarly faced conflict from their families,⁷⁵ which statistically must have involved on occasion the same charge of insanity or demonic possession from them; so depicting their Lord as trading his family in for a new one in result (Mark 3:21–35, and that in the very same scene), is again too convenient. Similar tactics have been employed by many a cult throughout history, dividing members from their established family, and representing the cult as their new “true” family. So “Jesus did it, too” would not be a statement against interest. To the contrary, it reinforces exactly what Christians wanted to preach (hence Luke 14:26).

Other EC arguments fall to the analysis that the evidence already argues the premise is false, whether the EC applies or not, like Jesus’ betrayal by a member of the twelve. Still others fall to the analysis that an embarrassment cannot even be established. The claim, for example, that Mark would not invent a story about women being the first to discover the empty tomb because the testimony of women would be too embarrassing, is based on claims about the ancient world that are simply not true.⁷⁶ Other EC arguments fall to the analysis that the facts depicted are so unbelievable that no matter how embarrassing they may have been we can still be certain they were never true. As Dennis MacDonald observes of the unbelievable fickleness and stupidity of Jesus’ disciples, an EC argument cannot sustain the belief that they were really like that, because no human beings are. The depiction is so contrary to any plausible reality it can only be fiction. And MacDonald makes a good case for what the literary function of that fiction was.⁷⁷ Paul Danove confirms MacDonald's point with his own demonstration that this is yet another example of Mark's deliberate use of irony to make a point.⁷⁸ Here, it's much more improbable that such stories would be true than that any embarrassment they caused would prevent them being told. In other words, the consequents favor invention, not historicity.

There are also some EC arguments that fall to the analysis that even though an event may be realistic in principle, its depiction is so literarily crafted we should be far more suspicious than the EC would have us be. Peter's thrice denial, for example, reads like a morality play, as if something out of Shakespeare rather than real life, with dramatic features more befitting a novel than a history. That his three denials (a fact odd to focus on in such detail) contrast too appositely with the women's three acts of loyalty (attending the cross, burial, and resurrection) is likewise suspicious. We have to ask why this story is told at such length and with such unusually meticulous detail—indeed, why it is told at all. Mark would not have included this story (much less composed it so carefully) unless he wanted to. So why did he want to? The answer is unlikely to support its historicity. Even assuming the naive view that, as tradition claims, Mark is simply recording what Peter preached, why would Peter be preaching this? Could it be because he found the story useful as a missionary? Yet such utility would attach to a fiction as much as a truth.

That all these EC arguments fail does not entail these claims are unhistorical. They may yet be rescued by other arguments. But that can only be accomplished by applying BT.

WHAT ARE WE TO DO?

So when is an EC argument successful? Certainly it isn't always unsound or invalid. It can work in a court of law, and often finds successful use in every historical field (even if called by other names). I've relied on it myself. But the only way to deploy a successful EC argument is to avoid or overcome all the problems surveyed above and then produce a logically valid and sound argument.

First, you must reliably know if the statement in question very probably did go against its author's interests, that the author actually perceived that it would, and that the statement did not serve other interests the author had which he may have regarded as outweighing any other consequences he perceived to be likely. And that means you must reliably know what an author's interests actually were, and not just in general, but that particular author, in that particular book, in that particular scene (and in that particular community at that particular time), and you must reliably know what the author perceived the consequences of his statement would be—not just what they actually were, because an author might have not foreseen those consequences, or underestimated their severity. And then you cannot simply rest on the expected negative consequences of a statement, because an author may have had overriding interests. So you must reliably know how that author would have weighed the pros and cons he was aware of at the time, especially if he thought the cons could easily be explained away or overcome in extra-textual discourse. And if you can establish all that, you're not done. For you must also reliably know if the author was even in a position to know the statement was actually true. Because that an author believed it was true does not entail it was.

Meeting all these conditions can be difficult, especially in the study of Jesus. You also need a specific theory as to why the questionable statement was included at all. And then you need to test that theory against other theories of why it may have been included—at the very least, you must test it against the most likely of those alternatives. You have to ask: even if it was historically false, what is the most likely reason that that statement could have been included, by that particular author, in that particular book, in that particular context? The answer to that question is the theory you will test against your own, which you must also spell out exactly—this being usually that the most likely reason the statement was included was that it was true but this author couldn't omit it or change it despite having ample reason to on account of its embarrassing nature. But that requires explaining why that author could not omit it or even change it (and why no one else could, in all the decades before). Because some explanations of that odd fact will be far more far-fetched than others (and for any explanation, the more far-fetched, the lower its prior probability must be), and because you are obligated to prove that that explanation was in fact guiding that author's construction of the text (because, again, the less certain you are that it was, the lower its prior probability must be), and because some explanations will provide just as much reason to invent a fact as to have reported it if true (which then renders your theory's consequent probability no higher than the alternative).

In other words, you can't just look at some statement x, check the box “EC applies,” and then conclude “x is probably historical.” You have to show that the EC not only actually does apply, but also that it actually does have this effect. And to prove the latter you first have to show (or be able to show) that the EC increases the probability of x at all, and then you have to show (or be able to show) that it not only increases it, but increases it enough to bring it to some degree of historical certainty. And you must do this in a logically correct way—which requires taking alternative explanations into account, and all available evidence and background knowledge. And even after achieving all of that, you cannot fallaciously tout the false dichotomy “true or false,” but you must honestly acknowledge degrees of certitude, that is, is x only somewhat probable, or very probable, or nearly certain? Because it makes a substantial difference which is the case, particularly if you plan to use x as a premise in another argument. Historians cannot hide from all these obligations. Because they all have consequences, and you cannot responsibly hide from the consequences of your own arguments and assumptions.

Hence to get an EC argument to work, we must first answer the question of when any method of criteria is logically valid and sound. Any method that conforms to BT will be valid. So as long as its premises are then all soundly demonstrated, a BT-structured EC argument will be valid and sound. Applying the logic of BT to the EC, we must first ascertain if our author is just reporting the facts as witnessed by or told to him (or what he purports to be those facts) or constructing a story for some purpose other than making a record of what happened (such as producing myth, fiction, or any other form of storytelling whose aims are more subtle than superficial veracity). For if the latter is the case, the EC no longer applies at all, because then the probability that an author will knowingly include any embarrassing statement is practically zero, as he will only include what he wants to include. Which means if something got included, it cannot have been a statement against interest. Only in the former case (of attempted historical reporting) will someone include anything like statements against interest, either out of sincere neutrality or in the effort to make sense of unexpected facts or because it cannot be avoided—or some equally compelling reason.

But even in a case of genuine reportage, authors still limit what they say to what serves their interests. So it is still unlikely they will include any embarrassing statement unless they need to or want to for some reason. And again, such needs and desires might just as easily motivate a fabrication. And if not that, then such needs and desires will still entail the author will most probably have protected his interests by apologizing for the statement, or defending it, gainsaying it, refuting it, or attempting to spin it, thus entailing predictions about the way the evidence will be presented, which, if that's not what we find, entails a lower consequent probability. So we must attend very carefully to the context in order to ascertain why the author included a seemingly embarrassing statement, in order to ascertain if he really was forced to against his interests, or if that statement instead served his interests, or if we should expect the evidence to have been presented differently (if it didn't serve his interests to report it without comment). And remember we cannot merely “declare” the author had some particular reason to report it—we are obligated to demonstrate that that reason was in fact operating on that author, or probably was. Otherwise, it's no more likely than any alternative motive.

Attending to all that we must address the three premises of BT. The first is prior probability. For that you must answer both principle questions: whether the author was writing myth or, if writing history, doing so with remarkable candor. In other words, you must ascertain how frequently that author, in that document, particularly in analogous literary and narrative contexts, fabricates data rather than reports what he learned from reliable historical sources. If there are many instances of doubtless fabrication or the use of unreliable sources or methods, then the prior probability that this author reports data because it's true (rather than only because it suits his story, or his source's story) is low even if he purports to be recording what happened. Thus, even if you can establish he was writing history and not myth, ascertaining that intent can only get you halfway. You must still ascertain the degree of his honesty (and the degree to which he verified and thus really knew the truth—rather than merely repeated what he was told). And though many scholars wish to avoid the question, the fact of the matter is the Gospels provide considerable evidence that their authors’ honesty was not exemplary (and there is no clear evidence of their taking pains to verify anything, either, or even of their having reliable sources at all).⁷⁹ And if you grant that, then you must further grant that it is very unlikely anything that seems embarrassing was included because it was true, for such an author (or his source) more likely had a reason to include it even if it was false—otherwise (once we've granted their lack of candor) we must confess they would not have included it at all (and if an author shows no effort at checking facts in any reliable way, even if he's being completely honest he may be repeating a fiction, by merely believing it a fact, or having no way to confirm it's not).

If, on the other hand, you wish to deny this conclusion, and insist instead that the Gospel authors spoke with remarkable honesty (and reliably researched every detail), you must prove this first—a daunting task, given that we can demonstrate that they knew of each other's work, and also cannot have been ignorant of the same traditions the others recorded (if so they did), and yet still contradict each other freely, without even acknowledging the contrary accounts. Otherwise, how did some authors know the traditions they record, while other authors never heard of them? To claim that one author had access to a tradition not passed down to the social circles of another author is to admit the Christians were working with isolated, unchecked, and highly deviant traditions, which could not be corrected if false as other Christians didn't even know of them. Which then pushes back the question of an author's honesty to that of their sources, which is even more difficult to evaluate, because these authors never tell us who their sources were, or how they corroborated what they said (or even whether they did)—hence we have no information at all as to the honesty and reliability (or even interests) of those sources.

So unless you have demonstrated otherwise (and I would argue no one sufficiently has in this case), the prior probability that what seems to be an embarrassing statement in the Gospels is true is actually low, not high. In any friendly tradition, most authors simply will not have included embarrassing truths; but all will have readily included embarrassing myths that served a literary purpose, since the motive is reversed: authors want to include such stories, not to suppress them, unlike embarrassing truths, most of which authors will want to suppress. The frequency of embarrassing myths in friendly sources that serve a literary purpose is essentially 100 percent. Because such myths will only be created and repeated to serve a literary purpose, therefore all of them will serve a literary purpose. But the frequency of embarrassing truths that also serve a literary purpose is far from 100 percent. Most such truths will not “coincidentally” be convenient to tell—this is, in fact, the assumption on which EC reasoning is based, so if it weren't true, no EC argument would be valid.

We next must ask, how many embarrassing truths were there, in ratio to embarrassing myths that were found useful to tell? Since we don't know the answer to that question (certainly not a priori), not even whether there were more or fewer of either, the principle of indifference entails we cannot assume there was more of either (again, until we can prove otherwise). From these facts the conclusion follows: the frequency of embarrassing stories preserved that are true (among all embarrassing stories) will be low, not high. Hence, most such stories will be false. It's simple math: Even starting with equal numbers of each—say, ten and ten—if few of the former will be preserved but all of the latter will—let's say, only two in the first case but all ten in the second—then the prior probability that any surviving embarrassing story is true will always be low—in this case, two out of twelve (two plus ten equals twelve embarrassing stories preserved, of which only two are true), which equals one in six, less than 17 percent.

This would not be the case when a neutral or hostile tradition exists that is highly reliable and well known, such that a friendly source cannot easily ignore it or change it. Thus, if we can show an embarrassing claim had already been reliably and widely established by a neutral or hostile source by the time a friendly author wrote, then we can reverse this probability, because the frequency of those claims that will be true will be much greater. This does not follow when a neutral or hostile tradition exists that is not highly reliable (or dates after a friendly source and merely responds to it), because such traditions have their own tendency to fabricate embarrassing stories about a subject (to which a friendly source can respond by a variety of strategies, not just gainsaying or refutation, but even continued fabrication—because, after all, the friendly source might no more know the truth of the matter than their opponent did). But when it comes to the historical Jesus, we have no neutral or hostile sources of any kind (apart from much later critics who had no access to any information about Jesus not provided by the Christians themselves), and of any such traditions that might have existed before the Gospels were written (such as any we might try to infer from the Gospels or Epistles), we can establish none as reliable, early, or widely known. So we're back to the original probability.

Thus, for the Gospels, we're faced with the following logic. If N(T) = the number of true embarrassing stories there actually were in any friendly source, N(~T) = the number of false embarrassing stories that were fabricated by friendly sources, N(T.M) = the number of true embarrassing stories coinciding with a motive for friendly sources to preserve them that was sufficient to cause them to be preserved, N(~T.M) = the number of false embarrassing stories (fabricated by friendly sources) coinciding with a motive for friendly sources to preserve them that was sufficient to cause them to be preserved, and N(P) = the number of embarrassing stories that were preserved (both true and fabricated), then N(P) = N(T.M) + N(~T.M), and P(T|P), the frequency of true stories among all embarrassing stories preserved, = N(T.M) / N(P), which entails P(T|P) = N(T.M) / (N(T.M) + N(~T.M)).⁸⁰ Since all we have are friendly sources that have no independently confirmed reliability, and no confirmed evidence of there ever being any reliable neutral or hostile sources, it further follows that N(T.M) = q × N(T), where q << 1, and N(~T.M) = 1 × N(~T): because all false stories created by friendly sources have motives sufficient to preserve them (since that same motive is what created them in the first place), whereas this is not the case for true stories that are embarrassing, for few such stories so conveniently come with sufficient motives to preserve them (as the entire logic of the EC argument requires). So the frequency of the former must be 1, and the frequency of latter (i.e., q) must be << 1. Therefore:

So even if we substitute 0.5 for q (which is much too high, as it would entail that half of all embarrassing truths brought no sufficient motive to suppress them, directly contradicting the basic assumption of an EC argument), this produces:

So if N(T) = N(~T), then:

Or in other words, the number of embarrassing stories preserved that are true will be no greater than 1 in 3, which means that at least 2 in 3 (in other words, most) will be false. If for q (the proportion of all embarrassing truths that brought no sufficient motive to suppress them) we substitute 0.25 instead of 0.5, then these final fractions become 1 in 6 and 5 in 6, respectively—and 0.25 is already unbelievably high. 1 in 4 embarrassing truths are not convenient to tell. Surely more than 3 in 4 such truths entail a sufficient motive for a friendly source to suppress or forget them. If we concluded, instead, that at least 9 in 10 would, which is more credible (and advocates of the EC have certainly behaved as if the frequency must have been that high, for them to express such certainty that preserved embarrassments must be true), then the fraction of surviving embarrassing claims that will be true would be 1 in 11 (or only 9%). And one might even suspect that's too high. Is it really unreasonable to think that less than 1 in 100 true embarrassing stories are going to be “convenient to tell”?

Thus as N(T.M)→0, so does P(T|P). Only if N(T) >> N(~T) do we get P(T|P) > 0.5. Which means N(T) must be greater than N(~T) by as many times as 1 is greater than q.⁸¹ In other words, the more frequently in general that embarrassing truths have sufficient motives to suppress them, the more embarrassing truths about Jesus there must have been than embarrassing but useful myths about Jesus in order for most of the embarrassing stories preserved to be true. And yet we know, more likely than not (and certainly so far as we know, absent circular logic), N(T) ≈ N(~T). In fact, it's very possible that N(T) << N(~T), since one can fabricate countless embarrassing but useful myths, whereas there are only so many embarrassing things that can actually have happened to someone. That would make the prior probability that embarrassing stories in the Gospels are true extremely small. But no matter what the actual ratio, surely the number of useful embarrassing myths that can be conceived is always far greater than the number of actual embarrassing things that could have happened. So in no case is it ever likely that N(T) > N(~T).

To make any headway toward reversing this judgment you must advance some theory as to why, despite all this, the originating author was compelled to include that detail anyway (if it were indeed known to be true and embarrassing to them), and then from that theory you must estimate how compelled the author was. From this you can derive a narrower reference class from which to revise the prior (on which see chapter 6, page 231) or generate a difference in consequents that overcomes the prior. But you can't just make up any reason willy nilly. Because if you can't prove that that reason was actually operating on that author, or even likely to be, then your theory's initial probability will be lower, not higher (as the probability space must then be evenly divided among several competing theories, representing all the other motives just as likely operating on that author instead of yours, as explained for all ad hoc reasoning in chapter 3, page 80). That means your prior probability (which, as just argued above, is in this case already low) must be adjusted even further downward to reflect this fact. On the other hand, any reason you can demonstrate was operating on that author, or very likely to have been, will raise the prior only if it makes that author more likely to report that fact if true than to fabricate it if useful. Of course, whatever reason we demonstrate an author had must entail that the strength of his impulse to include a detail despite its embarrassing nature was stronger than any impulse he had to omit it, as that's the only condition in which an embarrassing statement will be included against an author's interest. And that impulse must be strongly correlated with that statement being true (as otherwise an author may feel compelled to include it thinking it's true, when in fact it's not—hence, we still must attend to the second prong of a proper EC test).

So unless we can demonstrate otherwise in any particular case, or unless we can demonstrate an author to be remarkably honest and reliable in making claims about Jesus generally, the prior probability is low that an embarrassing story in the Gospels is true. Most of those stories will be in them because they were useful to create; few will be in them because they were true. Then you must estimate the consequent probabilities of the evidence given your competing theories (which theories are that an author or source had reason to fabricate the statement vs. that an author made that statement only because it was true). How likely is it that that detail would have been included by that author, in that text, in that context, and in that way, given that it's true? The overall evidence might not fit that hypothesis so well. If, for example, we should expect an embarrassing fact to be reported only in conjunction with some sort of apologetic, yet it's not, then P(e|h.b) is low. For on the hypothesis that an author reported something he didn't want to report because he was compelled to for some reason, odds are the author would express this, or in some fashion explicitly defend his interests against the implications of the embarrassment, which means if this didn't happen, then the evidence is not as expected on our hypothesis.

Still, in some cases the evidence can fit our hypothesis fine, being entirely what we should expect (entailing P(e|h.b)→1). The remaining problem is that in all the cases we find in the Gospels, there are also good theories about why Mark (or his sources) would fabricate the allegedly embarrassing detail, which entails P(e|~h.b)→1. So even finding a version of a story in Mark exactly as we should expect it to appear (with evidence of his embarrassment, let's say), it still might not be credible. But usually such “evidence of embarrassment” won't be very explicable on any alternative explanation, and thus the consequent probabilities will favor an embarrassing story's being true in that case—if we can establish that Mark probably believed it because it was true, rather than for some other reason, yet Mark never tells us anything about his methods or sources or the warrants for his beliefs, nor can we securely deduce any of these things. Hence, it seems we are always stymied by one or the other prong of a proper EC criterion: for any given claim, either we can't show it was against Mark's interests to claim it, or we can't show he would have reliably known it was true (whether it embarrassed him or not). And as for Mark, so for his sources.

But the main obstacle in Jesus studies is that strong plausibility of alternative explanations. Thus Craig Evans's attempt to reformulate the EC as “authenticity is supported when the tradition cannot easily be explained as the creation of the Church in general” at least correctly acknowledges the importance of establishing a low P(e|~h.b).⁸² This is even a nice example of a coded phrase in English (“cannot easily be explained”) representing what is in fact mathematical reasoning: that the consequent probability for ~h (that the material in e is “the creation of the Church in general”) must be low. But establishing such in the case of the Gospels is a lot harder than most historians casually assume. And the consequent probability for h (that the material in e is “the product of having actually happened”) must still be high (which it often is not, e.g., the Judas story, which makes no sense as actual history—and that, being highly contrary to expectation on any hypothesis that it actually happened, entails a low P(e|h.b)).

Often the prior probabilities must favor h over ~h as well, which requires (among other things) the prior demonstration that the Gospels are honest historical records rather than deliberately constructed myths. Such a prior demonstration must conform to BT and could logically consist of beginning with neutral priors (50/50, favoring neither theory about the nature of the text) and then running case after case until the trend is visible (either mostly history, mostly myth, or equal parts either), the outcome of each case becoming the prior probability in the next. That trend will then equal the prior probability of meeting that same trend again in the next case, the result of which will then become the prior probability for the next case, and so on (this is a mathematically valid procedure—which I'll discuss again in chapter 6, page 239). Absent such a proper demonstration, as previously shown, the priors will favor fabrication in every case of seeming embarrassment. In fact, for embarrassing elements, that genre-based prior would normally be your initial prior in any such analysis (not a blind 50/50 probability), which means this process of iterating the analysis case after case would have to repeatedly and strongly favor truth-telling to get that prior probability up to any respectable value in the case of the Gospels, just as with any other sacred stories for any other religion.

Either way, what we are still faced with is a question of a balance of three different probabilities. We can then use BT to produce a conclusion. But since, as a result of all the above, the conditions for a successful EC argument are so rarely encountered, and probably never encountered in the Gospels, these probabilities never favor historicity enough, so the Criterion of Embarrassment can have little or no use in reconstructing facts about the historical Jesus.

COHERENCE

All the other criteria suffer the same defects. The Criterion of Coherence assumes that anything that coheres with what has been established with other criteria is also historical. “If a saying or story is consistent with the picture of Jesus that is emerging from well-attested material, then it may be considered likely to have been historical,” because “it is consistent with what has already been discerned” as originating with Jesus.⁸³ But this is illogical.⁸⁴ Coherent material can be fabricated precisely because it coheres with other beliefs about Jesus, or even for the specific purpose of cohering with them.

Liars tend to prefer their lies to be coherent, and when telling new lies, build on old ones. Even more innocent mechanisms of legendary development follow the same principles (“but that's just what Jesus would have done, isn't it?”). Indeed, the usual trend in fabrication over time is toward harmonization—in other words, the creation of coherence (the textual tradition of the Gospels even shows this explicitly occurring). Everyone knows “good fiction is often just as ‘coherent’ as historical fact.”⁸⁵ Indeed it can even be more so—for coherence is easy to create by design, whereas real historical people and events are often evolving, complex, unpredictable, or actually in fact incoherent. It's easy not to understand what people said and did, to see such events as incoherent or inexplicable. It's then much easier to make it all cohere after the fact, imposing structure and consistency on what actually had none (or whose structure and consistency actually escaped you, and thus was replaced by structure and consistency of your own imagining). As a result, although failing to cohere with established facts might (at least in some cases) raise the priors or consequents of “fabrication” hypotheses, cohering with established facts does not raise the priors or consequents of hypotheses of “historicity” by any relevant amount—since coherence is just as common and expected on hypotheses of fabrication.

As Anthony Le Donne aptly puts it, this criterion suffers from a one-two punch: “it presupposes that certain characteristics of Jesus are of little dispute” yet “such characteristics are very few,” and even those are suspect (as is becoming clear throughout this chapter—and as even Le Donne himself warns, “this criterion has a tendency to confirm the presuppositions of the scholar” rather than any actual facts); and even when granting a characteristic as independently established, “it is possible that a characteristic well known to early memory may have bled into other episodes during narration” and thus reflect accidental fabrication, not historicity.⁸⁶ Add to that the occurrence of intentional fabrication and its tendency to cohere, and coherence just isn't a reliable indicator of the truth.

It might be assumed that prior confirmed cases of some property x would increase the prior probability of another case of x being true, but in this case that doesn't follow. A story found in an author whose stories often turn up confirmed will have a higher prior probability of being true; but the mere fact of “a” detail cohering with “some other” detail in the overall “tradition history” of all the stories preserved does not have that effect, precisely because not all authors are reliable, nor are all sources, and unreliable authors and sources will produce (or reproduce) fictions that cohere with tradition more than amply. Thus, the mere fact of cohering with established facts is insufficient to make a story more likely to be true. If, for example, unreliable sources transmit as many false cohering stories as reliable sources do true ones, and we cannot confirm there are more reliable sources behind our accounts than unreliable ones, then so far as we know the set of all cohering stories will contain as many fictions as truths, making no difference to the odds that any particular story will be true. Only if you can demonstrate that these ratios fall out differently can you get coherence to change the odds of some detail being true. And in the study of Jesus, we generally don't have that kind of information.

Even at best, in every case you have to attend very carefully to the evidence. For example, it's sometimes assumed that Matthew 10:5–6 and 15:24 (where Jesus says only Jews are to be evangelized) must be authentic because “we know” that Jesus didn't call for any mission to the Gentiles (because according to Galatians that mission was initiated by Paul after, as Paul himself says, Jesus had died). But there might be no reason to assume Jesus ever had cause to tell anyone not to go into Gentile communities to evangelize them (why would that even occur to his disciples and thus have to be explicitly prohibited in the first place? And even if it occurred to them, why would they go do such a thing on their own initiative, rather than await his specific orders?). In contrast, we know (from the way he redacted Mark) that the writer of Matthew is up against a Gentile Christian community he wants to discredit, so he clearly had motive to invent these sayings for Jesus, to further his agenda.⁸⁷ So the fact that they cohere with what may be an established fact about Jesus does not lend any weight to these sayings being historical. To the contrary, the evidence suggests they probably aren't (e.g., we can see Paul never had to answer anyone quoting these sayings of the Lord against him, yet surely they would have, so the consequent probability of all the evidence together is actually higher on the hypothesis of fabrication). On the other hand, Matthew's seemingly contradictory endorsement of a mission to the Gentiles (in Matthew 28:19) is no more likely to be unhistorical because it fails to cohere with what we know from Galatians—because Matthew does not mean what Paul was doing (converting Gentiles straightaway, without first converting them to Judaism through circumcision and dietary laws), but what we know Jesus’ disciples were already doing even before Paul came along (allowing Gentiles who become Jews to join the Christian community, as is clearly attested in Galatians they had already done before Paul and were ready to continue doing), which is what many Jews in antiquity did. The idea of evangelistic Jews sounds odd today, but only after thousands of years of anti-Semitic legislation and hostility has chastened Jews into giving up the active pursuit of converts as being bad for their health. But in antiquity they were often welcoming or even pursuing Gentile converts.⁸⁸ It's entirely possible Jesus preached the same practice (it was, after all, God's law [Exodus 12:48], “not one jot or tittle” of which Jesus intended to abolish, according to Matthew 5:17–18). Because if he did, that would in no way fail to cohere with the evidence in Galatians where the Apostles before Paul were doing this very thing already; and certainly Matthew consistently imagined such a mission throughout his Gospel (Matthew 24:14, 25:31–46). Of course, all of that still doesn't mean any of these sayings are or aren't historical, only that we can't rule them in or out using the Criterion of Coherence, as many scholars have attempted to do.

And that's even assuming you actually have established facts to cohere with (or not). The greatest folly in applying this criterion is the same bootstrapping fallacy critiqued earlier: “cohering” with a “fact” established by an invalid (or invalidly applied) criterion cannot legitimate another fact. Yet most “historical Jesuses” are constructed from exactly such a house of cards. Hence, the Criterion of Coherence is the most insidious of them all.

MULTIPLE ATTESTATION

“If a tradition is attested in more than one strand of the tradition” then “it is more likely to be authentic,” as long as these strands are “independent layers of the tradition.”⁸⁹ This is often a sound principle commonly employed throughout the field of history. But it has to be applicable—and applied correctly. And yet in Jesus studies, “relatively few individual units of the tradition are attested in more than one strand,” and even in those few cases establishing independence is hard to do.⁹⁰ It was long thought that the Gospel of John is independent of the Synoptics, but a growing body of evidence argues otherwise, so John's independence can no longer be reliably assumed.⁹¹ Some scholars even argue that Mark knew Q (which is entirely possible: just because he rejected much of it doesn't mean he wasn't aware of it or didn't use it) or that there was no Q, only Matthew's expansion of Mark.⁹² And hardly any of the extrabiblical evidence for Jesus is independent of the NT [New Testament], and most of the evidence that even so much as might be independent of the NT is universally rejected as fabricated (e.g., the Infancy Gospels; 3 Corinthians; the Epistle of Jesus to Abgar), and thus can hardly count as “multiple attestation” (a fact that should already caution against assuming the canonical texts are any more trustworthy than these).⁹³

Those facts put any reliance on the Criterion of Multiple Attestation on very shaky ground, particularly for the reasons just noted for the Criterion of Coherence: we should actually expect multiple attestations to be fabricated. Hence the Infancy Gospels “corroborate” that Jesus was a great miracle worker, yet we know full well this evidence is fictional—and thus doesn't corroborate anything. And even when we have something like a credible instance of independent attestation, that only proves that the corroborated datum originates in an earlier source, not that it originates with a historical fact. Because “multiple attestation…does not exclude the possibility of creation by Christians” at earlier stages of development prior to the documents we have.⁹⁴

A classic example is the fact that we have impressive multiple attestation of the labors of Hercules (and in antiquity this was even more the case, as many texts now lost are known to have recounted them), yet no one believes this makes those labors even “more probable,” much less believable. The story was fabricated long before our written sources (probably long before even their written sources), spawning numerous independent lines of legendary development, which each came to be independently recorded later on, an outcome that in no way makes the story more credible. The same clearly happened to the Christian tradition before the Gospels were even written. For example, it appears two separate and contradictory legends developed about the suicide of Judas (Matthew 27:3–10 vs. Acts 1:18–20), yet this could have resulted just as easily from an originating fiction as from an originating fact (and as I argued earlier, it most likely did: see page 152). Thus its multiple attestation does not establish its historicity (and if Luke's version is a deliberate rewrite of Matthew's, we don't even have multiple attestation). A similar datum can also originate independently because of a common motive rather than a common source, to explain a shared problem in the text or to defend a shared doctrine or goal. For example, many scribal emendations of the NT manuscripts produce the same or similar results even when they were not even aware of each other, simply because they saw the same problems and devised similarly obvious solutions. Though this will usually be less likely, it still has to be ruled out.

For all these reasons it's simply not true that, as Marcus Borg claims, “if a saying or story appears at least twice in traditions that are early and independent of each other, that is a very good reason for thinking that the gist of it goes back to Jesus.”⁹⁵ Because it is an equally good reason for thinking that the gist of it goes back to an originating myth (or even a revelatory dream or vision), or an earlier storyteller's innovation. For example, Borg's own example of the general fact of Jesus performing “healings and exorcisms” could just as easily be an invention to idealize and legitimize the fact that early Christian communities were engaging in healings and exorcisms. After all, Paul attests to such activities in the earliest churches, yet never attests to such things ever having been done by Jesus (try as you might, you won't find the latter in his letters, but you'll find many hints of the former). If a myth grew during this time that Jesus did them, too, it could easily have spread to independently inspire different stories in the Gospels, just like the labors of Hercules. And that's not the only possibility. For if Q is actually just the elaborations on Mark produced by Matthew (as Goodacre argues—remember, Q is entirely hypothetical, it is not a document we actually have), then Jesus being a healer and exorcist could have been an outright invention of Mark, which thereby inspired all subsequent iterations of the theme—because then it does not appear even twice in traditions “that are early and independent of each other,” but only in Mark, and in documents long post-dating and even employing Mark. The fact is, these questions are not as settled as Jesus historians often claim.

When does Multiple Attestation argue for historicity? Only when the fact of multiple attestation entails P(e|h.b) is substantially higher than P(e|~h.b). Which only occurs when we can establish (to some degree of probability) that two extant testimonies to the same claim derive (at least ultimately, if not directly) from independent eyewitnesses of the fact attested, or from one eyewitness (or group of eyewitnesses) whose reliability on that claim is demonstrably more likely than their lying or being in error. But rarely can we ascertain even who an author's source is, much less to which eyewitness it can ultimately be traced, and we can rarely assert someone is reliable when we don't even know who they are. Even when we do know, it would be naive to merely presume their reliability; and establishing it is often impossible. Which is why hearsay is almost never even admitted as evidence in a court of law, and why modern historians of antiquity are often skeptical of all but the most public or mundane of claimed “facts.”⁹⁶

Notably, the most persuasive cases of multiple attestation are when our sources are diverse in type—as exemplified in my analysis of the Argument from Evidence in chapter 4 (which is precisely why that method is so useful: see page 98). But this is exactly what we don't have in the case of Jesus. Rather than items from all five categories of useful evidence, we don't even have one of them. All we have are uncritical pro-Christian devotional or hagiographic texts filled with dubious claims written decades after the fact by authors who never tell us their methods or sources. Multiple Attestation can never gain traction on such a horrid body of evidence.

EXPLANATORY CREDIBILITY

Any claim that “seeks to provide a plausible explanation for the rise of Christianity within its first-century Jewish context” can be considered a candidate for historicity.⁹⁷ Stated thus it can only be exclusionary (claims that fail to meet this criterion are probably false, but claims that meet it are not thereby true), and is only valid insofar as it merely repeats BT, i.e., a “plausible explanation” can mean an explanation with a high prior, and yet explanations with lower priors can actually be more credible if their consequents are sufficiently higher; conversely, a “plausible explanation” can mean an explanation that makes the evidence we have more probable than other explanations do, which is merely an assertion of higher consequent probability, and yet explanations with lower consequents can actually be more credible if their priors are sufficiently higher. So this criterion itself is useless. Only when we replace it with BT do we get something valid to work with. This same analysis follows for all other exclusionary criteria (below).

CONTEXTUAL PLAUSIBILITY

To be historical, Jesus must have been a Jew in early first-century Judaea, which had been gradually Hellenized for centuries (through conquest, immigration, and trade) and was at that time infiltrated, influenced, and governed by Romans. So “anything Jesus said” or did (or that was said or done to him) “must therefore make sense within the religious, social, cultural and linguistic milieu of that context.”⁹⁸ This is again only exclusionary (claims that fail to meet this criterion are probably false, but claims that meet it are not thereby true). And to that extent it merely states that anachronistic claims about Jesus have a low P(e|h.b). Whether that means such claims are false still requires working out the rest of the equation; and whether claims that aren't anachronistic are true requires doing the same. Because as suggested earlier, good fiction will be contextually accurate, especially if it originated early—and sometimes even if originating late, as there were then many books available providing accurate historical detail for a period and place, which a storyteller could employ as references to lend veracity to his tales.⁹⁹ And often many details remained alive in living memory or were actually commonplace and thus not as distinctive of the particular time and place as might be assumed. Thus getting such details right does not automatically reduce P(e|~h.b). Though that's the assumption this criterion is sometimes meant to capture, it only sometimes has that result, requiring more analysis than this criterion alone entails. Conversely, it's not automatically the case that an anachronism is false. We may be wrong about whether a given detail is in fact anachronistic, so of course we can't circularly assume it is, and occasionally unusual behaviors do occur ahead of their time. Thus, demonstrations are still needed, and BT is the only method up to the task.

HISTORICAL PLAUSIBILITY

“Any reconstruction of Jesus must show that it is ‘historically plausible’ in the widest sense of the phrase: it must cohere with, and make sense of, all the evidence we have,” which also means, “his life and teaching must be such that the written accounts which eventually emerged are explicable.”¹⁰⁰ But this is either invalid or merely restates BT and thus should simply be replaced by it.¹⁰¹ The first statement is basically just BT in a nutshell, for only BT validly measures the degree to which any theory “coheres with, and makes sense of, all the evidence we have” (the former referring to prior probability, the latter to consequent probability). The second statement essentially just insists we need a high P(e|h.b). But that alone does not entail P(h|e.b) is high, nor does a low P(e|h.b) entail P(h|e.b) is low; to know one way or the other you have to work out the rest of the equation. Thus BT supersedes this criterion.

NATURAL PROBABILITY

Claims contrary to nature or that are suspiciously improbable are probably false.¹⁰² This is essentially just a restatement of the Smell Test analyzed in chapter 4 (and there demonstrated to be yet another special case of BT: see page 114). It's another criterion that is primarily exclusionary: validly applied, it can only tell you what's probably false, not what's probably true—except sometimes when a naturalistically realistic claim is made that an author could not have imagined with the same frequency as someone actually having seen it. For example, when Pliny the Elder marvels at a tribe of fire walkers, it's unlikely he or anyone would have made this up had it not been true. For that would entail a remarkable coincidence between random fantasy and an actual, venerable, scientifically well-understood magic trick (the details of which his account perfectly corresponds to). Likewise, Pliny the Younger's once scientifically incredible description of the eruption of Mount Vesuvius was scientifically confirmed by the modern eruption of the geologically similar Mount St. Helens, which vindicated every detail. It's quite unlikely the younger Pliny would have just accidentally erred his way into what turned out to be a scientifically correct observation.

Unfortunately, this reasoning does not have much use in the study of Jesus, whose only naturalistically credible miracles (such as psychosomatic healings and exorcisms) were commonly practiced by subsequent Christians and thus could have been fabricated quite easily by simply projecting onto Jesus the practices of Christian communities (thus legitimating them and creating models for them). Even his failure to succeed in his own hometown (according to Mark 6:1–12) has the same ready explanation. This is sometimes explained as evidence his power was indeed only psychosomatic, which by extension proves the historicity of his role as a faith healer. But that doesn't follow. Because it just as easily proves fiction, by again projecting onto Jesus the practices of Christian communities (thus legitimating them and creating models for them), in this case the occasionally inevitable failure of Christian faith healers, similarly explained as a lack of faith among the sick—not only the same excuse used by faith healers even now, but the very excuse used in Mark 6:6, thus establishing another useful model and precedent (leaving Christians the convenient justification “even the Son of God could not heal the unbelieving”). This pericope also addressed the occasional problem of Christian missionaries facing conflict from their families; hence, another obvious point of Mark's story would be to manufacture a context in which to coin the proverb in Mark 6:4. Such useful fiction is actually the better explanation, as it would be naturalistically improbable that a psychosomatic faith healing act would fail like this in only the one town (as if that were the only repository of the unbelieving). But a story created for its utility would not need more than the one event. Thus the consequents favor fiction. So again BT prevails over the criterion alone.

ORAL PRESERVABILITY

“If we cannot imagine a tradition being preserved orally, then we cannot think that it goes back to Jesus,” such as, Marcus Borg argues, “the extended discourses attributed to Jesus in John's gospel,” because “what one can imagine being remembered is the gist of a saying, parable, or story,” whereas it's hard to “imagine one or more disciples memorizing [e.g., the entirety of John 14–17] as it was spoken and then preserving it through the decades” (in fact nearly a century, according to some scholars), especially with no other Gospel author in the meantime ever having heard of it.¹⁰³

Again this criterion has no positive heuristic value (being exclusionary, it cannot tell us what is authentic, only what isn't), but it does reflect a valid historical intuition: such extensive speeches (as well as—and it's important to repeat this—every other author's ignorance of them) are improbable on a hypothesis of “accurate oral memory,” but highly probable on a hypothesis of “fabricated to the occasion to sell the creedal ideas of the authors of John.” Hence, the balance of consequents favors fabrication. The prior probability of such an amazingly accurate oral tradition is also extremely low for the first-century Christian tradition specifically. Our background knowledge establishes that oral tradition can be rapidly distorted and expanded with fictions, while any true content gets simpler and less detailed over time—indeed, writing was specifically invented to combat those facts (and yet even writing suffers alteration and distortion over time, just much more slowly), which facts are confirmed by recent scientific studies of human memory.¹⁰⁴ Only when background knowledge establishes a high prior can the inevitably low consequent be overcome, and we can't do that for early Christian oral tradition. We have no evidence of the presence and operation of the institutions and mechanisms we know are required for securing the reliable memorization of detailed material within the first-century churches (there were no Christian schools, for example, as there were for memorizing the Mishnah, nor was the Gospel put into verse, song, or anything like mnemonics or counting rhyme). Indeed, the wild discordance among the New Testament materials (much less noncanonical materials) attests no such institution or mechanism was in operation. Thus Borg's intuition is sound. And again it's BT that proves it.

CRUCIFIXION

“Any proposed reconstruction of Jesus has to be a Jesus who was so offensive to at least some of his contemporaries that he was crucified.”¹⁰⁵ In other words, he must have committed some outrageous crime or posed some very real threat, otherwise no one would have bothered. But this assumes a priori that he was crucified (which also, of course, assumes he existed). Reformulating the criterion so as not to beg the question gives us “any proposed theory of Jesus has to explain why he was preached crucified.” Analogously, “any proposed theory of Attis has to explain why he was preached castrated,” “any proposed theory of Inanna has to explain why she was preached humiliated,” and so on. All true. In formal terms, this amounts to asserting that any h that doesn't make reports of Jesus’ crucifixion likely will entail a low P(e|h.b), and a high P(e|~h.b) since many other hypotheses make those reports likely, thus the consequents will weigh heavily against your theory. But e must include the Gospel accounts of the crucifixion, which already substantially lower both consequents for any theory that isn't elaborately complicated, which theories often have low priors (because increasing a theory's complexity decreases its prior: see chapter 3, page 80). For the explanation given in the Gospels makes little to no historical sense (as remarked earlier, page 140). The historicity of the crucifixion of Jesus is thus as challenged as any theory positing it as myth. Either way some elaborate ad hoc theory is required to explain all the oddities of the evidence.

Thus this criterion is useless. You have to fall back on BT. In other words, though the premise seems correct (any theory you have must make “Jesus was preached crucified” likely), there are many more ways to meet that premise than normally assumed (the number and diversity of theories of comparable prior probability is great), and even not meeting it does not automatically entail a theory is untrue (since a large enough disparity in priors can overcome any disparity in consequents, i.e., sometimes unlikely things do happen).

FABRICATORY TREND

“Whenever a saying or story reflects a known tendency of the developing tradition, the historian must suspect that saying's historicity,” which is again only exclusionary.¹⁰⁶ For example, the trend in the Gospels toward magnifying Jesus over time casts such magnification into doubt. And yet Paul, our earliest source, already has a rather high Christology, substantially predating the Gospels. So you have to take great care to ensure the trend you claim is there is really there, and not something that's been inherent in the tradition from its inception. But once that requirement is met, this criterion becomes a valid element of determining prior probability. In effect, to say that a claim conforms to an observed trend of legendary development is to say that our background knowledge establishes a high prior probability that more of the same will also be legendary. Of course, that prior being high does not alone entail such a detail is legendary, since a ratio of consequents strongly favoring historicity can still prevail. Hence, this criterion is again subsumed by BT.

LEAST DISTINCTIVENESS

This is the assumption that when we have many versions of a story or saying, the simpler and less elaborated is the earlier and thus most authentic (in accord with the Criterion of Oral Preservability above, page 178). This is actually just a special case of the previous criterion (Fabricatory Trend), yet one that's multiply problematic. For example, “traditions becoming longer and more detailed, the elimination of Semitisms,” or “the use of direct discourse, and the conflation and hence growth of traditions” are different ways a simpler account becomes elaborated over time.¹⁰⁷ And yet, as Stanley Porter observes, sometimes the less elaborate version is an edit or truncation of an earlier, more elaborate version, and sometimes Semitisms are added and thus a result of embellishment, not a sign of being more original. Moreover, an earlier, simpler version is not thereby true. Even the presence of early Semitisms does not make a story more true (as we'll see under the Criterion of Aramaic Context, below, page 185).

So this criterion is at best only exclusionary, and not universally applicable. But it could be helpful. If the evidence in a specific case is such that we can prove (to some degree of probability) that an elaborate version is more probable as an embellishment of the simpler version than as the retention of an earlier, truer account (or the simpler version is less probable as a truncation of the elaborate version than as a retention of an earlier, truer account), then we can say the consequent probability of the simpler version being true exceeds the consequent probability of the elaborate version being true. But that would entirely depend on cases. And again, the priors can reverse the outcome. Or indeed so can the consequents: if the evidence actually supports the contrary conclusion that the simpler version is the more derivative, or verifies one as the earlier but fails to establish that it's thereby true—because even the earliest, simplest version of a story we have is still often nevertheless a fabrication. So BT must still prevail.

More interesting is how this criterion could affect the prior probability. Rather than simply asserting that the criterion is true, historians must actually test it: examine every case where we have simple and elaborate versions of a story or saying, starting with a neutral prior (0.5) and using the outcome of each case as the new prior in the next (the procedure suggested earlier, page 168). If after numerous representative cases the trend shows a higher prior probability confirming the trend (i.e., in most simple-elaborate pairings, the elaborate version is indeed the less credible), then you can apply that prior to further cases (continuing the same procedure). And of course, doing this may get you to the earlier versions of stories, but not all the way to knowing whether those stories are at all true, which may require further analysis (if more even can be known on that point). So once again this criterion is only valid when replaced by BT.

VIVIDNESS OF NARRATION

As if defiantly contradicting the previous criterion, it has also been claimed that versions of a story that are more vividly narrated (as if the author were “there” and viscerally responding to what she experienced) are more likely true than versions that use a more distant or cursory mode of narration. But this is a non sequitur. For vivid detail is also an established trend in fictionalization and embellishment. Good storytellers often come up with these details, especially when they are lacking, and thus such elements are as likely as any to accumulate in the retelling over time. Human memory even does this routinely, without anyone being aware of it, especially through memory distortion and contamination through repeated retelling.¹⁰⁸ Conversely, an eyewitness can produce a concise and droll account without vivid narration, as can someone relaying what an eyewitness said.

In fact, in the ancient world especially, our background knowledge establishes that vividness of narration is more often a sign of fiction than history. Schools of the time specifically taught writers to embellish stories and speeches in exactly this way (see note 118, page 323), and we can find numerous cases where battles and speeches are described in vivid detail when we know for a fact the author had no actual sources for them. Conversely, the histories we trust the most are the ones that restrict themselves to the fewest and least embellished details that could be corroborated by multiple lines of evidence, and we especially prefer prosaically analytical discussions of the evidence to unsourced novelizations of it.¹⁰⁹ Thus, vividness of narration by itself actually argues slightly against historicity, not in favor of it, and can only support historicity in cases where you can specifically prove such vividness isn't the result of dramatization. So by itself this criterion is useless, and we're left again with BT.

TEXTUAL VARIANCE

Formally, this is called the “Criterion of Greek Textual Variance.” According to Stanley Porter, “where there are two or more independent traditions with similar wording, the level of variation is greater the further one is removed from the common source,” whereas the presence of “less variation points to stability and probable preservation of the tradition, and hence the possibility that the source is authentic to Jesus.”¹¹⁰ But there is actually no basis in our background knowledge for either view.

The stability of a tradition no more demonstrates its authenticity than its later importance or familiarity—or in some cases even its unimportance (insofar as passages more doctrinally charged received the most meddling). Sometimes variance decreases with fatigue: innovation in adapting a source decreases as an author goes along until that author more lazily just copies his source.¹¹¹ Yet that would not indicate the latter half was any more true than the former. Even in the best of cases one might use this criterion to be more certain of what an author's immediate source said (hence this criterion is a fundamental tool in textual criticism), but that doesn't help us much, since what a source said and what's historically true are not thereby identical. Certainly the transmitters often can't possibly have known whether what they were transmitting was really true or not, so their passing it on more consistently can't be in consequence of it being true. Just as often it will have other causes (such as any of the four just enumerated). In fact, that some traditions will show greater or lesser variation is expected simply as a consequence of completely random fluctuation. To then point to the ones that just by chance got the less and claim they are more true is logically perverse. Either way this criterion has no validity.

GREEK CONTEXT

Formally, this is called the “Criterion of Greek Language and Its Context.” According to Stanley Porter, “if there are definable and characteristic features of various episodes that point to a Greek-language based unity between the participants, the events depicted, and concepts discussed,” then “the probability would be greater that Greek would have been the language of communication used by Jesus and his conversation partners” and therefore it “might well have originated with Jesus.”¹¹² Unfortunately this is a cascade of non sequiturs. For one can just as easily argue from such a fact (assuming it's even established) that this conversation was entirely constructed by the author, or by another Greek-speaking source, and not derived from Jesus at all, which is even the more probable if you reject Porter's controversial theory that Jesus held conversations in Greek. Porter is also committing the fallacy here of bootstrapping a conclusion from “is more probable” to “probably is”—an invalid procedure (as proved earlier). This criterion has no validity.

ARAMAIC CONTEXT

Formally the “Criterion of Semitic Language Phenomena” (and known by other names), this is the flip-side of the previous criterion: if there is evidence of an “Aramaic-language based unity between the participants, the events depicted, and concepts discussed” underlying the extant Greek text, then this suggests the account goes back to the original Jesus, who most likely conversed in Aramaic.¹¹³

The first difficulty with this criterion is that it isn't easy to discern an “underlying Aramaic origin” from an author or source who simply wrote or spoke in a Semitized Greek. The output of both often look identical. And yet we know the earliest Christians routinely wrote and spoke in a Semitized Greek, and regularly employed (and were heavily influenced by) the Septuagint, which was written in a Semitized Greek. This is most notably the case for the author of Luke-Acts, and is evident even in Paul. Many early Christians were also bilingual (as Paul outright says he was), and thus often spoke and thought in Aramaic, and thus could easily have composed tales in Aramaic (orally or in lost written form) that were just as fabricated as anything else, which could then have been translated into Greek, either by the Gospel authors themselves or their sources. Indeed, some material may have preceded Jesus in Aramaic form (such as sayings and teachings, as we find collected at Qumran) that was later attributed to him with suitable adaptation. So even if we can distinguish what is merely a Semitic Greek dialect from a Greek translation of an Aramaic source (and we rarely can), that still does not establish that the Aramaic source reported a historical fact.

Consequently, Semitic features in a Gospel pericope do not make its historicity any more likely, other than in very exceptional cases (where we can actually prove an underlying source that we otherwise did not already suspect), and even then it gains very little (since an underlying source is not automatically reliable). Whereas one might have hoped such features would lower P(e|~h.b) relative to P(e|h.b), there is no evidence in b that warrants that conclusion. Even the best cases would lower it but little; and most cases, not at all. As Christopher Tuckett says:

We should not forget that Jesus was not the only person in first-century Palestine; nor was he the only Aramaic speaker of his day. Hence such features in the tradition are not necessarily guaranteed as authentic: they might have originated in an early (or indeed later) Christian milieu within Palestine or in an Aramaic-speaking environment.¹¹⁴

Or as I've noted, they might have originated in a Semitic-Greek-speaking environment (of which there were many across the whole Roman world), or even a pre-Christian milieu. Even a chronological trend is not dispositive, since Stanley Porter finds evidence the tradition could become “both more and less Semitic.”¹¹⁵ Unfortunately there are just too many ways a Semitic flavor could have entered the tradition of any saying or tale, and we have no way to tease out their relative probabilities. So when it comes to Jesus, this criterion effectively has no value for discerning historically authentic material.

DISCOURSE FEATURES

Stanley Porter defines the Criterion of Discourse Features as follows:

If the words of Jesus are determined to be significantly different from those of the surrounding Gospel, and especially if these words are consistent from one segment to another, then the presumption is that the author, and by extension any later redactors, of the Gospel have preserved the words of Jesus in an earlier form, ostensibly a form that could well be authentic, rather than redacting them as the Gospel was constructed and transmitted.¹¹⁶

This is unfortunately another non sequitur. Even if the procedure works (and in this case it hasn't been shown to), the most it could establish is that those discourses derive from a different source than the narrative material. That source will not necessarily be Jesus. It could be any storyteller. Just as discussed under the previous criterion, unless the Gospels were contrived from whole cloth, we already should expect a diversity of source materials in different languages and dialects. That does not make what they said true. Moreover, ancient authors were specifically taught to employ a different style in direct discourse than in descriptive narration, even to mimic (or create) distinctive styles for different speakers, and Porter's procedure can really establish no more than that they did that, so it probably can't even establish that a different source was used. So this criterion not only hasn't worked (the required procedure has never been attempted), it probably can't work.

Only if it was unexpectedly successful (e.g., we proved a consistent authorial style within the speeches of Jesus spanning all four Gospels, inconsistent with the style of all the remaining Gospel material) would we have a case for lowering P(e|~h.b), since such a coincidence is unexpected on any theory but “common authorship.” But since the common author need not be Jesus (it could still be a singularly influential missionary between Jesus and the extant Gospels, or even a teacher prior to Jesus whose teachings were later attributed to him), P(e|~h.b) wouldn't necessarily be lowered by much. And this hypothetical outcome is not likely to be realized. For the discourse features of Jesus’ speeches in John vs. the Synoptics are self-evidently not in agreement, while any agreement found between Mark and Q could be explained by Mark having selected his material from the same document (Q), or by Matthew and Luke emulating Mark in their adaptation of Q. With no way as yet to tell the difference, we have no valid use for this criterion.

CHARACTERISTIC JESUS

Finally, the most recent attempt at inventing a new criterion—on the heels of once again proving all the others defective—is the Criterion of the “Characteristic Jesus,” which argues that “any feature that is characteristic within the Jesus tradition, even if only relatively distinctive of the Jesus tradition, is most likely to go back to Jesus.”¹¹⁷ As that is clearly a non sequitur (for all the reasons surveyed so far), this criterion is also invalid.

OTHER CRITERIA

Other criteria that I've seen implicit in various arguments are usually garden-variety fallacies. The notion that a story just sounds so real and moving it must be true I'm tempted to call the Criterion of It Just Feeling True. But logicians already named this years ago. It's called the Affective Fallacy: judging something true because of how it affects you (how real it sounds, how moving it is, etc.). Such reasoning has no objective merit. Another I call the Criterion of Inexplicability, which logicians have long identified as the Argument from Ignorance: the fallacy of assuming that because you can't think of any other reason a claim would exist, then it must be true; or assuming that because you can't find any specific evidence a claim is false, then it must be true. Neither assumption is logically valid. You must attend to questions of prior probability in light of the telltale features of the text in question (e.g., the Smell Test, examined in chapter 4), and whether you should even expect to have evidence against a claim if it is false (e.g., the Argument from Silence examined in chapter 4), and then weigh all this in a sound fashion (which is exactly what BT does), before deciding whether the absence of evidence against a claim actually warrants believing it. But usually any informed expert who takes the consideration of alternative hypotheses seriously will not only always find good contenders to test—she'll be able to think of several herself.

I've also seen what I call the Oral Source Fallacy: assuming that because you can't identify or reconstruct a written source for a story or saying that therefore it must derive from oral tradition. In fact it could have been conveyed by any kind of written intermediary now lost to us, from letters to sermons to memoirs to commentaries or histories or anything, not just prior Gospels; and it could also still derive from a prior lost Gospel, or even a Gospel we have, through a completely original retelling rather than a more direct redaction (as evidence suggests John did with the Synoptics: see earlier note 91, page 320), because ancient schools taught both methods of composition.¹¹⁸ The latter fact especially cautions us: it is a grave mistake to assume a redaction must share the same wording and structure as its source material. And finally, such a text could also be a completely original invention of the author (and thus not derive from any tradition, oral or otherwise). Or it could indeed derive from an oral tradition—whose content is completely fabricated. All are prima facie equally probable. So the task of ruling all these out is not merely daunting, it's often impossible.

I've also seen what I call the Criterion of Repetition: the assumption that just because Jesus is depicted as frequently talking about “The Kingdom,” or performing healings, or speaking in parables, it should be concluded that that's what he really did.¹¹⁹ This is either a non sequitur (such repetition at best only establishes that a particular author wished to depict Jesus as emphasizing these things, which then influenced subsequent redactors) or a careless deployment of the Criterion of Multiple Attestation, which we've already seen is of little valid use in Jesus studies (for the reasons there enumerated: see page 172).

We can see a similar criterion in the work of Anthony Le Donne.¹²⁰ Le Donne says a great deal of value about how memory becomes distorted (and a historian would do well to heed all he says about that), yet he never presents any valid method for determining whether a claim reflects an actual memory or a convenient fabrication. Instead, he relies on the same old criteria proved invalid here (e.g., multiple attestation, coherence, embarrassment) and on classical fallacies like affirming the consequent (e.g., “all apples are red, therefore everything red is an apple”). Le Donne's fundamental thesis, which he employs repeatedly, is that “the more significant a memory is, the more interpreted it will become” and therefore when we find highly interpreted claims in the Gospels he assumes they must reflect a significant memory.¹²¹ For instance he concludes “John was remembered as a type of Elijah” (emphasis mine).¹²² Yet no valid reason is given for how Le Donne can dismiss the alternative possibility that “John was represented as a type of Elijah” for reasons having nothing whatever to do with any actual memory. Instead he just repeats the same fallacies (i.e., either reaffirming the consequent, or appealing to the same invalid “Criteria of Authenticity”).¹²³ Here he applies this ‘Criterion of Heavy Interpretation’ in conjunction with an invalid EC argument to the effect that “Luke strains to represent Jesus as Elijah, so he wouldn't also represent John as Elijah”—which is intrinsically illogical. There is no reason the same type could not be used for both characters, serving different symbolic purposes in each case: in the one instance using Elijah in his role as the harbinger of the messiah, applied to John, and in the other as a type for the messiah, applied to Jesus. But Le Donne's reasoning is also invalid for a more pertinent reason: we know Luke used Mark as a source, so it may be an association of John with Elijah invented by Mark that had become so popular (and evangelically useful) that Luke had to use it against his own literary tendency, and not any actual pre-Markan memory that Luke was compelled by (or, again, it may have been a pre-Markan invention rather than a pre-Markan memory). On how we're to tell the difference Le Donne has nothing to say.

Likewise Le Donne's treatment of Jesus’ conflicts with his family, and every other argument he makes. For instance, he takes the different ways Mark and John included the “temple/body” resurrection metaphor as evidence of a true historical saying about Jesus destroying and rebuilding the Jerusalem temple, ignoring the fact that the metaphor appears to originate with Paul, not Jesus, and thus Mark and John may well be responding to (or deliberately reconstructing) a fabricated saying, not an actual one.¹²⁴

Paul certainly appears to have originated this “temple/body” metaphor,¹²⁵ as well as the “tabernacle made with hands/not made with hands” distinction as a metaphor for death and resurrection (2 Corinthians 5:1–10), wherein the one is torn down and the other “built.” The saying constructed in Mark 14:58 (and 15:29; repeated in Matthew 26:61 and 27:40) clearly alludes to this Pauline teaching because it incorporates “in three days,” which can only be an allusion to the resurrection of Jesus (1 Corinthians 15:4), and it uses the “made with hands/made without hands” resurrection distinction exactly parallel to Paul's (2 Corinthians 5:1), and Paul repeatedly equated the body with the temple (1 Corinthians 3:16–17, 6:19–20; 2 Corinthians 6:16). Accordingly, John 2:19–21 simply makes this connection explicit, yet it's already implicit in Mark. Thus Le Donne's interpretation of what Mark is doing with the saying is wholly incorrect.¹²⁶ There is simply no evidence here that Jesus ever really said these things.

Thus Le Donne repeatedly acts like someone who assumes all red things are apples. But just as all red things aren't apples, memories aren't the only things that become highly interpreted (or multiply attested, or mutually coherent, or repeated by later authors contrary to their own literary tendency, etc.). Myths, inventions, and fabrications do as well (just like the many iterations of the Hercules myth). Typology, after all, was more commonly a device used for communicating ideas, not memories. Typological constructs in Daniel, for example, do not in any way reflect real memories by or about Daniel. That book is wholly a forgery. How are we to conclude that Jesus is being any more “remembered” in the Gospels than the real Daniel is in the Book of Daniel?

At least Le Donne admits his method gives no certainty, and that there may be no fact of the matter accessible to historians at all (thus he does not even affirm anything about John the Baptist as actually known, not even whether he really was associated with Elijah at any time in his life).¹²⁷ But if Le Donne said that of every claim he makes, that would simply be saying we can know nothing about a historical Jesus. Whereas if we agree with his axiom that “a historical argument aims toward the most likely explanation given the historical context and the events that followed by way of impact,” then we must obey Bayes's Theorem (the only valid method known for ascertaining “the most likely explanation” among all contenders).¹²⁸ And Le Donne simply doesn't.

CONCLUSION

Even the conservative Mark Strauss concludes that Jesus historians have yet to produce any valid methodology from all this confusion of criteria. He observes that these “criteria are often used subjectively and in a circular manner to prove whatever the investigator wishes…especially since they can be used to contradict each other,” in fact all “too often the criteria are used selectively and arbitrarily to ‘prove’ whatever the investigator wants to prove.”¹²⁹ I concur. So does everyone else who's examined the issue. See chapter 1. But any method that makes that possible is clearly invalid and should be abandoned. I've shown that BT replaces all the criteria with a valid procedure, and as long as it's used correctly and honestly, it won't let you prove whatever you want, but only what the facts warrant. There is no other contender.

BAYESIAN ANALYSIS OF EMULATION CRITERIA

A completely different set of criteria has been developed by historians of myth and literature that I call “Emulation Criteria” (colloquializing the formal term “mimesis” with the more familiar word “emulation”). The more of these criteria that are met, and the more strongly, the more likely a story in one document is a literary re-crafting of a story in another document. Such literary constructions were not only common in antiquity: the procedure for creating them was specifically taught in schools of the time (again, see note 118, page 323). Can we detect them? Is the procedure of applying Emulation Criteria logically valid?

The best set of Emulation Criteria has been assembled by Dennis MacDonald:

[There are] six criteria for identifying mimesis [i.e., emulation]: (1) the accessibility and popularity of the proposed model, (2) evidence of analogous imitations of the same story or speech, (3) the volume or number of similarities between two works, (4) the order of the similarities, (5) the presence of distinctive or unusual traits that bind the two works together, and (6) the interpretability of the differences between the two works.¹³⁰

These were inspired by Thomas Brodie, who had earlier formulated three criteria for literary dependence: (1) external plausibility, (2) significant similarities, and (3) intelligibility of the differences (between the emulated and emulating text).¹³¹ Brodie's first criterion MacDonald divides into his criteria (1) and (2); Brodie's second criterion MacDonald divides into his criteria (3), (4), and (5); and Brodie's third criterion becomes MacDonald's criterion (6). Brodie had already articulated his second criterion as keying on “significant similarities” of theme, pivotal clues, action/plot, unusual details, order, completeness, and matching words, especially unusual words, a range of possible parallels that MacDonald simplifies into his three categories: number of correspondences (of whatever kind), order of their arrangement, and remarkably distinctive parallels (such as key words or unusual plot elements).

MacDonald's first and second criteria are not entirely apt. They would presumably establish prior probability: if a particular work or opus (e.g., Homer) was commonly known in the ancient world, the prior probability is increased that that work will be emulated, and if a particular story (e.g., the shipwreck of Odysseus) was frequently emulated in the ancient world (and this second criterion could be expanded to include not just specific stories or speeches, but also frequently emulated characters, like Moses or Odysseus), the prior probability is further increased that that story (or speech or character) will be emulated again. However, in determining the probability of a hypothesis of emulation, the prior probability must actually reflect the frequency with which such emulation is confirmed in a given author and book (e.g., the Gospel of Mark) or in a given type of literature of that general period and place (e.g., ancient hagiography). In other words, if we have caught an author doing this a lot, the odds are high he will have done it in other cases as well; or if it happens a lot in ancient hagiography generally, the odds are high it happened in the Gospels generally. So we should replace his first two criteria with the frequency with which our present author or type of literature deploys literary emulation. To ascertain this without prejudging the conclusion, you could begin with a neutral prior (0.5) and then build a new prior, case after case (using the iteration procedure I described before: see page 168). Nevertheless, if the text or story being proposed as the target of emulation is obscure and unlikely to have been known to the author, we might have to reduce the prior to reflect that fact. But just because the proposed target is ubiquitous, popular, certainly known to the author, and frequently emulated by other authors does not increase the odds that our particular author emulated it, beyond the odds we can already determine that he emulated anything.

So MacDonald's first two criteria even when applicable are only exclusionary—and like all exclusionary criteria, they do not automatically exclude, for the consequents can deviate enough to overcome any prior, no matter how low it is. Thus BT must still be applied to ascertain whether the proposed hypothesis of emulation is probably true (or probably not). The prior we've ascertained for an author or genre (from past cases within that author or genre) is then applied to further cases of suspected emulation in the same author or genre, when the proposed target belongs to the category of popular and commonly emulated books, stories, and characters. But if we suspect an uncommon or rarely mimicked text is being emulated, we should reduce that prior by the degree to which such a specific case of emulation would be unusual (whereas emulating popular and commonly emulated books, stories, and characters is not unusual and therefore requires no reduction of the prior probability). Unusual emulations can thus still be confirmed, if the evidence is strong enough.

BT thus teaches us two things MacDonald's criteria do not: that what matters most is whether we've established a given author is a frequent emulator or emulation is a frequent occurrence in a given genre (which admittedly are the very things MacDonald has set out to establish), and that meeting MacDonald's first two criteria do not increase that frequency (and thus do not increase the odds that the same writer emulated again), but that failing to meet those criteria could decrease that frequency (and thus decrease those odds) in specific cases; yet even that will not entail the emulation did not occur (nor will higher odds entail it did), because we must still evaluate the consequent probabilities. And they can often be overwhelming.

So we turn to MacDonald's other four criteria, which establish consequent probability. First, he looks for “the volume or number of similarities between two works,” which (per Brodie) can include similarities of theme, pivotal clues, action/plot, unusual details, completeness, and matching words, especially unusual words, and also “the order” in which these similarities appear (which together constitute MacDonald's third and fourth criteria). Many such similarities will often exist merely by chance, or simply because the two stories are about similar topics (like a shipwreck), thus, where h = “emulation occurred,” P(e|~h.b) will not be low if the similarities are few and already expected. But when the similarities start to accumulate to the point that chance is no longer a credible explanation for why they are there, the scales tip. Such similarities are entirely expected on h, so P(e|h.b) will be high, but such a scale and scope of similarities would be an unusual coincidence otherwise, and unusual coincidences are by definition improbable (unusual = infrequent = improbable), therefore P(e|~h.b) will be low, and by exactly as much as that coincidence is improbable (which obviously will vary from case to case). This is most powerfully demonstrated when very unusual words appear in both stories, or very unusual sequences or events, or anything bizarre, since the inherent probability of such a chance coincidence is always very low, whereas the probability of an emulator borrowing an unusual keyword (or other feature) from the emulated text is always much higher.

An emulating text does not have to have all of these features. It can have any number of them, of any sort. All that matters is that whatever collection of them there happens to be, the odds of all those elements being there by chance (or inevitability) must be significantly less than the odds of their being there by design (and thus as a result of emulation). Because then, the balance of consequents will favor h (often quite strongly). This is further evident in MacDonald's fifth criterion, which calls attention to “the presence of distinctive or unusual traits that bind the two works together,” which is just a special case of his criteria three and four: parallels that are distinctive or unusual in both stories (and thus not commonly found in any other stories, even of the same type). These are elements that are already improbable in and of themselves, so to see them in both stories alone often swings the balance of consequents in favor of emulation.

As with the trustworthy neighbor example in chapter 3 (page 74), the actual consequent probabilities here could both be extremely low (since we usually can't predict from h exactly which words or ideas will be used or that any specifically will), but all that matters is their ratio, and using any particular keyword or concept from an emulated story is always more likely on h than chance (where h = emulation and ~h = chance), so when enough of these parallels are present, the effect can be huge. So we can treat the consequent for h as being effectively 1 and setting the consequent for ~h in ratio to that (I discuss this practice of disregarding contingencies in chapters 3 and 6: see pages 77–79 and 214–18).

Nevertheless, when we do this, “chance” does not then mean the bare probability of assembling a particular collection of words and ideas in one place. You must avoid the Lottery Fallacy, the idea that because winning is improbable therefore the player must have cheated—when in fact, odds are, someone was going to win, because there are so many players (this fallacy is discussed in chapter 6: see page 227). Related to this is the Fallacy of Multiple Comparisons, which must also be avoided when looking for emulation: if we allow any comparison to any text, odds are we'll always find some similarities simply by chance—this simply won't be unlikely at all. There must be some constraints on what counts as a parallel and what doesn't (which is the function of MacDonald's criteria), and there must be too many parallels for multiple comparisons to have caused them. Accordingly, because there are so many texts and so many ways to say the same thing, when all the coefficients of contingency cancel out, the consequent probability for ~h can often end up as near to 100 percent as the consequent for h, no matter how unique the text may be or what coincidences appear in it. Because as discussed in chapter 3 (pages 77–78), every configuration of words is extremely improbable, so what matters is their generic likelihood, which can often be quite high.

Thus the coincidences that you propose are emulative must be truly unusual, highly numerous, and/or unarguably apposite in meaning, none of which chance can easily explain. In other words, you must attend to the difference between (a) calculating the odds of finding someone who won a lottery, which can be near 100 percent no matter how unlikely winning that lottery is (hence when someone wins a lottery we usually infer chance, not design), and (b) calculating the odds of accidentally banging out an entire sheet of Chopin's music by randomly hitting piano keys, a case where the specificity is huge in relation to the number of available attempts, no matter how many attempts are made (hence when some stranger rattles off a bit of Chopin we infer design, not chance). Like finding a lottery winner, P(e|~h.b) must reflect the probability of a chance coincidence all else considered. Hence, it can take a lot to lower it. Basically, the question you must ask is, can we reasonably expect to find any set of coincidences like that anywhere in ancient literature just by chance? If the answer is yes, then P(e|~h.b) is high. If no, then it's low.

Such follows for evaluating the similarities. What about the differences? Contrary to common assumption, the differences between the two stories will not lower P(e|h.b). As MacDonald rightly argues, some differences actually argue for emulation, and few will ever argue against it, because emulation entails the implementation of differences. An author only wants to adapt select elements of an emulated story to develop an otherwise entirely different story, and the emulating author will always have different literary interests than the emulated author had—in fact, as MacDonald aptly explains, often the emulator's interests are to subvert the emulated tale (as when Virgil subverts the stories in Homer in order to demonstrate how Roman beliefs and values are superior to Greek). Hence, sometimes the differences are the whole point of the emulation, allowing an attentive or clued-in reader to extract the intended meaning from precisely what has changed. We can detect this when differences from the first story make unusual elements of the second story more intelligible on h (criterion 6), or when differences that exactly reverse the order or gender or other element (criterion 4) make h more (or at least as) probable as ~h. And when differences are already contextually expected on h (like changing the geographical location or the exact metaphor used to symbolize the same story element), this won't make h less probable; nor will any other differences that serve the author's interests and aims.

Indeed, differences that increase interpretability actually increase P(e|h.b), because that's exactly what we expect on h, and they also decrease P(e|~h.b), because it would be unusual if a purely chance inversion of the first story made the second story more intelligible. Such a coincidence is not impossible, just improbable, and that improbability must still lower P(e|~h.b), and by exactly as much as that correspondence is improbable (at least on any other cause than authorial design). This is MacDonald's sixth criterion (“the interpretability of the differences between the two works”). But that criterion should be expanded to include any increased interpretability, even if it derives from a similarity and not a difference (a fact perhaps meant to be captured by MacDonald's fifth criterion).

Differences that involve an exact reversal of elements also won't usually reduce P(e|h.b) because such reversals frequently occur in emulation. But sometimes exact reversals will lower P(e|~h.b), e.g., using the exact five words in reverse still entails an unexpected coincidence on ~h, just as would using the exact five words in the same order. Random chance can produce many three- and some four-word exact sequence matches, so the amount these lower the consequent on ~h will likely be so small as to be washed out by a fortiori estimates (unless someone can perform a statistical analysis on the Greek corpus showing otherwise). Likewise single-word match-ups will be common by chance, except in the case of very rare words, or words that are very unexpected otherwise, or when there are so many one-word match-ups as to be demonstrably unusual.

All the same goes for plot elements such as the sequence of events, or switching genders or other categories (e.g., from indoors to outdoors or from war to peace). Again, such reversals might not argue for emulation, but neither do they usually argue against it. But if such exact reversals are entirely expected on h (i.e., if h makes exactly that reversal very likely), then they do argue for emulation: they will increase P(e|h.b), because such an observation is expected on h, or else (or also) decrease P(e|~h.b), since a reversal that's expected and intelligible on h is often also an unexpected coincidence on a hypothesis of chance, and thus less probable on ~h.

MacDonald's works are filled with examples of applying these criteria. If we structured his arguments to conform to BT, I think we'd find that most of those examples would produce only a weak conclusion (like a maximum P(h|e.b) of around 60 percent), but many will clearly entail a strong conclusion (like a P(h|e.b) greater than 90 percent or even 99 percent or well above). Doing this would not require any exact knowledge of any of the relevant statistics, since easily adduced a fortiori estimates will usually suffice (and if anyone suspects otherwise, they can always generate the relevant statistical data to find out). MacDonald is also not alone. Thomas Brodie has surveyed many other examples; and Randel Helms has added many more (though without explicit application of criteria). The same BT analysis can test their claims as well. I will demonstrate many of the best examples (my own and theirs, and those discovered by other scholars as well) in my next volume. Here I will provide just one example (following) to illustrate the concept. But as just shown, the method they are all using is valid and sound (when competently employed), and can be verified as such with BT: when the overall features of a story are significantly less probable by chance than by emulation, emulation has probably occurred, especially if we can show that the same author does this a lot. The significance of this for historicity is that a story that was probably produced by emulation is less probably a historical fact; and an author who composes primarily by emulation probably isn't writing a history. But I won't demonstrate those inferences here.

“Daniel in the lion's den” becomes “Jesus in the empty tomb”

To illustrate the above I will draw on an example I published myself.¹³² We know from archaeology that the story of Daniel in the lion's den was a popular symbol of resurrection (and of Jesus) among early Christians.¹³³ The story is told in the Old Testament book of Daniel, which we know was written (and translated into Greek) over a century before the time of Christ, and was very popular. As the story goes, when Daniel was entombed with the lions, and thus facing certain death, the Persian king Darius placed a “seal” on the stone “so that nothing might be changed in regard to Daniel” (Daniel 6:17). The same thing is done in Matthew's story of Jesus’ burial: the Jewish authorities place a “seal” on his tomb, and post a guard, so they could be sure his body stayed put—a whole incident that conspicuously doesn't occur in any other Gospel, not even Matthew's source, Mark, whose account (as also Luke's and John's) thoroughly contradicts Matthew's additions in this regard (thus confirming they are fictional creations of Matthew—see previous discussion in this chapter, page 128).¹³⁴ The absence in every other Gospel, especially Matthew's source Mark, of guards, seal, even prior awareness of a possible plan to steal the body, as well as an ensuing miraculous angelic act, is less likely on the theory that any of this actually happened, than on the theory that Matthew made it all up, intentionally embellishing the story he received from Mark. This fact alone entails P(e|HISTORICAL.b) << P(e|MYTHICAL.b). But we needn't stop there.

In Matthew the placing of the seal (Matthew 27:66) is described with the exact same verb used in the Greek edition of the Daniel story (sphragizô), which is in both stories a rather unusual detail. This evokes a meaningful parallel: Jesus, facing real death, and sealed in the den like Daniel, would, like Daniel, escape death by divine miracle, defying the seals of man. The parallels are too dense to be accidental: like the women who visit the tomb of Jesus at the break of dawn (Matthew 28:1), the king visits the tomb of Daniel at the break of dawn (Daniel 6:19); the escape of Jesus signified eternal life, and Daniel at the same dramatic moment wished the king eternal life (Daniel 6:21; cf. 6:26); in both stories, an angel performs the key miracle (Matthew 28:2, Daniel 6:22); and after this miracle in Matthew, the guards curiously become “like dead men” (Matthew 27:4) just as Daniel's accusers are thrown to the lions and killed (Daniel 6:24). The very unusual choice of phrase “like dead men” in Matthew thus becomes explicable as an allusion to these victims in Daniel. The angel's description is also a clue to the Danielic parallel: in the Septuagint version of Daniel 7:14, an angel is described as “and his garment white as snow”; in Matthew 28:3, the angel is described as “and his garment white as snow,” in the Greek every word identical but one (and that a cognate), and every word but one in the same order. Another angel in Daniel 10:6 is described as “his outward appearance as a vision of lightning” while the angel in Matthew is similarly described as “his appearance as lightning.” The imagery is thus a Danielic marker: Matthew is getting his ideas of what an angel looks like and how to describe one not from eyewitnesses but from Daniel, exactly where he's getting the lion's den story. (Matthew was fond of expanding on Mark by lifting ideas from Daniel, e.g., Matthew 17:6–8 takes material from Daniel 10:7–12 to expand on elements of Mark 9.)

Furthermore, Matthew alone among the Gospels ends his story with a particular commission from Jesus (Matthew 28:18–20) that matches many details of the ending of the Greek version of Daniel's adventure in the den: Jesus says God's power extends “in heaven and on earth,” to “go and make disciples of all nations” and teach them to observe the Lord's commands, for Jesus is with them “always” even “unto the end.” And so King Darius, after the miraculous rescue of Daniel, sends forth a decree “to all nations” commanding reverence for the Jewish God, who lives and reigns “always” even “unto the end,” with power “in heaven and on earth” (Daniel 6:25–28). The latter phrase in Greek is even identical in both cases. The stories thus have nearly identical endings. Indeed, the king's decree in Daniel reads like a model for the very Gospel message itself (see Daniel 6:25–27). And the episodes are framed the same way: in both Matthew and the Greek text of Daniel the stories introduce their parallel structure with the same verb and object, “to seal” (sphragizô) the “stone” (lithon), and conclude it with the same teaching about the Lord reigning until the “end” (telos; sunteleia) of the “eons” (aiôn; aiônas).

Since the placing of a “seal” is essential to creating the Danielic parallel, Matthew has a motive for inventing the entire motif of the guards in order to create the pretext, not only for the sealing, but for the clue of “becoming like dead men” and the angelic “miracle,” all elements unique to his story. There are more telltale signs that this story is fabricated, and that Matthew fabricates other stories like this with some frequency, but what I've summarized here is enough for the present purpose. The evidence e thus includes not only the fact that this entire story is unexpectedly unique to Matthew (despite our having three other versions to compare, including Matthew's own source), but also all those other facts that link his unique changes to the story to the book of Daniel and its tale of the lion's den. Since the latter is entirely expected on the hypothesis of fabrication, but not as expected on the hypothesis of historicity, what was already P(e|HISTORICAL.b) << P(e|MYTHICAL.b) becomes P(e|HISTORICAL.b) <<< P(e|MYTHICAL.b). We must also begin with P(HISTORICAL|b) << P(MYTHICAL|b) owing to the fact that integral to the story is a grandiosely flying, paralyzing angel, which is inherently more likely to be made up than true (since we just don't see such things happening in the real world—thus, if they happen at all, it's but rarely, whereas made-up stories about magical beings are extremely common)—the more so if we can establish a trend of fabrication and mythmaking in Matthew (and we can), but we needn't add that in here. Just with what we have enumerated here, no matter what numbers are entered in, the end result is going to be P(HISTORICAL|e.b) → 0.

Focusing on the emulation hypothesis alone, all six of MacDonald's criteria for literary emulation are met: the text being imitated (the Septuagint, which by then included the book of Daniel) was well-known and frequently used this way, and the comparison of Jesus with Daniel was a common one (so there are no deductions from the prior probability on that account); there are several significant parallels; the parallels often appear in the same order; the connection is confirmed by peculiar features (direct borrowing of terms and phrases; the unusual description of the guards); and the whole device reveals an obvious, intelligible meaning. Indeed, the story becomes interpretable, with obscure and seemingly confusing features suddenly making perfect sense (such as why the guards become “like dead men,” why the Jews bother with a seal when they have a guard, why an angel has to intervene even though Jesus is apparently no longer even in the tomb, and why Matthew alone reports any of this). All of these factors are more expected (and thus more likely) on the hypothesis of emulatory fabrication than on any other hypothesis. For all these coincidences with the Daniel story to have actually happened is intrinsically improbable (even if not impossible, certainly not highly probable—in fact it's highly improbable, the likes of which never found by chance anywhere else), but for these coincidences to exist as a product of emulatory fabrication is intrinsically likely (in fact, entirely expected).

Illustrating the principles set forth earlier, the fact that the description of the angel lifts a whole string of words directly from Daniel is very likely if Matthew is getting it from Daniel, but extremely unlikely if he is not. The probability of such a coincidence of words and word order being a product of random chance is extraordinarily small. The fact that one word is switched for a cognate, and one word is in reverse order, makes no significant difference to this calculation. One can propose the alternative theory that Matthew borrowed a description from Daniel to embellish an otherwise true account, but then you have the improbable coincidence that Matthew's story uniquely draws ideas from and makes allusions to a story unique to the book of Daniel, indeed one in which an angel also performs the key miracle in the narrative, the very angel Matthew then endeavors to describe by lifting angelic descriptions from that very same book. This coincidence is expected on the hypothesis of fabrication, but much less expected on the hypothesis that he is only embellishing a true story.

Likewise, all the deviations do not alter this ratio. For example, the fact that there are no lions in Matthew's version of the story does not make borrowing less likely, since removing them is exactly what we would expect (since Matthew is starting with a narrative from Mark that already excludes them, and in which introducing them would make no sense). The fact that “the chief priests and the Pharisees” place the seal, instead of a king (much less King Darius), does not make borrowing less likely, since Matthew's story already begins in a context that makes that particular switch likely: he is setting his own story in a different historical and political context, and has different identifiable literary aims, which both entail that elements of the Daniel story unsuited to his expected context and purpose will be dropped or altered (and no emulation hypothesis entails exactly which elements will be adapted, only that more will be than chance alone can easily explain).¹³⁵

In short, none of the changes make emulation less likely. But many elements do make it more likely; and some changes even make emulation more likely. For example, unlike in the lion's den, the guards are only paralyzed, yet oddly said to “become like dead men,” rather than actually being killed (much less by the ruling parties). But these differences are expected. The motive and occasion for killing Daniel's accusers doesn't exist in Matthew's story, and killing the guards would destroy Matthew's intended plot, which requires the guards to lie about what happened, and his story wouldn't work at all if the Christians were on the hook for murder. But that the guards are only described as “like dead men” is actually itself an improbable coincidence on the theory that Matthew isn't alluding to (and thus borrowing the idea from) the lion's den tale, thus though marking a difference between the two stories, this actually reduces P(e|HISTORICAL.b) and increases P(e|MYTHICAL.b). This is because such a description is strange and unexpected—unless it's an allusion to the lion's den tale, in which case it's neither strange nor unexpected (or certainly much less so). Overall, that chance coincidences of history or storytelling would produce all these congruences is very improbable, but that emulating Daniel's tale of the lion's den would produce them all is far more probable. Thus the consequents favor emulation—especially when combined with all the other evidence (such as the contradictory silence of other Gospels).

This is just one example, and indeed it's not even the best. I chose it because of those features allowing us to confirm the story really is fabricated independently of any application of emulation criteria. But even using those criteria alone it's sufficiently strong to be clear (to anyone not dogmatically set against the conclusion) that Matthew made all this up to equate the tomb of Jesus with Daniel's den of lions. MacDonald's criteria pertain, confirming that what is evident is genuinely there. And their capacity to do this is entirely explained and validated by Bayes's Theorem.

BAYESIAN DEMONSTRATIONS OF AHISTORICITY

In the last three chapters I've made a strong case that all valid historical methods are described by BT and that BT proves the methods so far used by Jesus historians are either invalid or invalidly employed. It follows that we must use BT.

Applying BT to the specific question of whether some person, place, or event actually existed or not is merely a matter of ascertaining the prior and consequent probabilities. What is the prior probability that Jesus actually existed? What is the prior probability that he didn't? How likely is the evidence we have if he did exist? How likely is the evidence we have if he didn't? Given the most obvious answers to these questions, at first glance it seems surely “Jesus existed” would win out as the most probable hypothesis on BT. In my next volume (On the Historicity of Jesus Christ) I'll reveal that on second glance, that conclusion is not so obvious, and might even be wrong. But you needn't believe that now. I'm obligated to prove it, and to persuade my expert peers thereby. If I can't, I'm wrong. But I won't undertake that task here. Here I will only close with the required methodology (adding to the relevant remarks on this task already concluding chapter 4).

The two hypotheses to test will be h = “Jesus was a historical person mythicized” and ~h = “Jesus was a mythical person historicized.” The only other logical possibilities are h₀ = “historical person not mythicized” and ~h₀ = “mythical person not historicized,” but our background evidence firmly establishes the prior probability of either of those is vanishingly small (all reasonable Jesus scholars agree Jesus was mythicized to some degree; and even those who would deny he existed agree he was historicized), while the consequent probability of the evidence favors neither of those over h and ~h (i.e., h and ~h make all the evidence just as likely or far more so than either h₀ or ~h₀). So we can disregard h₀ and ~h₀ (their consequents lend them no credence and their priors are so small they won't even be visible in our math). So the prior probabilities of h and ~h must still in effect sum to one. I will present a method for determining their value in On the Historicity of Jesus Christ, after more carefully defining h and ~h. That leaves estimating their consequents, which will then occupy the rest of that volume.

If it's more inherently likely that a savior god like Jesus would be mythical, then the priors will favor ~h, and if proposing this actually explains all the evidence better than any alternative, then the consequents will also favor ~h. And if both the priors and consequents favor ~h, then we must conclude that Jesus probably didn't exist. But all these values are conditional on background knowledge. And that's where the debate often focuses, and contemporary scholars feel certain that deniers of the historical existence of Jesus fail to grasp the extent and significance of that background knowledge, and thus grossly misestimate the probabilities of different theories of the evidence. Deniers, meanwhile, charge historicists with ignoring telltale oddities in the evidence that make little sense unless Jesus never really existed to begin with and was only created after the fact. Who is right deserves another look. But that's for another time. What has been shown here is that it is at least logically possible to prove from existing evidence that Jesus (probably) didn't really exist, and how one can legitimately do that. Whether the evidence goes that way, however, cannot be presumed.

Buscar este blog

Proving History: Bayes's Theorem and the Quest for the Historical Jesus

SEGUNDA PARTE

Comentarios

Publicar un comentario

Entradas más populares de este blog

Primera parte