Early changes in WRR's success measure

Brendan McKay, Australian National University

This is an appendix to our rejoinder to Witztum's response to our analysis of variations.

NOTE:We have corrected and expanded this page in response to some criticism by Doron Witztum of the first edition. In particular, Witztum claims we omitted to present the contents of his 1987 preprint because it supports him. In fact it supports us, as we shall see.

Here are two examples where the success measure presented by WRR in their earliest preprints differs mathematically from the success measure published by them in Statistical Science. The 1986 preprint presented the first list of rabbis, and the 1987 preprint presented the second list. Neither preprint mentioned the permutation test that appeared in Statistical Science.

Which letters are perturbed?

The first example concerns the exact nature of the "perturbations" used to determine the strength of an ELS convergence.

From the 1986 preprint:

From the 1987 preprint:

Note how 3 perturbations starting at the second gap became k-2 perturbations (k being the word length) and then became 3 again. Also note how the text meticulously avoids stating which 3 of the k-2 gaps are to be perturbed. This is interesting because in Statistical Science it is the last 3 gaps, not the 2nd, 3rd and 4th gaps. (We strongly suspect the second preprint was deliberately ambiguous so as match both what came before and what came after.)

Thus we see that Witztum's claim that "the description in the second preprint accords perfectly with that of the first one" accords perfectly with a bluff.

In this case there is evidence that the last 3 gaps were used in all three computations, contrary to the picture and text in the first preprint. We suspect that the first preprint describes a more ancient variation, or maybe was just a mistake. The second preprint was then written carefully so that nobody would notice the description was changing.

Which skips are used?

The second example concerns the choice of which skips to search for ELSs. We did not notice this example before our paper in Statistical Science was published.

From the 1986 preprint:

From the 1987 preprint:

It can be seen that the first preprint gave an exact formula, while the second gave only an English prescription. That prescription can easily be taken by a casual reader as describing what is given formally in the first preprint, but it does not. The word "approximately" in the first preprint, missing from the second preprint, gives this away. The exact wording of the second preprint in fact matches the mathematical definition given in Statistical Science, which is different from that in the first preprint. As an example of the difference it makes, consider the phrase "Rabbi David" in Genesis. The 1986 version gives a skip limit of 2678, while the Statistical Science version gives a skip limit of 2959.

Thus we see that Witztum's claim "the second preprint corresponded perfectly to the first preprint" corresponds perfectly to a deception.

We have been unable to determine which variation was used in the three computations. However, as before, we suspect that the second preprint was carefully written to avoid questions about why the definitions were changing.

Other changes to the algorithm

There are also reasons to suspect other variations were used, both before the first preprint (see here) and after the second preprint. The latter is the easiest to prove, as we can directly compare the distances given in WRR's second preprint with the program that they later distributed. The following diagram shows the considerable change that occurred.

It is not possible to determine what effect these changes had on the result of the permutation test, since the original distances for mismatched (name, date) pairs were not recorded.

In his reply, Witztum claims that the change of program made no difference, since "the differences are usually small and equally distributed in both directions". He notes that the value of P4 actually became larger. However, the result of the permutation test does not depend only on the value of P4, but also on the many distances that were not recorded in the preprints. Since Witztum can't (or won't) give us his earlier program, we have no way to investigate this further.

Note that WRR have assured us that the program they distribute is the same one that was used to perform the permutation test published in Statistical Science. If that is so, the histogram of distances had by then changed to the red one. However, in Statistical Science the histogram they give is the blue one, which they published in 1987. As in the other examples above, a plausible explanation is that they didn't wish to draw attention to the change.

Go Back
Creator: Brendan McKay, bdm@cs.anu.edu.au.