The pseudo-appellation ha'Chaim

The "first list" of 34 rabbis appeared in a 1986 preprint of Witztum, Rips and Rosenberg. The appellations which are listed there are much the same as what later appeared in the Statistical Science paper of the same authors in 1994.

There are two exceptions. First, the preprint has many appellations shorter than 5 letters or longer than 8 letters. These were removed from the experiment by defining the rules to exclude them. Second, and more interestingly, there is an "appellation" that is not an appellation at all. This is the word ha'Chaim that was listed for Rabbi Chaim Ibn-Attar.

The word ha'Chaim is related to Ibn-Attar by being the second word of the name of his famous book Ohr ha'Chaim. By itself, it has never served Ibn-Attar as an appellation.

WRR acknowledge this problem themselves:

This reveals a number of important things. First, WRR admit that they put the word in after observing it in the text. This would have been a violation of correct experimental procedure even if it was a valid appellation. The correct procedure was to prepare the data blindly, without looking into the text first. This allows us to reasonably question if the correct procedure was followed for the genuine appellations either.

Except for this one word, the preprint doesn't say what procedure was followed in collecting the data. Only years later was it "revealed" that all the data was prepared independently by Professor S. Z. Havlin, who did no ELS searches of his own. The latter part of the story of course implies that Havlin could not have noticed ha'Chaim as an ELS in the text, so it must have been that WRR added it by themselves in violation of Havlin's wishes.

After they had broken the rules in such a serious manner, they then admitted unease about it, as can be seen in the scan above. Again there is a problem. They didn't display any unease with the fact that they had looked in the text first, or that (by the official history) they had violated Havlin's wishes. The only reason they gave was that it wasn't a genuine appellation - precisely the only thing that any knowledgeable person would immediately notice on reading their table.

In my opinion, the real reason for including ha'Chaim can be seen in the first scan above. It performs extremely well, better than any genuine appellation in the entire list (with one possible exception). Why should we believe that none of the genuine appellations were included for the same reason?

Note that the 1986 preprint by itself does not show any clear signs of deceptive practices. Rather, what it shows is naive ignorance of experimental process and basic statistics. The more serious suspicions belong to the official history that appeared years later.

Brendan McKay

September 20, 1998


Notes

The full name of the book, which is correctly listed as an appellation, does not even have an ELS in Genesis.

The preprint presents the results in three forms: as a histogram of distances, as a count of distances less than 0.2, and as a value of the P2 statistic. All three were given with ha'Chaim included, and only the last without it. The value of P2 is 28 times better with ha'Chaim than without it.

The 1986 preprint also contains about a dozen long appellations which had changed by the time they appeared in the 1987 preprint. Because they are so long, they don't affect the results. However, one might still ask where this episode can be found in the official history.