Tolstoy loves Brendan more than he loves Doron by Brendan McKay

In July 1997, I posted on the internet the first of several "codes experiments" that appear to demonstrate remarkable information hidden in the Hebrew translation of War and Peace. Of course, I don't really believe in such hidden codes, and my page clearly stated that the demonstrated convergences occurred entirely by chance. The purpose of the page was to caution everyone against accepting Doron Witztum's "experiments" in the Book of Genesis.

Now Witztum, the Guru of the Codes, has published a "refutation" of my experiment, proving that all those nice patterns occurred entirely by chance. Since that is exactly what I claimed from the very beginning, it is cause for some hilarity.

Of course, if there is anyone who is qualified to identify problems with my experiment, it is Doron Witztum. After all, every trick that I used was invented by him.

The purpose of the present document is not to defend my experiment, though I might not be able to resist showing some flaws in Witztum's refutation. I am responding because parts of Witztum's article clearly show him to be a fraud. That information is not known as widely as it should be.

We will use Michigan-Clairmont encoding of Hebrew letters. My original paper and Witztum's reply (transliterated) are available if readers wish to see their full text.

Let's go.

On July 31st, 1997, Dr. Brendan McKay of the Australian National University presented an article on the internet, in which he claims that the personal details of his life "amazingly" appear encoded at minimum ELSs along with his name, in Tolstoy's War and Peace. He claims the probability of this occurring by chance to be less than one in five thousand.
In this response, we will show that Dr. McKay achieved his results through a combination of systematic error and very partial reporting of the number of trials conducted.
The two errors were employed by Dr. McKay with the aim of creating a "counterfeit" effect of success. We will demonstrate that real probability involved in this experiment corresponds exactly with what is expected to take place randomly.

Note how Witztum carefully avoids mentioning that my article itself states that the events displayed occurred by chance. He wants to trick his readers into thinking he is showing something contrary to what I wrote, not something that agrees with what I wrote. Let me quote myself:

The lesson to be drawn from this paper is clear enough. Anyone with the skill and the perseverance can make ELS experiments that seem to show remarkable results. In this paper we found a significance level well below 1/1000 from a single name and a single date. Did it happen by chance? Yes!

This type of sample, in which we measure the proximities between ELSs of a single expression and those of a list of other expressions, we termed a "heading sample". In our paper "Hidden Code in the Book of Genesis", ("CPN XMWY BSPR BR)$YT" preprint 1996; originally presented as a lecture before the Israeli National Academy of Sciences, 19 March, 1996), we analyzed this type of sample, and we explained how to carry out the randomization process for it.

In true cult-leader style, Witztum is laying down "the rules". No matter that he doesn't have any relevant qualifications. No matter that his work is riddled with the most elementary errors.

The Method of Measurement

As we explained in the above-mentioned paper, the proper method for analyzing a heading sample has three stages:
1. We use the function c, ("the Corrected Distance" in the Statistical Science paper), to measure proximity between the heading and each expression w (from the list). However, the function has the following alteration: the heading is only taken as ELSs, while the expression w is taken as ELSs and as PLSs (perturbed letter sequences). In other words, ELSs of w compete with the PLSs of w over the more successful proximities to the ELSs of the heading.

Lest anyone be misled, the "alteration" Witztum mentions here is not in his preprint. So why is he mentioning it? Anyone who follows Witztum's work will know the answer immediately: because it is to his advantage! Since he is going to apply "The Correct Method" to my data and wants a bad result, we can bet our life savings that his "alteration" will make the result worse. Sure enough, it does -- by a factor of 4 or 5.

The pattern is always the same. Witztum's ability to make a priori decisions in his favour is simply mind-boggling. A true prophet in our midst!

Charismatic words

1. A few months ago, (in a posting called "Equidistant Letter Sequences in Genesis- A Report. Feb. 22, 1997), Dr. McKay noted that some of the names in the list of Rabbis included in the experiment published in Statistical Science, had the following property: their ELSs appeared more often than their PLSs. This advantage may allow for easier "successful" proximities with ELSs of other expressions. Suppose that expression x possesses this advantage. If we calculate the proximities between its ELSs and those of the series of expressions, w1, w2,.. wn, we will receive an overabundance of "successful" results. This will happen whether the other expressions are related to x or not. This is a systematic error because we exploit a certain property of the expression more than once.

Mr Witztum is deliberately distorting what I wrote. It is not some of the names, it is the set of names as a whole that have many more ELSs than PLSs. The difference is very important, as we shall see in a moment.

This possible advantage connected to the number of ELSs of a word, may be only one possible advantage that an expression may have. Regardless of the origin of the advantage, let us call expressions with these types of advantages "charismatic" expressions.
Naturally, besides charismatic expressions, some expressions may also be "anti-charismatic". For instance, if the ELSs of an expression appear less often than its PLSs. Such expressions will tend to produce "failures" in measuring proximities with other expressions. In the sample of the Rabbis, there were many names and appellations. Some may be charismatic, and others may be anti-charismatic. One would expect that these effects will balance each other out.

Mr Witztum is fully aware that the effects do not cancel out. The set of names is charismatic on average. More on this below.

Aside: The test specified by Persi Diaconis

The permutations test proposed by Professor Persi Diaconis, as described in the Statistical Science paper

Is it really possible that six months or more after both Prof Rips and Prof Aumann admitted that Diaconis' test was not used, Doron Witztum still doesn't know? I don't think it is possible. Is it really possible he thinks I won't catch him in this lie? What sort of mind is this we are dealing with?

Witztum is consistent in this deception. For example, in a Dec 1997 letter to the Israeli magazine Galileo he wrote:

After the great success of the second list's measurement, Prof. Diaconis suggested we use a new method of measurement on the second list. We did so, and the surprising results of our experiment are brought at the beginning of our letter.

Let's set the record straight on this. An exchange of letters between Persi Diaconis (representing the journal to which the paper had been submitted) and Robert Aumann (representing Witztum and Rips) took place in 1990 in order to establish the rules for the experiment. The method finally agreed to was described in a letter from Diaconis to Aumann on September 5. We have made a scan of the letter available with the kind permission of Professor Diaconis.

Prof Diaconis's test required calculation of an aggregate distance from each set of names to each set of dates. This is mathematically incompatible with the test actually done. There is no 32 x 32 table of distances in the WRR method. Instead they performed a different test - one that could easily have been predicted to have a great chance of success given the tests they had already done.

Prof Diaconis's 32 x 32 table can be calculated from the WRR distances in a variety of ways. One, which he suggested himself in an earlier letter, is to take the minimum distance. Another, suggested by him later, is to take the average. Both of these options make the result hundreds of times worse. So much worse that, according to my calculations, they would not have achieved the criterion for success indicated in Diaconis' letter.

Another little-known aspect of the history is that Diaconis asked them around the same time to prepare a third list of rabbis. He was concerned by the fact that the second list had already been in existence for three years. With the vigorous support of a senior mathematician who should have known better, they adamantly refused. To his later regret, Diaconis gave in. The rest is history.

One of the reasons given for not preparing a third sample is that less famous rabbis are not so likely to be "encoded". This argument totally ignores the fact that the second list of rabbis gave better results than the first list. In fact, statistical analysis shows that the individual performance of each rabbi has very low correlation with the size of the entry in the encyclopedia. If there is a trend at all, it is towards less famous rabbis performing better.

Back to charismatic names

is supposed to eliminate any such residual effects of charisma for symmetry considerations. If we assume that the success of the proximities of name x is due to its charisma, then it should be just as successful with unrelated dates to which it is paired as to related dates. Thus, the permutation test should cancel the "charisma" (advantage) of individual terms.
However, Dr. McKay (in his Report) argued that the permutations test may not have succeeded in canceling this effect.

Dr McKay was right. As mentioned above, the names of the rabbis are charismatic on average. This implies that the more distances that are measured from those names, the better the score will be (on average). It so happens that the permutation that matches each rabbi to his own dates produces more word-pairs and so more distances than over 98% of random permutations. Therefore it has some advantage. This statement has the non-negotiable status of a mathematical theorem.

This property of being charismatic on average is in fact the primary reason why the original (totally invalid) probability P2 gave a result much better than the WRR permutation test gives.

We thought Dr. McKay was wrong, and that Professor Diaconis' permutation test was a good one.

So why didn't you use it?

Nonetheless, it occurred to us that to demonstrate to Dr. McKay that he is incorrect, one could apply the test described in B1 (described in the above mentioned paper). In a response sent to him (A Preliminary Analysis and Comments on the Report of New ELS Tests, June 19, 1997. Posted on the internet), Professor Eliyahu Rips proposed the following test (Stage 1 in section B above):
To repeat the experiment described in Statistical Science, with the single variation that the names will be taken only as ELSs, while the dates as ELSs and PLSs. This simple procedure cancels all possible effects due to charismatic names.
When we carried out this experiment, it turned out that the results improved. Whereas the ranking of the most successful statistic P4 was originally four in a million, it now improved to one in a million. In other words, the success of the Statistical Science experiment was clearly not caused by this effect.

Here we see two more deceptions.

The first deception is in Witztum's comparison of the two computations. Has he forgotten that he already claimed that the "real" ranking of P4 is 59 in 100 million? He said so himself in his Hebrew preprint cited above. That's better than 1 in a million, not worse. To make the comparison more accurately, I computed both scores using the same 100 million permutations for each. The original method achieved a rank of 55, and the new method achieved a rank of 510. In other words, the change that Witztum claims as an improvement actually makes the result worse by a factor of 9.

The second deception (though to be fair it might be due to Witztum's ignorance of statistics) is the fallacy that one can measure the probability of an event just by looking more closely at that event. Wrong, one must examine the probability space that the event lies in. [Readers who don't know what I'm talking about can take solace from the knowledge that Mr Witztum probably doesn't either.]

2. When one is conducting an experiment that involves a heading sample, the need for caution cannot be overstated. If the expression being used as a heading happens to be charismatic, then it will "succeed" with many expressions, whether they are related or not.
It appears that Dr. McKay neglected to take into account his own observation: he happens to have used charismatic expressions for the heading in his sample, and seems not to have realized that the "success" of his results is entirely artificial. Moreover, the randomization technique Dr. McKay performed, instead of offsetting the charisma, ignored it altogether. Indeed, as per the discussion in section B above, the use of only ELSs for the headings and the appropriate randomization technique, is sufficient to make the artificial result of 1 in 5,000 into a real one of 1 in 40.

Witztum's test permutes the letters of the words on the right. Mine permutes the letters of the words on the left. My weakest students could see the symmetry in that. They are mathematically the same, and if there was a fallacy in my calculation so there is in his. Somehow, this "scientist" failed to notice this obvious fact.

One of the ways of writing my year of birth ($NTHT$YB) is extremely anti-charismatic. So much so that its distance to almost 1/3 of all words is exactly 1 (the largest possible). Obviously, permuting its letters provides him with precisely the same type of advantage that he claims I had from permuting the letters of "Brendan".

Beyond this argument, there is the question of why it is an error to not allow for charisma. That can only be claimed on the basis of a thorough understanding of what the codes phenomenon is. Witztum thinks he knows the mind of the creator of the codes. [Although, since Witztum himself created the codes, I guess he is right. Damn, foiled again.]

More seriously, even if charisma is the entire explanation for the result of my experiment, Witztum has not given any reason why my name should be more charismatic in War and Peace than any permutations of its letters. Why aren't I entitled to claim that Tolstoy loves me on the basis of that?

"Dr McKay" is not very charismatic anyway.

3. In my opinion, there is an even more basic approach to the prevention of errors of this sort, and it touches on the question of the exact phenomenon being traced. Here is not the place to discuss this subject at length. It will be covered in detail in my forthcoming book.

We can hardly wait.

One more remark: in the summer of 1985, we set out to define a "distance" between two expressions, w and w'(a measure of proximity). In general, each expression will be represented by several ELSs. First, we define a "distance" between a particular ELS of w and a particular ELS of w'. We then have two options: a) To sum all the "distances" between ELSs of w and ELSs of w'. We call this option TOT. b) To take only the best "distance" of all these distances. We call this option BEST.
At the time, Professor Rips suggested that TOT would provide a more stable measure, so we used it for all our experiments, including the Rabbis experiments. This is the function described in our 1994 Statistical Science paper, from which function c is derived.

The mathematical expression "more stable" has no evident meaning in this context. Perhaps what it means is "gives a result 351 times better" (which indeed it does for the overall measure used then). Don't expect Doron Witztum to tell you that.

Last year, I conducted several experiments (also described in detail in my forthcoming book), using the BEST measure. I believe that BEST avoids the problem of charisma, arising from ELSs appearing more often than expected. It must be stressed, however, that BEST should be considered only as an additional or complementary measure of proximity, not as a replacement for TOT.

From this statement we can positively conclude that Witztum has found some examples where BEST does better than TOT. It is quite impossible that he would have used it otherwise. However, from the last sentence we can also infer that he is painfully aware that it doesn't work at all well on some earlier experiments he adjusted for TOT. So he has to declare it "an additional or complementary measure" in an attempt to have it both ways at once. The argument is like that over whether swishing the tea-leaves in the clockwise or anti-clockwise direction makes the fortune-telling more accurate.

It is also impossible that he would have mentioned it here unless it did worse on my data.

Selection of the headings

After all that mucking around, Witztum is now finally getting to the real issue. As he knows very well, having done it many times himself, selection of the data is the most important ingredient in a well-cooked experiment. Rather than bother with a line-by-line response, I will quote my own description verbatim from a private letter I wrote to a fellow codes-skeptic on September 10, 1997.

The honest truth is that most of the 1/5400 is just good luck.

Let me consider what choices I had. I could have used a different text, but War and Peace is the only non-Bible Hebrew text I have and was also the one used by WRR. I could have used different dates, such as only day+month like WRR instead of the year as well. Whatever I chose, I had no real choice in the forms of dates. I could have used a different method of analysis, but as I showed it doesn't seem to matter terribly. Probably the only place where I deliberately made a choice in my favour was the spelling MQYY of "McKay". I could have used MKQY or MQQY, but would not have been able to write an excuse like I could for MQYY. Actually MKQY works quite a lot better than chance too, but not as well as MQYY. I could have used multiple spellings, but decided not to. Certainly the freedom I had was very limited.

I think that the real choice might have been to do the experiment at all. I found that BRNDN was good against my date of birth some time ago, while looking for an example for Drosnin. Only much later did I happen to try DRMQYY, after hearing that Maariv had spelled it that way. If it didn't work I would have forgotten about it and some other day found something different that worked.

To understand this, it is necessary to know that at the time I didn't know of any ways to write "McKay" except the three mentioned. If I knew of others I would have tried them too, but in fact I didn't. I also did not know of the WRR method for "header samples", as it was only available in Hebrew at that time. It is interesting (but not really surprising to a mathematician) that the method I invented was nearly the same.

My letter also makes reference to other methods of analysis. Readers may well wonder why Witztum is studiously avoiding the fact that I used two completely different methods of analysis in my paper that both gave good results. He sure would like to "disprove" the other method too, but the poor guy can't. You see, the other method was almost identical to one proposed by Prof Rips. Funny but true.

Dr. McKay also provides us with an explanation why he ignored the very encyclopedia he used in his present article and why he is not consistent in his usage of MKQY as before. He states that the only time his name was mentioned in the Israeli press (the Israeli newspaper Maariv), it was spelled MQYY. In truth, we think that the transcription MQYY is a mistake altogether. A newspaper should not be considered an authoritative source for spellings of transliterated words. Nonetheless, since he used it, we will consider its use.
However, Dr. McKay's claim that his name was never otherwise mentioned in the Israeli press is simply not true. In an article in the magazine Mishpacha, published on the July 3, 1997, (page 13, Column 1, Line 14), his name is spelled MQQYY. And if Dr. McKay should claim that he did not know of this publication, it would be a strange claim the dedication at the head of his own article: LA.L. M$T"P )MYC (M HAMT is based on that same piece in Mishpacha.

I knew of that scurrilous pack of filthy lies, yes, but I never thought of asking how it spelt my name. If I had, all I would have had to do was to write "the first time my name appeared in the Israeli press" instead of "the only time". My story would have been saved. Big deal. (Remember that the story only needs to be Witztum-quality, which is laughably easy.)

For people who can't read it, the dedication expresses my disgust at the author of the article, who hid behind a cowardly fake name. A brave and honest Haredi friend was labelled a "traitor" for daring to question Witztum and his codes. It was not an isolated example, and shows very clearly what the codes cult is like.

Recently I heard that Mishpacha had changed ownership. Let's hope that the new owners have the will to drag it out of the gutter.

The Selection of the Expressions in the List

Once Dr. McKay added QNBRH (Canberra, which is not the place of birth) to the list, he revealed that he had an inestimable space of biographical details to work with. Not only is it prohibited to include QNBRH on the list, it also casts a dark shadow on the remainder of the list.

Witztum is quite correct, though he might have mentioned that I put "Canberra" (and "Australia") not in the same list, but in a separate list. It is invalid, all right, so why does Witztum do it? The fact is that the practice of tuning a list by adding extraneous words is an old Witztum stunt. An example is discussed in my paper on his Auschwitz experiment.

Conclusion

We have shown that Dr. McKay's success was in fact artificial. For the sake of comparison, we will shortly publish an authentic example of a heading sample.

Save us.