Hebrew Data

Here are the five books of the Torah, Koren edition, in Michigan-Clairmont transliteration. Note that they have spaces and numbers. Also, the verse numbers are reversed. To use these files in programs, you might need to first delete everything except the Hebrew letters.

We have also included the first 78064 letters of a Hebrew translation of War and Peace, as it is mentioned many times on this page. Unfortunately the words are not separated.

English Data

These English files contain normal punctuation and spacing. However, we have attempted to strip out all letters except those actually belonging to the text. To use them in a codes program, you may need to remove all the characters which are not letters.


Various versions of WRR-style programs can be found here.
If you want to make your own word clusters in Hebrew, Greek or English, a good choice is CodeFinder.

Creator: Brendan McKay, bdm@cs.anu.edu.au.