Last update: 19 January 2014.
Thanks to everybody who has reported an error.
Please report any further errors you notice to
- Page 80, Figure 4.5: In the sub-figure in top-right (Blocks A:
GivenName), the third row should have a Soundex encoding of
`p360' instead of `p630'.
- Page 80, again in Figure 4.5: In the sub-figure Candidate
record pairs from GivenName, the pair of record identifiers
should be (a3,b2) instead of (a3,b3).
- Page 81, Equation (4.2) is wrong. We have n records and
b blocks. In each block, we compare n/b *
(n/b-1)/2 record pairs, so c = b * n/b*(n/b-1)/2 = n/2 *
- Section 5.4, page 107: In the example at the bottom of the page,
the numerical values in the simjaccard equation
simjaccard(`gail', `gayle') = 1 / (3 + 4 - 1) = 1 / 6 =
simjaccard(`gail', `gayle') = 1 / (3 + 3 - 1) = 1 / 5 = 0.2.
- Section 10.2, page 231: Another freely available data matching
system which unfortunately was missed is
Duke. For details
Duke is a practical data matching system developed in Norway,
with the aim to be easy to embedded into other software.