Bibliography

1
G.B. Bell and A. Sethi, Matching Records in a National Medical Patient Index, Communications of the ACM, Vol. 44 No. 9, September 2001.

2
D.P. Bertsekas, Auction Algorithms for Network Flow Problems: A Tutorial Introduction, Computational Optimization and Applications, Vol. 1, pp. 7-66, 1992.

3
V. Borkar, K. Deshmukh and S. Sarawagi, Automatic segmentation of text into structured records, in Proceedings of the 2001 ACM SIGMOD international conference on Management of Data, Santa Barbara, California, 2001.

4
P. Christen, T. Churches and J.X. Zhu, Probabilistic Name and Address Cleaning and Standardisation, Proceedings of the Australasian Data Mining Workshop, Canberra, December 2002.

5
T. Churches, P. Christen, K. Lim and J.X. Zhu, Preparation of name and address data for record linkage using hidden Markov models, BioMed Central Medical Informatics and Decision Making, 2002, 2:9, http://www.biomedcentral.com/1472-6947/2/9/

6
W.W. Cohen, The WHIRL Approach to Integration: An Overview, in Proceedings of the AAAI-98 Workshop on AI and Information Integration. AAAI Press, 1998.

7
M.G. Elfeky, V.S. Verykios and A.K. Elmagarmid, TAILOR: A Record Linkage Toolbox, Proceedings of the ICDE' 2002, San Jose, USA, 2002.

8
I. Fellegi and A. Sunter, A Theory for Record Linkage. In Journal of the American Statistical Society, 1969.

9
H. Galhardas, D. Florescu, D. Shasha and E. Simon, An Extensible Framework for Data Cleaning, Technical Report 3742, INRIA, 1999.

10
L. Gill, Methods for Automatic Record Matching and Linking and their use in National Statistics, National Statistics Methodology Series No. 25, London 2001.

11
J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2000.

12
M.A. Hernandez and S.J. Stolfo, The Merge/Purge Problem for Large Databases, in Proceedings of the SIGMOD Conference, San Jose, 1995.

13
C.W. Kelman, Monitoring Health Care Using National Administrative Data Collections, PhD thesis, Australian National University, Canberra, May 2000.

14
A.J. Lait, and B. Randell, An Assessment of Name Matching Algorithms, Technical Report, Department of Computing Science, University of Newcastle upon Tyne, UK 1993.

15
J.I. Maletic and A. Marcus, Data Cleansing: Beyond Integrity Analysis, in Proceedings of the Conference on Information Quality (IQ2000), Boston, October 2000.

16
AutoStan and AutoMatch, User's Manuals, MatchWare Technologies, Kennebunk, Maine, 1998. See also: www.fcsm.gov/working-papers/software-demos.pdf

17
A. McCallum, K. Nigam and L.H. Ungar, Efficient clustering of high-dimensional data sets with application to reference matching, Knowledge Discovery and Data Mining, 169-178, 2000.

18
U.Y. Nahm, M. Bilenko and R.J. Mooney, Two Approaches to Handling Noisy Variation in Text Mining, in Proceedings of the ICML-2002 Workshop on Text Learning (TextML'2002), pp.18-27, Sydney, Australia, July 2002.

19
H.B. Newcombe and J.M. Kennedy, Record Linkage: Making Maximum Use of the Discriminating Power of Identifying Information, Communications of the ACM, Vol. 5 No. 11, 1962.

20
L. Philips, The Double-Metaphone Search Algorithm, C/C++ User's Journal, Vol. 18 No. 6, June 2000.

21
E.H. Porter and W.E. Winkler, Approximate String Comparison and its Effect on an Advanced Record Linkage System, Research Report RR97/02, US Bureau of the Census, 1997.

22
L.R. Rabiner, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, in Proceedings of the IEEE, Vol. 77, No. 2, February 1989.

23
E. Rahm and H.H. Do, Data Cleaning: Problems and Current Approaches, IEEE Bulletin of the Technical Committee on Data Engineering, Vol. 23 No. 4, December 2000.

24
K. Seymore. A. McCallum and R. Rosenfeld, Learning Hidden Markov Model Structure for Information Extraction, in Proceedings of AAAI-99 Workshop on Machine Learning for Information Extraction, 1999.

25
V.S. Verykios, A.K. Elmagarmid and E.N. Houstis, Automating the Approximate Record-Matching Process, Information Sciences, Vol. 126, July 2000.

26
V.S. Verykios, A.K. Elmagarmid, M.G. Elfeky, M. Cochinwala and S. Dalal, On the Completeness and Accuracy of the Record Matching Process, in Proceedings of the MIT Conference on Information Quality, Boston, MA, October 2000.

27
W.E. Winkler and Y. Thibaudeau, An Application of the Fellegi-Sunter Model of Record Linkage to the 1990 U.S. Decennial Census, Research Report RR91/09, US Bureau of the Census, 1991.

28
W.E. Winkler, Quality of Very Large Databases, Research Report RR2001/04, US Bureau of the Census, 2001.

29
W.E. Yancey, Frequency-Dependent Probability Measures for Record Linkage, Research Report RR00/07, Statistical Research Division, US Bureau of the Census, July 2000.

30
W.E. Yancey, BigMatch: A Program for Extracting Probable Matches from a Large File for Record Linkage, Research Report RR 2000-01, Statistical Research Division, US Bureau of the Census, March 2002.