Bibliography

1: G.B. Bell and A. Sethi, Matching Records in a National Medical Patient Index, Communications of the ACM, Vol. 44 No. 9, September 2001.
2: D.P. Bertsekas, Auction Algorithms for Network Flow Problems: A Tutorial Introduction, Computational Optimization and Applications, Vol. 1, pp. 7-66, 1992.
3: V. Borkar, K. Deshmukh and S. Sarawagi, Automatic segmentation of text into structured records, in Proceedings of the 2001 ACM SIGMOD international conference on Management of Data, Santa Barbara, California, 2001.
4: P. Christen, T. Churches and J.X. Zhu, Probabilistic Name and Address Cleaning and Standardisation, Proceedings of the Australasian Data Mining Workshop, Canberra, December 2002.
5: T. Churches, P. Christen, K. Lim and J.X. Zhu, Preparation of name and address data for record linkage using hidden Markov models, BioMed Central Medical Informatics and Decision Making, 2002, 2:9, http://www.biomedcentral.com/1472-6947/2/9/
6: W.W. Cohen, The WHIRL Approach to Integration: An Overview, in Proceedings of the AAAI-98 Workshop on AI and Information Integration. AAAI Press, 1998.
7: M.G. Elfeky, V.S. Verykios and A.K. Elmagarmid, TAILOR: A Record Linkage Toolbox, Proceedings of the ICDE' 2002, San Jose, USA, 2002.
8: I. Fellegi and A. Sunter, A Theory for Record Linkage. In Journal of the American Statistical Society, 1969.
9: H. Galhardas, D. Florescu, D. Shasha and E. Simon, An Extensible Framework for Data Cleaning, Technical Report 3742, INRIA, 1999.
10: L. Gill, Methods for Automatic Record Matching and Linking and their use in National Statistics, National Statistics Methodology Series No. 25, London 2001.
11: J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2000.
12: M.A. Hernandez and S.J. Stolfo, The Merge/Purge Problem for Large Databases, in Proceedings of the SIGMOD Conference, San Jose, 1995.
13: C.W. Kelman, Monitoring Health Care Using National Administrative Data Collections, PhD thesis, Australian National University, Canberra, May 2000.
14: A.J. Lait, and B. Randell, An Assessment of Name Matching Algorithms, Technical Report, Department of Computing Science, University of Newcastle upon Tyne, UK 1993.
15: J.I. Maletic and A. Marcus, Data Cleansing: Beyond Integrity Analysis, in Proceedings of the Conference on Information Quality (IQ2000), Boston, October 2000.
16: AutoStan and AutoMatch, User's Manuals, MatchWare Technologies, Kennebunk, Maine, 1998. See also: www.fcsm.gov/working-papers/software-demos.pdf
17: A. McCallum, K. Nigam and L.H. Ungar, Efficient clustering of high-dimensional data sets with application to reference matching, Knowledge Discovery and Data Mining, 169-178, 2000.
18: U.Y. Nahm, M. Bilenko and R.J. Mooney, Two Approaches to Handling Noisy Variation in Text Mining, in Proceedings of the ICML-2002 Workshop on Text Learning (TextML'2002), pp.18-27, Sydney, Australia, July 2002.
19: H.B. Newcombe and J.M. Kennedy, Record Linkage: Making Maximum Use of the Discriminating Power of Identifying Information, Communications of the ACM, Vol. 5 No. 11, 1962.
20: L. Philips, The Double-Metaphone Search Algorithm, C/C++ User's Journal, Vol. 18 No. 6, June 2000.
21: E.H. Porter and W.E. Winkler, Approximate String Comparison and its Effect on an Advanced Record Linkage System, Research Report RR97/02, US Bureau of the Census, 1997.
22: L.R. Rabiner, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, in Proceedings of the IEEE, Vol. 77, No. 2, February 1989.
23: E. Rahm and H.H. Do, Data Cleaning: Problems and Current Approaches, IEEE Bulletin of the Technical Committee on Data Engineering, Vol. 23 No. 4, December 2000.
24: K. Seymore. A. McCallum and R. Rosenfeld, Learning Hidden Markov Model Structure for Information Extraction, in Proceedings of AAAI-99 Workshop on Machine Learning for Information Extraction, 1999.
25: V.S. Verykios, A.K. Elmagarmid and E.N. Houstis, Automating the Approximate Record-Matching Process, Information Sciences, Vol. 126, July 2000.
26: V.S. Verykios, A.K. Elmagarmid, M.G. Elfeky, M. Cochinwala and S. Dalal, On the Completeness and Accuracy of the Record Matching Process, in Proceedings of the MIT Conference on Information Quality, Boston, MA, October 2000.
27: W.E. Winkler and Y. Thibaudeau, An Application of the Fellegi-Sunter Model of Record Linkage to the 1990 U.S. Decennial Census, Research Report RR91/09, US Bureau of the Census, 1991.
28: W.E. Winkler, Quality of Very Large Databases, Research Report RR2001/04, US Bureau of the Census, 2001.
29: W.E. Yancey, Frequency-Dependent Probability Measures for Record Linkage, Research Report RR00/07, Statistical Research Division, US Bureau of the Census, July 2000.
30: W.E. Yancey, BigMatch: A Program for Extracting Probable Matches from a Large File for Record Linkage, Research Report RR 2000-01, Statistical Research Division, US Bureau of the Census, March 2002.