Publications and selected presentations by Peter Christen


Go to presentations


Publications:

(see also my Google Scholar profile).

2024:

  1. Class ratio and its implications for reproducibility and performance in record linkage
    Jeremy Foxcroft, Peter Christen, and Luiza Antonie.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'24), Taipei, May 2024.
    Camera-ready paper (pdf, 493 KB)

  2. (Privately) Estimating Linkage Quality for Record Linkage
    Martin Franke, Victor Christen, Peter Christen, Florens Rohde, and Erhard Rahm.
    Accepted by the International Conference on Extending Database Technology (EDBT), March 2024.
    Accepter paper (pdf, 926 KB)

  3. Pattern Masking for Dictionary Matching: Theory and Practice
    Panagiotis Charalampopoulos, Huiping Chen, Peter Christen, Grigorios Loukides, Nadia Pisanti, Solon P Pissis, and Jakub Radoszewski.
    In the journal Algorithmica, March 2024.
    See here for open online access.

  4. When Data Science Goes Wrong: How Misconceptions About Data Capture and Processing Causes Wrong Conclusions
    Peter Christen and Rainer Schnell.
    In the Harvard Data Science Review (HDSR), February 2024.
    See here for open online access.

  5. Encryption-based sub-string matching for privacy-preserving record linkage
    Sirintra Vaiwsri, Thilina Ranbaduge, and Peter Christen.
    In the Journal of Information Security and Applications (Elsevier), January 2024.
    Camera-ready paper (pdf, 543 KB)

2023:
  1. A Critical Re-evaluation of Benchmark Datasets for (Deep) Learning-Based Matching Algorithms
    George Papadakis, Nishadi Kirielle, Peter Christen, and Themis Palpanas.
    Article available from arXiv.org, July 2023.

  2. A review of the F-measure: Its History, Properties, Criticism, and Alternatives
    Peter Christen, David J. Hand, and Nishadi Kirielle.
    Accepted by the ACM Computing Surveys, June 2023.
    Accepted paper (pdf, 723 KB)

  3. An Analysis of One-to-One Matching Algorithms for Entity Resolution
    George Papadakis, Vasilis Efthymiou, Emmanouil Thanos, Oktie Hassanzadeh, and Peter Christen.
    In the VLDB Journal, April 2023.

  4. A Vulnerability Assessment Framework for Privacy-Preserving Record Linkage
    Anushka Vidanage, Peter Christen, Thilina Ranbaduge, and Rainer Schnell.
    In ACM Transactions on Privacy and Security, April 2023.

  5. Tuning the Utility-Privacy Trade-Off in Trajectory Data
    Maja Schneider, Jonathan Schneider, Lea Löffelmann, Peter Christen, and Erhard Rahm.
    Accepted by the International Conference on Extending Database Technology (EDBT), Ioannina, Greece, March 2023.

  6. Rule-Based Knowledge Discovery via Anomaly Detection in Tabular Data
    Asara Senaratne, Peter Christen, Graham Williams, and Pouya Ghiasnezhad Omran.
    In proceeding of the AAAI Spring Symposium on Challenges Requiring the Combination of Machine Learning and Knowledge Engineering (AAAI-MAKE 2023), San Francisco, March 2023.

  7. Evolution of Degree Metrics in Large Temporal Graphs
    Christopher Rost, Kevin Gomez, Peter Christen, and Erhard Rahm.
    Accepted by the 20th Conference on Database Systems for Business, Technology and Web (BTW-2023), Dresden, March 2023.

  8. Thirty-three Myths and Misconceptions about Population Data: from Data Capture and Processing to Linkage
    Peter Christen and Rainer Schnell.
    In the International Journal of Population Data Science (IJPDS), vol 8, number 1, January 2023.
    See here for a brief news article: Why misconceptions about population data can lead to bad outcomes.

2022:
  1. Privacy-Preserving Record Linkage using Autoencoders
    Victor Christen, Tim Häntschel, Peter Christen, and Erhard Rahm.
    Accepted by the International Journal of Data Science and Analytics (Springer), November 2022.
    Camera-ready paper (pdf, 466 KB)

  2. Big Data is not the New Oil: Common Misconceptions about Population Data
    Peter Christen and Rainer Schnell.
    Article available from arXiv.org, September 2022 (first published December 2021).

  3. Locality Sensitive Hashing with Temporal and Spatial Constraints for Efficient Population Record Linkage
    Charini Nanayakkara and Peter Christen.
    Proceedings of the International Conference on Information and Knowledge Management (CIKM), Atlanta, October 2022.
    Camera-ready paper (pdf, 709 KB)

  4. D-TOUR: Detour-based point of interest detection in privacy-sensitive trajectories
    Maja Schneider, Lukas Gehrke, Peter Christen, and Erhard Rahm.
    Privacy and Security at Large Workshop, Bonn, September 2022.

  5. A Taxonomy of Attacks on Privacy-Preserving Record Linkage Anushka Vidanage, Thilina Ranbaduge, Peter Christen, and Rainer Schnell.
    In the Journal of Privacy and Confidentiality (JPC), July 2022.

  6. Unsupervised Identification of Abnormal Nodes and Edges in Graphs
    Asara Senaratne, Peter Christen, Graham Williams, and Pouya G. Omran.
    In the ACM Journal of Data and Information Quality (JDIQ), July 2022.

  7. Unsupervised Graph-Based Entity Resolution for Complex Entities
    Nishadi Kirielle, Peter Christen, and Thilina Ranbaduge.
    Accepted by the ACM Transactions on Knowledge Discovery from Data (TKDD), May 2022.
    Accepted paper (pdf, 484 KB)

  8. Accurate and Efficient Privacy-Preserving String Matching
    Sirintra Vaiwsri, Thilina Ranbaduge, and Peter Christen.
    In the International Journal of Data Science and Analytics (Springer), March 2022.
    Camera-ready paper (pdf, 1.4 MB)

  9. TransER: Homogeneous Transfer Learning for Entity Resolution
    Nishadi Kirielle, Peter Christen, and Thilina Ranbaduge.
    Accepted by the International Conference on Extending Database Technology (EDBT), March 2022.
    Camera-ready paper (pdf, 910 KB)

  10. Unsupervised Graph-based Entity Resolution for Accurate and Efficient Family Pedigree Search
    Nishadi Kirielle, Charini Nanayakkara, Peter Christen, Chris Dibben, Lee Williamson, Eilidh Garrett, and Claire Manson.
    Accepted by the International Conference on Extending Database Technology (EDBT), March 2022.
    Camera-ready paper (pdf, 864 KB)

  11. Accurate Privacy-preserving Record Linkage for Databases with Missing Values
    Sirintra Vaiwsri, Thilina Ranbaduge, Peter Christen, and Rainer Schnell.
    In Information Systems (Elsevier), January 2022.
    Camera-ready paper (pdf, 668 KB)

2021:
  1. Unsupervised Anomaly Detection in Knowledge Graphs.
    Asara Senaratne, Pouya. Omran, Graham Williams, and Peter Christen.
    Proceedings of the International Joint Conference on Knowledge Graphs (IJCKG'21), December 2021.
    Camera-ready paper (pdf, 693 KB)

  2. A Critique and Attack on: Blockchain-based Privacy-preserving Record Linkage
    Peter Christen, Rainer Schnell, Thilina Ranbaduge, and Anushka Vidanage.
    In Information Systems (Elsevier), October 2021.

  3. Data Science for Society: Challenges, Developments and Applications
    Pia Hardelid, Peter Christen, Elizabeth Williamson, Katie Harron, Bianca L De Stavola.
    Journal of the Royal Statistical Society: Series A (Statistics in Society), October 2021.

  4. Active Learning based Similarity Filtering for Efficient and Effective Record Linkage
    Charini Nanayakkara, Peter Christen, and Thilina Ranbaduge.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'21), Delhi, May 2021.
    Camera-ready paper (pdf, 374 KB)

  5. Large Scale Record Linkage in the Presence of Missing Data
    Thilina Ranbaduge, Peter Christen, and Rainer Schnell.
    Article available from arXiv.org, April 2021.

  6. Accurate and Efficient Suffix Tree Based Privacy-Preserving String Matching
    Sirintra Vaiwsri, Thilina Ranbaduge, Peter Christen, and Kee Siong Ng.
    Article available from arXiv.org, April 2021.

  7. F*: An Interpretable Transformation of the F-measure
    David J. Hand, Peter Christen, and Nishadi Kirielle.
    In the journal Machine Learning, March 2021.
    Article available online from the SpringerLink.

2020:
  1. Linking Sensitive Data - Methods and Techniques for Practical Privacy-Preserving Information Sharing
    Peter Christen, Thilina Ranbaduge, and Rainer Schnell.
    Springer, November 2020.

  2. Estimating Maternal Mortality Rates during the 1918 Flu using Birth to Death Linkage
    Peter Christen, Eilidh Garrett, Beata Nowok, Alice Reid, Lee Williamson, and Chris Dibben.
    International Population Data Linkage Conference (IPDLN), Adelaide (virtual), November 2020.

  3. Evaluating Binary Encoding Techniques in the Presence of Missing Values in Privacy-Preserving Record Linkage
    Thilina Ranbaduge and Peter Christen.
    International Population Data Linkage Conference (IPDLN), Adelaide (virtual), November 2020.

  4. Linking Sensitive Data
    Peter Christen, Thilina Ranbaduge, and Rainer Schnell.
    International Population Data Linkage Conference (IPDLN), Adelaide (virtual), November 2020.

  5. A Graph Matching Attack on Privacy-Preserving Record Linkage
    Anushka Vidanage, Peter Christen, Thilina Ranbaduge, and Rainer Schnell.
    ACM International Conference on Information and Knowledge Management (CIKM 2020), Galway (virtual), October 2020.
    Camera-ready paper (pdf, 1.5 MB)

  6. An Anonymiser Tool for Sensitive Graph Data
    Charini Nanayakkara, Peter Christen, and Thilina Ranbaduge.
    Workshop on EntitY Retrieval and lEarning (EYRE 2020), held at the ACM International Conference on Information and Knowledge Management (CIKM 2020), Galway (virtual), October 2020.

  7. Quality assessment in data linkage
    James Doidge, Peter Christen, and Katie Harron.
    In Guidance: Joined up data in government: the future of data linking methods, UK Office for National Statistics, August 2020.

  8. A Privacy Attack on Multiple Dynamic Match-key based Privacy-Preserving Record Linkage
    Anushka Vidanage, Thilina Ranbaduge, Peter Christen, and Sean Randall.
    In the International Journal of Population Data Science (IJPDS), vol 5, number 1, August 2020.

  9. Secure and Accurate Two-step Hash Encoding for Privacy-Preserving Record Linkage
    Thilina Ranbaduge, Peter Christen, and Rainer Schnell.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'20), Singapore, May 2020.
    Camera-ready paper (pdf, 1.1 MB)

  10. Secure Multi-party Summation Protocols: Are They Secure Enough Under Collusion?
    Thilina Ranbaduge, Dinusha Vatsalan, and Peter Christen.
    Transactions on Data Privacy, April 2020.
    Paper (pdf, 528 KB)

  11. Incremental Clustering Techniques for Multi-Party Privacy-Preserving Record Linkage
    Dinusha Vatsalan, Peter Christen, and Erhard Rahm.
    In the journal Data and Knowledge Engineering, March 2020.
    Camera-ready paper (pdf, 510 KB)

2019:
  1. Transforming Pairwise Duplicates to Entity Clusters for High-quality Duplicate Detection
    Uwe Draisbach, Peter Christen, and Felix Naumann.
    In the ACM Journal of Data and Information Quality (JDIQ), vol. 12, issue 1, 2019.
    Article available online from the ACM Digital Library.

  2. Evaluation Measure for Group-Based Record Linkage
    Charini Nanayakkara, Peter Christen, Thilina Ranbaduge, an Eilidh Garrett.
    In the International Journal of Population Data Science (IJPDS), vol 4, number 1, November 2019.

  3. Outlier Detection Based Accurate Geocoding of Historical Addresses
    Nishadi Kirielle, Peter Christen, Thilina Ranbaduge
    Proceedings of the Australasian Data Mining Conference (AusDM), Adelaide, December 2019.
    Camera ready paper (12 pages, pdf, 600 KB)

  4. Data Linkage: The Big Picture
    Peter Christen.
    Invited Diving into Data article in the Harvard Data Science Review, issue 1.2, November 2019.

  5. Informativeness-Based Active Learning for Entity Resolution
    Victor Christen, Peter Christen, and Erhard Rahm.
    International workshop on Data Integration and Applications, held at the ECML/PKDD Conference, Würzburg, Germany, September 2019.
    The article is available online (PDF, 3.1 MBytes).

  6. A Scalable Privacy-Preserving Framework for Temporal Record Linkage
    Thilina Ranbaduge and Peter Christen.
    In the journal Knowledge and Information Systems, June 2019.
    Camera-ready paper (pdf, 909 KB)

  7. Robust Temporal Graph Clustering for Group Record Linkage
    Charini Nanayakkara, Peter Christen, and Thilina Ranbaduge.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'19), Macau, April 2019.
    Camera-ready paper (pdf, 429 KB)

  8. Efficient Pattern Mining Based Cryptanalysis for Privacy-Preserving Record Linkage
    Anushka Vidanage, Thilina Ranbaduge, Peter Christen, and Rainer Schnell.
    Proceedings of the IEEE International Conference on Data Engineering (ICDE'19), Macau, April 2019.
    Camera-ready paper (pdf, 337 KB)

  9. Linking Scottish Vital Event Records using Family Groups
    Özgür Akgün, Alan Dearle, Graham Kirby, Eilidh Garrett, Tom Dalton, Peter Christen, Chris Dibben, and Lee Williamson.
    In Historical Methods: A Journal of Quantitative and Interdisciplinary History, 2019.

2018:
  1. Reference Values Based Hardening for Bloom Filters Based Privacy-Preserving Record Linkage
    Sirintra Vaiwsri, Thilina Ranbaduge, Peter Christen.
    Proceedings of the Australasian Data Mining Conference (AusDM), Bathurst, November 2018.
    Camera ready paper (12 pages, pdf, 290 KB)

  2. Privacy-Preserving Temporal Record Linkage
    Thilina Ranbaduge and Peter Christen.
    Proceedings of the IEEE International Conference on Data Mining (ICDM'18), Singapore, November 2018.
    Camera-ready paper (pdf, 591 KB)

  3. Towards a `Smart' Cost-Benefit Tool: Using Machine Learning to Predict the Costs of Criminal Justice Policy Interventions
    Matthew Manning, Gabriel Wong, Timothy Graham, Thilina Ranbaduge, Peter Christen, Kerry Taylor, Richard Wortley, Toni Makkai, and Pierre Skorich.
    In Crime Science, October 2018. Preprint available at Springer Link

  4. Precise and Fast Cryptanalysis for Bloom Filter Based Privacy-Preserving Record Linkage
    Peter Christen, Thilina Ranbaduge, Dinusha Vatsalan, and Rainer Schnell.
    In the IEEE Transactions on Knowledge and Data Engineering, October 2018.
    Final submitted paper (14 pages, pdf, 562 KB)

  5. Evaluating Hardening Techniques Against Cryptanalysis Attacks on Bloom Filter
    Thilina Ranbaduge, Anushka Vidanage, Sirintra Vaiwsri, Rainer Schnell, and Peter Christen.
    In the International Journal of Population Data Science (IJPDS), vol 3, number 4, August 2018.

  6. Temporal Graph-Based Clustering for Historical Record Linkage
    Charini Nanayakkara, Peter Christen, and Thilina Ranbaduge.
    International workshop on Mining and Learning from Graphs (MLG 2018), held at ACM SIGKDD 2018, London, August 2018.
    Submitted paper is available online from arXiv.org

  7. Developing a Temporal Bibliographic Data Set for Entity Resolution
    Yichen Hu, Qing Wang, and Peter Christen.
    Workshop BigScholar 2018, held at ACM SIGKDD 2018, London, August 2018.
    Submitted paper is available online from arXiv.org

  8. Distributed Privacy-Preserving Record Linkage Using Pivot-Based Filter Techniques
    Marcel Gladbach, Ziad Sehili, Thomas Kudrass, Peter Christen, and Erhard Rahm.
    In IEEE 34th International Conference on Data Engineering Workshops, Paris, July 2018.

  9. Pattern-Mining based Cryptanalysis of Bloom Filters for Privacy-Preserving Record Linkage
    Peter Christen, Anushka Vidanage, Thilina Ranbaduge, and Rainer Schnell.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'18), Melbourne, Australia, June 2018.
    Camera-ready paper (pdf, 462 KB)

  10. A Scalable and Efficient Subgroup Blocking Scheme for Multidatabase Record Linkage
    Thilina Ranbaduge, Dinusha Vatsalan, and Peter Christen.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'18), Melbourne, Australia, June 2018.
    Camera-ready paper (pdf, 695 KB)

  11. Using Metric Space Indexing for Complete and Efficient Record Linkage
    Özgür Akgün, Alan Dearle, Graham Kirby, and Peter Christen.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'18), Melbourne, Australia, June 2018.

  12. A Decision Tree Approach to Predicting Recidivism in Domestic Violence (BDASC best paper award)
    Senuri Wijenayake, Timothy Graham, Peter Christen.
    Proceedings of the Big Data Analytics for Social Computing (BDASC) workshop, held at the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'18), Melbourne, Australia, June 2018.
    Submitted paper is available online from arXiv.org

  13. DLforum - A Multidisciplinary Online Discussion Forum for Data Linkage Researchers and Practitioners
    See also: https://dmm.anu.edu.au/DLforum/
    Peter Christen, Thilina Ranbaduge, and Dinusha Vatsalan.
    In the International Journal of Population Data Science (IJPDS), vol 3, number 1, February 2018.

2017:
  1. Scalable Entity Resolution Using Probabilistic Signatures on Parallel Databases
    Yuhang Zhang, Kee Siong Ng, Michael Walker, Pauline Chou, Tania Churchill, and Peter Christen.
    arXiv.org, December 2017.
    Paper (pdf, 536 KB)

  2. Efficient Cryptanalysis of Bloom Filters for Privacy-Preserving Record Linkage
    Peter Christen, Rainer Schnell, Dinusha Vatsalan, and Thilina Ranbaduge.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17), Jeju Island, South Korea, May 2017.
    Submitted paper is available online from INI DLA Preprints: Paper (pdf, 588 KB)

  3. Improving Temporal Record Linkage using Regression Classification
    Yichen Hu, Qing Wang, Dinusha Vatsalan, and Peter Christen.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17), Jeju Island, South Korea, May 2017.
    Camera-ready paper (pdf, 360 KB)

  4. Advanced Methods for Linking Complex Historical Birth, Death, Marriage and Census Data
    Peter Christen.
    In the International Journal of Population Data Science (IJPDS), issue 1, vol 1, April 2017.

  5. Evaluation of Advanced Techniques for Multi-Party Privacy-Preserving Record Linkage on Real-World Health Databases
    Thilina Ranbaduge, Dinusha Vatsalan, Sean Randall, and Peter Christen.
    In the International Journal of Population Data Science (IJPDS), issue 1, vol 1, April 2017.

  6. A Note on Using the F-measure for Evaluating Record Linkage Algorithms
    David Hand and Peter Christen.
    In the journal Statistics and Computing, Online first, April 2017.
    Article available online from the SpringerLink.

  7. Data Scrubbing
    Peter Christen.
    Chapter in the Encyclopedia of Database Systems, Springer, April 2017.
    Article available online from SpringerLink.

  8. Temporal Group Linkage and Evolution Analysis for Census Data
    Victor Christen, Anika Gross, Jeffrey Fisher, Qing Wang, Peter Christen, and Erhard Rahm.
    Proceedings of the International Conference on Extending Database Technology (EDBT'17), Venice, March 2017.

  9. Privacy-Preserving Record Linkage for Big Data: Current Approaches and Research Challenges
    Dinusha Vatsalan, Ziad Sehili, Peter Christen, Erhard Rahm.
    Handbook of Big Data Technologies, Springer, 2017.
    Preprint (pdf, 531 KB)

  10. Scalable Multi-Database Privacy-Preserving Record Linkage using Counting Bloom Filters
    Dinusha Vatsalan, Peter Christen, and Erhard Rahm.
    arXiv.org, January 2017.
    Paper (pdf, 422 KB)

2016:
  1. Multi-Party Privacy-Preserving Record Linkage using Bloom Filters
    Dinusha Vatsalan and Peter Christen.
    arXiv.org, December 2016.
    Paper, (pdf, 348 KB)

  2. Scalable Block Scheduling for Efficient Multi-Database Record Linkage
    Thilina Ranbaduge, Dinusha Vatsalan, and Peter Christen.
    Proceedings of the IEEE International Conference on Data Mining (ICDM'16), Barcelona, December 2016.

  3. Scalable Privacy-Preserving Linking of Multiple Databases Using Counting Bloom filter
    Dinusha Vatsalan, Peter Christen, and Erhard Rahm.
    Proceedings of the workshop Privacy and Discrimination in Data Mining (PDDM), held at the IEEE International Conference on Data Mining (ICDM'16), Barcelona, December 2016.
    Camera-ready paper (pdf, 555 KB)

  4. Application of Advanced Record Linkage Techniques for Complex Population Reconstruction
    Peter Christen.
    arXiv.org, December 2016.
    Paper (pdf, 1.4 MB)

  5. Regression Classifier for Improved Temporal Record Linkage
    Yichen Hu, Qing Wang, Dinusha Vatsalan, and Peter Christen.
    Proceedings of the Fourteenth Australasian Data Mining Conference (AusDM'16), Canberra, December 2016.
    Paper, (pdf, 636 KB)

  6. A Note on using the F-measure for Evaluating Data Linkage Algorithms
    David Hand and Peter Christen.
    Preprint, Isaac Newton Institute for Mathematical Sciences (INI), Cambridge, November 2016.
    Article available online from INI DLA Preprints: Paper (pdf, 684 KB)

  7. Record Linkage
    Peter Christen and William Winkler.
    Chapter in the Encyclopedia of Machine Learning and Data Mining.
    Claude Sammut and Geoff Webb, Springer, June 2015.
    Article available online from SpringerLink.

  8. Hashing-based Distributed Multi-party Blocking for Privacy-preserving Record Linkage
    Thilina Ranbaduge, Dinusha Vatsalan, Peter Christen, and Vassilios Verykios.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'16), Auckland, New Zealand, April 2016.
    Article available online from SpringerLink.

  9. A Clustering-Based Framework for Incrementally Repairing Entity Resolution
    Qing Wang, Jingyi Gao, and Peter Christen.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'16), Auckland, New Zealand, April 2016.
    Article available online from SpringerLink.

  10. Active Learning Based Entity Resolution Using Markov Logic
    Jeffrey Fisher, Peter Christen, and Qing Wang.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'16), Auckland, New Zealand, April 2015.
    Article available online from SpringerLink

  11. Efficient Record Linkage Using a Compact Hamming Space
    Dimitrios Karapiperis, Dinusha Vatsalan, Vassilios Verykios, and Peter Christen.
    Proceedings of the International Conference on Extending Database Technology (EDBT'16), Bordeaux, France, March 2016.
    Article available online from OpenProceedings.

  12. Automatic Discovery of Abnormal Values in Large Textual Databases
    Peter Christen, Ross Gayler, Khoi-Nguyen Tran, Jeffrey Fisher and Dinusha Vatsalan.
    In the ACM Journal of Data and Information Quality (JDIQ), vol. 7, issues 1-2, 2016.
    Article available online from the ACM Digital Library.

  13. Privacy-Preserving Matching of Similar Patients
    Dinusha Vatsalan and Peter Christen.
    In Journal of Biomedical Informatics (JBI), vol. 59, pages 285-298, 2016.
    Article available online from Science Direct.

  14. Macro-Level Information Transfer in Social Media: Reflections of Crowd Phenomena
    Minkyoung Kim, David Newth, and Peter Christen.
    In Elsevier Neurocomputing, vol. 172, pages 84-99, 2016.
    Article available online from Science Direct.

2015:
  1. Efficient Entity Resolution with Adaptive and Interactive Training Data Selection
    Peter Christen, Dinusha Vatsalan, and Qing Wang.
    Proceedings of the IEEE International Conference on Data Mining (ICDM'15), Atlantic City, November 2015.
    Article available online from IEEE Explore.

  2. MERLIN - A Tool for Multi-party Privacy-preserving Record Linkage
    Thilina Ranbaduge, Dinusha Vatsalan, and Peter Christen.
    Demo paper. Proceedings of the IEEE International Conference on Data Mining (ICDM'15), Atlantic City, November 2015.
    Article available online from IEEE Explore.

  3. Context-aware Approximate String Matching for Large-scale Real-time Entity Resolution
    Peter Christen and Ross Gayler.
    Proceedings of the workshop Data Integration and Applications (DINA), held at the IEEE International Conference on Data Mining (ICDM'15), Atlantic City, November 2015.
    Article available online from IEEE Explore.
    Camera-ready paper (pdf, 383 KB)

  4. Dynamic Sorted Neighborhood Indexing for Real-Time Entity Resolution
    Banda Ramadan, Peter Christen, Huizhi Liang, and Ross Gayler.
    In ACM Journal of Data and Information Quality (JDIQ), vol. 6, issue 4, 2015.
    Article available online from the ACM Digital Library.

  5. Population Reconstruction
    Gerrit Bloothooft, Peter Christen, Kees Mandemakers, and Marijn Schraagen (editors).
    Springer, August 2015.

  6. Advanced Record Linkage Methods and Privacy Aspects for Population Reconstruction - A Survey and Case Studies.
    Peter Christen, Dinusha Vatsalan, and Zhichun Fu.
    Invited book chapter in Population Reconstruction.
    Gerrit Bloothooft, Peter Christen, Kees Mandemakers, and Marijn Schraagen (editors).
    Springer, August 2015.

  7. Clustering-Based Framework to Control Block Sizes for Entity Resolution
    Jeffrey Fisher, Peter Christen, Qing Wang, and Erhard Rahm.
    Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (KDD'15), Sydney, August 2015.
    Article available online from the ACM Digital Library.

  8. Efficient Interactive Training Selection for Large-scale Entity Resolution
    Qing Wang, Dinusha Vatsalan, and Peter Christen.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'15), Ho Chi Minh City, Vietnam, May 2015.
    Article available online from Springer.

  9. Clustering-based Scalable Indexing for Multi-party Privacy-preserving Record Linkage
    Thilina Ranbaduge, Peter Christen, and Dinusha Vatsalan.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'15), Ho Chi Minh City, Vietnam, May 2015.
    Article available online from Springer.
    Camera-ready paper (pdf, 591 KB)

  10. Unsupervised Blocking Key Selection for Real-Time Entity Resolution
    Banda Ramadan and Peter Christen.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'15), Ho Chi Minh City, Vietnam, May 2015.
    Article available online from Springer.
    Camera-ready paper (pdf, 393 KB)

  11. Context-Aware Detection of Sneaky Vandalism on Wikipedia across Multiple Languages (Best Student Paper Award)
    Khoi-Nguyen Tran and Peter Christen.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'15), Ho Chi Minh City, Vietnam, May 2015.
    Article available online from Springer.

  12. Large-Scale Multi-Party Counting Set Intersection Using a Space Efficient Global Synopsis
    Dimitrios Karapiperis, Dinusha Vatsalan, Vassilios Verykios, and Peter Christen.
    Proceedings of the International Conference on Database Systems for Advanced Applications (DASFAA), Hanoi, Vietnam, April 2015.
    Article available online from Springer.
    Camera-ready paper (pdf, 386 KB)

  13. Cross Language Learning from Bots and Users to detect Vandalism on Wikipedia
    Khoi-Nguyen Tran and Peter Christen.
    In IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 27, no 3, March 2015.
    Article available online from IEEE Explore.

2014:
  1. Uncovering Diffusion in Academic Publications using Model-Driven and Model-Free Approaches
    Minkyoung Kim, David Newth, and Peter Christen.
    Proceedings of the IEEE Conference on Social Computing and Networking (SocialCom 2014), Sydney, December 2014.

  2. Tree Based Scalable Indexing for Multi-Party Privacy-Preserving Record Linkage
    Thilina Ranbaduge, Peter Christen, and Dinusha Vatsalan.
    Proceedings of the Twelfth Australasian Data Mining Conference (AusDM'14), Brisbane, November 2014.
    Paper (pdf, 754 KB)

  3. Privacy Aspects in Big Data Integration: Challenges and Opportunities
    Peter Christen.
    Invited keynote at the 1st International Workshop on Privacy and Security of Big Data (PSBD 2014),
    held at the ACM International Conference on Information and Knowledge Management (CIKM 2014), Shanghai, November 2014.
    Article available online from the ACM Digital Library

  4. Scalable Privacy-Preserving Record Linkage for Multiple Databases
    Dinusha Vatsalan and Peter Christen.
    Poster paper at the ACM International Conference on Information and Knowledge Management (CIKM 2014), Shanghai, November 2014.
    Article available online from the ACM Digital Library
    Camera-ready paper (pdf, 240 KB)

  5. Forest-Based Dynamic Sorted Neighborhood Indexing for Real-Time Entity Resolution
    Banda Ramadan and Peter Christen.
    Poster paper at the ACM International Conference on Information and Knowledge Management (CIKM 2014), Shanghai, November 2014.
    Article available online from the ACM Digital Library

  6. Automatic Record Linkage of Individuals and Households in Historical Census Data
    Zhichun Fu, Mac Boot, Peter Christen, and Jun Zhou.
    In International Journal of Humanities and Arts Computing, October 2014.

  7. Dynamic Sorted Neighborhood Indexing for Real-Time Entity Resolution
    Banda Ramadan, Peter Christen and Huizhi Liang.
    Proceedings of the Australasian Database Conference (ADC'14), Brisbane, July 2014.
    Article available online from Springer Link.

  8. An Evaluation Framework for Privacy-Preserving Record Linkage
    Dinusha Vatsalan, Peter Christen, Christine M. O'Keefe, and Vassilios Verykios.
    Journal of Privacy and Confidentiality (CMU), 2014.
    Article available online from the Journal of Privacy and Confidentiality Web site.

  9. Challenges for Privacy Preservation in Data Integration
    Peter Christen, Dinusha Vatsalan, and Vassilios Verykios.
    In ACM Journal of Data and Information Quality (JDIQ), vol. 5, issue 1-2, 2014.
    Article available online from the ACM Digital Library.
    Camera-ready paper (pdf, 41 KB)

  10. Preparation of a real temporal voter data set for record linkage and duplicate detection research
    Peter Christen.
    Research School of Computer Science, The Australian National University.
    Technical Report, June 2014.
    Paper (pdf, 190 KB)

  11. A Graph Matching Method for Historical Census Household Linkage
    Zhichun Fu, Peter Christen, and Jun Zhou.
    Proceedings of the Eighteenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'14), Tainan, Taiwan, May 2014.
    Article available online from Springer Link.

  12. Noise-Tolerant Approximate Blocking for Dynamic Real-Time Entity Resolution
    Huizhi Liang, Yanzhe Wang, Peter Christen, and Ross Gayler.
    Proceedings of the Eighteenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'14), Tainan, Taiwan, May 2014.
    Article available online from Springer Link.

  13. Trends of news diffusion in social media based on crowd phenomena
    Minkyoung Kim, David Newth, and Peter Christen.
    Workshop on Social News on the Web (SNOW), held at the World Wide Web conference (WWW'14), Seoul, April 2014.
    Article available online from the ACM Digital Library.

  14. Macro-level information transfer across social networks
    Minkyoung Kim, David Newth, and Peter Christen.
    World Wide Web conference (WWW'14), Seoul, April 2014.
    Article available online from the ACM Digital Library.

  15. Sensor discovery and configuration framework for the Internet of Things paradigm.
    Charith Perera, Prem Prakash Jayaraman, Arkady Zaslavsky, Dimitrios Georgakopoulos, and Peter Christen
    IEEE World Forum on Internet of Things (WF-IoT), Seoul, March 2014.
    Available online

  16. Advanced record linkage methods and privacy aspects for population reconstruction
    Peter Christen
    Keynote paper at the workshop Population Reconstruction, Amsterdam, February 2014.
    Article available online from the workshop programme as PDF document (105 KB).

  17. Context-aware Dynamic Discovery and Configuration of `Things' in Smart Environments
    Charith Perera, Prem Jayaraman, Arkady Zaslavsky, Peter Christen, and Dimitrios Georgakopoulos.
    Chapter in the book Big Data and Internet of Things: A Roadmap for Smart Environments, Studies in Computational Intelligence.
    Springer Berlin Heidelberg, 2014,

  18. Sensor Search Techniques for Sensing as a Service Architecture for The Internet of Things
    Charith Perera, Arkady Zaslavsky, Chi Harold Liu, Michael Compton, Peter Christen, and Dimitrios Georgakopoulos.
    In IEEE Sensors Journal, 2014.

  19. Sensing as a Service Model for Smart Cities Supported by Internet of Things
    Charith Perera, Arkady Zaslavsky, Peter Christen, and Dimitrios Georgakopoulos.
    In Transactions on Emerging Telecommunications Technologies, 2014.

  20. MOSDEN: An Internet of Things Middleware for Resource Constrained Mobile Devices
    Charith Perera, Prem Prakash Jayaraman, Arkady Zaslavsky, Peter Christen, and Dimitrios Georgakopoulos.
    Proceedings of the 47th Hawaii International Conference on System Sciences (HICSS), Kona, Hawaii, January, 2014.

2013:
  1. Data Cleaning and Matching of Institutions in Bibliographic Databases
    Jeffrey Fisher, Qing Wang, Paul Wong and Peter Christen.
    Proceedings of the Eleventh Australasian Data Mining Conference (AusDM'13), Canberra, November 2013.
    Paper (pdf, 171 KB)

  2. Efficient two-party private blocking based on sorted nearest neighborhood clustering
    Dinusha Vatsalan, Peter Christen, and Vassilios Verykios.
    Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM 2013), San Francisco, October 2013.
    Article available online from the ACM Digital Library

  3. GeCo: an online personal data generator and corruptor
    Khoi-Nguyen Tran, Dinusha Vatsalan, and Peter Christen.
    Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM 2013), San Francisco, October 2013.
    Article available online from the ACM Digital Library

  4. Flexible and extensible generation and corruption of personal data
    Peter Christen and Dinusha Vatsalan.
    Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM 2013), San Francisco, October 2013.
    Article available online from the ACM Digital Library
    Camera-ready paper (pdf, 100 KB)

  5. Modeling dynamics of meta-populations with a probabilistic approach: global diffusion in social media
    Minkyoung Kim, David Newth, and Peter Christen.
    Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM 2013), San Francisco, October 2013.
    Article available online from the ACM Digital Library

  6. Identifying multilingual Wikipedia articles based on cross language similarity and activity
    Khoi-Nguyen Tran and Peter Christen.
    Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM 2013), San Francisco, October 2013.
    Article available online from the ACM Digital Library

  7. Social affinity filtering: recommendation through fine-grained analysis of user interactions and activities
    Suvash Sedhain, Scott Sanner, Lexing Xie, Riley Kidd, Khoi-Nguyen Tran, and Peter Christen.
    Proceedings of the Conference on Online Social Networks (COSN 2013), Boston, October 2013.
    Article available online from the ACM Digital Library

  8. Semantic-driven Configuration of Internet of Things Middleware
    Charith Perera, Arkady Zaslavsky, Michael Compton, Peter Christen, and Dimitrios Georgakopoulos.
    Proceedings of the International Conference on Semantics, Knowledge and Grids (SKG 2013), Beijing, October 2013.
    Article available online from arXiv.org

  9. Modeling Dynamics of Diffusion Across Heterogeneous Social Networks: News Diffusion in Social Media
    Minkyoung Kim, David Newth, and Peter Christen.
    In Entropy, volume 15, number 10, October 2013, Pages 4215-4242.
    Article available online at http://dx.doi.org/10.3390/e15104215

  10. Context Aware Sensor Configuration Model for Internet of Things
    Charith Perera, Arkady Zaslavsky, Michael Compton, Peter Christen, and Dimitrios Georgakopoulos.
    Proceedings of the International Semantic Web Conference (ISWC), Posters and Demos, Sydney, Australia, October 2013.
    Article available online at arXiv.org

  11. Privacy-preserving record linkage
    Vassilios Verykios and Peter Christen.
    In Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, volume 3, issue 5, pages 321- 332, September/October 2013.
    Article available online at Wiley Online Library

  12. A Taxonomy of Privacy-Preserving Record Linkage Techniques
    Dinusha Vatsalan, Peter Christen, and Vassilios Verykios.
    In Information Systems (Elsevier), volume 38, issue 6, pages 946-969, September 2013.
    Article available online at http://dx.doi.org/10.1016/j.is.2012.11.005.

  13. Modeling direct and indirect influence across heterogeneous social networks
    Minkyoung Kim, David Newth, and Peter Christen.
    Proceedings of the Workshop on Social Network Mining and Analysis (SNAKDD 2013), held at ACM SIGKDD 2013, Chicago, August 2013.
    Article available online from the ACM Digital Library

  14. Context-aware Sensor Search, Selection and Ranking Model for Internet of Things Middleware
    Charith Perera, Arkady Zaslavsky, Peter Christen, Michael Compton, and Dimitrios Georgakopoulos.
    Proceedings of the International Conference on Mobile Data Management (MDM 2013), Milan, Italy, June 2013.
    Article available online from IEEE Explore

  15. Dynamic Configuration of Sensors Using Mobile Sensor Hub in Internet of Things Paradigm
    Charith Perera, Prem Jayaraman, Arkady Zaslavsky, Peter Christen, and Dimitrios Georgakopoulos.
    Proceedings of the International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP 2013), Melbourne, April 2013.
    Article available online from IEEE Explore

  16. Adaptive Temporal Entity Resolution on Dynamic Databases
    Peter Christen and Ross Gayler.
    Proceedings of the Seventeenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'13), Gold Coast, Australia, April 2013.
    Article available online from Springer Link.

  17. Sorted Nearest Neighborhood Clustering for Efficient Private Blocking
    Dinusha Vatsalan and Peter Christen.
    Proceedings of the Seventeenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'13), Gold Coast, Australia, April 2013.
    Article available online from Springer Link.

  18. Cross Language Prediction of Vandalism on Wikipedia using Article Views and Revisions
    Khoi-Nguyen Tran and Peter Christen.
    Proceedings of the Seventeenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'13), Gold Coast, Australia, April 2013.
    Article available online from Springer Link.

  19. Dynamic Similarity-Aware Inverted Indexing for Real-Time Entity Resolution
    Banda Ramadan, Peter Christen, Huizhi Liang, Ross Gayler, and David Hawking.
    In proceedings of the International Workshop on Data Mining Applications in Industry and Government (DMApps 2013),
    held at the Seventeenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'13), Gold Coast, Australia, April 2013.
    Paper (pdf, 272 KB)

  20. Predicting High Impact Academic Papers using Citation Network Features
    Daniel McNamara, Paul Wong, Peter Christen and Kee Siong Ng.
    In proceedings of the International Workshop on Data Mining Applications in Industry and Government (DMApps 2013),
    held at the Seventeenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'13), Gold Coast, Australia, April 2013.
    Paper (pdf, 328 KB)

  21. Context Aware Computing for The Internet of Things: A Survey
    Charith Perera, Arkady Zaslavsky, Peter Christen, and Dimitrios Georgakopoulos.
    In IEEE Communications Surveys and Tutorials, 2013.
    Article available online from IEEE Explore

2012:
  1. A Bag Reconstruction Method for Multiple Instance Classification and Group Record Linkage
    Zhichun Fu, Jun Zhou, Furong Peng, and Peter Christen.
    Proceedings of the Eighth International Conference on Advanced Data Mining and Applications (ADMA'12), Nanjing, China, December 2012.
    Article available online from Springer Link.

  2. An Iterative Two-Party Protocol for Scalable Privacy-Preserving Record Linkage
    Dinusha Vatsalan and Peter Christen.
    Proceedings of the Tenth Australasian Data Mining Conference (AusDM'12), Sydney, December 2012.
    Paper, (pdf, 279 KB)

  3. CA4IOT: Context Awareness for Internet of Things
    Charith Perera, Arkady Zaslavsky, Peter Christen, and Dimitrios Georgakopoulos,
    Proceedings of the IEEE International Conference on Green Computing and Communications, Conference on Internet of Things,
    and Conference on Cyber, Physical and Social Computing
    (GreenCom/iThings/CPSCom'12), Besancon, France, November 2012.

  4. Time-aware Topic Recommendation Based on Micro-blogs
    Huizhi Liang, Yue Xu, Dian Tjondronegoro, and Peter Christen.
    Proceedings of the ACM Conference on Information and Knowledge Management (CIKM'12), Hawaii, October 2012

  5. Capturing Sensor Data from Mobile Phones using Global Sensor Network Middleware
    Charith Perera, Arkady Zaslavsky, Peter Christen, Ali Salehi, and Dimitrios Georgakopoulos.
    Proceedings of the IEEE International Workshop on Internet-of-Things Communications and Networking (IoT-CN),
    held at the 23rd IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Sydney, September 2012.

  6. A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication
    Peter Christen
    In IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 12, no. 9, September 2012.
    Article available online from Computer.org digital library.

  7. Data Matching - Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection
    Peter Christen.
    Springer, Data-Centric Systems and Applications, August 2012.
    Preface, table of contents, and references are available for download.

  8. Event Diffusion Patterns in Social Media
    Minkyoung Kim, Lexing Xie, and Peter Christen.
    International AAAI Conference on Weblogs and Social Media, Dublin, June 2012.
    Paper (pdf, 1.3 MB)

  9. Multiple Instance Learning for Group Record Linkage
    Zhichun Fu, Jun Zhou, Peter Christen and Mac Boot.
    Proceedings of the Sixteenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'12), Kuala Lumpur, May-June 2012.
    Article available online from Springer Link.

  10. Connecting Mobile Things to Global Sensor Network Middleware using System-generated Wrappers
    Charith Perera, Arkady Zaslavsky, Peter Christen, Ali Salehi, and Dimitrios Georgakopoulos.
    Proceedings of the ACM International Workshop on Data Engineering for Wireless and Mobile Access (MobiDE),
    ACM Special Interest Group on Management of Data and Principles of Database Systems (SIGMOD/PODS), Scottsdale, Arizona, May 2012.
    Paper available online from ACM Digital Library.

  11. New Objective Functions for Social Collaborative Filtering.
    Joseph Noel, Scott Sanner, Khoi-Nguyen Tran, Peter Christen, Lexing Xie, Edwin Bonilla, Ehsan Abbasnejad, and Nicolas Della Penna.
    World Wide Web conference (WWW'12), Lyon, April 2012.
    Paper available online from WWW'12 Proceedings.

2011:
  1. Automatic Cleaning and Linking of Historical Census Data using Household Information
    Zhichun Fu, Peter Christen and Mac Boot.
    Proceedings of the Fifth International Workshop on Domain Driven Data Mining (DDDM'11), held at IEEE ICDM, Vancouver, December 2011.

  2. Proceedings of the Ninth Australasian Data Mining Conference (AusDM'11)
    Peter Vamplew, Andrew Stranieri, Kok-Leong Ong, Peter Christen and Paul Kennedy (editors).
    Proceedings of the Ninth Australasian Data Mining Conference, Ballarat, December 2011.
    Conferences in Research and Practice in Information Technology (CRPIT), vol. 121.

  3. An Efficient Two-Party Protocol for Approximate Matching in Private Record Linkage
    Dinusha Vatsalan, Peter Christen and Vassilios Verykios.
    Proceedings of the Ninth Australasian Data Mining Conference (AusDM'11), Ballarat, December 2011.
    Paper (pdf, 880 KB) available online from Conferences in Research and Practice in Information Technology (CRPIT), vol. 121.

  4. A Supervised Learning and Group Linking Method for Historical Census Household Linkage
    Zhichun Fu, Peter Christen and Mac Boot.
    Proceedings of the Ninth Australasian Data Mining Conference (AusDM'11), Ballarat, December 2011.
    Paper (pdf, 860 KB) available online from Conferences in Research and Practice in Information Technology (CRPIT), vol. 121.

  5. Fake Injection Strategies for Private Phonetic Matching
    Alexandros Karakasidis, Vassilios Verykios and Peter Christen.
    Proceedings of the International Workshop on Data Privacy Management (DPM2011), Leuven, Belgium, September 2011.

  6. Analysis of Cluster Migrations using Self-Organizing Maps
    Denny, Peter Christen and Graham Williams.
    Proceedings of the International Workshop on Behavior Informatics (BI2011),
    15th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD2011), Shenzhen, China, May 2011.

  7. Robust Record Linkage Blocking using Suffix Arrays and Bloom Filters
    Timothy de Vries, Hui Ke, Sanjay Chawla and Peter Christen.
    In ACM Transactions on Knowledge Discovery from Data, vol. 2, no. 5, February 2011.
    Available online.

2010:
  1. Visualizing Temporal Cluster Changes using Relative Density Self-Organizing Maps
    Denny, Graham Williams and Peter Christen
    In Knowledge and Information Systems Springer, vol. 25, no. 2, November 2010.
    Paper available online.

  2. New Frontiers in Applied Data Mining
    T. Theeramunkong, C. Nattee, P.J.L. Adeodato, N. Chawla; Peter Christen, P. Lenca, J. and G Williams (editors).
    Revised Selected Papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) workshops, Bangkok, Thailand, April 2009.

2009:
  1. Data Mining and Analytics 2009
    Paul Kennedy, Kok-Leong Ong and Peter Christen (editors).
    Proceedings of the Seventh Australasian Data Mining Conference (AusDM 2009), Melbourne, December 2009.
    Conferences in Research and Practice in Information Technology (CRPIT), vol. 101.

  2. Robust Record Linkage Blocking using Suffix Arrays
    Timothy de Vries, Hui Ke, Sanjay Chawla and Peter Christen.
    Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), Hong Kong, November 2009.
    Paper available online.

  3. Similarity-Aware Indexing for Real-Time Entity Resolution
    Peter Christen, Ross Gayler and David Hawking.
    Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), Hong Kong, November 2009.
    Paper available online.
    The full paper (10 pages) is published as an ANU Computer Science technical report.
    Report (pdf, 273 KB)   Report (ps.gz, 285 KB)

  4. Development and User Experiences of an Open Source Data Cleaning, Deduplication and Record Linkage System
    Peter Christen
    In SIGKDD Explorations, Volume 11, Issue 1, July 2009.
    Available online: Paper (pdf, 778 KB)

  5. Accurate Synthetic Generation of Realistic Personal Information
    Peter Christen and Agus Pudjijono
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Bangkok, Thailand, April 2009.
    Paper available online.
    Submitted paper (12 pages, pdf, 645 KB)

  6. Geocode Matching and Privacy Preservation
    Peter Christen
    Invited Presentation at the PinKDD 2008 workshop held at the ACM SIGKDD 2008 conference, Las Vegas, August 2008.
    In Revised, Selected Papers, F. Bonchi, E. Ferrari, W. Jiang and B. Malin (editors).
    Springer Lecture Notes in Computer Science (LNCS), vol. 5456, 2009.
    Paper available online.

2008:
  1. Visualization of Temporal Changes in Cluster Structures using Self-Organizing Maps
    Denny, Graham Williams, and Peter Christen
    In proceedings of the IEEE International Conference on Data Mining (ICDM), Pisa, Italy, December 2008.
    Please contact Denny if you are interested in this paper.

  2. Data Mining and Analytics 2008
    John Roddick, Jiuyong Li, Peter Christen and Paul Kennedy (editors).
    Proceedings of the Seventh Australasian Data Mining Conference (AusDM 2008), Glenelg, Adelaide, November 2008.
    Conferences in Research and Practice in Information Technology (CRPIT), vol. 87.

  3. Towards Scalable Real-Time Entity Resolution using a Similarity-Aware Inverted Index Approach
    Peter Christen and Ross Gayler
    In proceedings of the Seventh Australasian Data Mining Conference (AusDM 2008), Glenelg, Adelaide, November 2008.
    Paper (pdf, 218 KB) available online from Conferences in Research and Practice in Information Technology (CRPIT), vol. 87.

  4. Automatic Record Linkage using Seeded Nearest Neighbour and Support Vector Machine Classification
    Peter Christen
    Proceedings of the ACM SIGKDD 2008 conference, Las Vegas, August 2008.
    Paper available online.

  5. Febrl - An Open Source Data Cleaning, Deduplication and Record Linkage System with a Graphical User Interface
    Peter Christen
    Proceedings of the demo session at the ACM SIGKDD 2008 conference, Las Vegas, August 2008.
    Paper available online.

  6. Automatic Training Example Selection for Scalable Unsupervised Record Linkage
    Peter Christen
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Osaka, Japan, May 2008.
    Paper available online.
    Submitted paper (12 pages, pdf, 146 KB)   Submitted paper (12 pages, ps.gz, 142 KB)

  7. Exploratory Hot Spot Profile Analysis using Interactive Visual Drill-Down Self-Organizing Maps
    Denny, Graham J. Williams and Peter Christen.
    Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Osaka, Japan, May 2008.
    Paper available online.

  8. Febrl - A Freely Available Record Linkage System with a Graphical User Interface
    Peter Christen
    Proceedings of the Australasian Workshop on Health Data and Knowledge Management (HDKM), Wollongong, January 2008.
    Paper (pdf, 748 KB) available online from Conferences in Research and Practice in Information Technology (CRPIT), vol. 80.

2007:
  1. Data Mining and Analytics 2007
    Peter Christen, Paul J. Kennedy, Jiuyong Li, Inna Kolyshkina and Graham J. Williams (editors).
    Proceedings of the Sixth Australasian Data Mining Conference (AusDM 2007), Gold Coast, Australia, December 2007.
    Conferences in Research and Practice in Information Technology (CRPIT), vol. 70.

  2. A Two-Step Classification Approach to Unsupervised Record Linkage
    Peter Christen
    In proceedings of the Sixth Australasian Data Mining Conference (AusDM 2007), Gold Coast, December 2007.
    Paper (pdf, 440 KB) available online from Conferences in Research and Practice in Information Technology (CRPIT), vol. 70.

  3. Exploratory Multilevel Hot Spot Analysis: Australian Taxation Office Case Study
    Denny, Graham J. Williams, and Peter Christen
    In proceedings of the Sixth Australasian Data Mining Conference (AusDM 2007), Gold Coast, December 2007.
    Paper (pdf, 759 KB) available online from Conferences in Research and Practice in Information Technology (CRPIT), vol. 70.

  4. Evaluation of a Graduate Level Data Mining Course with Industry Participants
    Peter Christen
    In proceedings of the Sixth Australasian Data Mining Conference (AusDM 2007), Gold Coast, December 2007.
    Paper (pdf, 436 KB) available online from Conferences in Research and Practice in Information Technology (CRPIT), vol. 70.

  5. Towards parameter-free blocking for scalable record linkage
    Peter Christen
    Technical Report TR-CS-07-03
    ANU Joint Computer Science Technical Report Series, August 2007.
    Report (pdf, 201 KB)   Report (ps.gz, 199 KB)

  6. Quality and Complexity Measures for Data Linkage and Deduplication
    Peter Christen and Karl Goiser
    Chapter in the book Quality Measures in Data Mining, vol. 43, Studies in Computational Intelligence.
    F. Guillet and H. Hamilton (eds), Springer, March 2007.
    Available online at SpringerLink.

2006:
  1. Privacy-Preserving Data Linkage and Geocoding: Current Approaches and Research Directions
    Peter Christen
    In proceedings of the Workshop on Privacy Aspects of Data Mining (PADM) held at the IEEE International Conference on Data Mining (ICDM), Hong Kong, December 2006.
    Final 5-page version: Paper (pdf, 53 KB)   Paper (ps.gz, 35 KB)
    Submitted 11-page version: Paper (pdf, 118 KB)   Paper (ps.gz, 74 KB)

  2. A Comparison of Personal Name Matching: Techniques and Practical Issues
    Peter Christen
    In proceedings of the Workshop on Mining Complex Data (MCD) held at the IEEE International Conference on Data Mining (ICDM), Hong Kong, December 2006.
    Final 5-page version: Paper (pdf, 57 KB)   Paper (ps.gz, 40 KB)
    Submitted 12-page version available as:
    Technical Report TR-CS-06-02
    ANU Joint Computer Science Technical Report Series, September 2006.
    Report (pdf, 248 KB)   Report (ps.gz, 236 KB)

  3. Dynamic Algorithm Selection Using Reinforcement Learning
    Warren Armstrong, Peter Christen, Eric McCreath and Alistair Rendell
    Proceedings of the Workshop on Integrating AI and Data Mining, Hobart, Australia, December 2006.
    Paper (pdf, 254 KB)

  4. Data Mining and Analytics 2006
    Peter Christen, Paul J. Kennedy, Jiuyong Li, Simeon J. Simoff and Graham J. Williams (editors).
    Proceedings of the Fifth Australasian Data Mining Conference (AusDM 2006), Sydney, November, 2006.
    Conferences in Research and Practice in Information Technology (CRPIT), vol. 61.

  5. Towards Automated Record Linkage
    Karl Goiser and Peter Christen
    In proceedings of the Fifth Australasian Data Mining Conference (AusDM 2006), Sydney, November 2006.
    Paper (pdf, 513 KB) available online from Conferences in Research and Practice in Information Technology (CRPIT), vol. 61.

  6. Secure Health Data Linkage and Geocoding: Current Approaches and Research Directions
    Peter Christen and Tim Churches
    Proceedings of the National e-Health Privacy and Security Symposium (ehPASS), Brisbane, October 2006.
    Paper (pdf, 139 KB)   Paper (ps.gz, 127 KB)

  7. Automated Geocoding of Routinely Collected Health Data in New South Wales
    Richard Summerhayes, Paul Holder, John Beard, Peter Christen, Alan Willmore and Tim Churches
    The NSW Public Health Bulletin, volume 17, number 3-4, March-April 2006.
    Online version available here.

  8. A Probabilistic Geocoding System Utilising a Parcel Based Address File
    Peter Christen, Alan Willmore and Tim Churches
    In Advances in Data Mining: Theory, Methodology, Techniques, and Applications. Simeon Simoff and Graham Williams (editors). State-of-the-Art Lecture Notes in Artificial Intelligence, Volume 3755, Springer-Verlag, 2006.
    Available online at SpingerLink, LNCS 3755.
    Copyright for this publication is held by the Springer Verlag.

2005:
  1. Automated Probabilistic Address Standardisation and Verification
    Peter Christen and Daniel Belacic
    Proceedings of the fourth Australasian Data Mining Conference (AusDM 2005), Sydney, December 2005.
    Paper (pdf, 146 KB)   Paper (ps.gz, 204 KB)

  2. Assessing Deduplication and Data Linkage Quality: What to Measure?
    Peter Christen and Karl Goiser
    Proceedings of the fourth Australasian Data Mining Conference (AusDM 2005), Sydney, December 2005.
    Paper (pdf, 178 KB)   Paper (ps.gz, 163 KB)

  3. Probabilistic Data Generation for Deduplication and Data Linkage
    Peter Christen
    Proceedings of the Sixth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL'05), Brisbane, July 2005.
    Copyright for this publication is held by the Springer Verlag.
    Available online at SpingerLink, LNCS 3578.
    Paper (pdf, 124 KB)   Paper (ps.gz, 135 KB)

  4. Febrl - Freely extensible biomedical record linkage (Manual, release 0.3)
    Peter Christen and Tim Churches
    Available online from SourceForge.Net, April 2005.
    Manual (pdf, 960 KB)   Manual (pdf, 282 KB)

  5. A Probabilistic Deduplication, Record Linkage and Geocoding System
    Peter Christen and Tim Churches
    Proceedings of the ARC Health Data Mining workshop, University of South Australia, April 2005.
    Paper (pdf, 136 KB)   Paper (ps.gz, 134 KB)

2004:
  1. A Probabilistic Geocoding System based on a National Address File
    Peter Christen, Tim Churches and Alan Willmore
    Proceedings of the Australasian Data Mining Conference, Cairns, December 2004.
    Paper (pdf, 120 KB)   Paper (ps.gz, 128 KB)

  2. Some Methods for Blindfolded Record Linkage
    Tim Churches and Peter Christen
    Published online at BioMed Central Medical Informatics and Decision Making, June 2004.
    For abstract and downloadable PDF file see here.

  3. Febrl - A Parallel Open Source Data Linkage System
    Peter Christen, Tim Churches and Markus Hegland
    Proceedings of the 8th PAKDD'04 (Pacific-Asia Conference on Knowledge Discovery and Data Mining), Sydney, May 2004.
    Springer Lecture Notes in Artificial Intelligence, (3056), available online at Springerlink.
    Copyright for this publication is held by the Springer Verlag.
    Paper (pdf, 202 KB)   Paper (ps.gz, 81 KB)

  4. Blind Data Linkage using n-gram Similarity Comparisons
    Tim Churches and Peter Christen
    Proceedings of the 8th PAKDD'04 (Pacific-Asia Conference on Knowledge Discovery and Data Mining), Sydney, May 2004.
    Springer Lecture Notes in Artificial Intelligence, (3056), available online here.
    Copyright for this publication is held by the Springer Verlag.
    Paper (pdf, 176 KB)   Paper (ps.gz, 68 KB)

2003:
  1. A Comparison of Fast Blocking Methods for Record Linkage
    Rohan Baxter, Peter Christen and Tim Churches
    Proceedings of the Workshop on Data Cleaning, Record Linkage and Object Consolidation at the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington DC, August 2003.
    Paper 3 pages (pdf, 87 KB)   Paper 6 pages (pdf, 138 KB)

2002:
  1. Preparation of name and address data for record linkage using hidden Markov models
    Tim Churches, Peter Christen, Kim Lim and Justin X Zhu
    Published online at BioMed Central Medical Informatics and Decision Making, December 2002.
    For abstract and downloadable PDF file see here. Also available locally: Paper (pdf, 353 KB)

  2. Probabilistic Name and Address Cleaning and Standardisation
    Peter Christen, Tim Churches and Justin Xi Zhu
    Proceedings of the Australasian Data Mining Workshop, Canberra, December 2002.
    Paper (ps.gz, 74 KB)   Paper (pdf, 158 KB)

  3. How Fast is '-fast'? Performance Analysis of KDD Applications using Hardware Performance Counters on UltraSPARC-III
    Adam Czezowski and Peter Christen
    Proceedings of the Australasian Data Mining Workshop, Canberra, December 2002.
    Paper (ps.gz, 82 KB)   Paper (pdf, 174 KB)

  4. High-Performance Computing Techniques for Record Linkage
    Peter Christen, Justin Xi Zhu, Markus Hegland, Stephen Roberts, Ole M. Nielsen, Tim Churches and Kim Lim
    Proceedings of the Australian Health Outcomes Conference (AHOC-2002), Canberra, July 2002.
    Paper (ps.gz, 95 KB)   Paper (pdf, 233 KB)

  5. Parallel Computing Techniques for High-Performance Probabilistic Record Linkage
    Peter Christen, Markus Hegland, Stephen Roberts, Ole M. Nielsen, Tim Churches and Kim Lim
    Proceedings of the Symposium on Health Data Linkage, Sydney, March 2002.
    Paper (ps.gz, 107 KB)   Paper (pdf, 228 KB)

  6. Performance Analysis of KDD Applications using Hardware Event Counters
    Peter Christen and Adam Czezowski
    Technical Report TR-CS-02-01, ANU Joint Computer Science Technical Report Series, February 2002.
    Report (ps.gz, 131 KB)   Report (pdf, 238 KB)

2001:
  1. DMtools - Open Source Software for Database Mining
    Peter Christen, Ole M. Nielsen and Markus Hegland
    In proceedings of the Workshop on Database Support for KDD (at the PKDD'2001 Conference), Freiburg, Germany, September 2001.
    Paper (ps.gz, 81 KB)

  2. Parallel Data Mining on a Beowulf Cluster
    Peter Christen, Ole M. Nielsen, Markus Hegland and Peter E. Strazdins
    Proceedings of the HPC Asia 2001 Conference, Gold Coast, Queensland, Australia, September 2001.
    Paper (ps.gz, 264 KB)   Paper (pdf.gz, 311 KB)

  3. A Scalable Parallel FEM Surface Fitting Algorithm for Data Mining
    Peter Christen, Markus Hegland, Stephen Roberts, Ole M. Nielsen and Irfan Altas
    Proceedings of the International Workshop on Mining Spatial and Temporal Data (at the PAKDD-2001 Conference), Hong Kong, April 2001.
    Paper (ps.gz, 229 KB)

  4. A Toolbox Approach to Flexible and Efficient Data Mining
    Ole M. Nielsen, Peter Christen, Markus Hegland, Tatiana Semenova and Timothy Hancock
    Proceedings of the PAKDD-2001 Conference, Hong Kong, April 2001.
    Published in the Springer Lecture Notes in Computer Science, Artificial Intelligence series, LNAI2035.
    Copyright for this publication is held by the Springer Verlag.
    Paper (ps.gz, 143 KB)   Paper (pdf, 183 KB)

  5. Towards a Parallel Data Mining Toolbox
    Peter Christen, Markus Hegland, Ole M. Nielsen, Stephen Roberts, Peter E. Strazdins, Irfan Altas, Tatiana Semenova and Timothy Hancock
    Proceedings of the 15th International Parallel and Distributed Processing Symposium (IPDPS-2001), San Francisco, April 2001. Workshop Parallel and Distributed Data Mining.
    Copyright 2001 Institute of Electrical and Electronic Engineers (IEEE). Reprinted for the Proceedings of the IPDPS-2001.
    Paper (ps.gz, 139 KB)

  6. Data Mining with Python
    Ole M. Nielsen, Peter Christen, Markus Hegland and Tatiana Semenova
    Proceedings of the 9th International Python Conference, Long Beach, California, March 2001.
    Paper available upon request from: Ole Nielsen.

  7. A Scalable Parallel FEM Surface Fitting Algorithm for Data Mining
    Peter Christen, Markus Hegland, Stephen Roberts and Irfan Altas
    Technical Report TR-CS-01-01, ANU Joint Computer Science Technical Report Series, January 2001.
    Report (ps.gz, 255 KB)   Report (pdf, 377 KB)

2000:
  1. Scalable Parallel Algorithms for Surface Fitting and Data Mining
    Peter Christen, Markus Hegland, Ole M. Nielsen, Stephen Roberts, Peter E. Strazdins and Irfan Altas
    In Elsevier Journal of Parallel Computing, special issue on Aspects of Parallel Computing for Linear Systems and Associated Problems, September 2000.

  2. Data Mining of Administrative Claims Data of Pathology Services
    Simon Hawkins, Graham Williams, Rohan Baxter, Peter Christen, Michael Fett, Markus Hegland, Fuchun Huang, Ole Nielsen, Tatiana Semenova and Andrew Smith
    Proceedings of the Thirty-Fourth Hawaii International Conference on System Sciences (HICSS-34), January 2001.
    Available upon request from: Rohan Baxter, CSIRO CMIS.

  3. Scalable Parallel Algorithms for Predictive Modelling
    Peter Christen, Markus Hegland, Ole Møller Nielsen, Stephen Roberts and Irfan Altas
    Proceedings of the Data Mining 2000 Conference, Cambridge, UK, N. Ebecken and C.A. Brebbia, editors, in Data Mining II, WIT Press, Southhampton Boston, 2000.
    Paper (ps.gz, 606 KB)

1999:
  1. The Integrated Delivery of Large-Scale Data Mining: The ACSYS Data Mining Project
    Graham Williams, Irfan Altas, Sergey Barkin, Peter Christen, Markus Hegland, Alonso Marquez, Peter Milne, Rajehndra Nagappan and Stephen Roberts
    KDD-99 Workshop on Large-Scale Parallel KDD Systems. San Diego, August 1999,
    Springer Lecture Notes in Artificial Intelligence 1759.

  2. Parallelization of a Finite Element Surface Fitting Algorithm for Data Mining
    Peter Christen, Irfan Altas, Markus Hegland, Stephen Roberts, Kevin Burrage and Roger Sidje. Proceedings of the CTAC-99 Conference. Canberra, 20-24 September 1999.
    Paper (ps.gz, 552 KB)   Slides (ps.gz, 614 KB)

  3. A Parallel Iterative Linear System Solver with Dynamic Load Balancing
    Peter Christen
    Proceedings of the CTAC-99 Conference. Canberra, 20-24 September 1999.
    Paper (ps.gz, 467 KB)   Slides (ps.gz, 218 KB)

  4. A Parallel Finite Element Surface Fitting Algorithm for Data Mining
    Peter Christen, Irfan Altas, Markus Hegland, Stephen Roberts, Kevin Burrage and Roger Sidje
    Proceedings of the ParCo-99 Conference, Delft, 17-20 August 1999.

  5. A Parallel Iterative Linear System Solver with Dynamic Load Balancing
    Peter Christen
    Dissertation (PhD thesis), Institut für Informatik, Universität Basel. February 1999.
    Available upon request.

1998:
  1. PAISS - Design and Implementation of a Parallel Iterative Linear System Solver with Dynamic Load Balancing
    Peter Christen
    Technischer Bericht 98-5, October 1998.
    Report (ps.gz, 193 KB)

  2. A Parallel Iterative Linear System Solver with Dynamic Load Balancing
    Peter Christen
    Proceedings of the ACM International Conference of Supercomputing (ICS) 1998. Melbourne, 13-17 July 1998.

  3. Dynamic Load Balancing within a Parallel Iterative Linear System Solver
    Peter Christen. Proceedings of the High-Performance Computing and Networking (HPCN) Conference 1998. Amsterdam, 21-23 April 1998, Springer Lecture Notes in Computer Science 1401.

1996:
  1. Speicher-Schemata für spärlich besetzte Matrizen (German)
    Peter Christen
    Institut für Informatik, Universität Basel. Technischer Bericht 96-4, September 1996.
    Report (ps.gz, 203 KB)

1995:
  1. Test- und Diagnosesoftware für Alpha7 (German)
    Peter Christen
    Diplomarbeit (MS thesis), Institut für Elektronik, ETH Zürich. Prof.Dr. A. Gunzinger, July 1995.


Selected Presentations:

2023:

2022: 2021: 2020: 2019: 2018: 2016: 2015: 2014: 2013: 2012: 2011:
  • Privacy-Preserving Data Matching
    Peter Christen
    Invited presentation to the Data Matching Working Group
    Australian Government Attorney-General's Department, Canberra, July 2011.
    Slides 8up (pdf, 2.0 MB)

  • Scalable Privacy-Preserving Record Linkage using Similarity-Based Indexing
    Peter Christen
    Invited presentation at Fujitsu Laboratories, Kawasaki, Japan, June 2011.
    Slides available upon request.

2010:

2009:

2008:

2007: 2006: 2005: 2004: 2003: 2002: 2001: 2000:
  • Application of Parallel Computing in Data Mining
    Peter Christen
    Seminar at the Suranaree University of Technology, December 2000.

  • Parallel Computing and Message Passing
    Peter Christen. Two-days course at the Suranaree University of Technology, October 2000.

  • Data Mining at the ANU
    Peter Christen
    Presentation at the ADFA/ANU Machine Learning meeting, ANU, Canberra, September 2000.

  • ACSys CRC - Data Mining Tools
    Peter Christen and Ole Nielsen
    Presentation and Demonstration for the ACSys CRC Data Mining research group, ANU, Canberra, August 2000.

  • Parallel Algorithms in Data Mining - The ANU CSL Data Mining Approach
    Peter Christen
    Seminar at the Department of Computer Science, Australian National University, Canberra, July 2000.

  • Parallel Algorithms for Data Mining
    Peter Christen
    Seminar at the School for Information Studies, Charls Sturt University, Wagga Wagga, May 2000.
    Slides (pdf, 1.7 MB)  Slides (ps.gz, 1.5 MB)

1999:
  • Parallelization of a Finite Element Surface Fitting Algorithm for Data Mining
    Peter Christen, Irfan Altas, Markus Hegland, Stephen Roberts, Kevin Burrage and Roger Sidje
    CTAC-99 Conference, Canberra, September 1999.
    Slides (ps.gz, 614 KB)

  • A Parallel Iterative Linear System Solver with Dynamic Load Balancing
    Peter Christen
    CTAC-99 Conference, Canberra, September 1999.
    Slides (ps.gz, 218 KB)


Last modified: 08 November 2023. ANU CRICOS Provider Number: 00120C Back to my home page