Distributed Privacy-Preserving Prediction

Another day, another paper, this time by my postdoc Lingjuan Lyu and a few collaborators. Here’s the abstract: In privacy-preserving machine learning, individual parties are reluctant to share their sensitive training data due to privacy concerns. Even the trained model parameters or prediction can pose serious privacy leakage. To address these problems, we demonstrate a … More Distributed Privacy-Preserving Prediction

Accurate and Efficient Privacy-Preserving String Matching

A few ANU colleagues and I have just completed a paper on a suffix-tree-based algorithm for computing the longest common substring of two strings in a privacy-preserving manner. Here’s the abstract: The task of calculating similarities between strings held by different organizations without revealing these strings is an increasingly important problem in areas such as … More Accurate and Efficient Privacy-Preserving String Matching

Linking Integer Records: The Simplest Case of PPRL

Privacy-Preserving Record Linkage (PPRL) is one of those problems that still doesn’t have a solid and widely accepted mathematical definition, perhaps because the problem of Record Linkage itself, especially the kind that doesn’t reduce to supervised learning through an abundance of labelled matches, still doesn’t have a solid mathematical definition despite thousands of papers published … More Linking Integer Records: The Simplest Case of PPRL

Hardening Bloom Filters using Paillier Encryption

Bloom Filters is a popular technique for privacy-preserving record linkage. However, recent work by Christen et al [1] and others have shown that Bloom Filters (BF) are susceptible to different forms of frequency attack. There are many ideas on hardening BF to protect against frequency attacks, and one idea we will explore in this blog article … More Hardening Bloom Filters using Paillier Encryption

Scalable Entity Resolution Using Probabilistic Signatures on Parallel Databases

My colleagues and I have just published on arXiv a simple but highly effective Entity Resolution algorithm that can scale to billions of records and handle significant data quality issues. The paper is titled Scalable Entity Resolution Using Probabilistic Signatures on Parallel Databases and it is an extension of our previous paper on linking millions of addresses … More Scalable Entity Resolution Using Probabilistic Signatures on Parallel Databases

Practical Algorithms for Distributed Privacy-Preserving Risk Modelling

In a previous post on the problem of detecting complex financial crimes, I described the following basic technology framework for financial intelligence units (FIUs) and their partner agencies and reporting entities (REs) to engage in collaborative but privacy-preserving and distributed risk modelling using confidential computing technologies. In this post, I describe a few concrete algorithms that … More Practical Algorithms for Distributed Privacy-Preserving Risk Modelling

How to Quickly and Meaningfully Improve the Financial System’s Collective Ability to Detect Crimes

Complex financial crimes are hard to detect primarily because data related to different pieces of the overall puzzle are usually distributed across a network of financial institutions, regulators, and law-enforcement agencies. The problem is also rapidly increasing in complexity because new platforms are emerging all the time that facilitate the transfer of value across a … More How to Quickly and Meaningfully Improve the Financial System’s Collective Ability to Detect Crimes