Privacy Preserving Outlier Detection: A Tutorial

Outlier detection is an important tool in risk modelling. In the context where data are distributed across multiple locations and data privacy is a concern, we need to start looking at privacy-preserving techniques for doing outlier detection. Linked here is a tutorial introduction to this topic I recently prepared.

Privacy Preserving Outlier Detection

The presentation builds on a few resources, including

  • Du & Atallah, Privacy-Preserving Cooperative Statistical Analysis, 2001
  • Vaidya Y Clifton, Privacy-Preserving Outlier Detection, 2004.

I guess the key takeaways are these:

  • Privacy-preserving (PP) statistical algorithms can appear difficult to appreciate at first because they sit at the intersection between data science and cryptography, both substantial topics in their own right. However, I find that once we have a good handle on a few primitives (oblivious transfer, secure scalar product, secure comparison, etc), many PP algorithms become relatively easy to understand and implement.
  • There are now practical PP algorithms for a range of problems so data science practitioners should really start paying attention.
  • The foundational technologies behind PP algorithms, including the important Secure Multi-party Computation problem, are now increasingly being used in conjunction with the Blockchain technique to produce potentially disruptive technologies like the Enigma system from MIT.

2 thoughts on “Privacy Preserving Outlier Detection: A Tutorial

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s