Privacy Preserving Outlier Detection: A Tutorial

Outlier detection is an important tool in risk modelling. In the context where data are distributed across multiple locations and data privacy is a concern, we need to start looking at privacy-preserving techniques for doing outlier detection. Linked here is a tutorial introduction to this topic I recently prepared.

Privacy Preserving Outlier Detection

The presentation builds on a few resources, including

  • Du & Atallah, Privacy-Preserving Cooperative Statistical Analysis, 2001
  • Vaidya Y Clifton, Privacy-Preserving Outlier Detection, 2004.

I guess the key takeaways are these:

  • Privacy-preserving (PP) statistical algorithms can appear difficult to appreciate at first because they sit at the intersection between data science and cryptography, both substantial topics in their own right. However, I find that once we have a good handle on a few primitives (oblivious transfer, secure scalar product, secure comparison, etc), many PP algorithms become relatively easy to understand and implement.
  • There are now practical PP algorithms for a range of problems so data science practitioners should really start paying attention.
  • The foundational technologies behind PP algorithms, including the important Secure Multi-party Computation problem, are now increasingly being used in conjunction with the Blockchain technique to produce potentially disruptive technologies like the Enigma system from MIT.

One thought on “Privacy Preserving Outlier Detection: A Tutorial

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s