Financial Intelligence Units (FIUs) around the world collect data like threshold transaction reports, international fund transfer reports, and suspicious matter/activity reports from Reporting Entities (REs), which include banks, money remitters, casinos, law firms, real-estate companies, and financial companies. They may also get data about entities of interest from partner agencies (PAs) like law-enforcement agencies (LEAs) and partner FIUs in other countries.
The following diagram shows the current state-of-the-art in detecting financial crimes in place at the more advanced FIUs.
All the data they received are pushed through an entity-resolution step, which is usually a highly non-trivial computational problem because of the sheer number of parties involved in financial transactions (billions for a typical FIU), to arrive at a table like the above, where the rows are entities (either individuals, companies, or groups of related individuals/companies) and the table show their behaviour in the financial system as seen through the lens of different reporting entities. In some cases, we also have explicit risk indicators for these entities. Given such a table, an FIU equipped with competent data scientists and big data analytics platforms can now use profiling and statistical machine learning techniques, both supervised and unsupervised at different levels of granularity from individual transactions to single entities and networks of entities, to uncover risks in the financial system and act on them.
That’s the current state and it is, well, as good as it goes. However, there are important limitations in the current state, almost all of which are related to two facts:
- complex financial crimes typically involve transactions that involve multiple financial institutions in multiple geographies, going through multiple payment channels, some of which are unregulated.
- Financial intelligence units and reporting entities are all working with data silos and no one has a complete view of activities across the financial system.
To appreciate the extent of the problem, one only need to look at the difficulty of trying to answer a simple question like finding all entities that exhibit a certain cross-border transaction pattern like that shown in the next diagram.
The question, and many more complicated others we would like to ask, simply cannot be answered today because the data needed to answer them sit in different organisations in different countries, governed by different legislations.
So what can we do?
Here’s a possible way forward using distributed confidential computing technologies, assuming financial intelligence units and partner agencies are willing to work together but without needing to make wholesale changes to AML/CTF legislations everywhere.
The obvious obstacle to different FIUs and partner agencies sharing large volumes of data with each other is privacy concerns. In the proposed scheme above, instead of entity-resolution, we have all the participating agencies contributing data that are first matched using privacy-preserving record linkage algorithms. The matched data are then encrypted using a homomorphic encryption scheme, which allow mathematical operations to be performed on encrypted data, to arrive at a table similar to what we showed earlier, with two distinctions:
- the values in the table are all encrypted and
- the columns of the table are stored in a distributed way across different databases in different agencies.
The point of using a homomorphic encryption schemes is that it is still possible to run a range of risk-modelling algorithms over the encrypted data, and once we get to a small set of entities of interest, decrypting those entities’ data can be done without raising any privacy concerns for the general public.
Having the data distributed in different databases does complicate the running of risk-modelling algorithms somewhat but the added security and scalability makes it a worthwhile trade-off to make.
The above proposed scheme, I believe, is a way forward, if not the way forward, in our battle against complex financial crimes.
- Using data vs seeing data by Stephen Hardy
- Distributed Machine Learning and Partially Homomorphic Encryption
- Data61’s confidential computing work
- CryptDB – SQL Processing on Encrypted Data
- The Enigma Engine from MIT
- Quick introduction to homomorphic encryption
- Privacy Preserving Outlier Detection