Scalable Entity Resolution Using Probabilistic Signatures on Parallel Databases
January 1, 2018
My colleagues and I have just published on arXiv a simple but highly effective Entity Resolution algorithm that can scale to billions of records and handle significant data quality issues. The paper is titled Scalable Entity Resolution Using Probabilistic Signatures on Parallel DatabasesĀ and it isĀ an extension of our previous paper on linking millions of addresses … More Scalable Entity Resolution Using Probabilistic Signatures on Parallel Databases