In-Database Machine Learning Illustrated

I have just received the excellent news that Apache MADlib, a big data machine learning library for which I was a committer until recently, has graduated to become a top-level Apache project. The basic idea behind MADlib is actually quite interesting and deserves to be more widely known. Massively Parallel Processing (MPP) databases like Greenplum have

Online Support Vector Machines

I have been studying and experimenting with online learning algorithms for support vector machines (SVMs) for a while now, primarily with the intention of understanding how they can be used to learn SVM models on large multi-terabyte datasets. The following technical report describes the NORMA and PEGASOS family of algorithms and give some observations and relevant