Quality Financial Journalism: A List

The book Rich Dad, Poor Dad by Robert Kiyosaki is probably the worst book of the personal-finance genre I have ever read, containing numerous bad advice and generally encouraging hubris instead of prudence in financial decision-making (in the name of entrepreneurism). I did, however, pick up one useful lesson from it: In finance, you need to … More Quality Financial Journalism: A List

From Words to Concepts: Explicit Semantic Analysis

Everyone with a rudimentary understanding of text analytics knows about term-frequency-inverse-document-frequency (TF-IDF) vectors. What is less known but deserve to be more widely appreciated and applied is a little related trick called Explicit Semantic Analysis (ESA) introduced by E. Gabrilovish and S. Markovitch, for which they were awarded the 2014 IJCAI-JAIR Best Paper Prize. (https://www.jair.org/bestpaper.html) ESA … More From Words to Concepts: Explicit Semantic Analysis

Unifying Logic and Probability for Learning

Unifying logic and probability is an active and ongoing research topic of great interest to many. There are many proposals of probabilistic logics in the literature, each with a different motivation, either computational or philosophical, and a different system of syntax and semantics. This state of affairs is confusing and not satisfactory, especially in view … More Unifying Logic and Probability for Learning

The Competitive Moat of Google, Facebook and Other Data Owners

When I was building the Data Science Centre of Excellence at Reliance Industries, I once interviewed a world-class researcher who has worked at multiple institutions, including Yahoo and Google, and he told me something I didn’t understand till then. I thought a company like Google has a weak business moat because it is vulnerable to … More The Competitive Moat of Google, Facebook and Other Data Owners

A Short Course on Statistical Learning

Here is a short (and somewhat unusual) course on statistical machine learning that I have delivered multiple times over the last few years. Introduction to Statistical Learning Theory Bayesian Probability Theory Sequence Prediction and Data Compression Bayesian Networks In designing this course, I have deliberately steered away from the usual practice of giving students a (long) … More A Short Course on Statistical Learning

How to Prove It

A major deficiency in many university-level computer science programs is neglect for training in fundamental mathematical skills. This deficiency usually rears its head when a CS student first move into an area like Data Science and quickly realise s/he does not even have the ability to fully understand papers and books in the field, let alone contribute … More How to Prove It

Online Support Vector Machines

I have been studying and experimenting with online learning algorithms for support vector machines (SVMs) for a while now, primarily with the intention of understanding how they can be used to learn SVM models on large multi-terabyte datasets. The following technical report describes the NORMA and PEGASOS family of algorithms and give some observations and relevant … More Online Support Vector Machines