Confession: As a computer scientist, I have always been comfortable with discrete mathematics. However, continuous maths, especially the type commonly seen in statistical machine learning, have always been a challenge for me. In fact, I lived through the last 15 years of my professional life in a more-or-less constant fog of partial understanding when it comes to machine learning maths. Now put yourself in the shoes of someone who, for nearly ten years, worked in the best machine learning group in Australia and you start to get the general unease — err, low self-esteem — I have been living with…
After years of hard work and self-study, the fog lifted for me about 3 years ago and I thought I’ll share parts of my journey with interested readers. For some reason, I have a strong suspicion I’m not alone in this. 😉
The most useful thing I did to lift the fog was doing a lot of mathematical writing. Specifically, I find the exercise of writing out “obvious” or omitted proofs in textbooks particularly useful. One exercise I did that really helped me understand machine learning maths is fleshing out the details in Chapter 3: Linear Methods for Regression in The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman, in particular the sections on Shrinkage Methods and Methods Using Derived Input Directions.
The whole thing started when someone asked me whether I know about Partial Least Squares (PLS), to which I replied “of course” without thinking clearly. I soon discovered that actually, I didn’t and proceeded to spend a couple of weeks (or maybe it was months) to learn as much about PLS as I can. The outcome of that exercise is this tech report.
The topic is surprisingly rich, especially once you start to venture into closely related areas. It’s not quite ML-complete, but it touches on many useful mathematical techniques in machine learning. My believe is that if you can work through the above report line-by-line, I think your confidence in dealing with machine-learning maths will be lifted a lot.
That’s the start.
I have spent a great deal of energy thinking about (my lack of proper) mathematical training and here you can find a few books that really helped me over the years.
Polya teaches creativity in mathematical problem solving, and Velleman teaches discipline in mathematical writing. Both are short-ish books that are easy to enjoy.
If you get tired going through the previous set of books, you can relax your mind by switching between those and this next set of popular-science books, which teaches the major mental models that I think every data scientist should know.
There is no short-cut to mathematical competency at the postgrad machine learning level, at least none I’m aware of. (There’s a book called All the Maths You Missed But Need to Know for Grad School that is useful as a summary, but it doesn’t teach you mathematical thinking.) But I’m confident everyone can get to a competent level by going through the above diligently.