Machine Learning: A Broad Church

I am sometimes asked what is the difference between Machine Learning (ML) and X, where X is one of a number of things like Statistics, Evolutionary Computing, Control Theory, etc. A variation of the question is what are problem classes that can be tackled by both ML and non-ML techniques, and what are the pros and cons of the ML and non-ML approaches.

Machine learning is a broad church and the answer to the above questions depends, to a large extent, on how one defines machine learning, whether by the problem definition (e.g. supervised and unsupervised learning in its various forms, reinforcement learning, online vs batch learning, etc) or by the available techniques and algorithms to solve the problem classes defined. The former is a somewhat static definition, and the latter, a constantly evolving feast.

These questions beg the simpler question of what is ML and what techniques / algorithms sit inside the ML circle? To unpack this further, let’s look at the “eigenquestions” of what is Machine Learning. There are four essential dimensions to think about:

  • the optimisation criterion (which is tied to a small number of valid induction principles like Empirical Risk Minimisation, Minimum Description Length, Bayesian Classifier, Minimax principle from Game Theory, AIXI, etc);
  • the class of models to be considered (e.g. neural nets, decision trees, ensembles of trees, all possible Turing machines, etc);
  • the transformations that can be applied to the data available for learning (e.g. explicit feature engineering or implicit feature mapping through kernel methods like SVM); and
  • finally the optimisation algorithm to pick a good model from the model class that fits the data well according to the optimisation criteria (e.g. convex programming, linear programming, greedy search, dynamic programming, genetic programming, etc).

Basically all machine learning algorithms can be understood in terms of the choices made for those four dimensions. As you can see, it’s a pretty broad class that covers a lot of areas. It’s also worth mentioning that the Reductions in Machine Learning literature also show many of those choices are reducible to a very small set of core choices.

The above hopefully provides a framework to answer the question on how is Machine Learning different from X. In most cases, X is either actually a form of Machine Learning, or X is a component of one or more of the four dimensions listed above. There are certainly a lot of alternative techniques one can apply for a specific learning problem, and the ML community goes through different fads every 5-10 years. Deep learning is the current rage. As are all the recent fair and privacy-preserving ML algorithms. In 5 years’ time, I am sure the community will move on to something bigger and better.

On the question of what are problem classes that are solvable using ML but which are also solvable using non-ML techniques, part of the answer again depends on understanding what those “non-ML techniques” are in terms of the four dimensions listed above, and whether machine learning is actually better than human learning for the problem at hand. On the latter, I note that in industry and government, most people still stick largely to rule-based or expert systems whenever they can, for the simple reason that the additional accuracy / efficiency achievable using ML is often times not enough to justify the complexity of setting up the necessary human and IT infrastructure to do ML model deployment and management (see also the general discipline of ML Ops).

As to the class of problems that can’t be solved by ML, it’s worth mentioning 

  • the effectiveness of Prediction Markets to tackle really difficult and dynamic prediction problems that ML can’t handle;
  • the still, in practice, unsolved problem of integrating learning and reasoning (or integrating probability and logic); and
  • the problem of lifting machine learning to the level of an organisation / group of communicating agents.

I think the Machine Learning definition is broad enough that, over time, those problems and their solutions will become part of the Machine Learning circle as well.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s