Partial Least Squares (PLS) is a widely used technique in chemometrics, especially in the case where
- there is multi-collinearity in the set of variables;
- the number of variables is larger than the number of data points; and
- there are multiple response variables.
There are many articles on PLS but the mathematical details of PLS do not always come out clearly in these treatments. I have attempted to describe PLS in clear and precise mathematical terms in this technical note
In particular, I show that, given design matrix X and response matrix Y, the PLS algorithm seeks transformed variables of the design matrix X that have high variance (like in principal component analysis) and high correlation with the response matrix Y, as stated in Section 3.5 of The Elements of Statistical Learning.