DESCRIPTION OF PLS METHOD

History

The idea of PLS (Partial Least Squares) is relatively recent but has been employed in the field of analytical chemistry for several decades. There are several algorithms for fitting this model to an array, but essentially they can be reduced to two types: with orthogonal scores [1,3] or with non-orthogonal scores  Martens & Nęs [4].

The two algorithms are identical in the predictions they yield, but the actual scores and weight differ and in the computation for the former method a loading matrix P (necessary to render the scores T orthogonal) is introduced.

The statistical and geometrical properties of these algorithms have been investigated in many works ([2,3] among others) along the years and almost any recent book with a multivariate - calibration chapter contains a description of this method.

The PLS regression model has been successfully employed in many fields. In particular in recent years some works on batch process monitoring has appeared where it is employed on multi-way (namely three-way) arrays that are opportunely rearranged in a matrix [5-6] to predict quality variables as well as to monitor the process itself.

Aim:
PLS is a bilinear calibration model computed on a matrix X of dimension n x p (which can be a matricised multi-way array X, in which case p=JK...) to predict the one or more variables present in a Y matrix of dimensions n x r (which can also be a multi-way array).
For PLS1, which is the model implemented in CuBatch, the predicted matrix Y is actually a vector (r=1) and thus, henceforth, identified with y.
For a the three-way array X and a predicted variable y, the model is then stated as:

where W are the weights, Ex are the residuals for X, B is the matrix with the internal regression coefficients and eY are the residuals for the y vector.

Criterion:
While for PCA and other decomposition models, the aim is to maximise the captured variance (that is, to minimise the distance between the data and the model used to describe them), in the case of PLS1 the objective function is to maximise the covariance between the scores t  for one component (as they components are extracted one at a time)  and the unexplained part of the predicted variable y.

The problem to solve is then the following:

Algorithm:
The algorithm here employed is the same as for the multilinear-PLS and thus includes the definition of a core. In the specific case of a 2-way X this matrix will be diagonal.

As for multilinear-PLS the components are still extracted one at a time; deflation is applied to the y.

In order to be applicable the array X has to be rearranged into a matrix as the first step; different possibilities have been proposed for the matricisation step and the one here employed is the one suggested in [5].

i. f = 1, e = y
ii
iv. compute wf (i.e. the f-th column of the weights) as the first left singular vectors of Z
v.
vi. compute the regression coefficients
vii. update the residuals
viii. Repeat from ii. until the desired number of components have been extracted
ix. Compute the core:

where + is the Moore-Penrose inverse

 

Code:
The code for PLS is the one from the n-way toolbox [8], whose version 2.1 is included in the software.
Updates (but CuBatch is not guaranteed to work with them) can be downloaded at www.models.kvl.dk

Applications:
Multivariate calibration, batch process monitoring, regression...

Dataset reference:
Fluorescence.mat

References:
[1]  Wold S et al, Siam Journal on Scientific and Statistical Computing, 5 (1984), 735-743
[2]  Höskuldsson A, Journal of Chemometrics, 2 (1988), 211-228
[3]  Phatak A, de Jong S, Journal of Chemometrics, 11 (1997), 311-338
[4]  Martens H, Nęs T, "Methods for calibration" in Multivariate Calibration, John Wiley & Sons, Chicester, 1989
[5]  Nomikos P, Mac Gregor JF, Chemometrics and Intelligent Laboratory Systems 30 (1995), 97-108
[6]  Gurden SP et al, Chemometrics and Intelligent Laboratory Systems 59 (2001), 121-136
[7]  Bro R, PhD dissertation, University of Amsterdam (1998)
[8]  Andersson CA, Bro R,
Chemometrics and Intelligent Laboratory Systems, 52 (2000), 1-4

 

 

Go to PLS window description