DESCRIPTION OF N-PLS METHOD

History

Multilinear PLS (nPLS) comes as a straightforward extension to the more established PLS model [1] with non orthogonal loadings, as introduced by Martens & Nęs [2].

Its initial formulation was made by Bro [3,4]. The actual computation of the model, and specifically of the regression coefficient underwent some improvements thanks to various contributions in more recent years by Smilde AK[5] and De Jong S [6]. In particular the latter showed how the model can be computed equivalently even when the deflation step is removed from the algorithm.

The initial PARAFAC-like structure of the model of X has been recently [7] changed into a Tucker structure. I.e., for the three-way case:

where the W identifies the weights in the third (K) and second (J) mode G is the core and E are the residuals. (IxJK) refers to the way the array is matricised [4]; namely as a (two-way) matrix with I rows and JK columns. This modification improved the prediction capability on the X array, in the calibration phase as well as for independent sets of data.

This model has been successfully employed in many fields in particular in the field of analytical chemistry as well as batch process monitoring [8,9].

Aim:
The main purpose is to fit a multivariate (namely multi-way) calibration model on a multi-way array X of dimensions I x J x K x ... to predict the one or more variables present in a Y array or matrix of dimensions I x L x M x ...
In the nPLS1 case, such as for the model implemented in CuBatch, L and M are equal to 1 and therefore Y is a vector (thus, henceforth, identified with y).
For a three-way array X and a predicted variable y, the model is then stated as:


where W are the weights (the letters J and K refer to the mode), G is the matricised core. Ex are the residuals for X, B is the matrix with the internal regression coefficients and eY are the residuals for the y vector.

Criterion:
While for PARAFAC and other decomposition models, the aim is to maximise the captured variance (that is, to minimise the distance between the data and the model used to describe them), in the case of multilinear PLS1 the objective function is to maximise the covariance between the scores t for one component (as they are extracted one at a time) and the y not yet accounted for.

For a three way X and a predicted variable y the problem to solve is then the following:

Algorithm:
Originally [3,4] the algorithm required deflation on the X after each component was fitted, modifications introduced by de Jong removed this necessity and, albeit the components are still extracted one at a time, deflation is applied to the y.

For the three-way case the algorithm is the following

i. f = 1, e = y
ii
iii. reshape e into Z of dimensions (J x K)
iv. compute and (i.e. the f-th column of the weights) as the left and right first singular vectors of Z
v.
vi. compute the regression coefficients
vii. update the residuals
viii. Repeat from ii. until the desired number of components have been extracted
ix. Compute the core:

where + is the Moore-Penrose inverse

If X is four or more dimensional step iii., v. and ix. are straightforwardly extended to the new case

E.g. for four-ways X
iii.
v.
ix.

The weights vectors are computed (step iv.) as loading vectors of a 1 component PARAFAC model computed on the Z array:

where all weights are subsequently normalised to length 1
 

Code:
The code for nPLS is the one from the N-way toolbox [10] version (2.1) of which is included in the software.
Updates (but CuBatch is not guaranteed to work with them) can be downloaded at www.models.kvl.dk

Applications:
Multivariate calibration, batch process monitoring.

Dataset reference:
Fluorescence.mat

References:
[1]  Wold S et al, Siam Journal on Scientific and Statistical Computing, 5 (1984), 735-743
[2]  Martens H, Nęs T, "Methods for calibration" in Multivariate Calibration, John Wiley & Sons, Chicester, 1989
[3]  Bro R, Journal of Chemometrics, 10 (1996), 47-61
[4]  Bro R, PhD dissertation, University of Amsterdam (1998)
[5]  Smilde AK, Journal of Chemometrics, 11 (1997), 367-377
[6]  de Jong S, Journal of Chemometrics, 12 (1998), 77-81
[7]  Bro R et al, Chemometrics and Intelligent Laboratory Systems 58 (2001), 3-13
[8]  Bro R, Heimdal H, Chemometrics and Intelligent Laboratory Systems 34 (1996), 85-102
[9]  Gurden SP et al, Chemometrics and Intelligent Laboratory Systems 59 (2001), 121-136
[10] Andersson CA, Bro R,
Chemometrics and Intelligent Laboratory Systems, 52 (2000), 1-4
 

Go to N-PLS window description