6- PCA
PCA model is a 2-way model. For an overview of the method, click here.
Some specifications:
- It allows missing data.
- For N-way data, data are automatically
unfolded, 1st mode against the others.
- It allows multiple sets.
- "Number of PC"
is the place where the user puts the number of PC he wants to execute PCA.
- "OK" button
executes PCA model. A default plot is returned after run (unless an information
appears). The other plots are available in the menu "results".
- "bootstrap" button
executes PCA model on original and resampled (by bootstrap) data. The user
will obtain the same plot as preceeding, but with a bootstrap estimation of
stability of the model (convex hulls, etc...). Bootstrap can be naive or residual
, see "Preferences/bootstrap". Available only for 3D!
- "Preprocessing" window
allows the user to choose 4 classical preprocessings.
- Column-centered: it substracts
to each component of the matrix X (corresponding to the data set) the
mean of its column. Widespread!
- Row-centered: it substracts to
each component of the matrix X the mean of its row. Not frequently used.
- Column-scaled: it divides to each
component of the matrix X the euclidean norm of its column.
- Row-scaled: it divides to each
component of the matrix X the euclidean norm of its row.
- "Close" button
closes the "PCA" window.
- Validation window:(Available
only for 3D!): 3 methods in order to help the user to choose the number
of PC he needs are available.
- Cross-Validation
- Naive bootstrap
- Residual bootstrap
How to do validation? Check the box "Validation", choose
your method and push"OK" button: a plot will be drawn.
-------------------------------------------------------------------------
Submenus and sub-windows:
Choice of algorithms:
- Nipals: Nipals is an iterative algorithm
which is sesuential, i.e. it calculates singular components step by step.
He is also more economic in memory than svd. He is mainly powerful when we
want few main singular components (generally not more than 10).
- SVD :uses the Matlab built-in function
"svd" which decomposes matrices in singular values thanks to QR
decompositions principles. It can occur full memory problems in case of large
data sets: then try Nipals algorithm.
N.B. : When missing data, Nipals is automatically
used.
Bootstrap options:
- Bootstrap model : naive (resampling
the horizontal slabs of X) and residual (resampling the horizontal slabs of
the residuals of PCA(X)).
- Bootstrap replicates: Number of bootstrap
resamplings .