ICA2 and classification
ICA2 decomposition
After cleaning the signal at both AR1 and AR2 stages, a number of ICA2 runs are performed. Prior removal of artifacts improves the quality of decomposition at this stage. Moreover, the best possible separation of sources is then identified among multiple ICA2 realizations.
The ICs obtained with ICA procedure can be divided into two main categories: those representing the activity of the brain and those representing other, non-brain sources of activity, including biological and technical artifacts (for the sake of simplicity, at this point we can neglect that some ICs can include both types of activity as an effect of suboptimal decomposition). The classification of ICs into brain and non-brain is a step performed on all ICA2 runs.
to choose ICs that will be retained for further analysis vs. rejected. Brain ICs will then be localized, and brain activations will be calculated as a sum of activity introduced by all brain ICs.
Classification of independent components
The purpose of this step is to classify the resulting components into those that indicate brain activity and non-brain components that represent various sources of noise (environmental, physiological, e.g., cardiac, ocular, and residual channel artifacts). Here we implement automatic classification of ICs using two streams of processing : one model designed to extract information from topographies and the other utilizing ICA signal features. Hence, two custom machine learning models account for different IC features. with the joint probability considered finally via weighted average.
The first model utilizes a simple, custom-built feed-forward network. It operates on a repertoire of spectral and statistical metrics of signals that provides parametrization of signals and their complex relationships. Three following variables help to filter out ocular artifacts and those stemming from extra-cerebral electrical activity in the body:
signal,
power,
spectral correlation between electrical channels and ICs),.
The next two variables are meant to detect noise components:
1/f spectrum similarity,
spectral flatness
Two more features account for electrical noise as well as heart-based components:
kurtosis,
frequency of abnormally large peaks.
Moreover, for event-related paradigms, another feature is utilized to account for post-stimulus signal:
variability time-locked to experimental events (post- vs. pre-stimulus variance).
The network itself consists of two Fully Connected layers, each followed by batch normalization and the Rectified Linear Unit activation function. The network ends in a softmax layer to gather the probability of brain and non-brain origin of a given IC. The number of nodes in fully connected layers, the optimization algorithm, and the regularization parameter were chosen based on grid search cross-validation on the validation dataset for each new instance of a model.
Chosing the proper classification model
Some pre-trained models come with the ASCT. They are saved in the class_models directory of the toolbox. ASCT will try to guess the proper model based on the data type, however the model can be stated in the ica.ICA2_classModel parameter.
Currently, the following models are available:
There are separate models for MEG and EEG, as the characteristics of the signal and IC topographies differ between each other. Also, ER (event-related) and RS (resting state) data use separate classifiers, as in the former, a feature of pre- vs. post-stimulus variance is used to account for typical ERP patterns with low prestimulus activity. For recordings with additional electric leads, the 'elec' version should be used, as additional channels improve classification accuracy. When no additional electric channels are present, 'noelec' is to be used. We always encourage recording EOG activity, especially for EEG. In the case of recordings where only scalp channels are available, we found it beneficial to create a mock EOG channel during preprocessing and use the 'mockEOG' model. To create such an extra channel, setchan.addMockEOG = [1, 2]
, where numbers are the indexes of the frontal electrodes located close to the eyes (typically Fp1, Fp2). With this directive, a linear combination of channels will be created, which will serve as a measure of oculographic activity.
Choosing the best ICA2 realization
ICA2 is recommended to be repeated many times (around 20-30) on the same cleaned data. Hence, we have a number of ICA2 realizations, and the best ICA2 version has to be finally chosen. This will ensure that the optimal decomposition will be used for further analysis.
ICA2 files contain all the ICA realizations that can be seen in the data_ica2.ICA.iter structure. Here, the classification results can be found with total_ic_number and brain_ic_number variables, separately for each ICA realization. As can be seen on the figure below, the realization can significantly differ between each other.
The most important criterion for choosing the best ICA2 is the maximal number of components classified as 'brain' (brain_ic_number). If more realizations have the same number, the artifact contamination parameter is taken into account. This parameter (artifact_COOOOO) can be calculated only when electric channels are available in the recording. It estimates how much of the artefactual activity can still be found in brain components compared to non-brain components. Thus said, we choose the realization with minimal value of this parameter.
Last updated