Correlating Brainwave Patterns with Mental Tasks Using MATLAB to Train Neural Networks on a Parallel Computer

Charles W. Anderson
Dept. of Computer Science, Colorado State University, Fort Collins, CO, 80523.
anderson@cs.colostate.edu, 970-491-7491, FAX: 970-491-2466
Application categories: Fuzzy Logic/Neural Networks, Biology and Medicine

Training Neural Networks to Recognize Patterns in EEG

Most EEG research seeks to understand the dynamic processes in the brain that are the basis of physical and mental behavior. In addition to serving as tools to probe the mind, EEG signals are being investigated as a new mode of human-computer communication. If a small number of mental states can be reliably detected, then a person could compose sequences of such states to indicate commands to a computer, just as letters are composed to form words. In this short paper, the results and MATLAB procedures are summarized that are used to analyze EEG signals recorded from four subjects while the subjects performed five mental tasks.

The subjects were seated in a sound controlled booth with dim lighting and noiseless fans for ventilation. An Electro-Cap elastic electrode cap was used to record from positions C3, C4, P3, P4, O1, and O2, defined by the 10-20 system of electrode placement. The electrodes were connected through a bank of Grass 7P511 amplifiers and bandpass filtered from 0.1--100 Hz. Data was recorded at a sampling rate of 250 Hz with a Lab Master 12 bit A/D converter mounted in an IBM-AT computer. Eye blinks were detected by means of a separate channel of data recorded from two electrodes placed above and below the subject's left eye. The subjects were asked to perform five mental tasks: a baseline task, for which the subjects were asked to relax as much as possible; the letter task, for which the subjects were instructed to mentally compose a letter to a friend or relative without vocalizing; the math task, for which the subjects were given nontrivial multiplication problems, such as 49 times 78, and were asked to solve them without vocalizing or making any other physical movements; the visual counting task, for which the subjects were asked to imagine a blackboard and to visualize numbers being written on the board sequentially; and the geometric figure rotation, for which the subjects were asked to visualize a particular three-dimensional block figure being rotated about an axis. Data was recorded for 10 seconds during each task and each task was repeated five times per session. Figure 1 shows one half-second window of EEG signals from the six channels for each of the five tasks. Before classification, we transformed the raw EEG signals by representing them as the coefficients of autoregressive (AR) models.

Neural networks were trained to classify half-second segments of six-channel, EEG data into one of five classes corresponding to five cognitive tasks performed by four subjects. Two and three-layer feedforward neural networks (see Figure 2) are trained using 10-fold cross-validation and early stopping to control over-fitting, a feature not currently included in the MATLAB Neural Network Toolbox). Figure 3 shows a typical training experiment. The training error continues to decrease through 3,000 epochs. Two smaller sets of data are held out from the training data set. The validation set is used to find the epoch at which the network best classified novel data, in this case Epoch 396. Beyond Epoch 396, the neural network begins to overfit the training data, meaning that it becomes fine-tuned to idiosyncraticies in the training data that will not necessarily appear in new EEG data sets. To predict how well this network will classify entirely new data, the error on the third data set, called the test set, is calculated.

It was found that by averaging the output of the neural network over consecutive half-second windows, a higher classification accuracy could be achieved. Figure 4 shows that the classification accuracy increased from 54% to 96% for this run by averaging over 20 consecutive windows.

Preparing Data, Initiating the Training, and Analyzing Results in MATLAB

An existing C program, written by the author, was used to train the network, because it included the ability to train on a 128-processor SIMD computer, the CNAPS Server II made by Adaptive Solutions, Inc., of Beaverton, Oregon. A suite of MATLAB functions have been developed to prepare data, initiate the training of neural networks with the C program, and to analyze the results. Additional functions for this application were written to examine, transform, and train networks to classify EEG signals. Below is part of the help message for the function nnTrain that was used to train networks to classify EEG signals:

%nnTrain:
% [teste,bestep,testouts,wh,wo,errorCurve] = nnTrain(...
%  matrices,nrows,ninputs,nhiddens1,nhiddens2,noutputs,inthru,...
%  hrates,orates,moms,epochs,options);
%
% Output Arguments:
%  teste     is error on test set using best network.
%  bestep    is epoch at which net produced lowest error on validation set.
%  testouts  is output of network for each test pattern.
%  wh,wo     are weight matrices, for hidden and output layers.
%  errCurve  is train,validation,and test errors per epoch (20 values spaced
%            evenly over entire training run.
%
% Input Arguments:
%  matrices  patterns, by row.  Each row has inputs then targets.
%  nrows     number of rows in each piece appearing sequentially in matrices.
%  ninputs   number of components in each input vector, not counting constant.
%  nhiddens1  number of hidden units.  If a vector, all values will be run.
%            If running hammer, this vector set to a nondecreasing order
%            to work around bnlib bugs.
%  nhiddens2  number of hidden units in second layer. Can be empty.
%            If a vector, all values will be run.
%            If running hammer, this vector set to a nondecreasing order
%            to work around bnlib bugs.
%  noutputs  number of components in target vector.
%  inthru    boolean flags to add inputs straight through to output layer.
%  hrates    learning rate for hidden units. If a vector, all will be run.
%  orates    learning rate for output units.  If a vector, all will be run.
%  moms      momentum rates for all units.  If a vector, all will be run.
%  epochs    number of epochs to train.
%  options   optional string containing any subset of the following arguments:

As an example of how to use nnTrain, a neural network is trained to implement the 2-bit exclusive-or function. Networks of 2, 4, and 10 hidden units in a single hidden layer are tried.

 d1 = [0 0 .1; 0 1 .9; 1 0 .9; 1 1 .1]   %2-bit exclusive-or
 d2 = d1 + randn(4,3)*.1;
 d3 = d1 + randn(4,3)*.1;
 nnTrain([d1;d2;d3],[4 4 4],2,[2 4 10],0,1,0,.5,0.1,0.9,2000,...
 'f=xor.results m=hammer o=long');

Results are extracted and graphed by doing:

 nnResults('xor.results');

This results in graphs of training, validation, and testing errors, as in Figure 3. Also displayed is a representation of the network's weights, as in Figure 5.

A better understanding of how a neural network solves a problem can be gained by looking at the weight values learned from multiple training experiments. We have done this by collecting all hidden-unit weight vectors from 20 runs and clustering them using a k-means clustering function written in MATLAB. The results of this for our EEG experiments is shown in Figure 6.

The ability to train neural networks on our parallel computer proved very valuable for our EEG experiments. We compared the time required to train various-sized neural networks on a Sun Sparc 10 and on our 128-processor CNAPS Server. The results are in Figure 7. Execution time increases approximately linearly in the number of hidden units for the serial Sparc 10, but the execution time for the parallel CNAPS server increases from 3.8 minutes for two hidden units to only 4.1 for 80 hidden units. A network of 80 hidden units takes about 240 times longer on the Sparc 10 than on the CNAPS server. Running 90 repetitions of the training procedure for a 40 hidden unit network took approximately 6 hours on the CNAPS Server; on the Sparc 10 this would require 37 days.

The combination of the parallel computer and the MATLAB environment for controlling and analyzing neural network experiments has proved to be useful in many other domains. The user can alter a number of parameters and quickly see the results graphically, quickly gaining an intuition for the effects of the parameters, such as the size of the network.


Figure 1: First one-half second of data from each task.


Figure 2: Feedforward neural network with one or two hidden layers.


Figure 3: RMS error versus training epochs for training, validation, and test sets. The epoch at which the error on the validation data is lowest is 396. The network's weights at epoch 396 are saved as the best weights. The error on the test set using these best weights is designated as the generalization error of the network.


Figure 4: Network output values and desired values for one test trial for one subject. Averaging the network's output over consecutive windows increases the classification accuracy to 96%.


Figure 5: A 20-0 network. The columns of the upper matrix represent the weights in each hidden unit and the rows of the lower matrix represent the weights in each output unit. Positive weights are filled, negative weights are unfilled.


Figure 6: Results of k-means clustering for 20 clusters with the AR representation. This shows that EEG signals from the O1 and O2 channels are most relevant to this classification problem.


Figure 7: Minutes of execution time for increasing network size for the parallel CNAPS server and a serial Sun Sparc 10. All runs were for 1000 epochs.