Modeling and Classification of EEG using Recurrent Neural Networks
Abstract:
Background and Objective
A number of algorithms for the classification of electroencephalogram (EEG) have been proposed in recent years. However, many of these approaches have not yet reached a performance level that is sufficient for practical applications. We hypothesize that a major drawback of current algorithms is their limited ability to incorporate the rich temporal information contained in EEG. This temporal information is often accounted for by embedding a small number of voltages observed from previous time steps into a single input to the classifier. This technique is limited because the temporal information that can be utilized is bounded by the number of embedded values. Furthermore, enough training samples must be observed to sufficiently represent the temporal changes that can take place within this window. We propose that these problems can be overcome with the use of recurrent artificial neural networks (RNN). Since RNN's contain feedback connections, they have an intrinsic state that allows them to generate outputs based not only on the current input, but also based on the sequence of previous inputs.
Methods
EEG is recorded from a subject while they perform a series of imagined mental tasks. A separate RNN is then trained to model the EEG recorded during each mental task by predicting the output of the sequence several steps ahead in time. In this way, we have a set of RNN's that can each be viewed as an expert at predicting EEG similar to that over which it was trained. Classification of previously unencountered EEG can then be performed by applying each RNN and assigning the class associated with the network that was able to model the sequence with the lowest error. We investigate the use of two RNN paradigms: Elman Networks trained using Backpropagation Through Time and Echo State Networks trained using Linear Least Squares.
Preliminary Results
Presently, we have demonstrated that RNN's are capable of predicting an EEG sequence a small number of steps ahead in time with a relatively high degree of accuracy, suggesting that RNN's are able to model EEG well. Preliminary classification results have demonstrated that our approach is able to classify simple artificially generated data with near perfect accuracy. We have also been able to differentiate between eye blink and jaw motion artifacts in EEG. Classification results for EEG recorded during two imagined mental tasks are currently in the 60% range for validation sets and 90% range for training sets when labeling individual time steps. We remain confident that these results will improve with better regularization, preprocessing and a more intelligent decision process. Further classification studies are in progress and will be presented at the BCI conference.
Discussion and Conclusions
RNN's may provide a mechanism for effectively modeling and classifying EEG. This approach may have the potential to improve upon current algorithms since RNN's may be better able to exploit the temporal information contained in EEG. Improved computational performance during classification may also be offered since the evaluation of RNN's consists primarily of a series of matrix multiplications.
Molecular, Cellular and Integrative Neurosciences 2010
Modeling and Classification of EEG using Recurrent Neural Networks
Abstract:
Recurrent Artificial Neural Networks (RNN) may prove to be powerful tools for the analysis of electroencephalogram (EEG). Specifically, we are interested in the ability of RNN's to model and ultimately classify EEG. Since RNN's possess an intrinsic state, we hypothesize that they will be able to incorporate the rich temporal information contained in EEG better than approaches that use a small number of embedded time steps.
In the experiments performed here, we use an Elman Recurrent Neural Network trained using Backpropagation Through Time with Scaled Conjugate Gradients. We have demonstrated that this approach is able to predict an EEG sequence a small number of steps ahead in time with a relatively high degree of accuracy, suggesting that RNN's are able to model EEG well. We have also shown that when a RNN is allowed to become an autonomous system by operating over its own predictions as inputs, the resulting outputs behave similarly to the EEG over which the network was trained.
Classification is performed by training a separate RNN to model EEG recorded while a subject performs each of several imagined mental tasks. In this way, we have a set of RNN's that can each be viewed as an expert at predicting EEG similar to that over which it was trained. Classification of previously unencountered EEG is then performed by applying each RNN and assigning the class associated with the network that was able to model the sequence with the smallest error.
Preliminary results demonstrate that this approach is able to differentiate between two imagined mental tasks with a success rate in the 60% range for test sets and in the 90% range for training sets. We remain confident, however, that these results will improve with better regularization, preprocessing and a decision process that utilizes accumulated error measurements.
Colorado Celebration of Women in Computing 2010
A Comparison of Elman and Echo State Recurrent Neural Networks
Abstract:
Artificial Neural Networks (ANN) are powerful, trainable approximation structures. ANN's are composed of a number of simple computational units, or neurons, with weighted interconnections. Training is typically performed by adjusting the connection weights between these neurons in a way that minimizes the error produced with respect to a given mapping. Recurrent Artificial Neural Networks (RNN) are a special class of ANN that contain delayed feedback connections. These feedback connections give RNN's an intrinsic state and the ability to approximate mappings that require memory. There are numerous RNN architectures and corresponding optimization algorithms, all with different characteristics. Here, we compare two different types of RNN by analyzing their performance on several benchmark problems.
Elman's Simple Recurrent Networks (SRN) consist of two layers. In the first, or hidden, layer, the computation performed by the neurons is a hyperbolic tangent. All neurons in the hidden layer are densely connected to the network inputs and have feedback connections to all other neurons in the hidden layer with a delay of a single timestep. The computation performed by the neurons in the second, or visible, layer is a simple linear combination. Neurons in the visible layer are densely connected to the hidden layer and contain no feedback connections. SRN's are typically trained by unfolding the network a number of steps back through time and then removing all feedback connections. The error gradient of the original network can then be approximated using this acyclic network and a standard gradient descent algorithm can be applied to minimize training error. SRN's were originally proposed by Jeff Elman in 1990 and a theoretical result has since shown that SRN's can approximate any finite state machine with arbitrary precision, given enough hidden units and the proper weight values. However, real-world applications that utilize SRN's are relatively rare due to the difficulty of avoiding local optima during training.
ESN's have a similar layout to SRN's in that they have a single hidden layer with a hyperbolic tangent transfer function and a visible layer with a linear transfer function. The hidden units in an ESN also have feedback connections with a delay of a single timestep. Unlike SRN's, however, the hidden layer of an ESN is relatively large, sparsely connected and remains untrained. Instead of adjusting the weights of the hidden layer, the weights are chosen to have the Echo State Property. That is, the output of the hidden layer will asymptotically converge to the same state given the same input sequence, regardless of initial conditions. Training the visible layer then becomes a simple linear regression problem. ESN's were originally proposed by Herbert Jaeger in 2002 and are one of several types of approximation structures that depend on large untrained "reservoirs". ESN's have received significant attention in recent neural network literature.
We compare SRN's and ESN's by applying both approaches to several benchmark problems. The first is known as the temporal XOR problem. In the temporal XOR problem, a sequence of bits are fed into the network which is then trained to output the exclusive OR of the bits that were seen a number of steps, T and T-1, back in time. The memory requirement of the network can be increased in the temporal XOR problem by increasing the value of T. The temporal XOR problem is an important benchmark because it is both non-linear and contains a temporal dependency. The second benchmark is a simple time series forecasting problem in which each RNN is trained to forecast a sinusoidal function a number of steps ahead in time. The difficulty of this problem can also be increased by requiring the network to forecast the signal further ahead in time. The final benchmark is a more difficult time-series forecasting problem using a laser- generated dataset borrowed from The Santa Fe Time Series Competition.
Preliminary results suggest that ESN's are able to achieve significantly longer term memory in the temporal XOR problem and typically outperform SRN's in the forecasting problems. However, ESN's can exhibit numerical instability problems with some choices of initial weights. Additionally, it may be more difficult to control overfitting when using ESN's. Although further investigation involving a wider range of problems will undoubtedly be required, it appears that the recent enthusiasm surrounding ESN's may be well founded.