.. _getting_started:


*****************************
Getting Started with EEG Data
*****************************

In 2011-2012, the brain-computer interface (BCI) research group at Colorado
State University recorded EEG signals from subjects in our lab and in
their homes, using three different EEG systems.  One goal of this work
is to determine if inexpensive EEG systems (about $7,000) are as
effective as more expensive ones (about $40,000) for conducting BCI
experiments in the home.

On this page, we summarize the steps you can follow to download some
of the data, load it into an ``ipython`` environment, and visualize
it.  We also show examples of looking at P300 ERP's.

Downloading EEG Data
====================

EEG data from multiple subjects can be downloaded from our `Public BCI
Data <http://www.cs.colostate.edu/eeg/main/data>`_ site.
Let's select the data files for the first subject in each device
column, for subjects recorded in our lab.

.. figure:: _images/eegDownload.png
   :width: 7.5in

The zip file should contain six zipped data files.

.. code-block:: bash

   > cd ~/Download

   > unzip eeg.zip
   Archive:  eeg.zip
    extracting: s20-activetwo-gifford-unimpaired.json.zip  
    extracting: s21-activetwo-gifford-unimpaired.json.zip  
    extracting: s20-gammasys-gifford-unimpaired.json.zip  
    extracting: s21-gammasys-gifford-unimpaired.json.zip  
    extracting: s20-mindset-gifford-unimpaired.json.zip  
    extracting: s21-mindset-gifford-unimpaired.json.zip  

   > rm eeg.zip

   > ls -l --block-size=M *json*
   -rw-r--r-- 1 ... 84M Mar 12 10:50 s20-activetwo-gifford-unimpaired.json.zip
   -rw-r--r-- 1 ...  5M Mar 12 10:50 s20-gammasys-gifford-unimpaired.json.zip
   -rw-r--r-- 1 ... 29M Mar 12 10:50 s20-mindset-gifford-unimpaired.json.zip
   -rw-r--r-- 1 ... 80M Mar 12 10:51 s21-activetwo-gifford-unimpaired.json.zip
   -rw-r--r-- 1 ...  5M Mar 12 10:51 s21-gammasys-gifford-unimpaired.json.zip
   -rw-r--r-- 1 ... 28M Mar 12 10:52 s21-mindset-gifford-unimpaired.json.zip

   > unzip s20-gammasys-gifford-unimpaired.json.zip
   Archive:  s20-gammasys-gifford-unimpaired.json.zip
     inflating: s20-gammasys-gifford-unimpaired.json  

   > unzip s20-mindset-gifford-unimpaired.json.zip 
   Archive:  s20-mindset-gifford-unimpaired.json.zip
     inflating: s20-mindset-gifford-unimpaired.json  

   > unzip s20-activetwo-gifford-unimpaired.json.zip 
   Archive:  s20-activetwo-gifford-unimpaired.json.zip
     inflating: s20-activetwo-gifford-unimpaired.json  

   > rm s20*zip

Loading g.GAMMAsys EEG Data into IPython
========================================

Let's start with the smallest file, the one recorded with the `g.tec
g.GAMMAsys system
<http://www.gtec.at/Products/Electrodes-and-Sensors/g.GAMMAsys-Specs-Features>`_.
Unzip it.

.. code-block:: bash


The unzipped data can loaded into an ``ipython`` environment.

.. ipython::
   :suppress:

   In [0]: import os

   In [3]: if not os.path.isfile('s20-gammasys-gifford-unimpaired.json'):
      ...:    !wget http://www.cs.colostate.edu/eeg/data/json/zips/s20-gammasys-gifford-unimpaired.json.zip
      ...:    !unzip s20-gamma*zip
      ...:     

   In [3]: if not os.path.isfile('s20-mindset-gifford-unimpaired.json'):
      ...:    !wget http://www.cs.colostate.edu/eeg/data/json/zips/s20-mindset-gifford-unimpaired.json.zip
      ...:    !unzip s20-mindset*zip
      ...:     

   In [3]: if not os.path.isfile('s20-activetwo-gifford-unimpaired.json'):
      ...:    !wget http://www.cs.colostate.edu/eeg/data/json/zips/s20-activetwo-gifford-unimpaired.json.zip
      ...:    !unzip s20-activetwo*zip
      ...:     

.. ipython::

   In [10]: import json

   In [11]: data = json.load(open('s20-gammasys-gifford-unimpaired.json','r'))

The variable ``data`` is a list of dictionaries, each with the same
keys.

.. ipython::

   In [14]: len(data)

   In [19]: data[0].keys()

Here is a handy function to show keys and their values in each data
element.

.. literalinclude:: summarize.py

.. ipython::
   :suppress:

   In [1]: run summarize.py

.. ipython::

   In [65]: summarize(data)

   
Plotting some EEG
=================

The first element of the data list has key-value pair ``protocol:
3minutes``, meaning that this element contains 3 minutes of EEG
recorded while the subject was asked to relax and look at the computer
screen.  Let's take a look at 2 seconds of this data.  

The EEG consists of one matrix with 9 rows and 46,342 columns.  The 9
rows correspond to the channels ``channels:  ['F3', 'F4', 'C3', 'C4',
'P3', 'P4', 'O1', 'O2']`` plus one more channel that is used to mark
stimuli onset and offset, which is not used for the 3 minute
protocol.  The number of samples (in columns) in one second depends on
the sample rate, which for this device, ``device: GAMMAsys``, is 256
samples per second, ``sample rate: 256``.   Let's plot data from all 9
channels for columns 1,000 to 1,512.

.. ipython::

   In [76]: import numpy as np

   In [77]: import matplotlib.pyplot as plt

   In [78]: first = data[0]

   In [79]: eeg = np.array(first['eeg']['trial 1'])

   In [80]: eeg.shape
   Out[80]: (9, 46352)

   # Using ending semicolon to suppress output of plotting functions.
   In [81]: plt.figure(1);

   In [82]: plt.plot(eeg[:,4000:4512].T);

   @savefig eegplot1.png width=6in align=center
   In [83]: plt.axis('tight');

Kind of hard to see each channel.  Let's spread them out and not plot
the constant, unused, 9th channel.  Also, we can add a legend with the
channel names.  If we reverse the vertical order of the channel plots,
they will correspond with the vertical order of the channel names.

.. ipython::

   In [90]: plt.figure(2);

   In [91]: plt.plot(eeg[:8,4000:4512].T + 80*np.arange(7,-1,-1));

   In [91]: plt.plot(np.zeros((512,8)) + 80*np.arange(7,-1,-1),'--',color='gray');

   In [91]: plt.yticks([]);

   In [92]: plt.legend(first['channels']);

   @savefig eegplot2.png width=6in align=center
   In [93]: plt.axis('tight');

Again, for EEG from ActiveTwo and Mindset Systems
=================================================

Now let's summarize the data from the other two systems.  First,
rename ``data`` to ``dataGammasys``.

.. ipython::

   In [1]: dataGammasys = data

   In [11]: dataActivetwo = json.load(open('s20-activetwo-gifford-unimpaired.json','r'))

   In [11]: dataMindset = json.load(open('s20-mindset-gifford-unimpaired.json','r'))

   In [65]: summarize(dataMindset[0:2])

This shows that the Mindset has 19 channels of EEG, but the EEG matrix
has 24 rows.  The first 19 rows are the EEG channels. Let's plot them.

.. ipython::

   In [90]: eegMindset = np.array(dataMindset[0]['eeg']['trial 1'])

   In [90]: plt.figure();

   In [91]: plt.plot(eegMindset[:19,4000:4512].T + 30*np.arange(18,-1,-1));

   In [91]: plt.plot(np.zeros((512,19)) + 30*np.arange(18,-1,-1),'--',color='gray');

   In [91]: plt.yticks([]);

   In [92]: plt.legend(dataMindset[0]['channels'], prop={'size':10});

   @savefig eegplot3.png width=6in align=center
   In [93]: plt.axis('tight');

Now for the data from the ActiveTwo system.  First, let's see which
element in the list is for the ``3minutes`` protocol.

.. ipython::

   In [65]: summarize(dataActivetwo[0:2])

   In [65]: eegActivetwo = np.array(dataActivetwo[1]['eeg']['trial 1'])

   In [65]: eegActivetwo.shape

This data matrix contains 41 rows.  The list of channels is the 41 names

.. ipython::

   In [65]: dataActivetwo[1]['channels']

The channels named ``EXG1`` through ``EXG6`` contain non-EEG data as
follows:

=======  ===== =========
Channel  Index Electrode
=======  ===== =========
EXG1     32    EOG vertical left
EXG2     33    EOG vertical right
EXG3     34    EOG horizontal left
EXG4     35    EOG horizontal right
EXG5     36    earlobe left
EXG6     37    earlobe right
=======  ===== =========

Typically, the EEG channels (indices 0 through 31) are referenced to
the earlobes, after removing the linear trend.  That's easy.

.. ipython::

   In [91]: import scipy.signal as sig   

   In [90]: eegActivetwo = sig.detrend(eegActivetwo,1)

   In [90]: ref = np.mean(eegActivetwo[36:38,:],axis=0).reshape((1,-1))

   In [90]: eeg = eegActivetwo[:32,:] - ref

Now we can plot all 32 EEG channels.

.. ipython::

   In [91]: plt.figure();

   In [91]: plt.plot(eeg[:,4000:4512].T + 150*np.arange(31,-1,-1));

   In [91]: plt.plot(np.zeros((512,32)) + 150*np.arange(31,-1,-1),'--',color='gray');

   In [91]: plt.yticks([]);

   In [92]: plt.legend(dataActivetwo[0]['channels'][:32], prop={'size':8});

   @savefig eegplot4.png width=7in align=center
   In [93]: plt.axis('tight');