.. _getting_started: ***************************** Getting Started with EEG Data ***************************** In 2011-2012, the brain-computer interface (BCI) research group at Colorado State University recorded EEG signals from subjects in our lab and in their homes, using three different EEG systems. One goal of this work is to determine if inexpensive EEG systems (about $7,000) are as effective as more expensive ones (about $40,000) for conducting BCI experiments in the home. On this page, we summarize the steps you can follow to download some of the data, load it into an ``ipython`` environment, and visualize it. We also show examples of looking at P300 ERP's. Downloading EEG Data ==================== EEG data from multiple subjects can be downloaded from our `Public BCI Data `_ site. Let's select the data files for the first subject in each device column, for subjects recorded in our lab. .. figure:: _images/eegDownload.png :width: 7.5in The zip file should contain six zipped data files. .. code-block:: bash > cd ~/Download > unzip eeg.zip Archive: eeg.zip extracting: s20-activetwo-gifford-unimpaired.json.zip extracting: s21-activetwo-gifford-unimpaired.json.zip extracting: s20-gammasys-gifford-unimpaired.json.zip extracting: s21-gammasys-gifford-unimpaired.json.zip extracting: s20-mindset-gifford-unimpaired.json.zip extracting: s21-mindset-gifford-unimpaired.json.zip > rm eeg.zip > ls -l --block-size=M *json* -rw-r--r-- 1 ... 84M Mar 12 10:50 s20-activetwo-gifford-unimpaired.json.zip -rw-r--r-- 1 ... 5M Mar 12 10:50 s20-gammasys-gifford-unimpaired.json.zip -rw-r--r-- 1 ... 29M Mar 12 10:50 s20-mindset-gifford-unimpaired.json.zip -rw-r--r-- 1 ... 80M Mar 12 10:51 s21-activetwo-gifford-unimpaired.json.zip -rw-r--r-- 1 ... 5M Mar 12 10:51 s21-gammasys-gifford-unimpaired.json.zip -rw-r--r-- 1 ... 28M Mar 12 10:52 s21-mindset-gifford-unimpaired.json.zip > unzip s20-gammasys-gifford-unimpaired.json.zip Archive: s20-gammasys-gifford-unimpaired.json.zip inflating: s20-gammasys-gifford-unimpaired.json > unzip s20-mindset-gifford-unimpaired.json.zip Archive: s20-mindset-gifford-unimpaired.json.zip inflating: s20-mindset-gifford-unimpaired.json > unzip s20-activetwo-gifford-unimpaired.json.zip Archive: s20-activetwo-gifford-unimpaired.json.zip inflating: s20-activetwo-gifford-unimpaired.json > rm s20*zip Loading g.GAMMAsys EEG Data into IPython ======================================== Let's start with the smallest file, the one recorded with the `g.tec g.GAMMAsys system `_. Unzip it. .. code-block:: bash The unzipped data can loaded into an ``ipython`` environment. .. ipython:: :suppress: In [0]: import os In [3]: if not os.path.isfile('s20-gammasys-gifford-unimpaired.json'): ...: !wget http://www.cs.colostate.edu/eeg/data/json/zips/s20-gammasys-gifford-unimpaired.json.zip ...: !unzip s20-gamma*zip ...: In [3]: if not os.path.isfile('s20-mindset-gifford-unimpaired.json'): ...: !wget http://www.cs.colostate.edu/eeg/data/json/zips/s20-mindset-gifford-unimpaired.json.zip ...: !unzip s20-mindset*zip ...: In [3]: if not os.path.isfile('s20-activetwo-gifford-unimpaired.json'): ...: !wget http://www.cs.colostate.edu/eeg/data/json/zips/s20-activetwo-gifford-unimpaired.json.zip ...: !unzip s20-activetwo*zip ...: .. ipython:: In [10]: import json In [11]: data = json.load(open('s20-gammasys-gifford-unimpaired.json','r')) The variable ``data`` is a list of dictionaries, each with the same keys. .. ipython:: In [14]: len(data) In [19]: data[0].keys() Here is a handy function to show keys and their values in each data element. .. literalinclude:: summarize.py .. ipython:: :suppress: In [1]: run summarize.py .. ipython:: In [65]: summarize(data) Plotting some EEG ================= The first element of the data list has key-value pair ``protocol: 3minutes``, meaning that this element contains 3 minutes of EEG recorded while the subject was asked to relax and look at the computer screen. Let's take a look at 2 seconds of this data. The EEG consists of one matrix with 9 rows and 46,342 columns. The 9 rows correspond to the channels ``channels: ['F3', 'F4', 'C3', 'C4', 'P3', 'P4', 'O1', 'O2']`` plus one more channel that is used to mark stimuli onset and offset, which is not used for the 3 minute protocol. The number of samples (in columns) in one second depends on the sample rate, which for this device, ``device: GAMMAsys``, is 256 samples per second, ``sample rate: 256``. Let's plot data from all 9 channels for columns 1,000 to 1,512. .. ipython:: In [76]: import numpy as np In [77]: import matplotlib.pyplot as plt In [78]: first = data[0] In [79]: eeg = np.array(first['eeg']['trial 1']) In [80]: eeg.shape Out[80]: (9, 46352) # Using ending semicolon to suppress output of plotting functions. In [81]: plt.figure(1); In [82]: plt.plot(eeg[:,4000:4512].T); @savefig eegplot1.png width=6in align=center In [83]: plt.axis('tight'); Kind of hard to see each channel. Let's spread them out and not plot the constant, unused, 9th channel. Also, we can add a legend with the channel names. If we reverse the vertical order of the channel plots, they will correspond with the vertical order of the channel names. .. ipython:: In [90]: plt.figure(2); In [91]: plt.plot(eeg[:8,4000:4512].T + 80*np.arange(7,-1,-1)); In [91]: plt.plot(np.zeros((512,8)) + 80*np.arange(7,-1,-1),'--',color='gray'); In [91]: plt.yticks([]); In [92]: plt.legend(first['channels']); @savefig eegplot2.png width=6in align=center In [93]: plt.axis('tight'); Again, for EEG from ActiveTwo and Mindset Systems ================================================= Now let's summarize the data from the other two systems. First, rename ``data`` to ``dataGammasys``. .. ipython:: In [1]: dataGammasys = data In [11]: dataActivetwo = json.load(open('s20-activetwo-gifford-unimpaired.json','r')) In [11]: dataMindset = json.load(open('s20-mindset-gifford-unimpaired.json','r')) In [65]: summarize(dataMindset[0:2]) This shows that the Mindset has 19 channels of EEG, but the EEG matrix has 24 rows. The first 19 rows are the EEG channels. Let's plot them. .. ipython:: In [90]: eegMindset = np.array(dataMindset[0]['eeg']['trial 1']) In [90]: plt.figure(); In [91]: plt.plot(eegMindset[:19,4000:4512].T + 30*np.arange(18,-1,-1)); In [91]: plt.plot(np.zeros((512,19)) + 30*np.arange(18,-1,-1),'--',color='gray'); In [91]: plt.yticks([]); In [92]: plt.legend(dataMindset[0]['channels'], prop={'size':10}); @savefig eegplot3.png width=6in align=center In [93]: plt.axis('tight'); Now for the data from the ActiveTwo system. First, let's see which element in the list is for the ``3minutes`` protocol. .. ipython:: In [65]: summarize(dataActivetwo[0:2]) In [65]: eegActivetwo = np.array(dataActivetwo[1]['eeg']['trial 1']) In [65]: eegActivetwo.shape This data matrix contains 41 rows. The list of channels is the 41 names .. ipython:: In [65]: dataActivetwo[1]['channels'] The channels named ``EXG1`` through ``EXG6`` contain non-EEG data as follows: ======= ===== ========= Channel Index Electrode ======= ===== ========= EXG1 32 EOG vertical left EXG2 33 EOG vertical right EXG3 34 EOG horizontal left EXG4 35 EOG horizontal right EXG5 36 earlobe left EXG6 37 earlobe right ======= ===== ========= Typically, the EEG channels (indices 0 through 31) are referenced to the earlobes, after removing the linear trend. That's easy. .. ipython:: In [91]: import scipy.signal as sig In [90]: eegActivetwo = sig.detrend(eegActivetwo,1) In [90]: ref = np.mean(eegActivetwo[36:38,:],axis=0).reshape((1,-1)) In [90]: eeg = eegActivetwo[:32,:] - ref Now we can plot all 32 EEG channels. .. ipython:: In [91]: plt.figure(); In [91]: plt.plot(eeg[:,4000:4512].T + 150*np.arange(31,-1,-1)); In [91]: plt.plot(np.zeros((512,32)) + 150*np.arange(31,-1,-1),'--',color='gray'); In [91]: plt.yticks([]); In [92]: plt.legend(dataActivetwo[0]['channels'][:32], prop={'size':8}); @savefig eegplot4.png width=7in align=center In [93]: plt.axis('tight');