Projects

Projects this semester will focus on tools for spliced alignment of short read data. Each student will choose one of the following programs to look at:

During the course of the project you will:

  • Present the method in class.
  • Apply the program to simulated short-read data that we will provide.
  • Write a report that describes your experience with the program and present your findings to the class. These will be due during the last week of classes.

Data

We ask you to apply your chosen program to the following two datasets:

  • Short read simulated data from Arabidopsis thaliana. Here's a readme.
  • Short read data generated by our collaborator from the biology department. The data is available from the NCBI short-read archive as GEO accession GSE32318. Note that this data is composed of two replicates that you need to align separately. The link for downloading the data is at the bottom of the page, labeled as supplementary file download.

New data

For aligning the datasets you will need the sequence of the Arabidiopsis genome. You can download these from the TAIR website. You will need the sequences for chromosomes 1-5.

Presentation schedule:

  • Tuesday 10/11 Jeremy
  • Thursday 10/13 Fayyaz and Mo
  • Tuesday 10/18 Nathan and Arpita
  • Thursday 10/20 Zhisheng and Indika