User Tools

Site Tools


Assignment 2

Due date: December 7, 10am
NOTE This is listed on Canvas as “Exercise 2”

Which genes are expressed in a specific tissue type? In this Assignment you will analyze an RNA-seq project whose aim is to determine genes that are enriched in the intestines of C. elegans worms compared to the animal as a whole.


  • Navigate to your home directory and download the dataset by copying the following folder…
$ cd 
$ cp -R /home/erin/03_htseq_homework/ .

Align and tabulate the sequencing files for this dataset. To do this, you may need to…

  • Create a metadata file.
  • Hack by amending the “MODIFY THIS SECTION”. Use the C.elegans bowtie2 indices and .gtf file from Assignment 1.
  • Run

To illustrate your success, please turn in the following two figures on a standard, letter-sized .pdf document…

  • A table depicting the # of mapped reads for each sample and the % overall read mapping rate for each sample. Include a table legend.
  • A plot depicting the raw read counts tabulated for each sample for the intestine-specific transcript ges-1 that has the identifier R12A1.4 and for the putative “housekeeping gene” tba-1 that has the identifier F26E4.8. Include a figure legend.


In this exercise, you will use DESeq2 to identify differentially expressed genes.

  • The data from Exercise1 was heavily downsampled. For the next section, we'll need the full dataset. Download the complete HTSeq counts.txt files by copying /home/erin/05_deseq_homework to your own home directory.
  • Write an R script to analyze the data from these _counts.txt files. There is a template in the file you can hack.

To illustrate your work…

  • Import one plot generated from R into powerpoint (ok) or Illustrator (better) or some other comparable illustration software you plan to use for publishing your own figures. Turn the pdf generated from R into a publication quality figure with a legend. This may require modifying things like the axis text size or font. You may need to change the axis text itself (you can cover over the old one if you're using powerpoint) or overlay more text for clarity. You may need to add units to the legends. Be sure to include a formal Figure legend with a Figure title and a description of the figure and briefly how it was generated. Save your figure and legend together on the same page and upload it to canvas as a standard paper (letter) sized .pdf file.
  • Bonus thought experiment… Do you have concerns about this dataset? Is it violating any of the assumptions we talked about in the RNA-seq Design slides? How could it be improved. You don't need to turn in anything on this thought experiment, but do think about it.


Find a paper in your field that has performed RNA-seq. Read their methods section to determine how they performed and analyzed their RNA-seq experiment.

  • Provide a link to the paper you read.
  • List the software they used for analysis or state where they wrote custom scripts or algorithms.
  • Critique whether you think it would be relatively easy or hard to reproduce the analysis performed in the paper. Do the authors list software versions, the genome version, or the annotation file (.gff) version? Do the authors reference software used or provide a link to the software? Do the authors make their custom software or scripts available?
assignments/2017assignment2.txt · Last modified: 2017/12/01 10:25 by erin