Main.Projects History

Hide minor edits - Show changes to markup

Changed lines 2-3 from:

548 projects this semester will focus on tools for spliced alignment of short read data.

to:

Projects this semester will focus on tools for spliced alignment of short read data.

Changed line 3 from:

580 projects this semester will focus on tools for spliced alignment of short read data.

to:

548 projects this semester will focus on tools for spliced alignment of short read data.

Changed line 45 from:
  • We have more simulated data - 14 million paired end reads. There are two files: set1 and set2. Start with aligning them individually, and see if using them as paired end data improves accuracy. Here's a file that provides the cigar strings for the reads.
to:
  • We have more simulated data - 14 million paired end reads. There are two files: set1 and set2. Start with aligning them individually, and see if using them as paired end data improves accuracy. Here's a file that provides the cigar strings for the reads.
Changed lines 46-48 from:
  • For testing purposes, here are two files that provide a list of splice junctions generated from EST alignments and from curated gene models. [ EST junctions ], [ annotated junctions ].
to:
Changed line 46 from:
  • For testing purposes, here are two files that provide a list of splice junctions generated from EST alignments and from curated gene models. [ EST junctions ], [ annotated junctions coming soon ].
to:
  • For testing purposes, here are two files that provide a list of splice junctions generated from EST alignments and from curated gene models. [ EST junctions ], [ annotated junctions ].
Changed line 46 from:
  • For testing purposes, here are two files that provide a list of splice junctions generated from EST alignments and from curated gene models. [ EST junctions ], [ coming soon]
to:
  • For testing purposes, here are two files that provide a list of splice junctions generated from EST alignments and from curated gene models. [ EST junctions ], [ annotated junctions coming soon ].
Changed line 46 from:
  • For testing purposes, here are two files that provide a list of splice junctions generated from EST alignments and from curated gene models. [ EST junctions ], [ gene model junctions ]
to:
  • For testing purposes, here are two files that provide a list of splice junctions generated from EST alignments and from curated gene models. [ EST junctions ], [ coming soon]
Changed lines 43-48 from:

For aligning both datasets you will need the sequence of the Arabidiopsis genome. You can download these from the TAIR website. You will need the sequences for chromosomes 1-5.

to:

New data

  • We have more simulated data - 14 million paired end reads. There are two files: set1 and set2. Start with aligning them individually, and see if using them as paired end data improves accuracy. Here's a file that provides the cigar strings for the reads.
  • For testing purposes, here are two files that provide a list of splice junctions generated from EST alignments and from curated gene models. [ EST junctions ], [ gene model junctions ]

For aligning the datasets you will need the sequence of the Arabidiopsis genome. You can download these from the TAIR website. You will need the sequences for chromosomes 1-5.

Changed lines 38-39 from:

Short read simulated data from Arabidopsis thaliana. Here's a readme.
You can download the genome sequences from the TAIR website. You will need the sequences for chromosomes 1-5.

to:

We ask you to apply your chosen program to the following two datasets:

  • Short read simulated data from Arabidopsis thaliana. Here's a readme.
  • Short read data generated by our collaborator from the biology department. The data is available from the NCBI short-read archive as GEO accession GSE32318. Note that this data is composed of two replicates that you need to align separately. The link for downloading the data is at the bottom of the page, labeled as supplementary file download.

For aligning both datasets you will need the sequence of the Arabidiopsis genome. You can download these from the TAIR website. You will need the sequences for chromosomes 1-5.

Changed lines 38-39 from:

Simulated data from Arabidopsis thaliana. Here's a readme.

to:

Short read simulated data from Arabidopsis thaliana. Here's a readme.
You can download the genome sequences from the TAIR website. You will need the sequences for chromosomes 1-5.

Changed lines 36-41 from:

Presentation schedule:

to:

Data

Simulated data from Arabidopsis thaliana. Here's a readme.

Presentation schedule:

Added lines 35-40:

Presentation schedule:

  • Tuesday 10/11 Jeremy
  • Thursday 10/13 Fayyaz and Mo
  • Tuesday 10/18 Nathan and Arpita
  • Thursday 10/20 Zhisheng and Indika
Changed line 9 from:
  • Gsnap
to:
  • Gsnap (Zhisheng)
Changed line 25 from:
  • BWA
to:
  • BWA (Nathan)
Changed line 28 from:
  • RUM
to:
  • RUM (Indika)
Changed line 22 from:
  • SOAPsplice
to:
  • SOAPsplice (Arpita)
Changed line 12 from:
  • palmapper
to:
  • palmapper (Fayyaz)
Changed line 19 from:
  • SpliceMap
to:
  • SpliceMap (Mo)
Changed line 6 from:
  • MapSplice
to:
  • MapSplice (Jeremy)
Changed line 16 from:
  • TopHat
to:
  • TopHat (Jeremy)
Changed lines 29-30 from:

Gregory R. Grant, Michael H. Farkas, Angel D. Pizarro, Nicholas F. Lahens, Jonathan Schug, Brian P. Brunk, Christian J. Stoeckert, John B. Hogenesch, and Eric A. Pierce. Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM). Bioinformatics (2011) 27(18): 2518-2528.

to:

Gregory R. Grant, Michael H. Farkas, Angel D. Pizarro, Nicholas F. Lahens, Jonathan Schug, Brian P. Brunk, Christian J. Stoeckert, John B. Hogenesch, and Eric A. Pierce. Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM). Bioinformatics (2011) 27(18): 2518-2528.

Added lines 28-30:
  • RUM
    Gregory R. Grant, Michael H. Farkas, Angel D. Pizarro, Nicholas F. Lahens, Jonathan Schug, Brian P. Brunk, Christian J. Stoeckert, John B. Hogenesch, and Eric A. Pierce.

Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM). Bioinformatics (2011) 27(18): 2518-2528.

Changed lines 7-8 from:
  Paper:  K. Wang et al. MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery. Nucl. Acids Res. (2010) 38(18): e178.
to:

K. Wang et al. MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery. Nucl. Acids Res. (2010) 38(18): e178.

Changed lines 10-11 from:

Paper: T.D. Wu and S. Nacu. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics (2010) 26: 873-881.

to:

T.D. Wu and S. Nacu. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics (2010) 26: 873-881.

Changed line 20 from:

Kin Fai Au, Hui Jiang, Lan Lin, Yi Xing, and Wing Hung Wong. Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Research (2010) doi: 10.1093/nar/gkq211.

to:

Kin Fai Au, Hui Jiang, Lan Lin, Yi Xing, and Wing Hung Wong. Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Research (2010).

Changed line 1 from:

Projects

to:

Projects

Changed line 1 from:

Projects

to:

Projects

Added lines 24-26:
  • BWA
    Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 2010;26(5):589-95.
Changed line 23 from:

Huang S, Zhang J, Li R, Zhang W, He Z, Lam T-W, Peng Z and Yiu S-M. SOAPsplice: genome-wide ab initio detection of splice junctions from RNA-Seq data. Frontiers in Genomic Assay Technology. (2010) 2:46.

to:

Huang S, Zhang J, Li R, Zhang W, He Z, Lam T-W, Peng Z and Yiu S-M. SOAPsplice: genome-wide ab initio detection of splice junctions from RNA-Seq data. Frontiers in Genomic Assay Technology (2010) 2:46.

Added lines 21-23:
  • SOAPsplice
    Huang S, Zhang J, Li R, Zhang W, He Z, Lam T-W, Peng Z and Yiu S-M. SOAPsplice: genome-wide ab initio detection of splice junctions from RNA-Seq data. Frontiers in Genomic Assay Technology. (2010) 2:46.
Changed line 20 from:

Kin Fai Au, Hui Jiang, Lan Lin, Yi Xing, and Wing Hung Wong. . Nucleic Acids Research (2010) doi: 10.1093/nar/gkq211.

to:

Kin Fai Au, Hui Jiang, Lan Lin, Yi Xing, and Wing Hung Wong. Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Research (2010) doi: 10.1093/nar/gkq211.

Changed line 16 from:
  • tophat
to:
  • TopHat
Added lines 18-20:
  • SpliceMap
    Kin Fai Au, Hui Jiang, Lan Lin, Yi Xing, and Wing Hung Wong. . Nucleic Acids Research (2010) doi: 10.1093/nar/gkq211.
Changed lines 16-18 from:
  • tophat
to:
  • tophat
    Trapnell C, Pachter L, and Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics (2009) 25 (9): 1105-1111.
Changed lines 12-14 from:
  • palmapper.
to:
  • palmapper
    De Bona, F. et al., Optimal spliced alignments of short sequence reads. ECCB08/Bioinformatics, 24 (16):i174, 2008.
Changed lines 7-8 from:
  Paper:  Kai Wang, Darshan Singh, Zheng Zeng, Stephen J. Coleman, Yan Huang, Gleb L. Savich, Xiaping He, Piotr Mieczkowski, Sara A. Grimm, Charles M. Perou, James N. MacLeod, Derek Y. Chiang, Jan F. Prins, and Jinze Liu. MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery. Nucl. Acids Res. (2010) 38(18): e178.
to:
  Paper:  K. Wang et al. MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery. Nucl. Acids Res. (2010) 38(18): e178.
Changed lines 10-11 from:

Paper: Thomas D. Wu and Serban Nacu. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics (2010) 26: 873-881.

to:

Paper: T.D. Wu and S. Nacu. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics (2010) 26: 873-881.

Changed lines 7-10 from:
  Paper:  Kai Wang, Darshan Singh, Zheng Zeng, Stephen J. Coleman, Yan Huang, Gleb L. Savich, Xiaping He, Piotr Mieczkowski, Sara A. Grimm, Charles M. Perou, James N. MacLeod, Derek Y. Chiang, Jan F. Prins, and Jinze Liu. MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery.

Nucl. Acids Res. (2010) 38(18): e178

  • Gsnap
to:
  Paper:  Kai Wang, Darshan Singh, Zheng Zeng, Stephen J. Coleman, Yan Huang, Gleb L. Savich, Xiaping He, Piotr Mieczkowski, Sara A. Grimm, Charles M. Perou, James N. MacLeod, Derek Y. Chiang, Jan F. Prins, and Jinze Liu. MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery. Nucl. Acids Res. (2010) 38(18): e178.
  • Gsnap
    Paper: Thomas D. Wu and Serban Nacu. Fast and SNP-tolerant detection of complex variants and splicing in short reads.

Bioinformatics (2010) 26: 873-881.

Changed lines 7-8 from:
  Paper:  Kai Wang, Darshan Singh, Zheng Zeng, Stephen J. Coleman, Yan Huang, Gleb L. Savich, Xiaping He, Piotr Mieczkowski, Sara A. Grimm, Charles M. Perou, James N. MacLeod, Derek Y. Chiang, Jan F. Prins, and Jinze Liu

MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery.

to:
  Paper:  Kai Wang, Darshan Singh, Zheng Zeng, Stephen J. Coleman, Yan Huang, Gleb L. Savich, Xiaping He, Piotr Mieczkowski, Sara A. Grimm, Charles M. Perou, James N. MacLeod, Derek Y. Chiang, Jan F. Prins, and Jinze Liu. MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery.
Changed lines 6-9 from:
  • MapSplice
to:
  • MapSplice
    Paper: Kai Wang, Darshan Singh, Zheng Zeng, Stephen J. Coleman, Yan Huang, Gleb L. Savich, Xiaping He, Piotr Mieczkowski, Sara A. Grimm, Charles M. Perou, James N. MacLeod, Derek Y. Chiang, Jan F. Prins, and Jinze Liu

MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery. Nucl. Acids Res. (2010) 38(18): e178

Changed lines 3-23 from:

580 projects this semester will focus on tools for spliced alignment / alignment of short read data. Each student should choose one of the following programs to look at:

  • Palma and Qpalma (Alex)
  • The tophat / bowtie suite of programs (Kate)
  • SSAHA2 (Nissa)
  • BLAT (Michael)
  • Exonerate (Alan)
  • Pass (Zach)
  • Gmap

Talk to the instructor to choose a program. You will apply the program to simulated short-read data that we will provide. Your final report will describe your experience with the program in comparison to some of the other programs. You will give a presentation during the last week of classes and submit a final report describing your analysis.

to:

580 projects this semester will focus on tools for spliced alignment of short read data. Each student will choose one of the following programs to look at:

  • MapSplice
  • Gsnap
  • palmapper.
  • tophat

During the course of the project you will:

  • Present the method in class.
  • Apply the program to simulated short-read data that we will provide.
  • Write a report that describes your experience with the program and present your findings to the class. These will be due during the last week of classes.
Changed lines 12-13 from:
  • BLAT
to:
  • BLAT (Michael)
Added lines 15-16:
  • Pass (Zach)
Changed line 14 from:
  • Exonerate
to:
  • Exonerate (Alan)
Changed line 8 from:
  • The tophat / bowtie suite of programs
to:
  • The tophat / bowtie suite of programs (Kate)
Changed lines 6-7 from:
  • Palma and Qpalma
to:
  • Palma and Qpalma (Alex)
Changed line 10 from:
  • SSAHA2
to:
  • SSAHA2 (Nissa)
Added lines 15-16:
  • Gmap
Changed lines 6-10 from:
  • Palma and

Qpalma

  • The tophat /

bowtie suite of programs

to:
  • Palma and Qpalma
  • The tophat / bowtie suite of programs
Changed lines 3-4 from:

Each student will choose a project topic after talking with the instructor. Projects will involve analyzing sequence data using methods studied in class. As part of the project each student will give a presentation during the last week of classes and submit a final report describing the analysis.

to:

580 projects this semester will focus on tools for spliced alignment / alignment of short read data. Each student should choose one of the following programs to look at:

  • Palma and

Qpalma

  • The tophat /

bowtie suite of programs

  • SSAHA2
  • BLAT
  • Exonerate

Talk to the instructor to choose a program. You will apply the program to simulated short-read data that we will provide. Your final report will describe your experience with the program in comparison to some of the other programs. You will give a presentation during the last week of classes and submit a final report describing your analysis.

Deleted line 0:
Changed lines 3-4 from:

Projects will be done in groups of two. Each group will choose a project topic after talking with the instructor. Projects will involve analyzing sequence data using methods studied in class. As part of the project each group will give a presentation during the last week of classes and submit a final report describing what they did.

to:

Each student will choose a project topic after talking with the instructor. Projects will involve analyzing sequence data using methods studied in class. As part of the project each student will give a presentation during the last week of classes and submit a final report describing the analysis.

Added lines 1-5:

Projects

Projects will be done in groups of two. Each group will choose a project topic after talking with the instructor. Projects will involve analyzing sequence data using methods studied in class. As part of the project each group will give a presentation during the last week of classes and submit a final report describing what they did.