Assignment 3

Due date: 3/1/13

In this assignment you will write a python module called assignment3.py, which performs the following tasks.

GC content

Write a function called gcContent(sequence) that computes the GC content of a DNA sequence, i.e. the fraction of G or C nucleotides in the sequence. Your function should work regardless of the case in which the sequence is provided (upper/lower case). Note that a DNA sequence can contain positions that are ambiguous. These are represented by ambiguity codes. For example, 'N' denotes that any nucleotide is possible in that position. In computing GC content ignore positions that contain ambiguous nucleotides. For example, for the sequence

TTACTNGGAGNT

Your function should return 0.4

String matching with mismatches

Write a function called find_with_mismatches(s, substr, num_mismatches) that determines whether the string s contains substr when allowed up to the given number of mismatches (num_mismatches). Your function should return the first position where a match occurs, and -1 if there is no match.

For example: the string CGCT occurs in AGGTCACTAG when you allow for a single mismatch in index 4. In the context of motif finding this is very useful, since patterns (motifs) in DNA or protein sequences do not always occur in exactly the same way. Searching with mismatches allows us to capture this variability. Further assume that your function is receiving a DNA sequence as input, and that positions that are not either A,C,G, or T in the string your are searching in (s) do not constitute matches. If you find it hard to deal with an arbitrary number of mismatches, then solve the problem for num_mismatches=1.

Reverse complement

Write a function reverse_complement(sequence) that receives as input a DNA sequence and computes its reverse complement. For example, the reverse complement of AGTCATG is CATGACT. In computing the reverse complement assume that any character that is not A,C,G, or T is its own complement.

Put the three functions in a module called assignment3.py, and include a "main" that allows a user to test them similarly to assignment 2. At the top of the file put a comment that identifies you and the program (use a multi-line comment using triple quotes):

"""
Assignment 3
Submitted by Your_Name
A short description of your program
"""

Also, just below each function definition, include a short description of the method and its parameters in triple quotes.

Submit the program via ramct by the due date.