Main.Assignment3 History

Hide minor edits - Show changes to markup

February 21, 2013, at 02:51 PM MST by 129.82.44.223 -
Added line 26:

If you find it hard to deal with an arbitrary number of mismatches, then solve the problem for num_mismatches=1.

February 18, 2013, at 10:22 AM MST by 129.82.44.223 -
Added line 39:

"""

Added line 43:

"""

February 18, 2013, at 10:22 AM MST by 129.82.44.223 -
Changed lines 5-11 from:

Nucleotide composition

Write a program that asks the user for a nucleotide sequence and then prints out the fraction of each nucleotide out of the total number of nucleotides in the sequence. Assume the user provides the input in capital letters.

Suppose the input sequence provided by the user is

to:

In this assignment you will write a python module called assignment3.py, which performs the following tasks.

GC content

Write a function called gcContent(sequence) that computes the GC content of a DNA sequence, i.e. the fraction of G or C nucleotides in the sequence. Your function should work regardless of the case in which the sequence is provided (upper/lower case). Note that a DNA sequence can contain positions that are ambiguous. These are represented by ambiguity codes. For example, 'N' denotes that any nucleotide is possible in that position. In computing GC content ignore positions that contain ambiguous nucleotides. For example, for the sequence

Changed lines 16-32 from:

Then the output should look like:

The nucleotide composition is:
A - 0.2
C - 0.1
G - 0.3
T - 0.4
Note that non-nucleotide symbols are not counted (the N usually denotes a position where the sequence is unknown).

Position i in a string can be accessed as a[i], so you can use a while or for loop to iterate through the letters of a string. Call your program nucleotide_composition.py, and have a function in it that receives as a parameter that contains the sequence the user has provided.

Submit the program via ramct. At the top of each file put a comment that identifies you and the program (use a multi-line comment using triple quotes):

to:

Your function should return 0.4

String matching with mismatches

Write a function called find_with_mismatches(s, substr, num_mismatches) that determines whether the string s contains substr when allowed up to the given number of mismatches (num_mismatches). Your function should return the first position where a match occurs, and -1 if there is no match.

For example: the string CGCT occurs in AGGTCACTAG when you allow for a single mismatch in index 4. In the context of motif finding this is very useful, since patterns (motifs) in DNA or protein sequences do not always occur in exactly the same way. Searching with mismatches allows us to capture this variability. Further assume that your function is receiving a DNA sequence as input, and that positions that are not either A,C,G, or T in the string your are searching in (s) do not constitute matches.

Reverse complement

Write a function reverse_complement(sequence) that receives as input a DNA sequence and computes its reverse complement. For example, the reverse complement of AGTCATG is CATGACT. In computing the reverse complement assume that any character that is not A,C,G, or T is its own complement.

Put the three functions in a module called assignment3.py, and include a "main" that allows a user to test them similarly to assignment 2. At the top of the file put a comment that identifies you and the program (use a multi-line comment using triple quotes):

Changed lines 42-46 from:

(:sourceend:)

to:

(:sourceend:)

Also, just below each function definition, include a short description of the method and its parameters in triple quotes.

Submit the program via ramct by the due date.

February 18, 2013, at 09:59 AM MST by 129.82.44.223 -
Changed lines 29-38 from:

Prime number detection

A prime number is a number that is only divisible by 1 or itself. Write a method called check_prime that receives an integer as input and returns True if it's prime, and False otherwise. The name of the python module should be prime.py.

Submit the programs by email to your instructor. At the top of each file put a comment that identifies you and the program (use a multi-line comment using triple quotes):

Assignment 2
Submitted by Your_Name
A short description of your program

to:

Submit the program via ramct. At the top of each file put a comment that identifies you and the program (use a multi-line comment using triple quotes):

(:source lang=python:) Assignment 3 Submitted by Your_Name A short description of your program (:sourceend:)

February 18, 2013, at 09:56 AM MST by 129.82.44.223 -
Changed line 3 from:

Due date: 2/15/10

to:

Due date: 3/1/13

February 08, 2010, at 02:39 PM MST by 129.82.18.169 -
Added lines 3-4:

Due date: 2/15/10

Changed lines 26-28 from:
to:

Call your program nucleotide_composition.py, and have a function in it that receives as a parameter that contains the sequence the user has provided.

Changed lines 32-38 from:

Write a method called check_prime that receives an integer as input and returns True if it's prime, and False otherwise.

to:

Write a method called check_prime that receives an integer as input and returns True if it's prime, and False otherwise. The name of the python module should be prime.py.

Submit the programs by email to your instructor. At the top of each file put a comment that identifies you and the program (use a multi-line comment using triple quotes):

Assignment 2
Submitted by Your_Name
A short description of your program

February 08, 2010, at 11:50 AM MST by 129.82.18.166 -
Changed lines 25-26 from:
to:

Prime number detection

A prime number is a number that is only divisible by 1 or itself. Write a method called check_prime that receives an integer as input and returns True if it's prime, and False otherwise.

February 07, 2010, at 09:03 PM MST by 71.196.160.210 -
Added lines 21-26:

Position i in a string can be accessed as a[i], so you can use a while or for loop to iterate through the letters of a string.

February 07, 2010, at 09:02 PM MST by 71.196.160.210 -
Changed line 20 from:

Note that non-nucleotide symbols are not counted (the @N@ usually denotes a position where the sequence is unknown).

to:

Note that non-nucleotide symbols are not counted (the N usually denotes a position where the sequence is unknown).

February 07, 2010, at 09:01 PM MST by 71.196.160.210 -
Changed lines 14-18 from:

The nucleotide composition is: A - 0.2 C - 0.1 G - 0.3 T - 0.4

to:

The nucleotide composition is:
A - 0.2
C - 0.1
G - 0.3
T - 0.4

February 07, 2010, at 09:01 PM MST by 71.196.160.210 -
Changed lines 14-18 from:

@@The nucleotide composition is: A - 0.2 C - 0.1 G - 0.3 T - 0.4@@

to:

The nucleotide composition is: A - 0.2 C - 0.1 G - 0.3 T - 0.4

February 07, 2010, at 09:00 PM MST by 71.196.160.210 -
Added lines 1-20:

Assignment 3

Nucleotide composition

Write a program that asks the user for a nucleotide sequence and then prints out the fraction of each nucleotide out of the total number of nucleotides in the sequence. Assume the user provides the input in capital letters.

Suppose the input sequence provided by the user is

TTACTNGGAGNT

Then the output should look like:

@@The nucleotide composition is: A - 0.2 C - 0.1 G - 0.3 T - 0.4@@

Note that non-nucleotide symbols are not counted (the @N@ usually denotes a position where the sequence is unknown).