Assignment 5
Due date: 3/1/10
Nucleotide composition - again
Write a function called nucleotide_composition(file_name) that computes the nucleotide composition in sequences contained in the given file, which is in Fasta format (a description of the format is found here.
The function should return the nucleotide composition as a list of length 4 where the first position is the fraction of As, the second is the fraction of Cs, the third is the fraction of Gs and the fourth is the fraction of Ts.
Assume the file has sequences in capital letters.
Suppose the file contains the sequence
TTACTNGGAGNT
Then your function should return the list [0.2,0.1,0.3,0.4].
Note that non-nucleotide symbols are not counted (the N usually denotes a position where the sequence is unknown).
Call your program nucleotide_composition.py. When run as a script your program should ask the user for a file name. It should verify that the file exists, and keep asking the user for a file name until the user provides a proper file name.
Processing comma delimited files
Write a program that asks the user to enter the name of a comma-delimited file, and the name of an output file. It then computes the average of each column in the file and writes that as a comma delimited line to the second file provided by the user. The file is comma-delimited and contains only numbers. The number of columns in each row is assumed to be the same. An example input file might look like
3,5,21,6,10
For this file the output file will contain the line 2,5.5,6.
Structuring your program
Your program should be structured as follows:
# all the function definitions go here
if __name__ == '__main__' :
input_file = ask_for_input_file()
output_file = ask_for_output_file()
average = csv_average(input_file)
write_to_file(average, output_file)
The function ask_for_input_file() should prompt the user for the file that contains the csv-formatted input. It should verify that the file exists, and keep asking the user for a file name until the user provides a proper file name.
The function ask_for_output_file() should ask for the name of the output file.
The function csv_average(input_file) computes the column averages and returns them as a list.
The function write_to_file writes the result to the output file. Your program should be called csv_average.py.
Include an identifying comment as usual, plus comments that explain how your code works.
