NSCI 580A5 fall 2017

Sidebar

NSCI 580A5

Instructors
Tai Montgomery
Asa Ben-Hur

assignments:assignment6

Assignment 6

Due date: 11/3 at 11:59pm.

Analyzing Protein-protein interactions

In this assignment you will write Python code that loads and analyzes protein-protein interaction data.

Your first task is to load interaction data stored in a file. The data is stored as a comma-delimited format, i.e. a CSV file. Here are the first few lines of our example file:

YPL094C,YPR086W
YPL043W,YPR072W
YPL070W,YPR193C

Each line has the format:

protein_a,protein_b

This indicates that protein_a interacts with protein_b. In writing your code, use the following interaction dataset. This file contains 10,517 interactions in yeast extracted from the Bind database.

Write a module called ppi.py with the following functions:

• load_interactions(file_name): this function should return a list of tuples, where each element in the list is an interaction, and each element in the tuple is the identifier of a protein. For the above example, the return value should be a list of length three:
[('YPL094C','YPR086W'), ('YPL043W','YPR072W'), ('YPL070W','YPR193C')]
• interact(interactions, id1, id2): This function receives the IDs of two proteins and returns True if they appear in the given interaction dataset, and False otherwise. Make sure that your function returns the same value regardless of the order in which the proteins are provided.
• get_interactions(interactions, id). This function returns the IDs of all the proteins with which the protein with the given ID interacts with in the given interaction dataset.
• average_interactions(interactions). Returns the average number of interactions per protein in the given set of interactions.

The following is a barebones use case of the code:

interactions = load_interactions('yeast_interactions.data')

protein1 = 'YPL094C'
protein2 = 'YPR086W'
print("do " + protein1 + " protein " + protein2 + "interact? " +   interact(interactions, protein1, protein2))

print ("the number of interactions of " + protein1 " : " + str(len(get_interactions(interactions, protein1)))

print("the average number of interactions per protein: " + str(average_interactions(interactions)))

Dictionaries are a really good way of representing the set of interactions, and has the potential of making your code much faster.

Submission

In writing your code use the template shown in class. The “main” segment of the module should be used to test each of the functions. Submit the file ppi.py.