Main

   Home
   CV
   Contact
   News
   About bioinformatics

Research

   Projects
   The lab
   Software
   Publications

Teaching

   CS646
   CS/ST548 CS580
   CS320 CS440
   CS160 CS161 CS200

Personal

   About me

Logo

Projects

The main focus of research in my lab is on the development of machine learning methods for problems in bioinformatics. Our specialty is in the creative design of kernel methods for problems ranging from prediction of protein function, and interactions to prediction of alternative splicing.

Protein function prediction

_images/go5.png

Despite having been studied for over twenty years, the standard method for protein function prediction remains annotation transfer. The difficulty in applying state-of-the-art machine learning methods is that proteins can have multiple functions, and that the system of keywords used to describe protein function, the Gene Ontology (GO), has a complex hierarchical structure. This provides genome annotators with a rich vocabulary with which to describe protein function, but makes it sub-optimal to use standard approaches. Therefore, there are significant opportunities to develop new classification methods that treat function prediction as a hierarchical classification problem.

We are using a recent development in machine learning - kernel methods for structured output spaces to address this problem. Our recent work in this area—the GOstruct method is showing great promise.

_images/nsflogo.jpg

This project has recently been funded by NSF grant ABI 0965768/0965616.

Alternative splicing in plants

Alternative splicing is not as well-studied in plants as it is in animals. The extent of alternative splicing in plants wasn’t appreciated until recent studies have shown alternative splicing rates that are approaching those observed in animals. The differences in genome architecture between plants and animals lead to differences in alternative splicing: for example, the majority of alternative splicing events in plants are intron retention, as opposed to exon skipping in animals. In collaboration with A.S.N. Reddy of the Biology Department, my lab is working on obtaining a better understanding of alternative splicing in plants. Our approach is to computationally search for sequence features that are predictive of alternative splicing—elements that serve as splicing enhancers and suppressors, and then test their biological relevance to the process.

_images/chlamy.CL010575.png

A second avenue we are persuing is to leverage next generation sequencing data for prediction of alternative splicing events. The noisy nature of this data makes this a challenging task. Our most recent results illustrate the wide-spread existence of alternative splicing in the single cell-alga Chalmydomonas reinhardtii.

_images/nsflogo.jpg

This project is funded by NSF grant DBI 0743097.

Protein protein interactions

Proteins perform their function by interacting with other proteins. Therefore understanding the complex network of interactions between an organism’s proteins is important for understanding their role. Even with the advent of high-throughput experimental methods for elucidating interactions, the interaction networks of even well-studied model organisms are only sparsely known. My work in this area includes genome-wide prediction of interaction networks in yeast and human; more recently, my lab is focusing on interactions of specific proteins such as Calmodulin which is highly conserved in all Eukaryotes, and interacts with a large number of proteins in each organism. This targeted approach allows us to tailor our predictors to the known properties of the protein in question. In the case of Calmodulin, interaction is typically via an amphiphylic helix in a contiguous region of the target protein.

This research has been carried out by Michael Hamilton, in collaboration with A.S.N. Reddy’s lab in the Biology Department at CSU.

Some of my older work in the area: