Evidence Integration in Bioinformatics
Biologists frequently use databases; for example, when a biologist encounters some unfamiliar proteins, s/he will use databases to get a preliminary idea of what is known about them. The databases can be often interpreted as lists of assertions. An example is a protein-protein interaction database: each entry is a pair of proteins that are asserted to interact, along with the supporting evidence. Often a candidate for inclusion in such a database can be supported in a variety of fundamentally different ways. A methodological challenge is how to effectively combine these different sources of evidence to make accurate aggregate predictions. Ideas from machine learning are useful for this. I will describe some of the special properties of problems like this, and relevant methods from machine learning, including algorithms based on bayesian networks, boosting and SVMs.
Author: Phil Long, Columbia University