DATE: | Thursday, Oct 6th, 2011 |
TIME: | 3:30 pm |
PLACE: | Council Room (SITE 5-084) |
TITLE: | Automating the extraction of knowledge from biological sequences |
PRESENTER: | Oksana Korol University of Ottawa |
ABSTRACT: In the past decade fast advancements have been made in the sequencing, digitalization and collection of biological data. This created a need for comprehensive automated data analysis techniques. To at least partially address this problem, we have developed a machine learning method based on inductive logic programming (ILP) to analyse and extract knowledge from a set of related DNA sequences. Our method is able to induce a descriptive, human-readable theory, describing the data in terms of relationships between commonly found biological markers (motifs). The application that we have developed is an end-to-end analysis service. It consists of a simple to use Web interface, back-end modules to locate motifs, format the data and execute an ILP engine, which induces the theory based on positive and negative examples and background knowledge. The model, assumed in our background knowledge, is a set of basic interactions between biological markers in any DNA sequence. This makes our approach applicable to analyze a wide variety of biological problems. We have successfully applied our method on two real biological datasets and some discovered trends were found to be of high interest by the biological expert. |