DATE: Thursday, Mar. 11, 2004
TIME: 11:30 am
PLACE: Council Room (SITE 5-084)
TITLE: XplorMed and Genes2Diseases: Two applications of data mining in molecular biology
PRESENTER: Miguel Andrade
Ottawa Health Research Institute
ABSTRACT:

We have applied a fuzzy relation formalism to the definition of relations of dependency between elements of a set. This is very appropriate to describe networks of relations of variable strength. We applied this formalism and to the detection of keywords and their relations of dependency in sets of abstracts selected from a search in MEDLINE (XplorMed, http://www.bork.embl.de/xplormed/) and to the detection of genes associated to inherited diseases (Genes2Diseases, http://www.bork.embl.de/g2d/).

The XplorMed server allows you to explore a set of abstracts derived from a MEDLINE search. The system gives you the main associations between the words in groups of abstracts. Then, you can select a subset of your abstracts based on selected groups of related words and iterate your analysis on them.

XplorMed is recommended for cases in which you do not know exactly what are you expecting to find. Your interests may be modified by the results obtained, or you may want to enquire new questions as the analysis develops. Also, the results may suggest you additional words that should be used to expand your query in MEDLINE (e.g., unexpected abbreviations of a protein name, or synonyms of a disease).

The public availability of the draft sequence of the human genome enables new strategies to map molecular functional features of gene products to complex phenotypic descriptions such as those of genetically inherited diseases. We have developed a scoring system for the possible functional relations of human genes to genetically inherited diseases that have been mapped onto chromosomal regions without assignment of a particular gene. Our methodology can be divided in two parts: the association of genes to phenotypic features, and the identification of candidate genes on a chromosonal region by homology. The results of application of this methodology to a set of 455 human diseases, and additional information can be accessed through the G2D web server.