Apr. 20, 2004

DATE:	Tuesday, Apr. 20, 2004
TIME:	11:30 am
PLACE:	Council Room (SITE 5-084)
TITLE:	Using Information Retrieval for Large Scale Gene Analysis
PRESENTER:	Hagit Shatkay Queen's University
ABSTRACT: Current genomic research has generated an immense volume of data and a tremendous increase in the number of gene-related publications. This wealth of information presents a major data analysis challenge. The ultimate goal is to understand the complex biological interrelationships among all discovered genes and proteins. Meeting this goal requires scanning the abundant literature about each gene, and plenty of human expertise. The past few years have seen an increase in research into biomedical literature mining, as automated systems for extracting relevant information from the literature are needed to complement existing techniques, speed up analysis, and enhance our understanding of biological processes. We present an approach, based on probabilistic information retrieval, which uses the literature to establish functional relationships among genes on a genome-wide scale. Experiments applied to documents discussing yeast genes, and a comparison of the results to well-established gene function, demonstrate the effectiveness of our approach.