Mar 30, 2010

DATE:	Tuesday, Mar 30, 2010
TIME:	3:30 pm
PLACE:	Council Room (SITE 5-084)
TITLE:	Dimensionality Reduction Techniques and the Search for Transcriptional Regulatory Modules
PRESENTER:	Alain Tchagang Institute for Information Technology, NRC
ABSTRACT: One of the major goals in computational biology is to identify transcriptional regulatory modules from gene expression datasets. Transcriptional regulatory modules are groups of transcription factors and their target genes, such that genes in the same group tend to be similarly expressed and regulated by the same transcriptional regulatory machinery across a number of experimental conditions or time points. To analyze gene expression datasets containing large numbers of observed variables, many researchers have applied dimensionality reduction techniques such as principal component analysis (PCA), singular value decomposition (SVD), independent component analysis (ICA), network component analysis (NCA), partial least squared (PLS) regression, factor analysis (FA), dynamic factor analysis (DFA), state space (SS) modeling, and others. In these techniques, a measure of similarity between the observed variables is defined and a low dimensional graphical representation of these similarities is presented. Although some of these techniques have been shown to reveal significant patterns in gene expression data, more still has to be done. In this talk, first, I will provide a review of the above mentioned techniques (advantages and disadvantages) as applied to gene expression data analysis. Next, I will provide an extension of the DFA and/or SS approaches and evaluate their performance on two criteria: gene clustering and transcription factor activity profiles reconstruction. The first tests their capability to uncover biologically meaningful groups of genes whereas the second assesses the reproducibility of transcription factor activity profiles from the time-series gene expression data. These extensions are tested using synthetic data as well as well-defined Saccharomyces cerevisiae and Escherichia coli microarray experimental datasets.