DATE: Wednesday, Oct. 31, 2007
TIME: 4:00 pm
PLACE: Council Room (SITE 5-084)
TITLE: Pruning Relations for Substructure Discovery of Multi-relational Databases
PRESENTERS: Hongyu Guo
University of Ottawa
ABSTRACT:

Multirelational data mining methods discover patterns across multiple interlinked tables (relations) in a relational database. In many real-world applications, such a database often involves a large number of relations. Unfortunately, acquiring such data is often expensive, in terms of data mining overheads such as cost of acquiring the data, preprocessing, building the model, computational overhead, and human effort. Another important issue is preserving the privacy of the data when the relational data is from multiple sources, e.g. from different relations with different levels of access privileges.

This talk presents our approach which aims to pre-prune uninteresting relations for multirelational classification tasks. Our strategy creates a pruned structure while minimizing the predictive performance loss to the final classification model. We also present our experiments on both real-world and synthetic databases. Our research suggests that one can build an accurate relational classification model using only a small subset of the original schema.