DATE: Thursday, Nov 24, 2011
TIME: 3:30 pm
PLACE: Council Room (SITE 5-084)
TITLE: Privacy Leakage in Multi-relational Learning via Unwanted Classification Models
PRESENTER: Hongyu Guo
NRC
ABSTRACT:

Relational database mining, where data are mined across multiple interlinked relations, is increasingly commonplace. However, when considering a complex database schema, it becomes difficult to identify all possible relationships between attributes from the different relations. A set of seemingly harmless attributes across multiple tables, therefore, may be used to learn unwanted classification models to accurately determine confidential information, leading to data leaks when publishing the results of a data mining exercise. Furthermore, eliminating or distorting confidential attributes may be insufficient to prevent such data disclosure.

In this talk, we will demonstrate such potential for privacy leakage. Also, we will present our approach to identify "dangerous" attribute sets. For data publishing, our method generates a ranked list of subschemas which maintain the predictive performance on the class attribute, while limiting the disclosure risk, and predictive accuracy, of confidential attributes. We demonstrate the effectiveness of our method against several databases.

This is a joint work with Herna L. Viktor (University of Ottawa) and Eric Paquet (Institute for Information Technology, NRC)