DATE: Thursday, Mar 17, 2011
TIME: 3:30 pm
PLACE: Council Room (SITE 5-084)
TITLE: Handling concept drift in relational databases using ensemble learning
PRESENTER: Mohammed Alshammeri
University of ottawa
ABSTRACT:

In classification, concept drift refers to the scenario where the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. Common types of such concepts are weather patterns, customer preferences, temperature shifts and behavioral changes, amongst others. It follows that changes in the underlying data distribution may cause the models built on older data to be inconsistent with the new concept’s data. Thus, the concept drift problem complicates the task of learning a model, since the data mining algorithm needs to both detect, and handle, such changes. Such concept drift is very common in temporal data repositories such as data warehouses, which contains vast amounts of data as collected over time. In this case, typical seasonal and behavior patterns cause concept drift that needs to be detected and handled.

We discuss our research, which involves the design of an algorithm for handling concept drift within the database mining domain. Our proposed approach first builds an ensemble of classifiers, based on separate chunks of the data. Next, we aim to address concept drift by identifying, and eliminating, the worst classifiers as identified during ensemble learning. Our objective is to dynamically remove the unhelpful classifiers and to subsequently replace them with better classifiers, as concept drift occurs.