DATE: Wed, Nov 20, 2013
TIME: 12:00
PLACE: Council Room (SITE 5-084)
TITLE: Cost-Sensitive Boosting Algorithms for Imbalanced Multi-instance Datasets
PRESENTER: Benjamin Wang
University of Ottawa
ABSTRACT:

Multi-instance learning is different than standard propositional classification, because it uses a set of bags containing many instances as input. The instances in each bag are not labeled, but the bags themselves are labeled positive or negative. Our research shows that classification of multi-instance data with imbalanced class distributions significantly decreases the performance normally achievable by most multi-instance algorithms, which is the same as the performance of most standard, single-instance classifier learning algorithms. In this paper, we present and analyze this multi-instance class imbalance problem, and propose a novel solution framework. We focus on how to utilize the extended AdaBoost techniques applicable to most multi-instance classifier learning algorithms. Cost-sensitive boosting algorithms are developed by introducing cost items into the learning framework of AdaBoost, to enable classification of imbalanced multi-instance datasets.