October 16, 2008

DATE:	Thursday, Oct 16, 2008
TIME:	2:45 pm
PLACE:	Council Room (SITE 5-084)
TITLE:	A Simple Sampling Method to Improve Co-training
PRESENTER:	Jin Huang University of Ottawa
ABSTRACT: Co-training is an effective semi-supervised learning method which uses two view classifiers to label unlabeled instances for each other. In co-training process the random sampling is used to select unlabeled instances from a large unlabeled instance pool. Since some research, such as active learning, has shown that usually random sampling cannot achieve the best performance, we explore whether we can use other sampling method to improve co-training. In this paper a novel sampling method is proposed. The basic idea of this sampling method is to use two view classifiers to select the unlabeled instances that two view classifiers agree most in assigning labels. We perform experiments to compare the co-training with our new sampling method, the original co-training, and co-training that uses co-testing sampling. The experimental results show that our new sampling method performs significantly better than the original co-training algorithm. Co-training that uses co-testing sampling method usually performs the worst.