DATE: Wed, Sept 28, 2016
TIME: 1:30 pm
TITLE: Semisupervised Text Classification Using Unsupervised Topic Information
PRESENTER: Sylvie Ratte and Ruben Dorado
Ecole de technologie superieure, Montreal

Labeling corpora is a time consuming and recurring problem while developing practical NLP applications. This study aims to propose a method to increase the speed when developing categorized corpora for text classification. We report advances in the task by presenting a semi-supervised method to build a text classifier using unsupervised topic information. The objective is to use the least amount of labeled data to accelerate the creation of corpus for classification in specific domains. We show that it is possible to obtain a performance similar to state-of-the-art methods, despite the limited quantity of data. We finally discuss the overall objective of the research and future research.