DATE: Thu, Feb 4, 2016
TIME: 1:30 pm
TITLE: Tweets classification using complex features based on distributed word representations
PRESENTER: Vera Sazonova
Dalhousie University

Distributed word vector representations have been shown to exhibit interesting semantic and syntactic properties. Recently they have been successfully used in NLP tasks such as sentiment recognition and named entity extraction. Text classification is a classic text mining problem which requires a reliable representation for a text. We have performed a series of classification experiments using a novel test representation model for short texts: tweets based on distributed word representation. The proposed model results in a dense vector and can take advantage of the additional unlabelled data if such data is available. We present an extensive study of the classification quality dependence on various model and dataset parameters. We compare the performance of our model with the standard BOW text representation.