DATE: | Wed, Oct 30, 2013 |
TIME: | 11:45 am |
PLACE: | Council Room (SITE 5-084) |
TITLE: | Feature Space Selection and Combination for Native Language Identification |
PRESENTER: | Cyril Goutte NRC |
ABSTRACT:
We describe the National Research Council Canada's submission to a shared task on Native Language Identification,
and provide an analysis of the results. Our systems rely on the use of SVM statistical classifiers, trained
on various combinations of feature spaces describing lexical and syntactic characteristics of documents
written by ESL learners. Somewhat surprisingly, classifiers using only surface form information performed
very well and yield an error rate of around 20% over the 11 classes. However, the best performance is obtained
by a combination of different models using majority voting.
|