DATE: | Thursday, Oct. 30, 2003 |
TIME: | 11:30 am |
PLACE: | Council Room (SITE 5-084) |
TITLE: | Discrimination of texts following discourse categories and using multi-classification techniques |
PRESENTER: | Matthieu Hermet University of Ottawa |
ABSTRACT:
This presentation is on work in progress in the area of classification of French texts according to rhetorical type categories. This work is part of a larger project targeting the development of tools assisting second language learning. We start with the premise that document classification calls for research in both classification techniques and document representation. In our case, classification is somewhat different from the standard tasks presented in the literature. Rather than topic spotting, we need to assign texts to one of the four unusual categories, representing rhetorical types (Cause-Effect, Description, Comparison and Problem-Solution). Initially, we work with a small sample of 48 news-like texts, extracted from various magazines, supposedly divisible into the four categories above. After initial experiments with standard techniques, it became clear that the problem calls for a different approach to attribute engineering and information extraction. We have, among others, experimented with the ECOC classification framework. We will present early results on two kinds of classification tasks, characterization and discrimination. We will discuss potential research implications of these preliminary. This work builds on previous and parallel research by A. Agbago and C. Barriere. |