DATE: | Thursday, Jan. 20, 2005 |
TIME: | 1:30 pm |
PLACE: | Council Room (SITE 5-084) |
TITLE: | Hierarchical Text Categorization |
PRESENTER: | Svetlana Kiritchenko University of Ottawa |
ABSTRACT:
Hierarchical text categorization deals with tasks where a category set has a hierarchical structure, or more formally, is partially ordered. In general, a category hierarchy is given as a directed acyclic graph (DAG) or its particular case, a tree. A number of hierarchical classification tasks emerge in today's computerized world: web page classification in web directories, email sorting into personal folders, article storage in digital libraries, etc. In this work, we propose two novel hierarchical learning algorithms. The first one is a generalized version of the conventional local top-down approach, extended to cover the case where a category hierarchy is represented as a general DAG, not just a tree. The second algorithm is a hierarchical version of AdaBoost, a very successful boosting approach. Furthermore, we introduce a new hierarchical evaluation measure that is based solely on a given class hierarchy and (statistically) consistent with, but more discriminating than standard measures. |