DATE: | Monday, Nov 24, 2008 |
TIME: | 2:45 pm |
PLACE: | Room A707 (CBY building) |
TITLE: | Automatic Text Categorization with Collaboratively Tagged Data |
PRESENTER: | Marcel Wirantono University of Ottawa |
ABSTRACT:
Collaborative tagging has recently become more popular alternative for web retrieval tool. Similar to text categorization, albeit in a less centralized fashion, collaborative tagging relies on humans to annotate documents with metadata descriptions, i.e. tags. For that reason, our research attempts to extend the tagging process to include of automatic text categorization. In applying automatic text categorization to collaboratively tagged data, we have found that the lack of structure in tagging has limited our learning approach to simple flat categorization techniques. Inspired by the application of hierarchical categorization in web directories [Dumais and Chen, 2000], we proposed a categorization approach that automatically builds a hierarchy from the tag space and incorporates it to the training and classification process. Unlike previous hierarchical categorizations that rely on human generated hierarchy, our hierarchical approach relies on an artificial hierarchy that is created from tag usage analysis. Nonetheless, with this artificially generated hierarchy, we observed improvements not only on the quality of predictions, but also the efficiency (total training and classification time) of our automatic text categorization system. |