DATE: | Thursday, Oct. 14, 2004 |
TIME: | 1:30 pm |
PLACE: | Council Room (SITE 5-084) |
TITLE: | The Anatomy of a Named Entity Recognition System |
PRESENTER: | David Nadeau NRC |
ABSTRACT:
The named entity recognition (NER) task consists in identifying phrases in text that refer to labeled things. These can be individuals, societies, organizations, cities, objects, movies, and so on. In the Message Understanding Conference (MUC), one of the tasks copes with (1) organizations, (2) persons and (3) locations. In this talk, we'll study the components that are required to build a NER system that identifies those 3 types of entities. We'll specifically look at a minimal grammar required to delimit phrases. We'll also look at internal and external evidences required to classify entities in one of the 3 types. We'll illustrate the limitations of such a system with examples and a demo. |