DATE: | Tuesday, Mar 23, 2010 |
TIME: | 3:30 pm |
PLACE: | Council Room (SITE 5-084) |
TITLE: | Finding salient segments in noisy text: some studies on speech transcripts and semi-structured Web text |
PRESENTER: | Xiaodan Zhu Institute for Information Technology, NRC |
ABSTRACT:
Automatically identifying salient content from written or spoken documents plays a very important role in helping human beings to acquire information efficiently. In this talk, I will mainly focus on my PhD thesis work, in which I studied the roles of noisy text (automatic transcripts) in the task of identifying salient content from spoken documents. I will also briefly discuss an information extraction effort that aims to identify from the whole Web repository the related entity names that are embedded in noisy html/xml tags. |