Oct 2, 2013

DATE:	Wed, Oct 2, 2013
TIME:	11:45 am
PLACE:	Council Room (SITE 5-084)
TITLE:	Distributional Semantics for Probabilistic Domain Modelling
PRESENTER:	Jackie Cheung University of Toronto
ABSTRACT: In many textual applications, detailed domain knowledge is needed in order to understand and address a user's information needs. For example, a news aggregator trying to summarize news articles about a court ruling should ideally know the salient participants and events, such as the plaintiffs, the issue at hand, and the outcome of the ruling. Unsupervised probabilistic models are a popular approach to domain modelling because they scale to multiple domains without requiring annotation effort, but they can be difficult to train on a limited amount of in-domain data. In this talk, I examine distributional semantics as a potential solution to this problem. Distributional semantics is based on the idea that words or phrases with similar meaning should have similar distributions in a large training corpus. I first present an evaluation framework for distributional semantics that is indicative of their performance in downstream applications involving semantic inference. Then, I show how distributional semantic representations can be effectively integrated into a probabilistic domain model. I conclude by discussing ongoing work on applying domain modelling to automatic summarization. Bio: Jackie CK Cheung is a PhD candidate at the University of Toronto under the supervision of Prof. Gerald Penn. His research interests span several areas of natural language processing, including computational semantics, automatic summarization, and various topics in natural language generation. He received his M.Sc. also from the University of Toronto, and his B.Sc. from the University of British Columbia.