DATE: Thu, Dec 8, 2016
TIME: 1 pm
PLACE: SITE 5084
TITLE: Local-Global Vectors to Improve Unigram Terminology Extraction
PRESENTER: Ehsan Amjadian
Carleton University
ABSTRACT:

We explore a novel method that integrates efficient distributed representations with terminology extraction. We show that the information from a small number of observed instances can be combined with local and global word embeddings to remarkably improve the term extraction results on unigram terms. To do so, we pass the terms extracted by other tools to a filter made of the local-global embeddings and a classifier which in turn decides whether or not a term candidate is a term. The filter can also be used as a hub to merge different term extraction tools into a single higher-performing system. We compare filters that use the skipgram architecture and filters that employ the CBOW architecture for the task at hand.
This is joint work with Diana Inkpen, T.Sima Paribakht, and Farahnaz Faez, published in the proceedings of the CompuTerm Workshop 2016.