ABSTRACT:
An indicator of the presence of an opinion and its polarity are the
words immediately surrounding a potential opinion "target". But not all
the words near the target are likely to be relevant to finding an
opinion.
This talk describes work in retrieving opinion words and their links to
within-sentence targets in an information technology (IT) business
corpus through crowdsourcing. The opinion words for which we are
looking are ones that apply to specific IT business concepts relevant to
a larger research project in the social diffusion of innovations.
Existing resources do not fully cover this domain.
I will present a data collection pipeline and a user interface that
avoids some of the pitfalls of asking untrained individuals about
word-target links. Our user interface evades some of problems caused by
the inherent subjectivity of the task. I will also briefly describe one
of the "downstream" uses for the data in a machine learning technique to
acquire syntactic features for sentiment classification.
|