Distraction-Based Neural Networks for Modeling Documents
PRESENTER:
Xiaodan Zhu
NRC
ABSTRACT:
Distributed representation learned with neural networks has recently shown
to be effective in modeling natural languages at fine granularities such
as words, phrases, and even sentences. Whether and how such an approach
can be extended to help model larger spans of text, e.g., documents, is
intriguing, and further investigation would still be desirable. This work
aims to enhance neural network models towards such a purpose.
A typical problem of document-level modeling is automatic summarization,
which aims to model documents in order to generate summaries. In this
work, we propose neural models to train computers not just to pay
attention to specific regions and content of input documents with
attention models, but also distract them to traverse between different
content of a document so as to better grasp the overall meaning for
summarization. Without engineering any features, we train the models on
two large datasets. The models achieve the state-of-the-art performance,
and they significantly benefit from the distraction modeling, particularly
when input documents are long.