DATE: Wed, Aug 19, 2020
TIME: 1 pm
PLACE: Online on Zoom
TITLE: Enhancing Text Readability using Deep Learning Techniques
PRESENTER: Wejdan Alkaldi
University of Ottawa
ABSTRACT:

In the information era, reading becomes more important to keep up with the growing amount of the knowledge. The ability to read a document vary from person to person depending on their skills and knowledge. It also depends on the readability level of the text whether it matches the reader's level or not. We propose a model that uses state-of-the-art technology in machine learning and deep learning to classify and simplify a text taking into consideration the reader's level of reading. The model classifies the text to its appropriate readability level. If the text readability level is higher than the reader level, i.e., too difficult to read, the model would perform a text simplification to the desired level. The classification model is trained against the readability levels found in Newsela corpus (https://newsela.com/data/). Once the classification model is trained, it will be used to classify more corpora for text simplification. Then the simplification model will be trained to simplify a given document to match a specific readability level. The model will be able to generate several simplified versions of a given document based on the readability level provided. The simplification will be done on a paragraph level, rather than sentence level. It would also include sentence splitting when appropriate. The model would help people with low literacy read and understand any documents they require. It will also be beneficial to educators assisting readers with different reading levels.