DATE: | Tuesday, Mar. 28, 2006 |
TIME: | 2:30 pm |
PLACE: | Council Room (SITE 5-084) |
TITLE: | Factor: More precise fact finding in full text articles |
PRESENTER: | Joel Martin NRC |
ABSTRACT:
Text mining studies in biology are often limited to abstracts instead of full text. As well, they are usually limited to thousands instead of millions of Medline records or are very slow. With a modified search engine, many common text mining tasks can be done rapidly, even on full text. In fact, some information extraction and text categorization tasks can be achieved in seconds or minutes even across tens of gigabytes of (previously indexed) text. I will present Factor, an efficient implementation of a text analysis engine that uses a highly expressive query language. With this language, users can create queries that quickly accomplish what previously required several different custom-built systems to achieve. |