DATE: | Wednesday, May 25, 2005 |
TIME: | 3:30 pm |
PLACE: | Council Room (SITE 5-084) |
TITLE: | Authorship Attribution as a Text Categorization Problem |
PRESENTER: | Moshe Koppel Bar-Ilan University, Israel |
ABSTRACT:
In the standard authorship attribution problem, we are told that the author of an anonymous document is one of a given set of suspects and are asked to choose the likeliest candidate among them based on their respective known writings. Posed this way, the problem is a reasonably straightforward text categorization problem and most of the usual tricks (which I will very briefly review) apply. In the real world, though, we usual face one of two harder variations. In the first variation, we have an anonymous text but no suspect authors at all and we are asked to profile the author. That is, we wish to determine, for example, the author's age, gender, linguistic background, etc. I will show the extent to which each of these problems is solvable. In the second variation, the problem of authorship verification, we are given the known writing of a single author and are asked to determine if this author is also the author of a given anonymous text. I will introduce a new meta-learning technique for solving this problem, demonstrate its effectiveness on various classic books, and use it to solve at least one real-life literary mystery. |