The Text Analysis and Machine Learning Group

Executive summary | What is Knowledge Management? | The team | Brief history | Results and accomplishments | Current graduate students | International, national and industrial cooperation

 


Fall 2001


 

SPEAKER:             Guy Lapalme
              
                  RALI - Département d'informatique et de recherche opérationnelle
      
                         
Université de Montréal
 

DATE:                    Thursday 20th September, 2001  

TITLE:                    Mercure: An automatic follow-up system for e-mail  

ABSTRACT:  We will show the state of our research project which deals with the automatic follow-up of e-mail for an enterprise web site. We will first present different approaches to the automatic processing of e-mail and then describe the context of our industrial project. We will discuss different selection criteria for an approach to e-mail follow-up and we will apply them to our corpus.  This project is funded by Bell Universities Laboratories and a Collaborative Research and Development (CRD) grant from NSERC. Guy Lapalme has been active in Natural Language Processing for more than 15 years especially in Natural Language Generation. His recent works in information extraction and statistical translation are done at the Recherche Appliquйe en Linguistique Informatique (RALI) laboratory.

SPEAKER:             Marina Sokolova
              
                 
University of Ottawa
 

DATE:                    Thursday 27th September, 2001  

TITLE:                    The Decision List Machine  

ABSTRACT:  We learn decision lists over a space of features that are constructed from the data. A practical machine which we call the Decision List Machine(DLM) comes as a result. We construct the DLM which uses generalized balls as data-dependent features. We have compared its practical performance on “real life” data sets with the performance of some other learning algorithms such as the Set Covering Machine and the Support Vector Machine. The performance is evaluated for both symmetric and asymmetric loss coefficients. The comparison has shown that the learners could benefit from using the DLM. We also provide the theoretical assessment of the performance of the DLM by computing the upper bounds of the generalization error.

 

SPEAKER:             Peter Turney, National Research Council Ottawa  

DATE:                    Thursday 4th October, 2001
 
PLACE:                  Room 318, third floor, MacDonald Hall

TITLE:                    Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL

ABSTRACT:         This paper presents a simple unsupervised learning algorithm for  recognizing synonyms, based on statistical data acquired by querying a  Web search engine. The algorithm, called PMI-IR, uses Pointwise Mutual  Information (PMI) and Information Retrieval (IR) to measure the  similarity of pairs of words. PMI-IR is empirically evaluated using 80  synonym test questions from the Test of English as a Foreign Language  (TOEFL) and 50 synonym test questions from a collection of tests for  students of English as a Second Language (ESL). On both tests, the  algorithm obtains a score of 74%. PMI-IR is contrasted with Latent  Semantic Analysis (LSA), which achieves a score of 64% on the same 80  TOEFL questions. The paper discusses potential applications of the new  unsupervised learning algorithm and some implications of the results for  LSA and LSI (Latent Semantic Indexing).

 
BIO:        Dr. Turney is a Senior Research Officer in the Interactive Information  Group of the National Research Council. In 1988, he obtained his PhD  from the
University of Toronto , where he then accepted a Postdoctoral  Fellowship. He joined the NRC in 1989, and he has since worked on a  variety of projects, all involving the application of machine learning  to industrial problems. His recent work focuses on machine learning  applied to information retrieval and summarization. He is the author  or co-author of more than fifty publications, a past editor of Canadian  Artificial Intelligence magazine, and an Associate Editor of the Journal  of Artificial Intelligence Research.

   

SPEAKER:             Doina Precup, School of Computer Science, McGill University

DATE:                    Monday October 15th Oct, 2001

PLACE:                  Room 318, third floor, MacDonald Hal

TITLE:                    Off-Policy Temporal Difference Learning with Function Approximation

ABSTRACT:         In reinforcement learning, an agent generally learns from experience, that is, from the sequence of states actions and rewards generated by the agent's interaction with its environment. This data is affected by the decision-making policy used by the agent to select its actions, and thus the result of learning is often a function of the agent's policy. For example, the common subproblem of policy evaluation is to learn the expected future reward available to the agent from each state, when the agent is following its policy. In general, however, it can be useful to learn about policies that are different from the current behavior.  This process is called off-policy learning.

Off-policy learning can speed up learning significantly, because the agent can use its experience to learn about many different policies in parallel, even though it can only follow one policy at a time. However, off-policy methods have been known to diverge when the value function is represented using a function approximator.

In this talk, I will introduce the first algorithm for off-policy reinforcement learning that is stable with linear function approximation. The main idea is to use importance sampling corrections in
order to account for the difference between the agent's behavior policy and the policy about which it is trying to learn. Our convergence result is currently limited to episodic tasks. [This is joint work with Dr. Richard S. Sutton from AT&T Research Labs.}

BIO:        Doina Precup received her  M.S. and PhD degrees from the Department of Computer Science, University of Massachusetts , Amherst , in 1997 and 2000 respectively.  Her research focuses on reinforcement learning.  In her PhD thesis, she developed the options framework, a new way of representing knowledge about temporally extended actions in reinforcement learning.

 

SPEAKER:             Stan Szpakowicz
                                (joint work with Terry Copeck, Nathalie Japkowicz)
                               
School of Information Technology and Engineering,
                               
University of Ottawa

DATE:                    Thursday 25th Oct, 2001

PLACE:                  Room 318, third floor, MacDonald Hall

TITLE:                    Text Summarization as Controlled Search

ABSTRACT: We present a framework for text summarization based on the generate-and-test model. A large set of summaries is generated for all plausible values of six parameters that control a three-stage process that includes segmentation and keyphrase extraction, and a number of features that characterize the document. Quality is assessed by measuring the summaries against the abstract of the summarized document. The large number of summaries produced for our corpus dictates automated validation and fine-tuning of the summary generator. We use supervised machine learning to detect good and bad parameters. In particular, we identify parameters and ranges of their values within which the summary generator might be used with high reliability on documents for which no author's abstract exists.

 

SPEAKER:             Nathalie Japkowicz
                               
School of Information Technology and Engineering,
                               
University of Ottawa

DATE:                    Thursday 1st Nov , 2001

PLACE:                  Room 318, third floor, MacDonald Hall

TITLE:                    The Class Imbalance Problem: A systematic Study
 

ABSTRACT: In machine learning problems, differences in prior class probabilities--or class imbalances--have been reported to hinder the performance of some standard classifiers, such as decision trees. This
paper presents a systematic study aimed at answering three different questions. First, we attempt to understand what the class imbalance problem is by establishing a relationship between concept complexity, size of the training set and class imbalance level. Second, we discuss several basic re-sampling or cost-modifying methods previously proposed to deal with class imbalances and compare their effectiveness. Finally, we investigate the assumption that the class imbalance problem does not only affect decision tree systems but also affects other classification systems such as Neural Networks and Support Vector Machine.

This work was done in collaboration with Shaju Stephen.

   

SPEAKER:             E. Milios
                                Faculty of Computer Science,
                               
Dalhousie University

DATE:                    Friday 9th Nov, 2001

PLACE:                  Room 318, third floor, MacDonald Hall

TITLE:                    Similarity and Structure in Networked Information Spaces

ABSTRACT:

  Large networked information spaces, such as the World Wide Web and the Citation Graph, are becoming amenable to representation and analysis with desktop computing resources. Search engines have been indexing the WWW for some time now, an exercise that requires substantial computing resources, which are increasing exponentially, just like the WWW does. We are interested in problems involving networked information spaces that can be solved with constant computing resources, independent of the size of the space, either with or without the help of search engines (indexing and back-link tracing). We will discuss approaches to three problems related to networked information spaces.

1. Characterization of the Citation Graph for Computer Science. We use various techniques from graph theory to study the properties of the citation graph, constructed from ResearchIndex (Citeseer). Results so far have shown that the citation graph is highly connected, and that such connectivity does not depend on particular hub or authority papers.

2. Similarity measure for computer science papers based on the similarity of their neighbourhoods in the citation graph, defined either in terms of the size of a minimum cut to separate the two papers, or in terms of the overlap in authority papers in the two neighbourhoods. The method has been seen to uncover unexpected relations.

3. Construction of a highly connected context graph for focused crawling on the WWW.
Evangelos Milios received a diploma in Electrical Engineering from the National Technical University of Athens, Greece, and Master's and Ph.D. degrees in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology,
Cambridge , Massachusetts . While at M.I.T., he was a member of Digital Signal Processing group and he worked, on acoustic signal interpretation problems at the M.I.T. Lincoln Laboratory. After spending 5 years doing research on shape analysis, sensor-based robotics, and signal processing in the Department of Computer Science, University of Toronto , he joined York University as an Associate Professor. Since July of 1998 he has been with the Faculty of Computer Science, Dalhousie University , where he is Professor and Director of the Graduate Program. He is a Senior Member of the IEEE. He served as a member of the ACM Dissertation Award committee (1990-1992), and a member of the AAAI/SIGART Doctoral Consortium Committee (1997-now). He has published on the processing, interpretation and use of visual and range signals for landmark-based navigation and map construction in single- and multiagent robotics, and on computational auditory scene analysis. His current research activity is centered on software agents for Web information retrieval.

   

SPEAKER:             Yoshua Bengio
(joint work with Pascal Vincent)
Département d'Informatique et Recherche Opérationnelle
Université de Montréal

DATE:                    Thursday 15th Nov, 2001
 
PLACE:                  Room 318, third floor, MacDonald Hall

TITLE:                    Non-Parametric Models for High-Dimensional Data
 
ABSTRACT:

Classical non-parametric models for regression, classification or density estimation may yield poor results when the number of dimensions of the data becomes large, due to the so-called 'curse of dimensionality'. We present some of our recent results in several directions to deal with this problem. We first try to understand why Support Vector Machines, which are essentially non-parametric, can sometimes perform much better than more classical non-parametric models. This suggests new algorithms, also inspired from prior work on Tangent Distance, in which one tries to infer the local structure of the data in order to tackle a weakness of Nearest-Neighbor algorithms in high-dimensions with finite samples. Experiments suggest that the proposed solutions allow to bridge the gap between Nearest-Neighbors and Support Vector Machines. The talk will then move the problem of estimating joint probability functions for high-dimensional discrete data, and in particular language data. What local regularities could be exploited there? Artificial neural networks can be used to discover a similarity function between complex objects, allowing to obtain much better perplexity results than non-parametric models based on n-grams. This suggests ways to improve on n-gram models.  A smoothing principle for discrete data is thus discovered and yields a new non-parametric approach to modeling high-dimensional discrete data.

   

SPEAKER:             Marvin Zaluski
                                National Research Council of
Canada

DATE:                    Thursday 29th Nov, 2001
 
PLACE:                  Room 318, third floor, MacDonald Hall

TITLE:                    Knowledge from Structured Documents

ABSTRACT:

Large vehicles, such as aircraft and automobiles, are complex devices with integrated computers and sensors. These internal computers record sensor data and assist in diagnostics. When a problem occurs, an alert is generated and can be used to isolate the problem. These alerts, sensor data and observed symptoms  direct the mechanic to procedures listed in repair manuals. Unfortunately, there may be many manuals and many procedures from which the mechanic must choose, and choose quickly.  The volume of information increases the difficulty of maintaining these vehicles.

Generally, knowledge systems are difficult to build and maintain, partly because domain expertise is not readily available. Yet, much knowledge is available from sources such as the repair manuals, if only we could capture that knowledge. Case Based Reasoning might be a useful technique. Our goal is develop a system that would assist the mechanic to select the best procedure for the current problem.

In this talk, we propose an automated method of acquiring Cases from the equipment's repair manuals. These Cases are then used within our Case Based Reasoning system. Another essential element is the current sensor data. The repair manual Cases and sensor data can assist in making equipment maintenance decisions. We will show how this can be applied to the real world in the aerospace domain.

 

Return to main page