| 
 The Text Analysis and Machine Learning Group Executive summary | What is Knowledge Management? | The team | Brief history | Results and accomplishments | Current graduate students | International, national and industrial cooperation 
 
 
 SPEAKER:            
    Messaouda Ouerd,  TOPIC:                  
    Learning in Belief Networks and its Application in Distributed
    Databases DATE:                   
     PLACE:                 
    Room 318, MacDonald Hall,  ABSTRACT:
    In this talk we present the problem of learning in belief networks and its
    application to caching data with repeated read-only accesses in distributed
    databases. Our goal is to build a probabilistic network from the
    distribution of the data which adequately represents the data. We describe
    two classes of techniques for the induction of Bayesian networks from data,
    methods based on probabilistic-graph models and methods using a Bayesian
    learning approach. The probabilistic methods for learning Bayesian Belief
    Networks(BBN)s focus on finding the most likely structure, implicitly
    assuming that there is a true underlying structure. The Bayesian methods for
    learning BBN search the network structure hypothesis to yield the network
    which maximizes the relative a posteriori probability. Once constructed,
    such a network can provide insight into probabilistic dependencies that
    exist among the variables. We consider representations using Chow's
    dependence trees and Polytrees (Singly Connected Networks) as structures for
    inferring causal knowledge. We apply this approach to learn patterns or
    sequences in query accesses to distributed databases. SPEAKER:            
    Dr. Ken Barker,  TOPIC: 
                    
    Identifying Semantic Relationships in Complex Noun Phrases DATE:                   
      PLACE:                 
    Room 318, MacDonald Hall,  ABSTRACT:        
    Complex noun phrases carry much of the information in English texts.
    Unfortunately for systems that want to get at that information, there are
    few surface indicators of the underlying meaning of a noun phrase. Such
    systems must compensate for the lack of clues with other information. One
    way is to load the system with lexical semantics for nouns and adjectives.
    This merely shifts the problem elsewhere: how do we define the lexical
    semantics and build large semantic lexicons? Another way is to find
    constructions similar to a given noun phrase, for which the semantic
    relationships among components are already known. In this talk I will
    present a semi-automatic system that identifies semantic relationships in
    noun phrases without using precoded noun or adjective semantics. Instead,
    partial matching on similar, previously analyzed noun phrases leads to a
    tentative interpretation of a new input, which is accepted or corrected by a
    cooperative user. I will break the bad news: similarity is not easily
    assessed, similar analyzed constructions may not exist, and if they do
    exist, their analyses may not be appropriate for the current phrase. I will
    also share the good news: processing can start with no prior analyses, and
    as more noun phrases are analyzed, the system learns to find better
    interpretations and reduces its reliance on the user. This talk contains no
    sliding boxes. SPEAKER:            
    Johanne Morin,    TOPIC:                  
    Learning Relational Cliches with Contextual Generalization               
     DATE:                   
     PLACE:                 
    Room 318, MacDonald Hall,  ABSTRACT:        
    Concept learners learn the definition of a concept from positive and
    negative examples of the concept. The definitions learned describe as many
    of the positives and as few of the negatives as possible. These definitions
    are then used to classify unknown examples as positive or negative examples.
    Many existing systems learn concepts one feature at a time. These systems
    have trouble learning definitions with interdependent features. The FOCL
    system (Pazzani et al. 1991) solved this problem by giving the concept
    learner hand-made "clichés" which are combinations of features.
    The problem is that these clichés are hard to derive. I developed CLUse
    (Clichés Learned and Used) to learn clichés automatically. Empirical
    testing shows that CLUse can help concept learners with useful clichés
    learned across domains.  SPEAKER:            
    Mauricio de Almeida,  TOPIC:    
                 
    Learning (Tree/Rule)-like boolean C++ methods               
     DATE:                   
     PLACE:                 
    Room 318, MacDonald Hall,  ABSTRACT:        
    Decision trees and rule sets are commonly used languages to describe
    learned concepts. Even though those representations are easy to read and
    often the learners that generate them can evaluate their performance on the
    testing set, in the long run the rule sets and trees are not directly
    implementable in systems that are to use the learned rules or trees. 
    This problem, among others, has suggested us an approach in which C++
    classes, equivalent to a decision tree or a rule set, or a mixture of the
    two, are direct learned from a set of examples represented as
    attribute-value vectors. In this seminar we present the main ideas behind
    the Knowledge Embedding Learning  system,
    witch we are now implementing.  SPEAKER:    
    Mario Jarmasz,  TOPIC:         
    Corpus Linguistics : a paradigm for solving NLP problems               
     DATE:          
     PLACE:         
    Room 318, MacDonald Hall,  ABSTRACT:
    Development of large electronic corpora for use in Computational Linguistics
    started in the late 1970's. Advances in software and NLP technologies have
    facilitated the transformation of text archives into electronic corpora.
    Many researchers have turned to Corpus Linguistics in the past decade to
    develop large-scale linguistic applications. The use of large corpora is not
    a new concept in Linguistics. The richness of the corpora, the increase in
    their size and the fact that many are easily accessible are some reasons
    that make Corpus Linguistics attractive today. In this talk I will present
    the different aspects of Corpus Linguistics. A definition of the corpus will
    be introduced along with the various types of corpora that are currently
    available. An overview of fields interested in corpora and possible
    applications such as the construction of an electronic thesaurus,
    information retrieval systems and machine translation systems will be
    demonstrated. I will also present some statistical methods for empirical
    investigations of corpora as well as the steps involved in creating an
    electronic corpus. This presentation is based on the book Les linguistiques
    de corpus (Habert, Nazerenko, Salem, 1997).  |