TAMALE SEMINAR

SPEAKER: Igor Jurisica, Ontario Cancer Institute/ Princess Margaret Hospital
University Health Network, , Division of Cancer Informatics, University of Toronto

DATE: Wednesday 20th September, 2000

TITLE: Systematic Knowledge Management:
Taking Advantage of High-Throughput Methods in Biology

ABSTRACT: Biomedical domains are characterized by substantial amounts of complex data, many unknowns, lack of complete theories, and rapid evolution. In such domains, reasoning is often based on experience rather than on general knowledge. Experts remember positive experiences (cases) for possible reuse of solutions; negative experiences are used to avoid potentially unsuccessful outcomes. In general, knowledge management systems support representation, organization, acquisition, creation, usage, and evolution of knowledge in its many forms. Knowledge-based decision-support systems can successfully support problem-solving by aiding classification, diagnosis, planning or prediction. The competence and performance of these systems depends on the knowledge and its representation schema, and reasoning mechanism used. It is usually desired to have expressive knowledge representation and scalable reasoning mechanism, where both are also understandable to human users. In addition, the system must be flexible in order to support domain knowledge evolution. Existing decision-support systems have mostly been designed and implemented by knowledge engineers, using domain expert knowledge compiled into appropriate representation schemes. However, manually evolving the system to meet the changing user needs and domain knowledge is neither efficient nor even feasible in many situations. Preferably we would like the system to automatically assist in knowledge evolution. We show how a case-based reasoning (CBR) system can be used for knowledge management in biology and medicine. CBR is an important paradigm for knowledge management because: (a) it is similar to human problem solving and thus it can complement the user; (b) it supports evolving domain models and thus can help to increase domain understanding; and (c) it diminishes the problem of exceptions and over-generalizations. We discuss extensions of CBR paradigm with automated image-analysis and knowledge-discovery tools. This is used to support systematic knowledge management, and thus helps to ensure that important information will not be missing or incorrect, the representation formalism limited, or only a subset of available experience represented and used during reasoning.

SPEAKER: George Foster, Département d'informatique et recherche opérationnelle,
Université de Montréal, Montréal, (Currently at SITE, Ottawa)

DATE: Thursday 12th October, 2000

TITLE: A Maximum Entropy Translation Model

ABSTRACT: Statistical translation (from one human language to another) can be cast as a classification problem which requires that diverse sources of predictive information be combined in an effective way. Two statistical machine learning techniques which provide flexible frameworks for combining different information sources are linear interpolation and Maximum Entropy / Minimum Divergence (MEMD) modelling. In this talk, I will describe a simple linear translation model and an equivalent MEMD model, and compare their performances using a theoretical measure. I will also illustrate the practical differences between these models by showing how they perform in a prototype text-prediction tool for translators.

SPEAKER: Tony White, Systems and Computer Engineering, Carleton University .

DATE: Thursday 19th October, 2000

TITLE: SynthECA: A Synthetic Ecology of Chemical Agents

ABSTRACT: Emergence, or self organization, is being increasingly associated with distributed or decentralized problem solving. Naturally occurring systems demonstrate that complex problem solving with simple agents is possible and that such systems are extremely robust to changes in their environment and to the loss of individual agents. Furthermore, such systems are self managing, requiring no "blind watchmaker" to ensure that the system operates efficiently. Systems that depend upon the interaction of a large number of simple agents for their problem solving properties stand in stark contrast to monolithic agent systems that have been the target of considerable research. The frailty of symbolic systems, in the face of unforeseen situations and the failure of individual agents provides motivation for the research documented in this talk. Social insect behaviour, and swarm systems generally, provide the inspiration for the research described here as does the increasing importance of activation-oriented computing. This talk proposes an agent architecture that relies on local communication only, all algorithms having no knowledge of the global scene. The agent architecture is analysed in terms of the requirements necessary to support emergent problem solving. The solutions to problems in the Communications domain, specifically routing and fault localization, demonstrate the utility of the proposed architecture along with algorithms that demonstrate the self-management of agent swarms.

SPEAKER: Robert C. Holte, School of Information Technology and Engineering, University of Ottawa

DATE: Thursday 26th October, 2000

TITLE: Combinatorial Auctions and Knapsack Problems

ABSTRACT: This talk will largely be tutorial in nature. I will introduce "combinatorial auctions", a topic of much recent interest in AI which naturally generalizes to the multi-dimensional knapsack problem. I will discuss applications (including weighted constraint satisfaction problems) and review the main algorithms and results for these problems. I will also present new results showing that simple hill-climbing, with an appropriate objective function, does well on these problems.

SPEAKER: Florence d'Alché-Buc, (Work joint with G. Siolas)
Laboratoire d'Informatique de Paris 6, France.

DATE: Thursday 2nd November, 2000

TITLE: Support Data Machines and Semantic Kernels for Text

ABSTRACT: The learning principle of Support Vector Machines can be applied to any data (not only vectors) that can be described as proximity data by the means of a kernel. In many pattern recognition applications, this can lead to consider using more expressive representations of the data, eventually structured ones, combined with well-founded statistical learning tools. Text categorization and more generally information retrieval tasks, offer a very promising application domain for testing these ideas. We explore and compare different ways of representing documents lying on kernels that encode semantic relationships between words. Results in some text categorization tasks are presented and commented.

SPEAKER: Mario Marchand, School of Information Technology and Engineering,
University of Ottawa

DATE: Thursday 9th November, 2000

TITLE: Learning with the Set Covering Machine

ABSTRACT: We generalize the classical algorithms of Valiant and Haussler for learning conjunctions and disjunctions of Boolean attributes to the problem of learning these functions over arbitrary sets of features. The result is a general-purposed learning machine, suitable for practical learning tasks, that we call the Set Covering Machine. We present a version of the Set Covering Machine that uses generalized balls for its set of features and compare its performance to the famous Support Vector Machine.

SPEAKER:             Fabrizio Sebastiani, (co-authors: Alessandro Sperduti and Nicola Valdambrini)
                                Information Engineering research group,     Italian National Research Council,
                                Pisa , Italy

DATE: Friday 10th November

TITLE: An improved boosting algorithm and its application to automated text categorization

ABSTRACT: We describe AdaBoost.MH^{KR}, an improved boosting algorithm, and its application to text categorization. Boosting is a method for supervised learning which has successfully been applied to many different domains, and that has proven one of the best performers in text categorization exercises so far. Boosting is based on the idea of relying on the collective judgment of a committee of classifiers that are trained sequentially. In training the i-th classifier special emphasis is placed on the correct categorization of the training documents which have proven harder for the previously trained classifiers. AdaBoost.MH^{KR} is based on the idea to build, at every iteration of the learning phase, not a single classifier but a sub-committee of the K classifiers which, at that iteration, look the most promising. We report the results of systematic experimentation of this method performed on the standard Reuters-21578 benchmark. These experiments have shown that AdaBoost.MH^{KR} is both more efficient to train and more effective than the original AdaBoost.MH^{R} algorithm.

SPEAKER: Evangelos Milios, (Joint work with Kori MacCara and Nathalie Japkowicz)
Faculty of Computer Science, Dalhousie University

DATE: Thursday 16th November, 2000

TITLE: Text Similarity Based on a Lexical Ontology

ABSTRACT: Measuring the similarity of documents is a key task in information retrieval. The classical vector representation of documents based on TF-IDF information suffers from the potential of classifying as dissimilar documents with the same semantic content but using different words. The problem is more serious for short documents (for example abstracts). Our research is motivated by the desire to build focused Web crawlers that rely on potentially very short text fragments, such as anchor text, to decide on the relevance of links. In this talk we describe a novel domain-specific text similarity measure that relies on a hierarchical lexical ontology to assess the similarity of documents at various levels of semantic detail. The approach is inspired by scale-space techniques for measuring shape similarity at various levels of geometric or visual detail.

SPEAKER: Joel Martin, Institute for Information Technology, National Research Council of Canada

DATE: Thursday 23rd November, 2000

TITLE: Surface Features for Text Classification: Beyond Words and NGrams

ABSTRACT: What features should be used to classify text? Words, phrases, ngrams and their frequencies are often used because they are efficient to calculate and work quite well. Unlike other recent talks, I'm going to ignore background knowledge and rely solely on surface level features. Moreover, I am going to take an extreme perspective and consider a text feature to be a partially ordered set of substrings. For example, this abstract has the text feature '1:"classif", 2:"phrase", before(1,2)' occurring five times. In addition to words, phrases, and ngrams, this definition of a text feature includes all substrings and all subsequences. Partly because of this power, these text features have three problems. First, it is not practical to enumerate all such features. I address this problem using simpler feature types and kernels for support vector machines. Second, most of these features are nuisance features and will lead to spurious correlations. Third, the complex features may carry no extra information than is already available from single words. I try to address these last two problems empirically, but will only present preliminary evidence with some artificial and one real-world document collection.

SPEAKER: Nathalie Japkowicz, School of Information Technology and Engineering,
University of Ottawa

DATE: Thursday 30th November , 2000

TITLE: A Combination Scheme for Learning from Imbalanced Data Sets

ABSTRACT: When learning algorithms designed to maximize accuracy are presented with class imbalanced data sets, they typically produce biased classifiers which have high predictive accuracy over the over represented class, but low predictive accuracy over the under represented class. Several methods have previously been proposed to address this problem including methods that compensate for the asymmetry in the training set and methods that ignore this asymmetry. This talk is concerned with methods that compensate for the class imbalance. In particular, it addresses the problem of how to tune these methods optimally. In a first part, I will present an experimental study designed to derive optimal parameters in domains of increasing complexity and increasing imbalances. I will show how different tunings of our compensation approach yield different solutions to our classification problems, thus suggesting that a combination approach may be more effective than the use of a single classifier with optimal tuning. The second part of the talk will present the combination scheme that we proposed as well as the results it yielded on a text classification task.

SPEAKER: Petko Valtchev (Rokia Missaoui & Pierre Lebrun), Universite du Quebec a Montreal

DATE: Thursday 7th December, 2000

TITLE: A Partition-based Approach Towards Building Galois (concept) Lattices

ABSTRACT: Galois lattices and formal concept analysis of binary relations have attracted significant attention as they proved useful in the resolution of many problems of theoretical or practical interest. Recent applications, especially in data mining and software engineering, put the emphasis on the need for both efficient and flexible algorithms to build the lattice. In this talk we present a novel approach for lattice building based on the apposition of binary relation fragments, previously used for visualization purposes. We extend the existing theory into a complete characterization of the global Galois (concept) lattice as a substructure of the direct product of the partial lattices related to both relation fragments. The structural properties underlie a procedure that extracts the global lattice from the direct product, and which is the basis for a full-scale divide-and-conquer lattice building algorithm. Our solution represents a unified framework for both incremental and batch approaches and therefore enables the design of novel algorithms implementing hybrid batch-incremental strategies, in a flexible, data-driven way. We then provide a complexity analysis together with some results about the practical performances of the algorithm. We also indicate a class of binary relations for which our algorithm outperforms some known lattice building methods and discuss the many-fold impact of our study on the current state of knowledge in the domain.