TAMALE SEMINAR

The Text Analysis and Machine Learning Group

What is Knowledge Management? | The team | Brief history | Our research seminar series

TAMALE is a world-class center of excellence in Data Mining, Text Mining, and Language Engineering.

What is Knowledge Management?

Knowledge management is considered here as a research field that combines Data Mining, Text Mining and Language Engineering, and builds on the technologies of Databases, Data Warehousing and Knowledge Bases. Specifically,

Data Mining is a field of research that extracts implicit, actionable knowledge from large, often unstructured data repositories. Data Warehousing addresses the organization of this data. For instance, data mining methods analyze gene expression data collected on DNA chips and builds a temporal model of that interaction.

Text Mining, also known as Information Extraction from Text, identifies in large repositories of texts (e.g. news wire, Internet, virtual libraries) specific information of interest to the user. For instance, a text mining system will provide geneticians with knowledge of all interactions of two specific proteins ever mentioned in the bibliographic database MedLine within the last three years.

Language Engineering combines techniques, representations, and algorithms that are collectively capable of advanced information processing involving natural language, e.g. answering questions about contents of a book, summarizing and compressing texts etc. For instance, a system reading texts on proteomics and summarizing them for a given research group, and ranking them with respect to their relevance for the group?s interests, is an example of a Language Engineering application.

The Team

Full-time faculty:

Dr. Diana Inkpen; Natural Language Processing, Machine Learning for Text Classification
Dr. Natalie Japkowicz; Data Mining/Machine Learning, Natural Language Processing (now at American University in Washington, DC)
Dr. Stan Matwin; Integrated Data Mining and applications (now retired, Canada Research Chair at Dalhousie University)
Dr. Stan Szpakowicz; Natural Language Processing (now retired, professor emeritus)
Dr. Thomas Tran; E-commerce applications
Dr. Marcel Turcotte; Data Mining, Bioinformatics
Dr. Herna Viktor; Machine Learning Foundations and Algorithms, Online Learning and Rare Events, Data-driven discovery and Big Data Applications
Dr. Yongyi Mao; Theoretical Machine Learning
Dr. Paula Branco; Machine Leanrning, Imbalanced Data
Dr. Andre Vellino; Information Systems (cross-appointed from the School of Information Studies)
Dr. Marina Sokolova; Machine Learning for E-health (cross-appointed with Faculty of Medicine)

Research Associate:

Dr. Jelber Sayyad Shirabad; Machine Learning and Application to Software Engineering

Below we refer to the group members by their initials.

Brief History

Research in machine learning started at University of Ottawa in 1988 when S. Matwin created the Ottawa Machine Learning Group. Initially it focused on analytical machine learning, as well as fundamental issues of inductive learning. R. C. Holte joined in 1989. With time, strong links were built with the Natural Language Lab led by S. Szpakowicz and in 1995 a joint group under the name of Text Analysis and Machine Learning (TAMALE) was launched. At different times, M. Marchand, K. Barker, C. Barriere, N. Japkowicz, L. Peyton, H. Viktor, I. Kiringa, and D. Inkpen, Y. Mao, P. Branco joined the group. Some of the above left the group.

TAMALE has quickly gained international status in a number of areas of data mining and language engineering, e.g. Inductive Logic Programming, Performance Evaluation, Learning and NL, Syntactic Processing of NL. In 2000, L. Duquette from the Second Language Institute, with active participation of C. Barriere, S. Matwin, S. Szpakowicz, and D. Inkpen has created the Groupe de recherche en ing?ierie de la langue (GRIL) which was a vehicle for collaboration of researchers from computer science, linguistics, and psychology, working on language engineering.

Current application areas in TAMALE span bioinformatics, computational linguistics, e-commerce and e-health.

Return to main page