DATE: | Wednesday, Nov. 14, 2007 |
TIME: | 4:00 pm |
PLACE: | Council Room (SITE 5-084) |
TITLE: | Decision Tree Instability and Active Learning |
PRESENTERS: | Ken Dwyer University of Alberta |
ABSTRACT:
Decision tree learning algorithms produce accurate
models that can be interpreted by domain experts.
However, these algorithms are known to be unstable --
they can produce drastically different hypotheses from
training sets that differ just slightly. This
instability undermines the objective of extracting
knowledge from the trees. In this paper, we study the
instability of the C4.5 decision tree learner in the
context of active learning. We introduce a new measure
of decision tree stability, and define three aspects of
active learning stability. Several existing active
learning methods that use C4.5 as a component are
compared empirically; it is determined that
query-by-bagging yields trees that are more stable and
accurate than those produced by competing methods.
Also, an alternative splitting criterion, DKM, is found
to improve the stability and accuracy of C4.5 in the
active learning setting.
|