DATE: Tuesday, Mar. 6, 2007
TIME: 2:30 pm
PLACE: Council Room (SITE 5-084)
TITLE: Scaling up Decision-Tree Learning.
PRESENTER: Jiang Su
University of New Brunswick
ABSTRACT:

There is growing interest in scaling up the widely-used decision-tree learning algorithms to large scale data mining. Although numerous diverse techniques have been proposed, a fast tree-growing algorithm without substantial decrease in accuracy and substantial increase in space complexity is essential. In this talk, we present a novel, fast decision-tree learning algorithm that is based on a conditional independence assumption. The new algorithm has the linear time complexity of the number of attributes, less than the quadratic time complexity of the standard decision-tree learning algorithm C4.5, with only an additional linear space increase. Experiments show that our algorithm performs competitively with C4.5 in accuracy on a large number of UCI benchmark data sets, and performs even better and significantly faster than C4.5 on a large number of text classification data sets.