DATE: | Tuesday, Mar. 6, 2007 |
TIME: | 2:30 pm |
PLACE: | Council Room (SITE 5-084) |
TITLE: | Scaling up Decision-Tree Learning. |
PRESENTER: | Jiang Su University of New Brunswick |
ABSTRACT:
There is growing interest in scaling up the widely-used decision-tree learning algorithms to large scale data mining. Although numerous diverse techniques have been proposed, a fast tree-growing algorithm without substantial decrease in accuracy and substantial increase in space complexity is essential. In this talk, we present a novel, fast decision-tree learning algorithm that is based on a conditional independence assumption. The new algorithm has the linear time complexity of the number of attributes, less than the quadratic time complexity of the standard decision-tree learning algorithm C4.5, with only an additional linear space increase. Experiments show that our algorithm performs competitively with C4.5 in accuracy on a large number of UCI benchmark data sets, and performs even better and significantly faster than C4.5 on a large number of text classification data sets. |