ABSTRACT:
Performance measures play important roles in Machine Learning.They
are not only used as the the criteria to evaluate learning
algorithms, but also used as the heuristics to construct learning
models. However, little work has been done to thoroughly explore the
characteristics of performance measures.
We first formally propose criteria to compare performance measures.
We theoretically and empirically compare two most popular measures:
accuracy and AUC (Area Under the ROC Curve). We show that AUC is
statistically consistent and more discriminant than accuracy, which
indicates that AUC should be preferred over accuracy in evaluating
learning algorithms.We also compare ranking measures and give a
preference order to use these measures in comparing ranking
performance.
Based on the comparison criteria, we propose a general approach to
construct new measures from existing ones. We also compare the
learning models of artificial neural networks trained with the newly
constructed measures and existing measures. The experiments show that
the model trained with the newly constructed measure outperforms the
models trained with the existing measures.
|