DATE: Wed, Feb 12, 2014
TIME: 2:30 pm
PLACE: Council Room (SITE 5-084)
TITLE: Denoising AutoEncoder-based Generative Minority Oversampling
PRESENTER: Colin Bellinger
University of Ottawa
ABSTRACT:

We present a novel method of synthetically oversampling the minority class based on the learning and reconstructive capabilities of denoising autoencoder neural networks (DAE). DAE-based Generative Oversampling (AEGO) facilitates the generation of synthetic points in the data-space occupied by the minority training set. Unlike SMOTE, however, the diversity of these points can be high, extending beyond the convex-hull of the minority training set, whilst remaining reasonably close to the training points. The classification of rare categories of gamma-ray spectra has environmental, health and security implications. Naturally, the distribution resulting from the national monitoring network is extremely imbalanced. Our five-fold cross-validated results demonstrate that AEGO leads to significantly better mean AUC results than bagged random undersampling and SMOTE. On 144 UCI-based test domains, AEGO is leads to the best classification results, and has a notable advantage when the minority class is very rare.