Random projections as regularizers: learning a linear discriminant from fewer observations than dimensions

Verfasser / Beitragende:
[Robert Durrant, Ata Kabán]
Ort, Verlag, Jahr:
2015
Enthalten in:
Machine Learning, 99/2(2015-05-01), 257-286
Format:
Artikel (online)
ID: 605478538
LEADER caa a22 4500
001 605478538
003 CHVBK
005 20210128100406.0
007 cr unu---uuuuu
008 210128e20150501xx s 000 0 eng
024 7 0 |a 10.1007/s10994-014-5466-8  |2 doi 
035 |a (NATIONALLICENCE)springer-10.1007/s10994-014-5466-8 
245 0 0 |a Random projections as regularizers: learning a linear discriminant from fewer observations than dimensions  |h [Elektronische Daten]  |c [Robert Durrant, Ata Kabán] 
520 3 |a We prove theoretical guarantees for an averaging-ensemble of randomly projected Fisher linear discriminant classifiers, focusing on the case when there are fewer training observations than data dimensions. The specific form and simplicity of this ensemble permits a direct and much more detailed analysis than existing generic tools in previous works. In particular, we are able to derive the exact form of the generalization error of our ensemble, conditional on the training set, and based on this we give theoretical guarantees which directly link the performance of the ensemble to that of the corresponding linear discriminant learned in the full data space. To the best of our knowledge these are the first theoretical results to prove such an explicit link for any classifier and classifier ensemble pair. Furthermore we show that the randomly projected ensemble is equivalent to implementing a sophisticated regularization scheme to the linear discriminant learned in the original data space and this prevents overfitting in conditions of small sample size where pseudo-inverse FLD learned in the data space is provably poor. Our ensemble is learned from a set of randomly projected representations of the original high dimensional data and therefore for this approach data can be collected, stored and processed in such a compressed form. We confirm our theoretical findings with experiments, and demonstrate the utility of our approach on several datasets from the bioinformatics domain and one very high dimensional dataset from the drug discovery domain, both settings in which fewer observations than dimensions are the norm. 
540 |a The Author(s), 2014 
690 7 |a Random projections  |2 nationallicence 
690 7 |a Ensemble learning  |2 nationallicence 
690 7 |a Linear discriminant analysis  |2 nationallicence 
690 7 |a Compressed learning  |2 nationallicence 
690 7 |a Learning theory  |2 nationallicence 
700 1 |a Durrant  |D Robert  |u Department of Statistics, University of Waikato, 3240, Hamilton, New Zealand  |4 aut 
700 1 |a Kabán  |D Ata  |u School of Computer Science, University of Birmingham, B15 2TT, Edgbaston, UK  |4 aut 
773 0 |t Machine Learning  |d Springer US; http://www.springer-ny.com  |g 99/2(2015-05-01), 257-286  |x 0885-6125  |q 99:2<257  |1 2015  |2 99  |o 10994 
856 4 0 |u https://doi.org/10.1007/s10994-014-5466-8  |q text/html  |z Onlinezugriff via DOI 
898 |a BK010053  |b XK010053  |c XK010000 
900 7 |a Metadata rights reserved  |b Springer special CC-BY-NC licence  |2 nationallicence 
908 |D 1  |a research-article  |2 jats 
949 |B NATIONALLICENCE  |F NATIONALLICENCE  |b NL-springer 
950 |B NATIONALLICENCE  |P 856  |E 40  |u https://doi.org/10.1007/s10994-014-5466-8  |q text/html  |z Onlinezugriff via DOI 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Durrant  |D Robert  |u Department of Statistics, University of Waikato, 3240, Hamilton, New Zealand  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Kabán  |D Ata  |u School of Computer Science, University of Birmingham, B15 2TT, Edgbaston, UK  |4 aut 
950 |B NATIONALLICENCE  |P 773  |E 0-  |t Machine Learning  |d Springer US; http://www.springer-ny.com  |g 99/2(2015-05-01), 257-286  |x 0885-6125  |q 99:2<257  |1 2015  |2 99  |o 10994