Asymptotic analysis of estimators on multi-label data

Verfasser / Beitragende:
[Andreas Streich, Joachim Buhmann]
Ort, Verlag, Jahr:
2015
Enthalten in:
Machine Learning, 99/3(2015-06-01), 373-409
Format:
Artikel (online)
ID: 605478503
LEADER caa a22 4500
001 605478503
003 CHVBK
005 20210128100406.0
007 cr unu---uuuuu
008 210128e20150601xx s 000 0 eng
024 7 0 |a 10.1007/s10994-014-5457-9  |2 doi 
035 |a (NATIONALLICENCE)springer-10.1007/s10994-014-5457-9 
245 0 0 |a Asymptotic analysis of estimators on multi-label data  |h [Elektronische Daten]  |c [Andreas Streich, Joachim Buhmann] 
520 3 |a Multi-label classification extends the standard multi-class classification paradigm by dropping the assumption that classes have to be mutually exclusive, i.e., the same data item might belong to more than one class. Multi-label classification has many important applications in e.g. signal processing, medicine, biology and information security, but the analysis and understanding of the inference methods based on data with multiple labels are still underdeveloped. In this paper, we formulate a general generative process for multi-label data, i.e. we associate each label (or class) with a source. To generate multi-label data items, the emissions of all sources in the label set are combined. In the training phase, only the probability distributions of these (single label) sources need to be learned. Inference on multi-label data requires solving an inverse problem, models of the data generation process therefore require additional assumptions to guarantee well-posedness of the inference procedure. Similarly, in the prediction (test) phase, the distributions of all single-label sources in the label set are combined using the combination function to determine the probability of a label set. We formally describe several previously presented inference methods and introduce a novel, general-purpose approach, where the combination function is determined based on the data and/or on a priori knowledge of the data generation mechanism. This framework includes cross-training and new source training (also named label power set method) as special cases. We derive an asymptotic theory for estimators based on multi-label data and investigate the consistency and efficiency of estimators obtained by several state-of-the-art inference techniques. Several experiments confirm these findings and emphasize the importance of a sufficiently complex generative model for real-world applications. 
540 |a The Author(s), 2014 
690 7 |a Generative model  |2 nationallicence 
690 7 |a Asymptotic analysis  |2 nationallicence 
690 7 |a Multi-label classification  |2 nationallicence 
690 7 |a Consistency  |2 nationallicence 
700 1 |a Streich  |D Andreas  |u Science and Technology Group, Phonak AG, Laubisrütistrasse 28, 8712, Stäfa, Switzerland  |4 aut 
700 1 |a Buhmann  |D Joachim  |u Department of Computer Science, ETH Zurich, Universitätstrasse 6, 8092, Zurich, Switzerland  |4 aut 
773 0 |t Machine Learning  |d Springer US; http://www.springer-ny.com  |g 99/3(2015-06-01), 373-409  |x 0885-6125  |q 99:3<373  |1 2015  |2 99  |o 10994 
856 4 0 |u https://doi.org/10.1007/s10994-014-5457-9  |q text/html  |z Onlinezugriff via DOI 
898 |a BK010053  |b XK010053  |c XK010000 
900 7 |a Metadata rights reserved  |b Springer special CC-BY-NC licence  |2 nationallicence 
908 |D 1  |a research-article  |2 jats 
949 |B NATIONALLICENCE  |F NATIONALLICENCE  |b NL-springer 
950 |B NATIONALLICENCE  |P 856  |E 40  |u https://doi.org/10.1007/s10994-014-5457-9  |q text/html  |z Onlinezugriff via DOI 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Streich  |D Andreas  |u Science and Technology Group, Phonak AG, Laubisrütistrasse 28, 8712, Stäfa, Switzerland  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Buhmann  |D Joachim  |u Department of Computer Science, ETH Zurich, Universitätstrasse 6, 8092, Zurich, Switzerland  |4 aut 
950 |B NATIONALLICENCE  |P 773  |E 0-  |t Machine Learning  |d Springer US; http://www.springer-ny.com  |g 99/3(2015-06-01), 373-409  |x 0885-6125  |q 99:3<373  |1 2015  |2 99  |o 10994