Optimised probabilistic active learning (OPAL)
For fast, non-myopic, cost-sensitive active classification
Gespeichert in:
Verfasser / Beitragende:
[Georg Krempl, Daniel Kottke, Vincent Lemaire]
Ort, Verlag, Jahr:
2015
Enthalten in:
Machine Learning, 100/2-3(2015-09-01), 449-476
Format:
Artikel (online)
Online Zugang:
| LEADER | caa a22 4500 | ||
|---|---|---|---|
| 001 | 605478341 | ||
| 003 | CHVBK | ||
| 005 | 20210128100405.0 | ||
| 007 | cr unu---uuuuu | ||
| 008 | 210128e20150901xx s 000 0 eng | ||
| 024 | 7 | 0 | |a 10.1007/s10994-015-5504-1 |2 doi |
| 035 | |a (NATIONALLICENCE)springer-10.1007/s10994-015-5504-1 | ||
| 245 | 0 | 0 | |a Optimised probabilistic active learning (OPAL) |h [Elektronische Daten] |b For fast, non-myopic, cost-sensitive active classification |c [Georg Krempl, Daniel Kottke, Vincent Lemaire] |
| 520 | 3 | |a In contrast to ever increasing volumes of automatically generated data, human annotation capacities remain limited. Thus, fast active learning approaches that allow the efficient allocation of annotation efforts gain in importance. Furthermore, cost-sensitive applications such as fraud detection pose the additional challenge of differing misclassification costs between classes. Unfortunately, the few existing cost-sensitive active learning approaches rely on time-consuming steps, such as performing self-labelling or tedious evaluations over samples. We propose a fast, non-myopic, and cost-sensitive probabilistic active learning approach for binary classification. Our approach computes the expected reduction in misclassification loss in a labelling candidate's neighbourhood. We derive and use a closed-form solution for this expectation, which considers the possible values of the true posterior of the positive class at the candidate's position, its possible label realisations, and the given labelling budget. The resulting myopic algorithm runs in the same linear asymptotic time as uncertainty sampling, while its non-myopic counterpart requires an additional factor of $$O(m \cdot \log m)$$ O ( m · log m ) in the budget size. The experimental evaluation on several synthetic and real-world data sets shows competitive or better classification performance and runtime, compared to several uncertainty sampling- and error-reduction-based active learning strategies, both in cost-sensitive and cost-insensitive settings. | |
| 540 | |a The Author(s), 2015 | ||
| 690 | 7 | |a Active learning |2 nationallicence | |
| 690 | 7 | |a Non-myopic |2 nationallicence | |
| 690 | 7 | |a Cost-sensitive |2 nationallicence | |
| 690 | 7 | |a Unequal misclassification costs |2 nationallicence | |
| 690 | 7 | |a Misclassification loss |2 nationallicence | |
| 690 | 7 | |a Imbalanced data |2 nationallicence | |
| 690 | 7 | |a Uncertainty sampling |2 nationallicence | |
| 690 | 7 | |a Error reduction |2 nationallicence | |
| 700 | 1 | |a Krempl |D Georg |u KMD Lab, University Magdeburg, Magdeburg, Germany |4 aut | |
| 700 | 1 | |a Kottke |D Daniel |u KMD Lab, University Magdeburg, Magdeburg, Germany |4 aut | |
| 700 | 1 | |a Lemaire |D Vincent |u Orange Labs, Lannion, France |4 aut | |
| 773 | 0 | |t Machine Learning |d Springer US; http://www.springer-ny.com |g 100/2-3(2015-09-01), 449-476 |x 0885-6125 |q 100:2-3<449 |1 2015 |2 100 |o 10994 | |
| 856 | 4 | 0 | |u https://doi.org/10.1007/s10994-015-5504-1 |q text/html |z Onlinezugriff via DOI |
| 898 | |a BK010053 |b XK010053 |c XK010000 | ||
| 900 | 7 | |a Metadata rights reserved |b Springer special CC-BY-NC licence |2 nationallicence | |
| 908 | |D 1 |a research-article |2 jats | ||
| 949 | |B NATIONALLICENCE |F NATIONALLICENCE |b NL-springer | ||
| 950 | |B NATIONALLICENCE |P 856 |E 40 |u https://doi.org/10.1007/s10994-015-5504-1 |q text/html |z Onlinezugriff via DOI | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Krempl |D Georg |u KMD Lab, University Magdeburg, Magdeburg, Germany |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Kottke |D Daniel |u KMD Lab, University Magdeburg, Magdeburg, Germany |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Lemaire |D Vincent |u Orange Labs, Lannion, France |4 aut | ||
| 950 | |B NATIONALLICENCE |P 773 |E 0- |t Machine Learning |d Springer US; http://www.springer-ny.com |g 100/2-3(2015-09-01), 449-476 |x 0885-6125 |q 100:2-3<449 |1 2015 |2 100 |o 10994 | ||