Direct conditional probability density estimation with sparse feature selection
Gespeichert in:
Verfasser / Beitragende:
[Motoki Shiga, Voot Tangkaratt, Masashi Sugiyama]
Ort, Verlag, Jahr:
2015
Enthalten in:
Machine Learning, 100/2-3(2015-09-01), 161-182
Format:
Artikel (online)
Online Zugang:
| LEADER | caa a22 4500 | ||
|---|---|---|---|
| 001 | 605478368 | ||
| 003 | CHVBK | ||
| 005 | 20210128100405.0 | ||
| 007 | cr unu---uuuuu | ||
| 008 | 210128e20150901xx s 000 0 eng | ||
| 024 | 7 | 0 | |a 10.1007/s10994-014-5472-x |2 doi |
| 035 | |a (NATIONALLICENCE)springer-10.1007/s10994-014-5472-x | ||
| 245 | 0 | 0 | |a Direct conditional probability density estimation with sparse feature selection |h [Elektronische Daten] |c [Motoki Shiga, Voot Tangkaratt, Masashi Sugiyama] |
| 520 | 3 | |a Regression is a fundamental problem in statistical data analysis, which aims at estimating the conditional mean of output given input. However, regression is not informative enough if the conditional probability density is multi-modal, asymmetric, and heteroscedastic. To overcome this limitation, various estimators of conditional densities themselves have been developed, and a kernel-based approach called least-squares conditional density estimation (LS-CDE) was demonstrated to be promising. However, LS-CDE still suffers from large estimation error if input contains many irrelevant features. In this paper, we therefore propose an extension of LS-CDE called sparse additive CDE (SA-CDE), which allows automatic feature selection in CDE. SA-CDE applies kernel LS-CDE to each input feature in an additive manner and penalizes the whole solution by a group-sparse regularizer. We also give a subgradient-based optimization method for SA-CDE training that scales well to high-dimensional large data sets. Through experiments with benchmark and humanoid robot transition datasets, we demonstrate the usefulness of SA-CDE in noisy CDE problems. | |
| 540 | |a The Author(s), 2015 | ||
| 690 | 7 | |a Conditional density estimation |2 nationallicence | |
| 690 | 7 | |a Feature selection |2 nationallicence | |
| 690 | 7 | |a Sparse structured norm |2 nationallicence | |
| 700 | 1 | |a Shiga |D Motoki |u Gifu University, 1-1 Yanagido, Gifu-city, 501-1193, Gifu, Japan |4 aut | |
| 700 | 1 | |a Tangkaratt |D Voot |u Tokyo Institute of Technology, 2-12-1 O-okayama, Meguro-ku, 152-8552, Tokyo, Japan |4 aut | |
| 700 | 1 | |a Sugiyama |D Masashi |u Tokyo Institute of Technology, 2-12-1 O-okayama, Meguro-ku, 152-8552, Tokyo, Japan |4 aut | |
| 773 | 0 | |t Machine Learning |d Springer US; http://www.springer-ny.com |g 100/2-3(2015-09-01), 161-182 |x 0885-6125 |q 100:2-3<161 |1 2015 |2 100 |o 10994 | |
| 856 | 4 | 0 | |u https://doi.org/10.1007/s10994-014-5472-x |q text/html |z Onlinezugriff via DOI |
| 898 | |a BK010053 |b XK010053 |c XK010000 | ||
| 900 | 7 | |a Metadata rights reserved |b Springer special CC-BY-NC licence |2 nationallicence | |
| 908 | |D 1 |a research-article |2 jats | ||
| 949 | |B NATIONALLICENCE |F NATIONALLICENCE |b NL-springer | ||
| 950 | |B NATIONALLICENCE |P 856 |E 40 |u https://doi.org/10.1007/s10994-014-5472-x |q text/html |z Onlinezugriff via DOI | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Shiga |D Motoki |u Gifu University, 1-1 Yanagido, Gifu-city, 501-1193, Gifu, Japan |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Tangkaratt |D Voot |u Tokyo Institute of Technology, 2-12-1 O-okayama, Meguro-ku, 152-8552, Tokyo, Japan |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Sugiyama |D Masashi |u Tokyo Institute of Technology, 2-12-1 O-okayama, Meguro-ku, 152-8552, Tokyo, Japan |4 aut | ||
| 950 | |B NATIONALLICENCE |P 773 |E 0- |t Machine Learning |d Springer US; http://www.springer-ny.com |g 100/2-3(2015-09-01), 161-182 |x 0885-6125 |q 100:2-3<161 |1 2015 |2 100 |o 10994 | ||