Regularized feature selection in reinforcement learning
Gespeichert in:
Verfasser / Beitragende:
[Dean Wookey, George Konidaris]
Ort, Verlag, Jahr:
2015
Enthalten in:
Machine Learning, 100/2-3(2015-09-01), 655-676
Format:
Artikel (online)
Online Zugang:
| LEADER | caa a22 4500 | ||
|---|---|---|---|
| 001 | 605478236 | ||
| 003 | CHVBK | ||
| 005 | 20210128100404.0 | ||
| 007 | cr unu---uuuuu | ||
| 008 | 210128e20150901xx s 000 0 eng | ||
| 024 | 7 | 0 | |a 10.1007/s10994-015-5518-8 |2 doi |
| 035 | |a (NATIONALLICENCE)springer-10.1007/s10994-015-5518-8 | ||
| 245 | 0 | 0 | |a Regularized feature selection in reinforcement learning |h [Elektronische Daten] |c [Dean Wookey, George Konidaris] |
| 520 | 3 | |a We introduce feature regularization during feature selection for value function approximation. Feature regularization introduces a prior into the selection process, improving function approximation accuracy and reducing overfitting. We show that the smoothness prior is effective in the incremental feature selection setting and present closed-form smoothness regularizers for the Fourier and RBF bases. We present two methods for feature regularization which extend the temporal difference orthogonal matching pursuit (OMP-TD) algorithm and demonstrate the effectiveness of the smoothness prior; smooth Tikhonov OMP-TD and smoothness scaled OMP-TD. We compare these methods against OMP-TD, regularized OMP-TD and least squares TD with random projections, across six benchmark domains using two different types of basis functions. | |
| 540 | |a The Author(s), 2015 | ||
| 690 | 7 | |a Feature selection |2 nationallicence | |
| 690 | 7 | |a Reinforcement learning |2 nationallicence | |
| 690 | 7 | |a Function approximation |2 nationallicence | |
| 690 | 7 | |a Regularization |2 nationallicence | |
| 690 | 7 | |a Linear function approximation |2 nationallicence | |
| 690 | 7 | |a OMP-TD |2 nationallicence | |
| 700 | 1 | |a Wookey |D Dean |u School of Computer Science and Applied Mathematics, University of the Witwatersrand, Johannesburg, South Africa |4 aut | |
| 700 | 1 | |a Konidaris |D George |u Department of Computer Science, Duke University, 27708, Durham, NC, USA |4 aut | |
| 773 | 0 | |t Machine Learning |d Springer US; http://www.springer-ny.com |g 100/2-3(2015-09-01), 655-676 |x 0885-6125 |q 100:2-3<655 |1 2015 |2 100 |o 10994 | |
| 856 | 4 | 0 | |u https://doi.org/10.1007/s10994-015-5518-8 |q text/html |z Onlinezugriff via DOI |
| 898 | |a BK010053 |b XK010053 |c XK010000 | ||
| 900 | 7 | |a Metadata rights reserved |b Springer special CC-BY-NC licence |2 nationallicence | |
| 908 | |D 1 |a research-article |2 jats | ||
| 949 | |B NATIONALLICENCE |F NATIONALLICENCE |b NL-springer | ||
| 950 | |B NATIONALLICENCE |P 856 |E 40 |u https://doi.org/10.1007/s10994-015-5518-8 |q text/html |z Onlinezugriff via DOI | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Wookey |D Dean |u School of Computer Science and Applied Mathematics, University of the Witwatersrand, Johannesburg, South Africa |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Konidaris |D George |u Department of Computer Science, Duke University, 27708, Durham, NC, USA |4 aut | ||
| 950 | |B NATIONALLICENCE |P 773 |E 0- |t Machine Learning |d Springer US; http://www.springer-ny.com |g 100/2-3(2015-09-01), 655-676 |x 0885-6125 |q 100:2-3<655 |1 2015 |2 100 |o 10994 | ||