Asymptotic Optimality of Likelihood-Based Cross-Validation

Verfasser / Beitragende:
[Mark J. van der Laan, Sandrine Dudoit, Sunduz Keles]
Ort, Verlag, Jahr:
2004
Enthalten in:
Statistical Applications in Genetics and Molecular Biology, 3/1(2004-03-22), 1-23
Format:
Artikel (online)
ID: 37892592X
LEADER caa a22 4500
001 37892592X
003 CHVBK
005 20180305123617.0
007 cr unu---uuuuu
008 161128e20040322xx s 000 0 eng
024 7 0 |a 10.2202/1544-6115.1036  |2 doi 
035 |a (NATIONALLICENCE)gruyter-10.2202/1544-6115.1036 
245 0 0 |a Asymptotic Optimality of Likelihood-Based Cross-Validation  |h [Elektronische Daten]  |c [Mark J. van der Laan, Sandrine Dudoit, Sunduz Keles] 
520 3 |a Likelihood-based cross-validation is a statistical tool for selecting a density estimate based on n i.i.d. observations from the true density among a collection of candidate density estimators. General examples are the selection of a model indexing a maximum likelihood estimator, and the selection of a bandwidth indexing a nonparametric (e.g. kernel) density estimator. In this article, we establish a finite sample result for a general class of likelihood-based cross-validation procedures (as indexed by the type of sample splitting used, e.g. V-fold cross-validation). This result implies that the cross-validation selector performs asymptotically as well (w.r.t. to the Kullback-Leibler distance to the true density) as a benchmark model selector which is optimal for each given dataset and depends on the true density. Crucial conditions of our theorem are that the size of the validation sample converges to infinity, which excludes leave-one-out cross-validation, and that the candidate density estimates are bounded away from zero and infinity. We illustrate these asymptotic results and the practical performance of likelihood-based cross-validation for the purpose of bandwidth selection with a simulation study. Moreover, we use likelihood-based cross-validation in the context of regulatory motif detection in DNA sequences. 
540 |a ©2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston 
690 7 |a Statistical Theory and Methods  |2 nationallicence 
690 7 |a Likelihood cross-validation  |2 nationallicence 
690 7 |a maximum likelihood estimation  |2 nationallicence 
690 7 |a Kullback-Leibler divergence  |2 nationallicence 
690 7 |a density estimation  |2 nationallicence 
690 7 |a bandwidth selection  |2 nationallicence 
690 7 |a model selection  |2 nationallicence 
690 7 |a variable selection  |2 nationallicence 
700 1 |a van der Laan  |D Mark J.  |u Division of Biostatistics, School of Public Health, University of California, Berkeley  |4 aut 
700 1 |a Dudoit  |D Sandrine  |u Division of Biostatistics, School of Public Health, University of California, Berkeley  |4 aut 
700 1 |a Keles  |D Sunduz  |u Division of Biostatistics, School of Public Health, University of California, Berkeley  |4 aut 
773 0 |t Statistical Applications in Genetics and Molecular Biology  |d De Gruyter  |g 3/1(2004-03-22), 1-23  |q 3:1<1  |1 2004  |2 3  |o sagmb 
856 4 0 |u https://doi.org/10.2202/1544-6115.1036  |q text/html  |z Onlinezugriff via DOI 
908 |D 1  |a research article  |2 jats 
950 |B NATIONALLICENCE  |P 856  |E 40  |u https://doi.org/10.2202/1544-6115.1036  |q text/html  |z Onlinezugriff via DOI 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a van der Laan  |D Mark J.  |u Division of Biostatistics, School of Public Health, University of California, Berkeley  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Dudoit  |D Sandrine  |u Division of Biostatistics, School of Public Health, University of California, Berkeley  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Keles  |D Sunduz  |u Division of Biostatistics, School of Public Health, University of California, Berkeley  |4 aut 
950 |B NATIONALLICENCE  |P 773  |E 0-  |t Statistical Applications in Genetics and Molecular Biology  |d De Gruyter  |g 3/1(2004-03-22), 1-23  |q 3:1<1  |1 2004  |2 3  |o sagmb 
900 7 |b CC0  |u http://creativecommons.org/publicdomain/zero/1.0  |2 nationallicence 
898 |a BK010053  |b XK010053  |c XK010000 
949 |B NATIONALLICENCE  |F NATIONALLICENCE  |b NL-gruyter