Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases
ESAIM: Probability and Statistics, Tome 21 (2017), pp. 412-451.

We investigate the optimality for model selection of the so-called slope heuristics, V-fold cross-validation and V-fold penalization in a heteroscedatic with random design regression context. We consider a new class of linear models that we call strongly localized bases and that generalize histograms, piecewise polynomials and compactly supported wavelets. We derive sharp oracle inequalities that prove the asymptotic optimality of the slope heuristics – when the optimal penalty shape is known – and V-fold penalization. Furthermore, V-fold cross-validation seems to be suboptimal for a fixed value of V since it recovers asymptotically the oracle learned from a sample size equal to 1-V -1 of the original amount of data. Our results are based on genuine concentration inequalities for the true and empirical excess risks that are of independent interest. We show in our experiments the good behavior of the slope heuristics for the selection of linear wavelet models. Furthermore, V-fold cross-validation and V-fold penalization have comparable efficiency.

Reçu le :
Accepté le :
DOI : 10.1051/ps/2017005
Classification : 62G08, 62G09
Mots clés : Nonparametric regression, heteroscedastic noise, random design, model selection, cross-validation, wavelets
Navarro, Fabien 1 ; Saumard, Adrien 1

1 CREST, ENSAI, Campus de Ker-Lann, rue Blaise Pascal, BP 37203, 35172 Bruz Cedex, France.
@article{PS_2017__21__412_0,
     author = {Navarro, Fabien and Saumard, Adrien},
     title = {Slope heuristics and {V-Fold} model selection in heteroscedastic regression using strongly localized bases},
     journal = {ESAIM: Probability and Statistics},
     pages = {412--451},
     publisher = {EDP-Sciences},
     volume = {21},
     year = {2017},
     doi = {10.1051/ps/2017005},
     mrnumber = {3743921},
     zbl = {1395.62093},
     language = {en},
     url = {http://www.numdam.org/articles/10.1051/ps/2017005/}
}
TY  - JOUR
AU  - Navarro, Fabien
AU  - Saumard, Adrien
TI  - Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases
JO  - ESAIM: Probability and Statistics
PY  - 2017
SP  - 412
EP  - 451
VL  - 21
PB  - EDP-Sciences
UR  - http://www.numdam.org/articles/10.1051/ps/2017005/
DO  - 10.1051/ps/2017005
LA  - en
ID  - PS_2017__21__412_0
ER  - 
%0 Journal Article
%A Navarro, Fabien
%A Saumard, Adrien
%T Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases
%J ESAIM: Probability and Statistics
%D 2017
%P 412-451
%V 21
%I EDP-Sciences
%U http://www.numdam.org/articles/10.1051/ps/2017005/
%R 10.1051/ps/2017005
%G en
%F PS_2017__21__412_0
Navarro, Fabien; Saumard, Adrien. Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases. ESAIM: Probability and Statistics, Tome 21 (2017), pp. 412-451. doi : 10.1051/ps/2017005. http://www.numdam.org/articles/10.1051/ps/2017005/

A. Antoniadis, J. Bigot and T. Sapatinas, Wavelet estimators in nonparametric regression: a comparative simulation study. J. Stat. Softw. 6 (2001) 1–83. | DOI

A. Antoniadis, G. Gregoire and I. Mckeague, Wavelet methods for curve estimation. J. Amer. Statist. Assoc. 89 (1994) 1340–1353. | DOI | MR | Zbl

S. Arlot, V-fold cross-validation improved: V-fold penalization. Preprint (2008). | arXiv

S. Arlot, Choosing a penalty for model selection in heteroscedastic regression (2010). | arXiv

S. Arlot and F. Bach, Data-driven calibration of linear estimators with minimal penalties. Adv. Neural Infor. Process. Syst. 22 (2009) 46–54.

S. Arlot and A. Célisse, A survey of cross-validation procedures for model selection. Stat. Surv. 4 (2010) 40–79. | DOI | MR | Zbl

S. Arlot and A. Célisse, Segmentation of the mean of heteroscedastic data via cross-validation. Stat. Comput. 21 (2011) 613–632. | DOI | MR | Zbl

S. Arlot and M. Lerasle, Choice of V for V-fold cross-validation in least-squares density estimation. J. Mach. Learn. Res. 17 (2016) 1–50. | MR | Zbl

S. Arlot and P. Massart, Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. 10 (2009) 245–279.

J.-P. Baudry, C. Maugis and B. Michel, Slope heuristics: overview and implementation. Stat. Comput. 22 (2012) 455–470. | DOI | MR | Zbl

L. Birgé and P. Massart, Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4 (1998) 329–375. | DOI | MR | Zbl

L. Birgé and P. Massart, Minimal penalties for Gaussian model selection. Probab. Theory Related Fields 138 (2007) 33–73. | DOI | MR | Zbl

P. Burman, A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76 (1989) 503–514. | DOI | MR | Zbl

T. Cai, Adaptive wavelet estimation: a block thresholding and oracle inequality approach. Ann. Statist. 27 (1999) 898–924. | MR | Zbl

T. Cai and L. Brown, Wavelet shrinkage for nonequispaced samples. Ann. Statist. 26 (1998) 1783–1799. | MR | Zbl

T. Cai and L. Brown, Wavelet estimation for samples with random uniform design. Statist. Probab. Lett. 42 (1999) 313–321. | DOI | MR | Zbl

G. Castellan, Modified Akaike’s criterion for histogram density estimation. Technical report 99.61, Université Paris-Sud (1999).

S. Chatterjee, A new perspective on least squares under convex constraint. Ann. Statist. 42 (2014) 2340–2381, 12. | DOI | MR | Zbl

A. Cohen, I. Daubechies and P. Vial, Wavelets on the interval and fast wavelet transforms. Appl. Comput. Harmon. Anal. 1 (1993) 54–81. | DOI | MR | Zbl

A. Donoho, D. Maleki and M. Shahram, Wavelab 850 (2006).

D. Donoho and I. Johnstone, Ideal spatial adaptation by wavelet shrinkage. Biometrika 81 (1994) 425–455. | DOI | MR | Zbl

S. Geisser, The predictive sample reuse method with applications. J. Amer. Statist. Assoc. 70 (1975) 320–328. | DOI | Zbl

L. Györfi, M. Kohler, A. Krzyżak and H. Walk, A distribution-free theory of nonparametric regression. Springer Series in Statistics. Springer Verlag, New York (2002). | MR | Zbl

P. Hall and B. Turlach, Interpolation methods for nonlinear wavelet regression with irregularly spaced design. Ann. Statist. 25 (1997) 1912–1925. | DOI | MR | Zbl

W. Härdle, G. Kerkyacharian, D. Picard and A. Tsybakov, Wavelets, approximation, and statistical applications. Vol. 129 of Lect. Notes Statist. Springer Verlag, New York (1998). | MR | Zbl

R. Kulik and M. Raimondo, Wavelet regression in random design with heteroscedastic dependent errors. Ann. Statist. 37 (2009) 3396–3430. | DOI | MR | Zbl

G. Lecué and C. Mitchell, Oracle inequalities for cross-validation type procedures. Electron. J. Stat. 6 (2012) 1803–1837. | DOI | MR | Zbl

M. Lerasle, Optimal model selection for density estimation of stationary data under various mixing conditions. Ann. Statist. 39 (2011) 1852–1877. | DOI | MR | Zbl

M. Lerasle, Optimal model selection in density estimation. Ann. Inst. Henri Poincaré Probab. Stat. 48 (2012) 884–908. | DOI | Numdam | MR | Zbl

S. Mallat, A wavelet tour of signal processing: the sparse way. Academic press (2008). | MR | Zbl

J. Marron, S. Adak, I. Johnstone, M. Neumann and P. Patil, Exact risk analysis of wavelet regression. J. Comput. Graph. Statist. 7 (1998) 278–309.

P. Massart, Concentration inequalities and model selection, Vol. 1896 of Lect. Notes Math. Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, With a foreword by Jean Picard (2003) 6–23. | MR | Zbl

A. Muro and S. van de Geer, Concentration behavior of the penalized least squares estimator. Preprint (2015). | arXiv | MR

G. Nason, Wavelet shrinkage using cross-validation. J.R. Stat. Soc. Ser. B (1996) 463–479. | MR | Zbl

A. Saumard, Nonasymptotic quasi-optimality of AIC and the slope heuristics in maximum likelihood estimation of density using histogram models (2010). hal-00512310.

A. Saumard, Optimal upper and lower bounds for the true and empirical excess risks in heteroscedastic least-squares regression. Electron. J. Statist. 6 (2012) 579–655. | DOI | MR | Zbl

A. Saumard, Optimal model selection in heteroscedastic regression using piecewise polynomial functions. Electron. J. Statist. 7 (2013) 1184–1223. | DOI | MR | Zbl

C. Stone, Optimal global rates of convergence for nonparametric regression. Ann. Statist. 10 (1982) 1040–1053. | DOI | MR | Zbl

S. van de Geer and M. Wainwright, On concentration for (regularized) empirical risk minimization. Preprint (2016). | arXiv | MR

M. Wegkamp, Model selection in nonparametric regression. Ann. Statist. 31 (2003) 252–273. | DOI | MR | Zbl

Cité par Sources :