Penalization <i>versus </i>Goldenshluger-Lepski strategies in warped bases regression

Chagny, Gaëlle

doi:10.1051/ps/2011165

Penalization versus Goldenshluger-Lepski strategies in warped bases regression

Chagny, Gaëlle

ESAIM: Probability and Statistics, Tome 17 (2013), pp. 328-358.

Résumé

This paper deals with the problem of estimating a regression function f, in a random design framework. We build and study two adaptive estimators based on model selection, applied with warped bases. We start with a collection of finite dimensional linear spaces, spanned by orthonormal bases. Instead of expanding directly the target function f on these bases, we rather consider the expansion of h = f ∘ G^-1, where G is the cumulative distribution function of the design, following Kerkyacharian and Picard [Bernoulli 10 (2004) 1053-1105]. The data-driven selection of the (best) space is done with two strategies: we use both a penalization version of a “warped contrast”, and a model selection device in the spirit of Goldenshluger and Lepski [Ann. Stat. 39 (2011) 1608-1632]. We propose by these methods two functions, ĥ_l (l = 1, 2), easier to compute than least-squares estimators. We establish nonasymptotic mean-squared integrated risk bounds for the resulting estimators, ${\hat{f}}_{l} = {\hat{h}}_{l} \circ G$ f̂_l = ĥ_l°G if G is known, or ${\hat{f}}_{l} = {\hat{h}}_{l} \circ \hat{G}$ f̂_l = ĥ_l°Ĝ (l = 1,2) otherwise, where Ĝ is the empirical distribution function. We study also adaptive properties, in case the regression function belongs to a Besov or Sobolev space, and compare the theoretical and practical performances of the two selection rules.

DOI : 10.1051/ps/2011165

Classification : 62G05, 62G08
Mots-clés : adaptive estimator, model selection, nonparametric regression estimation, warped bases

@article{PS_2013__17__328_0,
     author = {Chagny, Ga\"elle},
     title = {Penalization \protect\emph{versus {}Goldenshluger-Lepski} strategies in warped bases regression},
     journal = {ESAIM: Probability and Statistics},
     pages = {328--358},
     publisher = {EDP-Sciences},
     volume = {17},
     year = {2013},
     doi = {10.1051/ps/2011165},
     language = {en},
     url = {http://www.numdam.org/articles/10.1051/ps/2011165/}
}

TY  - JOUR
AU  - Chagny, Gaëlle
TI  - Penalization versus Goldenshluger-Lepski strategies in warped bases regression
JO  - ESAIM: Probability and Statistics
PY  - 2013
SP  - 328
EP  - 358
VL  - 17
PB  - EDP-Sciences
UR  - http://www.numdam.org/articles/10.1051/ps/2011165/
DO  - 10.1051/ps/2011165
LA  - en
ID  - PS_2013__17__328_0
ER  -

%0 Journal Article
%A Chagny, Gaëlle
%T Penalization versus Goldenshluger-Lepski strategies in warped bases regression
%J ESAIM: Probability and Statistics
%D 2013
%P 328-358
%V 17
%I EDP-Sciences
%U http://www.numdam.org/articles/10.1051/ps/2011165/
%R 10.1051/ps/2011165
%G en
%F PS_2013__17__328_0

Chagny, Gaëlle. Penalization versus Goldenshluger-Lepski strategies in warped bases regression. ESAIM: Probability and Statistics, Tome 17 (2013), pp. 328-358. doi : 10.1051/ps/2011165. http://www.numdam.org/articles/10.1051/ps/2011165/

Bibliographie
Cité par

[1] A. Antoniadis, G. Grégoire and P. Vial, Random design wavelet curve smoothing. Statist. Probab. Lett. 35 (1997) 225-232. | MR | Zbl

[2] J.Y. Audibert and O. Catoni, Robust linear least squares regression. Ann. Stat. (2011) (to appear), arXiv:1010.0074. | MR | Zbl

[3] J.Y. Audibert and O. Catoni, Robust linear regression through PAC-Bayesian truncation. Preprint, arXiv:1010.0072.

[4] Y. Baraud, Model selection for regression on a random design. ESAIM: PS 6 (2002) 127-146. | EuDML | Numdam | MR | Zbl

[5] A. Barron, L. Birgé and P. Massart, Risk bounds for model selection via penalization. Probab. Theory Relat. Fields 113 (1999) 301-413. | MR | Zbl

[6] J.P. Baudry, C. Maugis and B. Michel, Slope heuristics: overview and implementation. Stat. Comput. 22-2 (2011) 455-470. | MR

[7] L. Birgé, Model selection for Gaussian regression with random design. Bernoulli 10 (2004) 1039-1051. | MR | Zbl

[8] L. Birgé and P. Massart, Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4 (1998) 329-375. | MR | Zbl

[9] L. Birgé and P. Massart, Minimal penalties for gaussian model selection. Probab. Theory Relat. Fields 138 (2006) 33-73. | MR | Zbl

[10] E. Brunel and F. Comte, Penalized contrast estimation of density and hazard rate with censored data. Sankhya 67 (2005) 441-475. | MR | Zbl

[11] E. Brunel, F. Comte and A. Guilloux, Nonparametric density estimation in presence of bias and censoring. Test 18 (2009) 166-194. | MR | Zbl

[12] T.T. Cai and L.D. Brown, Wavelet shrinkage for nonequispaced samples. Ann. Stat. 26 (1998) 1783-1799. | MR | Zbl

[13] G. Chagny, Régression: bases déformées et sélection de modèles par pénalisation et méthode de Lepski. Preprint, hal-00519556 v2.

[14] F. Comte and Y. Rozenholc, A new algorithm for fixed design regression and denoising. Ann. Inst. Stat. Math. 56 (2004) 449-473. | MR | Zbl

[15] R.A. Devore and G. Lorentz, Constructive approximation, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 303. Springer-Verlag, Berlin (1993). | MR | Zbl

[16] D.L. Donoho, I.M. Johnstone, G. Kerkyacharian and D. Picard, Wavelet shrinkage: asymptopia? With discussion and a reply by the authors. J. Roy. Stat. Soc., Ser. B 57 (1995) 301-369. | MR | Zbl

[17] A. Dvoretzky, J. Kiefer and J. Wolfowitz, Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann. Math. Stat. 27 (1956) 642-669. | MR | Zbl

[18] S. Efromovich, Nonparametric curve estimation: Methods, theory, and applications. Springer Series in Statistics, Springer-Verlag, New York (1999) xiv+411 | MR | Zbl

[19] J. Fan and I. Gijbels, Variable bandwidth and local linear regression smoothers. Ann. Stat. 20 (1992) 2008-2036. | MR | Zbl

[20] S. Gaïffas, On pointwise adaptive curve estimation based on inhomogeneous data. ESAIM: PS 11 (2007) 344-364. | Numdam | MR | Zbl

[21] A. Goldenshluger and O. Lepski, Bandwidth selection in kernel density estimation: oracle inequalities and adaptive minimax optimality. Ann. Stat. 39 (2011) 1608-1632. | MR | Zbl

[22] G.K. Golubev and M. Nussbaum, Adaptive spline estimates in a nonparametric regression model. Teor. Veroyatnost. i Primenen. ( Russian) 37 (1992) 554-561; translation in Theor. Probab. Appl. 37 (1992) 521-529. | MR | Zbl

[23] W. Härdle and A. Tsybakov, Local polynomial estimators of the volatility function in nonparametric autoregression. J. Econ. 81 (1997) 223-242. | MR | Zbl

[24] G. Kerkyacharian and D. Picard, Regression in random design and warped wavelets. Bernoulli 10 (2004) 1053-1105. | MR | Zbl

[25] T. Klein and E. Rio, Concentration around the mean for maxima of empirical processes. Ann. Probab. 33 (2005) 1060-1077. | MR | Zbl

[26] M. Köhler and A. Krzyzak, Nonparametric regression estimation using penalized least squares. IEEE Trans. Inf. Theory 47 (2001) 3054-3058. | MR | Zbl

[27] C. Lacour, Adaptive estimation of the transition density of a particular hidden Markov chain. J. Multivar. Anal. 99 (2008) 787-814. | MR | Zbl

[28] E. Nadaraya, On estimating regression. Theory Probab. Appl. 9 (1964) 141-142. | Zbl

[29] T.-M. Pham Ngoc, Regression in random design and Bayesian warped wavelets estimators. Electron. J. Stat. 3 (2009) 1084-1112. | MR

[30] A.B. Tsybakov, Introduction à l'estimation non-paramétrique, Mathématiques & Applications (Berlin), vol. 41. Springer-Verlag, Berlin (2004). | MR | Zbl

[31] G.S. Watson, Smooth regression analysis. Sankhya A 26 (1964) 359-372. | MR | Zbl

[32] M. Wegkamp, Model selection in nonparametric regression. Ann. Stat. 31 (2003) 252-273. | MR | Zbl

Cité par Sources :