We deal with the problem of choosing a piecewise constant estimator of a regression function mapping into . We consider a non gaussian regression framework with deterministic design points, and we adopt the non asymptotic approach of model selection via penalization developed by Birgé and Massart. Given a collection of partitions of , with possibly exponential complexity, and the corresponding collection of piecewise constant estimators, we propose a penalized least squares criterion which selects a partition whose associated estimator performs approximately as well as the best one, in the sense that its quadratic risk is close to the infimum of the risks. The risk bound we provide is non asymptotic.
Mots-clés : CART, change-points detection, deviation inequalities, model selection, oracle inequalities, regression
@article{PS_2009__13__70_0, author = {Sauv\'e, Marie}, title = {Histogram selection in non gaussian regression}, journal = {ESAIM: Probability and Statistics}, pages = {70--86}, publisher = {EDP-Sciences}, volume = {13}, year = {2009}, doi = {10.1051/ps:2008002}, mrnumber = {2502024}, language = {en}, url = {http://www.numdam.org/articles/10.1051/ps:2008002/} }
Sauvé, Marie. Histogram selection in non gaussian regression. ESAIM: Probability and Statistics, Tome 13 (2009), pp. 70-86. doi : 10.1051/ps:2008002. http://www.numdam.org/articles/10.1051/ps:2008002/
[1] Model selection for regression on a fixed design. Probab. Theory Related Fields 117 (2000) 467-493. | MR | Zbl
,[2] Model election for (auto-)regression with dependent data. ESAIM: PS 5 (2001) 33-49. | EuDML | Numdam | MR | Zbl
, and ,[3] Gaussian model selection. J. Eur. Math. Soc. 3 (2001) 203-268. | EuDML | MR | Zbl
and ,[4] Minimal penalties for gaussian model selection. To be published in Probab. Theory Related Fields (2005). | MR | Zbl
and ,[5] How many bins should be put in a regular histogram. ESAIM: PS 10 (2006) 24-45. | EuDML | Numdam | MR | Zbl
and ,[6] Concentration nequalities for ub-dditive unctions sing the ntropy ethod. Stochastic Inequalities and Applications 56 (2003) 213-247. | MR | Zbl
,[7] Classification And Regression Trees. Chapman et Hall (1984). | MR | Zbl
, , and ,[8] Modified kaike’s criterion for histogram density estimation. C.R. Acad. Sci. Paris Sér. I Math. 330 (2000) 729-732. | MR | Zbl
,[9] Universal aggregation rules with sharp oracle inequalities. Ann. Stat. (1999) 1-37.
,[10] Quelques approches pour la détection de ruptures à horizon fini. Ph.D. thesis, Université Paris XI Orsay (2002).
,[11] Consistency of data-driven histogram methods for density estimation and classification. Ann. Stat. 24 (1996) 786-706. | MR | Zbl
and ,[12] Some comments on . Technometrics 15 (1973) 661-675. | Zbl
,[13] Notes de aint-lour. Lecture Notes to be published (2003).
,[14] Histogram regression estimation using data-dependent partitions. Ann. Stat. 24 (1996) 1084-1105. | MR | Zbl
,[15] Sélection de modèles en régression non gaussienne. Applications à la sélection de variables et aux tests de survie accélérés. Ph.D. thesis, Université Paris XI Orsay (2006).
,[16] Variable selection through CART. Research Report 5912, INRIA (2006).
and ,Cité par Sources :