Random forests were introduced by Breiman in 2001. We study theoretical aspects of both original Breiman’s random forests and a simplified version, the centred random forests. Under the independent and identically distributed hypothesis, Scornet, Biau and Vert proved the consistency of Breiman’s random forest, while Biau studied the simplified version and obtained a rate of convergence in the sparse case. However, the i.i.d hypothesis is generally not satisfied for example when dealing with time series. We extend the previous results to the case where observations are weakly dependent, more precisely when the sequences are stationary β−mixing.
Accepté le :
Première publication :
Publié le :
DOI : 10.1051/ps/2020015
Mots-clés : Statistics, random forests, time-dependent processes
@article{PS_2020__24_1_801_0, author = {Goehry, Benjamin}, title = {Random forests for time-dependent processes}, journal = {ESAIM: Probability and Statistics}, pages = {801--826}, publisher = {EDP-Sciences}, volume = {24}, year = {2020}, doi = {10.1051/ps/2020015}, mrnumber = {4178366}, zbl = {1455.62172}, language = {en}, url = {http://www.numdam.org/articles/10.1051/ps/2020015/} }
Goehry, Benjamin. Random forests for time-dependent processes. ESAIM: Probability and Statistics, Tome 24 (2020), pp. 801-826. doi : 10.1051/ps/2020015. http://www.numdam.org/articles/10.1051/ps/2020015/
[1] Random walks with stationary increments and renewal theory. MC Tracts 112 (1979) 1–223. | MR | Zbl
,[2] Analysis of a random forests model. J. Mach. Learn. Res. 13 (2012) 1063–1095. | MR | Zbl
,[3] A random forest guided tour. TEST 25 (2016) 197–227. | DOI | MR | Zbl
and ,[4] Basic properties of strong mixing conditions. a survey and some open questions. Probab. Surv. 2 (2005) 107–144. | DOI | MR | Zbl
,[5] Bagging predictors. Mach. Learn. 24 (1996) 123–140. | DOI | Zbl
,[6] Random forests. Mach. Learn. 45 (2001) 5–32. | DOI | Zbl
,[7] Consistency for a simple model of random forests. Technical report (2004).
,[8] Classification and Regression Trees. The Wadsworth and Brooks-Cole statistics-probability series. Taylor & Francis, Oxford (1984). | MR | Zbl
, , and ,[9] Random forests for classification in ecology. Ecology 88 (2007) 2783–2792. | DOI
, , , , , and ,[10] Weak dependence, in Weak Dependence: With Examples and Applications. Springer, Berlin (2007) 9–20. | MR | Zbl
, , , , and ,[11] Short-term load forecasting using random forests, in Intelligent Systems’2014. Springer International Publishing, Cham (2015) 821–828.
,[12] Statistical learning for wind power: A modeling and stability study towards forecasting. Wind Energy 20 (2017) 2037–2047. | DOI
, , and ,[13] A distribution-free theory of nonparametric regression. Springer Science & Business Media, Berlin (2006). | MR | Zbl
, , and ,[14] Comparison of arima and random forest time series models for prediction of avian influenza h5n1 outbreaks. BMC Bioinform. 15 (2014) 276. | DOI
, , and ,[15] Random forests model for one day ahead load forecasting, in IREC2015 The Sixth International Renewable Energy Congress (2015) 1–6.
and ,[16] Convergence and consistency of regularized boosting with weakly dependent observations. IEEE Trans. Inf. Theory 60 (2014) 651–660. | DOI | MR | Zbl
, and ,[17] Nonparametric time series prediction through adaptive model selection. Mach. Learn. 39 (2000) 5–34. | DOI | Zbl
,[18] Quantifying uncertainty in random forests via confidence intervals and hypothesis tests. J. Mach. Learn. Res. 17 (2016) 1–41. | MR | Zbl
and ,[19] Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9 (2006) 181–199. | DOI
, and ,[20] Inequalities and limit theorems for weakly dependent sequences. Lecture (2013).
,[21] On the asymptotics of random forests. J. Multivar. Anal. 146 (2016) 72–83. | DOI | MR | Zbl
,[22] Consistency of random forests. Ann. Stat. 43 (2015) 1716–1741. | DOI | MR | Zbl
, and ,[23] Real-time human pose recognition in parts from single depth images. Commun. ACM 56 (2013) 116–124. | DOI
, , , , , , and ,[24] Random forest: a classification and regression tool for compound classification and qsar modeling. J. Chem. Inf. Comput. Sci. 43 (2003) 1947–1958. | DOI
, , , , and ,[25] Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc. 113 (2018) 1228–1242. | DOI | MR | Zbl
and ,[26] Rates of convergence for empirical processes of stationary mixing sequences. Ann. Prob. 22 (1994) 94–116. | MR | Zbl
,Cité par Sources :