Nearest neighbor classification in infinite dimension

Cérou, Frédéric; Guyader, Arnaud

doi:10.1051/ps:2006014

Cérou, Frédéric ; Guyader, Arnaud

ESAIM: Probability and Statistics, Tome 10 (2006), pp. 340-355.

Résumé

Let $X$ be a random element in a metric space $(ℱ, d)$ , and let $Y$ be a random variable with value $0$ or $1$ . $Y$ is called the class, or the label, of $X$ . Let ${(X_{i}, Y_{i})}_{1 \leq i \leq n}$ be an observed i.i.d. sample having the same law as $(X, Y)$ . The problem of classification is to predict the label of a new random element $X$ . The $k$ -nearest neighbor classifier is the simple following rule: look at the $k$ nearest neighbors of $X$ in the trial sample and choose $0$ or $1$ for its label according to the majority vote. When $(ℱ, d) = (ℝ^{d}, | | . | |)$ , Stone (1977) proved in 1977 the universal consistency of this classifier: its probability of error converges to the Bayes error, whatever the distribution of $(X, Y)$ . We show in this paper that this result is no longer valid in general metric spaces. However, if $(ℱ, d)$ is separable and if some regularity condition is assumed, then the $k$ -nearest neighbor classifier is weakly consistent.

MR | 4 citations dans Numdam

DOI : 10.1051/ps:2006014

Classification : 62H30
Mots-clés : classification, consistency, non parametric statistics

@article{PS_2006__10__340_0,
     author = {C\'erou, Fr\'ed\'eric and Guyader, Arnaud},
     title = {Nearest neighbor classification in infinite dimension},
     journal = {ESAIM: Probability and Statistics},
     pages = {340--355},
     publisher = {EDP-Sciences},
     volume = {10},
     year = {2006},
     doi = {10.1051/ps:2006014},
     mrnumber = {2247925},
     language = {en},
     url = {http://www.numdam.org/articles/10.1051/ps:2006014/}
}

TY  - JOUR
AU  - Cérou, Frédéric
AU  - Guyader, Arnaud
TI  - Nearest neighbor classification in infinite dimension
JO  - ESAIM: Probability and Statistics
PY  - 2006
SP  - 340
EP  - 355
VL  - 10
PB  - EDP-Sciences
UR  - http://www.numdam.org/articles/10.1051/ps:2006014/
DO  - 10.1051/ps:2006014
LA  - en
ID  - PS_2006__10__340_0
ER  -

%0 Journal Article
%A Cérou, Frédéric
%A Guyader, Arnaud
%T Nearest neighbor classification in infinite dimension
%J ESAIM: Probability and Statistics
%D 2006
%P 340-355
%V 10
%I EDP-Sciences
%U http://www.numdam.org/articles/10.1051/ps:2006014/
%R 10.1051/ps:2006014
%G en
%F PS_2006__10__340_0

Cérou, Frédéric; Guyader, Arnaud. Nearest neighbor classification in infinite dimension. ESAIM: Probability and Statistics, Tome 10 (2006), pp. 340-355. doi : 10.1051/ps:2006014. http://www.numdam.org/articles/10.1051/ps:2006014/

Bibliographie
Cité par

[1] C. Abraham, G. Biau and B. Cadre, On the kernel rule for function classification. submitted (2003). | Zbl

[2] G. Biau, F. Bunea and M.H. Wegkamp, On the kernel rule for function classification. IEEE Trans. Inform. Theory, to appear (2005). | MR

[3] T.M. Cover and P.E. Hart, Nearest neighbor pattern classification. IEEE Trans. Inform. Theory IT-13 (1967) 21-27. | Zbl

[4] S. Dabo-Niang and N. Rhomari, Nonparametric regression estimation when the regressor takes its values in a metric space, submitted (2001). | Zbl

[5] L. Devroye, On the almost everywhere convergence of nonparametric regression function estimates. Ann. Statist. 9 (1981) 1310-1319. | Zbl

[6] L. Devroye, L. Györfi, A. Krzyżak and G. Lugosi, On the strong universal consistency of nearest neighbor regression function estimates. Ann. Statist. 22 (1994) 1371-1385. | Zbl

[7] L. Devroye, L. Györfi and G. Lugosi, A probabilistic theory of pattern recognition 31, Applications of Mathematics (New York). Springer-Verlag, New York (1996). | MR | Zbl

[8] L.C. Evans and R.F. Gariepy, Measure theory and fine properties of functions. Studies in Advanced Mathematics. CRC Press, Boca Raton, FL (1992). | MR | Zbl

[9] H. Federer, Geometric measure theory. Die Grundlehren der mathematischen Wissenschaften, Band 153. Springer-Verlag New York Inc., New York (1969). | MR | Zbl

[10] D. Preiss, Gaussian measures and the density theorem. Comment. Math. Univ. Carolin. 22 (1981) 181-193. | Zbl

[11] D. Preiss, Dimension of metrics and differentiation of measures, in General topology and its relations to modern analysis and algebra, V (Prague, 1981), Sigma Ser. Pure Math., Heldermann, Berlin 3 (1983) 565-568. | Zbl

[12] D. Preiss and J. Tišer, Differentiation of measures on Hilbert spaces, in Measure theory, Oberwolfach 1981 (Oberwolfach, 1981), Springer, Berlin. Lect. Notes Math. 945 (1982) 194-207. | Zbl

[13] C.J. Stone, Consistent nonparametric regression. Ann. Statist. 5 (1977) 595-645. With discussion and a reply by the author. | Zbl

Cité par Sources :