Universal consistency of the k-NN rule in metric spaces and Nagata dimension
ESAIM: Probability and Statistics, Tome 24 (2020), pp. 914-934.

The k nearest neighbour learning rule (under the uniform distance tie breaking) is universally consistent in every metric space X that is sigma-finite dimensional in the sense of Nagata. This was pointed out by Cérou and Guyader (2006) as a consequence of the main result by those authors, combined with a theorem in real analysis sketched by D. Preiss (1971) (and elaborated in detail by Assouad and Quentin de Gromard (2006)). We show that it is possible to give a direct proof along the same lines as the original theorem of Charles J. Stone (1977) about the universal consistency of the k-NN classifier in the finite dimensional Euclidean space. The generalization is non-trivial because of the distance ties being more prevalent in the non-Euclidean setting, and on the way we investigate the relevant geometric properties of the metrics and the limitations of the Stone argument, by constructing various examples.

Reçu le :
Accepté le :
Première publication :
Publié le :
DOI : 10.1051/ps/2020018
Classification : 62H30, 54F45
Mots-clés : $$-NN classifier, universal consistency, geometric Stone lemma, distance ties, Nagata dimension, sigma-finite dimensional metric spaces
@article{PS_2020__24_1_914_0,
     author = {Collins, Beno{\^\i}t and Kumari, Sushma and Pestov, Vladimir G.},
     title = {Universal consistency of the {\protect\emph{k}-NN} rule in metric spaces and {Nagata} dimension},
     journal = {ESAIM: Probability and Statistics},
     pages = {914--934},
     publisher = {EDP-Sciences},
     volume = {24},
     year = {2020},
     doi = {10.1051/ps/2020018},
     mrnumber = {4178790},
     zbl = {1455.62123},
     language = {en},
     url = {http://www.numdam.org/articles/10.1051/ps/2020018/}
}
TY  - JOUR
AU  - Collins, Benoît
AU  - Kumari, Sushma
AU  - Pestov, Vladimir G.
TI  - Universal consistency of the k-NN rule in metric spaces and Nagata dimension
JO  - ESAIM: Probability and Statistics
PY  - 2020
SP  - 914
EP  - 934
VL  - 24
PB  - EDP-Sciences
UR  - http://www.numdam.org/articles/10.1051/ps/2020018/
DO  - 10.1051/ps/2020018
LA  - en
ID  - PS_2020__24_1_914_0
ER  - 
%0 Journal Article
%A Collins, Benoît
%A Kumari, Sushma
%A Pestov, Vladimir G.
%T Universal consistency of the k-NN rule in metric spaces and Nagata dimension
%J ESAIM: Probability and Statistics
%D 2020
%P 914-934
%V 24
%I EDP-Sciences
%U http://www.numdam.org/articles/10.1051/ps/2020018/
%R 10.1051/ps/2020018
%G en
%F PS_2020__24_1_914_0
Collins, Benoît; Kumari, Sushma; Pestov, Vladimir G. Universal consistency of the k-NN rule in metric spaces and Nagata dimension. ESAIM: Probability and Statistics, Tome 24 (2020), pp. 914-934. doi : 10.1051/ps/2020018. http://www.numdam.org/articles/10.1051/ps/2020018/

[1] P. Assouad and T. Quentin De Gromard, Recouvrements, derivation des mesures et dimensions. Rev. Mat. Iberoam. 22 (2006) 893–953. | DOI | MR | Zbl

[2] F. Cérou and A. Guyader, Nearest neighbor classification in infinite dimension. ESAIM: PS 10 (2006) 340–355. | DOI | Numdam | MR | Zbl

[3] T.M. Cover and P.E. Hart, Nearest neighbour pattern classification. IEEE Trans. Info. Theory 13 (1967) 21–27. | DOI | Zbl

[4] L. Devroye, On the almost everywhere convergence of nonparametric regression function estimates. Ann. Statist. 9 (1981) 1310–1319. | DOI | MR | Zbl

[5] Luc Devroye, László Györfi and Gábor Lugosi, A Probabilistic Theory of Pattern Recognition, Springer–Verlag, New York (1996). | DOI | MR | Zbl

[6] Hubert Haoyang Duan, Applying Supervised Learning Algorithms and a New Feature Selection Method to Predict Coronary Artery Disease. M.Sc. thesis, University of Ottawa. Preprint (2014). | arXiv

[7] R. Engelking, General Topology, Sigma Series in Pure Mathematics, 2nd edn. Heldermann Verlag, Berlin (1989) 6. | MR | Zbl

[8] L. Forzani, R. Fraiman and P. Llop, Consistent nonparametric regression for functional data under the Stone–Besicovitch conditions. IEEE Transac. Inf. Theory 58 (2012) 6697–6708. | DOI | MR | Zbl

[9] Stan Hatko, k-Nearest Neighbour Classification of Datasets with a Family of Distances. M.Sc. thesis, University of Ottawa. Preprint (2015). | arXiv

[10] Sushma Kumari, Topics in Random Matrices and Statistical Machine Learning. Ph.D. thesis, Kyoto University. Preprint (2018). | arXiv

[11] P.A. Loeb and E. Talvila, Lusin’s theorem and Bochner integration. Sci. Math. Japan 60 (2004) 113–120. | MR | Zbl

[12] J. Nagata, On a special metric characterizing a metric space of dim ≤ n. Proc. Japan Acad. 39 (1963) 278–282. | MR | Zbl

[13] J.I. Nagata, On a special metric and dimension. Fund. Math. 55 (1964) 181–194. | DOI | MR | Zbl

[14] J.-I. Nagata, Modern dimension theory. Bibliotheca Mathematica, Interscience Publishers, North–Holland, Amsterdam (1965) 6. | MR | Zbl

[15] J.-I. Nagata, Open problems left in my wake of research. Topol. Appl. 146 (2005) 5–13. | DOI | MR | Zbl

[16] A.P. Ostrand, A conjecture of J. Nagata on dimension and metrization. Bull. Amer. Math. Soc. 71 (1965) 623–625. | DOI | MR | Zbl

[17] D. Preiss, Invalid Vitali theorems, in: Abstracta. 7th Winter School on Abstract Analysis, Czechoslovak Academy of Sciences, Prague, Czechia (1979) 58–60.

[18] D. Preiss, Dimension of metrics and differentiation of measures, General topology and its relations to modern analysis and algebra, V (Prague, 1981) Sigma Series of Pure Mathematics, Heldermann, Berlin (1983) 565–568. | MR | Zbl

[19] M. Radovanović, A. Nanopoulos and M. Ivanović, Hubs in space: popular nearest neighbors in high-dimensional data. J. Mach. Learn. Res. 11 (2010) 2487–2531. | MR | Zbl

[20] C. Stone, Consistent nonparametric regression. Ann. Stat. 5 (1977) 595–645. | DOI | MR | Zbl

[21] K.Q. Weinberger and L.K. Saul, Distance metric learning for large margin classification. J. Mach. Learn. Res. 10 (2009) 207–244. | Zbl

Cité par Sources :

S.K. would like to thank JICA-FRIENDSHIP scholarship (fellowship D-1590283/ J-1593019) for supporting her doctoral study at Kyoto University and Department of Mathematics, Kyoto University for supporting the Brazil research trip under the KTGU project.

V.G.P. wants to acknowledge support from CNPq (bolsa Pesquisador Visitante, processo 310012/2016) and CAPES (bolsa Professor Visitante Estrangeiro Sênior, processo 88881.117018/2016-01).