Linguistic matrix theory
Annales de l’Institut Henri Poincaré D, Tome 6 (2019) no. 3, pp. 385-426.
Le texte intégral des articles récents est réservé aux abonnés de la revue. Consultez l'article sur le site de la revue.

Recent research in computational linguistics has developed algorithms which associate matrices with adjectives and verbs, based on the distribution of words in a corpus of text. These matrices are linear operators on a vector space of context words. They are used to construct meaning representations for composite expressions from that of the elementary constituents, forming part of a compositional distributional approach to semantics. We propose a Matrix Theory approach to this data, based on permutation symmetry along with Gaussian weights and their perturbations. A simple Gaussian model is tested against word matrices created from a large corpus of text. We characterize the cubic and quartic departures from the model, which we propose, alongside the Gaussian parameters, as signatures for comparison of linguistic corpora. We propose that perturbed Gaussian models with permutation symmetry provide a promising framework for characterizing the nature of universality in the statistical properties of word matrices. The matrix theory framework developed here exploits the view of statistics as zero dimensional perturbative quantum field theory. It perceives language as a physical system realizing a universality class of matrix statistics characterized by permutation symmetry.

Accepté le :
Publié le :
DOI : 10.4171/aihpd/75
Classification : 52-XX, 05-XX, 60-XX, 81-XX
Mots-clés : Distributional semantics, matrix models, natural language processing, permutation invariant distributions, random maps, random matrix theory, tensor models, topological gravity
@article{AIHPD_2019__6_3_385_0,
     author = {Kartsaklis, Dimitrios and Ramgoolam, Sanjaye and Sadrzadeh, Mehrnoosh},
     title = {Linguistic matrix theory},
     journal = {Annales de l{\textquoteright}Institut Henri Poincar\'e D},
     pages = {385--426},
     volume = {6},
     number = {3},
     year = {2019},
     doi = {10.4171/aihpd/75},
     mrnumber = {4002671},
     zbl = {1447.91125},
     language = {en},
     url = {http://www.numdam.org/articles/10.4171/aihpd/75/}
}
TY  - JOUR
AU  - Kartsaklis, Dimitrios
AU  - Ramgoolam, Sanjaye
AU  - Sadrzadeh, Mehrnoosh
TI  - Linguistic matrix theory
JO  - Annales de l’Institut Henri Poincaré D
PY  - 2019
SP  - 385
EP  - 426
VL  - 6
IS  - 3
UR  - http://www.numdam.org/articles/10.4171/aihpd/75/
DO  - 10.4171/aihpd/75
LA  - en
ID  - AIHPD_2019__6_3_385_0
ER  - 
%0 Journal Article
%A Kartsaklis, Dimitrios
%A Ramgoolam, Sanjaye
%A Sadrzadeh, Mehrnoosh
%T Linguistic matrix theory
%J Annales de l’Institut Henri Poincaré D
%D 2019
%P 385-426
%V 6
%N 3
%U http://www.numdam.org/articles/10.4171/aihpd/75/
%R 10.4171/aihpd/75
%G en
%F AIHPD_2019__6_3_385_0
Kartsaklis, Dimitrios; Ramgoolam, Sanjaye; Sadrzadeh, Mehrnoosh. Linguistic matrix theory. Annales de l’Institut Henri Poincaré D, Tome 6 (2019) no. 3, pp. 385-426. doi : 10.4171/aihpd/75. http://www.numdam.org/articles/10.4171/aihpd/75/

Cité par Sources :