[Conditions environnementales dans les bassins semi-fermés : Une approche dynamique de structure latente pour les variables multivariées de type mixte]
L’identification des conditions environnementales typiques basée sur plusieurs séries chronologiques d’observations linéaires et circulaires nécessite des méthodes de classification qui tiennent compte des dépendances aussi bien temporelles que dans l’ensemble des variables. Motivé par une étude de cas centré autour de caractéristiques marines, nous adoptons une analyse de structure latente comme classification, en s’appuyant sur un modèle de Markov caché multivarié. Le modèle intègre des densités multivariées von Mises et log-normales pour décrire la distribution que la vitesse du vent et la hauteur des vagues ainsi que la direction du vent et des vagues prennent sous différents régimes latents, avec des paramètres qui dépendent de l’évolution d’une chaîne de Markov non observée. L’estimation du modèle est facilitée par un algorithme hybride qui combine un algorithme EM avec maximisation directe de la log-vraisemblance.
Notre analyse des données marines de deux régions de la Méditerranée montre qu’une approche de Markov caché comme classification peut être utilisée avec succès pour identifier les conditions marines interprétables dans les milieux orographiques complexes.
The identification of typical environmental conditions from multiple time series of linear and circular observations requires classification methods that account for the dependence across variables and in time. Motivated by a case study of sea conditions, we take a latent-class approach to classification, relying on a multivariate hidden Markov model. The model integrates multivariate von Mises and log-normal densities to describe the distribution that wind speed and wave height as well as wind and wave direction take under different latent regimes, with parameters that depend on the evolution of an unobserved Markov chain. The estimation of the model is facilitated by a hybrid algorithm that combines an EM algorithm with direct maximization of the log-likelihood.
Our analysis of marine data from two locations in the Mediterranean shows that a hidden Markov approach to classification can be successfully employed for identifying interpretable marine conditions in complex orographic settings.
@article{JSFS_2015__156_1_114_0, author = {Bulla, Jan and Lagona, Francesco and Maruotti, Antonello and Picone, Marco}, title = {Environmental conditions in semi-enclosed basins: {A} dynamic latent class approach for mixed-type multivariate variables}, journal = {Journal de la soci\'et\'e fran\c{c}aise de statistique}, pages = {114--137}, publisher = {Soci\'et\'e fran\c{c}aise de statistique}, volume = {156}, number = {1}, year = {2015}, zbl = {1316.62164}, language = {en}, url = {http://www.numdam.org/item/JSFS_2015__156_1_114_0/} }
TY - JOUR AU - Bulla, Jan AU - Lagona, Francesco AU - Maruotti, Antonello AU - Picone, Marco TI - Environmental conditions in semi-enclosed basins: A dynamic latent class approach for mixed-type multivariate variables JO - Journal de la société française de statistique PY - 2015 SP - 114 EP - 137 VL - 156 IS - 1 PB - Société française de statistique UR - http://www.numdam.org/item/JSFS_2015__156_1_114_0/ LA - en ID - JSFS_2015__156_1_114_0 ER -
%0 Journal Article %A Bulla, Jan %A Lagona, Francesco %A Maruotti, Antonello %A Picone, Marco %T Environmental conditions in semi-enclosed basins: A dynamic latent class approach for mixed-type multivariate variables %J Journal de la société française de statistique %D 2015 %P 114-137 %V 156 %N 1 %I Société française de statistique %U http://www.numdam.org/item/JSFS_2015__156_1_114_0/ %G en %F JSFS_2015__156_1_114_0
Bulla, Jan; Lagona, Francesco; Maruotti, Antonello; Picone, Marco. Environmental conditions in semi-enclosed basins: A dynamic latent class approach for mixed-type multivariate variables. Journal de la société française de statistique, Tome 156 (2015) no. 1, pp. 114-137. http://www.numdam.org/item/JSFS_2015__156_1_114_0/
[1] Space-time models for moving fields. Application to significant wave height, Environmetrics, Volume 22 (2011) no. 3, pp. 354-369
[2] Space-time modelling of precipitation by using a hidden Markov model and censored Gaussian distributions, Journal of the Royal Statistical Society, Series C (Applied Statistics), Volume 58 (2009) no. 3, pp. 405-426
[3] Computational issues in parameter estimation for stationary hidden Markov models, Computational Statistics, Volume 23 (2008) no. 1, pp. 1-18
[4] Using hidden Markov model to analyse extreme rainfall events in Central-East Sardinia, Envirionmetrics, Volume 19 (2008), pp. 702-713
[5] hsmm - an R Package for Analyzing Hidden Semi-Markov Models, Computational Statistics and Data Analysis, Volume 54 (2010) no. 3, pp. 611-619 | Zbl
[6] Wind and Wave Predictions in the Adriatic Sea, Journal of Marine Systems, Volume 78 (2009), pp. 227-234
[7] An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology, Bulletin of the AMS, Volume 73 (1967), pp. 360-363 | Zbl
[8] A note on the mixture transition distribution and hidden Markov models, Journal of Time Series Analysis, Volume 31 (2010) no. 2, pp. 132-138 | Zbl
[9] A hidden Markov model for downscaling synoptic atmospheric patterns to precipitation amounts, Climatology Research, Volume 15 (2000), pp. 1-12
[10] Semi-Markov Chains and Hidden Semi-Markov Models Toward Applications: Their Use in Reliability and DNA Analysis, Preliminary Entry 191 (Lecture Notes in Statistics), Springer-Verlag, New York - Heidelberg - Berlin, 2008 | Zbl
[11] Spectral wave climate of the North Sea, Applied Ocean Research, Volume 29 (2007), pp. 146-154
[12] A Multivariate Hidden Markov Model for the Identification of Sea Regimes from Incomplete Skewed and Circular Time Series, Journal of Agricultural, Biological, and Environmental Statistics, Volume 17 (2012) no. 4, pp. 544-567 | Zbl
[13] Computer-assisted Analysis of Mixture and Applications: Meta-analysis, Disease Mapping and Others, Chapman & Hall, 2000
[14] The EM algorithm with the gradient function update for discrete mixtures with known (fixed) number of components, Statistics and Computing, Volume 13 (2003), pp. 257-263
[15] Statistical inference for probabilistic functions of finite state Markov chains, The Annals of Mathematical Statistics, Volume 37 (1966) no. 6, pp. 1554-1563 | Zbl
[16] Wind-wave modelling aspects within complicate topography, Annales Geophysicae, Volume 15 (1997), pp. 1340-1353
[17] Model-based clustering, discriminant analysis and density estimation, Journal of American Statistical Association, Volume 97 (2002), pp. 611-631 | Zbl
[18] Characterising spectral sea wave conditions with statistical clustering of actual spectra, Applied Ocean Research, Volume 32 (2010), pp. 332-342
[19] A class of stochastic models for relating synoptic atmospheric patterns to regional hydrologic phenomena, Water Resources Research, Volume 30 (1994) no. 5, pp. 1535-1546
[20] A non-homogeneous hidden Markov model for precipitation occurrence, Journal of the Royal tatistical Society - Series C, Volume 48 (1999), pp. 15-30 | Zbl
[21] The WAM model - A third generation ocean wave prediction model, Journal of Physical Oceanography, Volume 18 (1988), pp. 1775-1810
[22] Hidden Markov models for circular and linear-circular time series, Environmental and Ecological Statistics, Volume 13 (2006) no. 3, pp. 325-347
[23] Air Quality Indices via Non Homogeneous Hidden Markov Models, Proceeding of the Italian Statistical Society Conference on Statistics and Environment, Contributed Papers, CLEUP, Padova (2005), pp. 91-94
[24] Spatial analysis of wave direction data using wrapped Gaussian processes, The Annals of Applied Statistics, Volume 6 (2012) no. 4, pp. 1474-1498 | Zbl
[25] A Non-Homogeneous Hidden Markov Model for the Analysis of Multi-Pollutant Exceedances Data, IntechOpen, 2011
[26] A Latent-Class Model for Clustering Incomplete Linear and Circular Data in Marine Studies, Journal of Data Science, Volume 9 (2011), pp. 585-605
[27] Model-based clustering of multivariate skew data with circular components and missing values, Journal of Applied Statistics, Volume 39 (2012), pp. 927-945 | Zbl
[28] Maximum Likelihood Estimation of Bivariate Circular Hidden Markov Models From Incomplete Data, Journal of Statistical Computation and Simulation, Volume 83 (2013) no. 7, pp. 1223-1237 | Zbl
[29] Hidden Markov models with arbitrary state dwell-time distributions, Computational Statistics and Data Analysis, Volume 55 (2011) no. 1, pp. 715-724 | Zbl
[30] A mixed non-homogeneous hidden Markov model for categorical data, with application to alcohol consumption, Statistics in Medicine, Volume 31 (2012) no. 9, pp. 871-886
[31] Sensitivity of wave model predictions to wind fields in the Western Mediterranean sea, Coastal Engineering, Volume 35 (2008), pp. 920-929
[32] R: A Language and Environment for Statistical Computing (2013) http://www.R-project.org
[33] Downscaling of daily rainfall occurrence over Northeast Brazil using a hidden Markov model, Journal of Climate, Volume 17 (2004) no. 22, pp. 4407-4424
[34] Probabilistic model for two dependent circular variables, Biometrika, Volume 89 (2002) no. 3, pp. 719-723 | Zbl
[35] A Hidden Markov Model for Space-Time Precipitation, Water Resources Research, Volume 27 (1991) no. 8, pp. 1917-1923
[36] Hidden Markov for Time Series: An Introduction Using R, CRC Monographs on Statistics and Applied Probability, Chapman & Hall, London, 2009 | Zbl