Modèles contextuels et alphabets infinis en théorie de l'information
Thèses d'Orsay, no. 708 (2006) , 150 p.

Ce travail de thèse explore quelques aspects contemporains de la théorie de l’information allant de la théorie du codage à certains problèmes de choix de modèles. Nous y considérons d’abord le problème du codage de sources sans mémoire émettant dans un alphabet infini dénombrable. Comme il est impossible d'y apporter une solution générale et efficace, deux approches sont utilisées: dans un premier temps nous établissons des conditions dans lesquelles le taux entropique peut être approché, et nous nous restreignons à des classes pour lesquelles les queues de probabilités sont contrôlées. Dans un second temps, il n’est posé aucune restriction sur la source, il est possible de fournir une solution partielle en codant seulement une partie de l’information - le motif - qui capture les répétitions contenues dans le message.

Pour arriver à l'étude de processus plus complexes, nous revenons sur le cas de sources à mémoire finie sur un alphabet fini, qui a donné lieu à beaucoup de travaux, ainsi qu'à des algorithmes efficaces comme la Context Tree Weighting (CTW) Method. Nous prouvons ici que cet algorithme est également efficace sur une classe non paramétrique de sources a mémoire infinie : les sources de renouvellement.

Nous montrons ensuite que les idées sous-jacentes à la méthode CTW permettent de construire un estimateur consistant de la structure de mémoire d’un processus quand celle-ci est finie : nous complètons l’étude de l’estimateur BIC pour les chaînes de Markov à longueur variable. Dans une dernière partie, il est montré qu’une telle approche est généralisable dans un cadre plus large de sources émettant dans un alphabet infini, dénombrable ou non. On obtient ainsi des estimateurs consitants de l’ordre de chaînes de Markov cachées à émission poissonienne et gaussienne.

This thesis explores some contemporary aspects of information theory, from source coding to issues of model selection. We first consider the problem of coding memoryless sources on a countable, infinite alphabet. As it is impossible to provide a solution which is both efficient and general, two approaches are considered : we first establish conditions under which the entropic rate can be reached, and we consider restricted classes for which tail probabilities are controlled. The second approach does not set any condition on the sources but provides a partial solution by coding only a part of the information - the pattern - which captures the repetitions in the message.

In order to study more complex processes, we come back to the case of finite memory sources on a finite alphabet : it has given rise to many works and efficient algorithms like the Context Tree Weighting (CTW) Method. We show here that this method is also efficient on a non-parametric class of infinite memory sources: the renewal processes.

We show then that the ideas on which CTW is based lead to a consistent estimator of the memory structure of a process, when this structure is finite. In fact, we complete the study of the BIC context tree estimator for Variable Length Markov Chains. In the last part, it is shown how similar ideas can be generalized for more complex sources on a (countable or not) infinite alphabet. We obtain consistent estimators for the order of hidden Markov models with Poisson and Gaussian emission.

Classification : 62M05, 62P30, 62L12, 68P30, 68Q17, 94A15, 94A24, 94A29, 94A45
Mots-clés : BIC, Context trees, CTW, HMM, infinite alphabet, MDL, minimax, pattern coding, renewal processes, source coding, VLMC
@phdthesis{BJHTUP11_2006__0708__P0_0,
     author = {Garivier, Aur\'elien},
     title = {Mod\`eles contextuels et alphabets infinis en th\'eorie de l'information},
     series = {Th\`eses d'Orsay},
     publisher = {Universite Paris-Sud Facult\'e des Sciences d'Orsay},
     number = {708},
     year = {2006},
     language = {fr},
     url = {http://www.numdam.org/item/BJHTUP11_2006__0708__P0_0/}
}
TY  - BOOK
AU  - Garivier, Aurélien
TI  - Modèles contextuels et alphabets infinis en théorie de l'information
T3  - Thèses d'Orsay
PY  - 2006
IS  - 708
PB  - Universite Paris-Sud Faculté des Sciences d'Orsay
UR  - http://www.numdam.org/item/BJHTUP11_2006__0708__P0_0/
LA  - fr
ID  - BJHTUP11_2006__0708__P0_0
ER  - 
%0 Book
%A Garivier, Aurélien
%T Modèles contextuels et alphabets infinis en théorie de l'information
%S Thèses d'Orsay
%D 2006
%N 708
%I Universite Paris-Sud Faculté des Sciences d'Orsay
%U http://www.numdam.org/item/BJHTUP11_2006__0708__P0_0/
%G fr
%F BJHTUP11_2006__0708__P0_0
Garivier, Aurélien. Modèles contextuels et alphabets infinis en théorie de l'information. Thèses d'Orsay, no. 708 (2006), 150 p. http://numdam.org/item/BJHTUP11_2006__0708__P0_0/

[ADC86] Robert Azencott and Didier Dacunha-Castelle. Series of irregular observations. Springer-Verlag, New-York., 1986. | MR | Zbl

[AGM04] Jean-Marc Azais, Elisabeth Gassiat, and Cécile Mercadier. Asymptotic distribution and power of the likelihood ratio test for mixtures : bounded and unbounded case. Submitted, 2004. | MR | Zbl

[Ans] Answers.com. Binary tree.

[Bar85] Andrew R. Barron. Logically Smooth Density Estimation. PhD thesis, Stanford University, Dept. of Engineering, 1985. | MR

[BBLM05] Stéphane Boucheron, Oliver Bousquet, Gabor Lugosi, and Pascal Massart. Moment inequalities for functions of independent random variables. Annals of Probability, 33(2) :514-560, 2005. | MR | Zbl

[BGG06] Stéphane Boucheron, Aurélien Garivier, and Elisabeth Gassiat. Infinite alphabets : redundancy rates for enveloppe classes. Work in progress, to be submitted, 2006.

[Bou00] Stéphane Boucheron. Théorie de l'information, notes de cours. http://www.proba.jussieu.fr/pageperso/boucheron/mpriit.php, 2000.

[BP66] Leonard E. Baum and Ted Petrie. Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Statist., 37 :1554-1563, 1966. | MR | Zbl

[BRY98] Andrew R. Barron, Jorma Rissanen, and Bin Yu. The minimum description length principle in coding and modeling. IEEE Trans. Inform. Theory, 44(6) :2743-2760, 1998. | MR | Zbl | DOI

[BW99] Peter Bühlmann and Abraham J. Wyner. Variable length Markov chains. Ann. Statist., 27(2) :480-513, 1999. | MR | Zbl

[Cat01] Olivier Catoni. Statistical Learning Theory and Stochastic Optimization. Lecture Notes in Mathematics, Vol. 1851. Springer-Verlag, Berlin, 2001. | MR | Zbl

[CB90] Bertrand S. Clarke and Andrew R. Barron. Information-theoretic asymptotics of Bayes methods. IEEE Trans. Inf. Theory, 36 :453-471, 1990. | MR | Zbl | DOI

[CB94] B.S. Clarke and A.R Barron. Jeffrey's prior is asymptotically least favorable under entropy risk. J. Stat. Planning and Inference, 41:37-60, 1994. | MR | Zbl | DOI

[CBL06] N. Cesa-Bianchi and G. Lugosi. Prediction, Learning, and Games. Cambridge University Press, 2006. ISBN 0521841089. | MR | Zbl | DOI

[CGG05] Antoine Chambaz, Elisabeth Gassiat, and Aurélien Garivier. A MDL approach to HMM with Poisson and Gaussian emissions. Application to order identification. JSPI, submitted, 2005. | Zbl

[Cha03] Antoine Chambaz. Testing the order of a model. Ann. Statist., Accepted, 2003. | MR | Zbl

[CK81] Imre Csiszár and János Körner. Information theory. Akadémiai Kiadó (Publishing House of the Hungarian Academy of Sciences), Budapest, 1981. Coding theorems for discrete memoryless systems. | MR | Zbl

[CMR05] Olivier Cappé, Eric Moulines, and Tobias Rydén. Inference in hidden Markov models. Springer Series in Statistics. Springer, New York, 2005. With Randal Douc's contributions to Chapter 9 and Christian P. Robert's to Chapters 6, 7 and 13, With Chapter 14 by Gersende Fort, Philippe Soulier and Moulines, and Chapter 15 by Stéphane Boucheron and Elisabeth Gassiat. | MR | Zbl

[Cox93] D. Cox. An analysis of Bayesian inference for nonparametric regression. Annals of Statistics, 21 :903-923, 1993. | MR | Zbl

[CR05] Antoine Chambaz and Judith Rousseau. Nonasymptotic bounds for Bayesian order identification with application to mixtures. Submitted., 2005. | MR | Zbl

[CS96] Imre Csiszár and Paul C. Shields. Redundancy rates for renewal and other processes. IEEE Transactions on Information Theory, 42, Nov. 1996. | Zbl

[CS00] Imre Csiszár and Paul C. Shields. The consistency of the BIC Markov order estimator. Ann. Statist., 28(6) : 1601-1619, 2000. | MR | Zbl

[Csi02] Imre Csiszár. Large-scale typicality of Markov sample paths and consistency of MDL order estimators. IEEE Trans. Inform. Theory, 48(6) :1616-1628, 2002. Special issue on Shannon theory : perspective, trends, and applications. | MR | Zbl | DOI

[CT91] Thomas M. Cover and Joy A. Thomas. Elements of information theory. Wiley Series in Telecommunications. John Wiley & Sons Inc., New York, 1991. A Wiley-Interscience Publication. | MR | Zbl

[CT06] Imre Csiszár and Zsolt Talata. Context tree estimation for not necessarily finite memory processes, via BIC and MDL. IEEE-IT, 52(3), march 2006. | MR | Zbl

[Dav73] Lee D. Davisson. Universal noiseless coding. IEEE Trans. Information Theory, IT-19 :783-795, 1973. | MR | Zbl | DOI

[DCG97] Didier Dacunha-Castelle and Elisabeth Gassiat. The estimation of the order of a mixture model. Bernoulli, pages 279-299, 1997. | MR | Zbl

[DEKM98] Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison. Biological sequence analysis. Cambridge University Press, 1998. | Zbl

[Del01] Céline Delmas. Distribution du maximum d'un champ aléatoire et applications statistiques. PhD thesis, Université Paul Sabatier, Toulouse, 2001.

[DF93] P. Diaconis and D. A. Freedman. Nonparametric binary regression : a Bayesian approach. Ann. Statist., 21(4) :2108-2137, 1993. | MR | Zbl | DOI

[DG90] Luc Devroye and László Györfi. No empirical probability measure can converge in the total variation sense for all distributions. Ann. Statist., 18(3) : 1496-1499, 1990. | MR | Zbl

[DGL96] Luc Devroye, László Györfi, and Gábor Lugosi. A probabilistic theory of pattern recognition, volume 31 of Applications of Mathematics (New York). Springer-Verlag, New York, 1996. | MR | Zbl

[DLG80] Lee D. Davisson and Alberto Leon-Garcia. A source matching approach to finding minimax codes. IEEE Trans. Inform. Theory, 26(2) :166-174, 1980. | MR | Zbl | DOI

[DMPW81] Lee D. Davisson, Robert J. Mceliece, Michael B. Pursley, and Mark S. Wallace. Efficient universal noiseless source codes. IEEE Trans. Inf. Theory, 27 :269-279, 1981. | MR | Zbl | DOI

[DN90] Jacques Dixmier and Jean-Louis Nicolas. A tribute to Paul Erdös, chapter Partitions sans petits sommants, pages 121-152. Cambridge University Press, 1990. | MR | Zbl

[DR98] D. Dubhashi and D. Ranjan. Balls and bins : A study in negative dependence. Random Struct. & Algorithms, 13(2) :99-124, 1998. | MR | Zbl | DOI

[DZ98] A. Dembo and O. Zeitouni. Large deviation techniques and applications. Springer, 1998. | MR | Zbl | DOI

[Eli75] Peter Elias. Universal codeword sets and representations of the integers. IEEE Trans. Information Theory, IT-21 :194-203, 1975. | MR | Zbl | DOI

[EM02] Yariv Ephraim and Neri Merhav. Hidden markov processes. IEEE Trans. Inform. Theory, 48 :1518-1569, 2002. | MR | Zbl | DOI

[EVKV02] Michelle Effros, Karthik Visweswariah, Sanjeev R. Kulkarni, and Sergio Verdú. Universal lossless source coding with the burrows Wheeler transform. IEEE Trans. Inform. Theory, 48(5) :1061-1081, 2002. | MR | Zbl | DOI

[Fan61] Robert M. Fano. Transmission of information : A statistical theory of communications. The M.I.T. Press, Cambridge, Mass., 1961. | MR | Zbl

[Fin91] Lorenzo Finesso. Consistent estimation of the order for Markov and hidden Markov chains. PhD thesis, University of Maryland, 1991. | MR

[FMG92] Meir Feder, Neri Merhav, and Michael Gutman. Universal prediction of individual sequences. IEEE Trans. Inform. Theory, 38(4) :1258-1270, 1992. | MR | Zbl | DOI

[Fre63] David A. Freedman. On the asymptotic behavior of Bayes' estimates in the discrete case. Ann. Math. Statist., 34 :1386-1403, 1963. | MR

[Fre99] David A. Freedman. On the Bernsteinvon Mises theorem with infinite dimensional parameters. Annals of Statistics, 27 :1119-1140, 1999. | MR | Zbl | DOI

[Gal68] R. G. Gallager. Information theory and reliable communication. John Wiley & sons, 1968. | Zbl

[Gal76] Robert G. Gallager. Source coding with side information and universal coding, Sept. 1976.

[Gar03] Aurélien Garivier. Rapport de DEA : Parsing et codage universel. Master's thesis, Université Paris Sud Orsay, 2003.

[Gar05a] Aurélien Garivier. Redundancy of the context tree weighting algorithm on renewal and other processes. IEEE-IT, accepted, 2005.

[Gar05b] Aurélien Garivier. Consistency of the unlimited BIC context tree estimator. IEEE Trans. Inform. Theory, accepted, 2005. | MR | Zbl

[Gar06] Aurélien Garivier. A new lower-bound for the pattern maximin redundancy. IEEE-IT, submitted, 2006.

[Gas02] Elisabeth Gassiat. Likelihood ratio inequalities with applications to various mixtures. Ann. Inst. H. Poincaré Probab. Statist., 38 :897-906, 2002. | MR | Zbl | Numdam | DOI

[GB03] Elisabeth Gassiat and Stéphane Boucheron. Optimal error exponents in hidden Markov models order estimation. IEEE Trans. Inform. Theory, 49 :964-980, 2003. | MR | Zbl | DOI

[GK97] Robert Giegerich and Stefan Kurtz. From Ukkonen to McCreight and Weiner : A unifying view of linear-time suffix tree construction. Algorithmica, 19(3) :331-353, 1997. | MR | Zbl | DOI

[GPvdM94] László Györfi, István Páli, and Edward C. Van Der Meulen. There is no universal source code for an infinite source alphabet. IEEE Trans. Inform. Theory, 40(1) :267-271, 1994. | MR | Zbl | DOI

[GW04] George M. Gemelos and Tsachy Weissman. On the entropy rate of pattern processes. Technical report hpl-2004-159, HP Laboratories Palo Alto, San Antonio, Texas, USA, sept 2004. | MR | Zbl

[Hau97] David Haussler. A general minimax result for relative entropy. IEEE Trans. Inform. Theory, 43(4) : 1276-1280, 1997. | MR | Zbl | DOI

[Hen85] Jogi Henna. On estimating of the number of constituents of a finite mixture of continuous distributions. Ann. Inst. Statist. Math., 37 :235-240, 1985. | MR | Zbl

[HG94] James P. Hughes and Peter Guttorp. A class of stochastic models for relating synoptic atmospheric patterns to regional hydrologic phenomena. Water Ressources Research, 30 :1535-1546, 1994. | DOI

[HTF01] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The elements of statistical learning. Springer-Verlag, New-York, 2001. | MR | Zbl

[HU79] John E. Hopcroft and Jeffrey D. Ullman. Introduction to automata theory, languages, and computation. Addison-Wesley Publishing Co., Reading, Mass., 1979. Addison-Wesley Series in Computer Science. | MR | Zbl

[HY01] Mark H. Hansen and Bin Yu. Model selection and the principle of minimum description length. JASA, 96 :746-774, 2001. | MR | Zbl | DOI

[IJS01] Hemant Ishwaran, Lancelot F. James, and Jiayang Sun. Bayesian model selection in finite mixtures by marginal density decompositions. J. Amer. Statist. Assoc., 96 :1316-1332, 2001. | MR | Zbl | DOI

[JB92] William Jefferys and James Berger. Ockam's razor and bayesian analysis. American Scientist, 80 :64-72, 1992.

[JOS05] Nikola Jevtić, Alon Orlitsky, and Narayana P. Santhanam. A lower bound on compression of unknown alphabets. Theoret. Comput. Sci., 332(1-3) :293-311, 2005. | MR | Zbl | DOI

[JPM01] Lancelot F. James, Carey E. Priebe, and David J. Marchette. Consistent estimation of mixture complexity. Ann. Statist., 29 :1281-1296, 2001. | MR | Zbl

[JST01] Philippe Jacquet, Wojciech Szpankowski, and Jim Tang. Average profile of the Lempel-Ziv parsing scheme for a Markovian source. Algorithmica, 31(3) :318-360, 2001. Mathematical analysis of algorithms. | MR | Zbl | DOI

[Ker00] Christine Keribin. Consistent estimation of the order of mixture models. Sankhyā Ser. A, 62(1) :49-66, 2000. | MR | Zbl

[Kie78] John C. Kieffer. A unified approach to weak universal source coding. IEEE Trans. Inform. Theory, 24(6) :674-682, 1978. | MR | Zbl | DOI

[Kie93] John C. Kieffer. Strongly consistent code-based identification and order estimation for constrained finite-state model classes. IEEE Trans. Inform. Theory, 39 :893-902, 1993. | MR | Zbl | DOI

[Kon98] Ioannis Kontoyiannis. Asymptotic recurrence and waiting times for stationary processes. J. Theoret. Probab., 11(3) :795-811, 1998. | MR | Zbl | DOI

[Kos01] Timo Koski. Hidden Markov Models For Bioinformatics. Kluwer Academic Publishers Group., 2001. | MR | Zbl

[Kra49] L. G. Kraft. A device for quantizing, grouping, and coding amplitude modulated pulses. Master's thesis, Dept, of Electrical Engineering, M.I.T., Cambridge, MA, 1949.

[KSY04] John C. Kieffer, Wojciech Szpankowski, and En-Hui Yang. Problems on sequences : information theory and computer science interface. IEEE Trans. Inform. Theory, 50(7) :1385-1392, 2004. | Zbl | MR | DOI

[KT81] Raphail Krichevsky and Victor Trofimov. The performance of universal encoding. IEEE Trans. Inform. Theory, 27(2) :199-207, Mar 1981. | MR | Zbl | DOI

[KV94] Ghassan K. Kaleh and Robert Vallet. Joint parameter estimation and symbol detection for linear or nonlinear unknown channels. IEEE Trans. Commun., 42 :2406-2413, 1994. | Zbl | DOI

[KY00] John C. Kieffer and En-Hui Yang. Grammar-based codes : a new class of universal lossless source codes. IEEE Trans. Inform. Theory, 46(3) :737-754, 2000. | MR | Zbl | DOI

[Ler92a] Brian G. Leroux. Consistent estimation of a mixing distribution. Ann. Statist., 20(3) :1350-1360, 1992. | MR | Zbl

[Ler92b] Brian G. Leroux. Maximum-likelihood estimation for hidden markov models. Stochastic Processes Their Applic., 40 :127-143, 1992. | MR | Zbl | DOI

[LN94] Chuang-Chun Liu and Prakash Narayan. Order estimation and sequential universal data compression of a hidden markov source by the method of mixtures. IEEE Trans. Inf. Theory, 40(4) :1167-1180, 1994. | Zbl | DOI

[LRS83] Stephen E. Levinson, Lawrence R. Rabiner, and Man Mohan Sondhi. An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition. The Bell System Technical Journal, 62 :1035-1074, 1983. | MR | Zbl | DOI

[LS97] Guy Louchard and Wojciech Szpankowski. On the average redundancy rate of the Lempel-Ziv code. IEEE Trans. Inform. Theory, 43(1) :2-8, 1997. | MR | Zbl | DOI

[MAS06] Pascal Massart. Concentration inequalities and model selection. Ecole d'été de Probabilités de Saint-Flour 2003. Lecture Notes in Mathematics, Springer, 2006. | MR | Zbl

[MF95] Neri Merhav and Meir Feder. A strong version of the redundancy-capacity theorem of universal coding. IEEE Trans. Inform. Theory, 41(3) :714-722, May 1995. | Zbl | DOI

[ML03] Elías Moreno and Brunero Liseo. A default Bayesian test for the number of components in a mixture. J. Statist. Plann. Inference, 111(1-2) :129-142, 2003. | MR | Zbl | DOI

[MP00] Geoffrey J. Mclachlan and David Peel. Finite mixture models. Wiley-Interscience, New York, 2000. | MR | Zbl

[MR96] Kerrie L. Mengersen and Christian P. Robert. Testing for mixtures : a Bayesian entropy approach. In J. O. Berger, J. M. Bernardo, and A. P. Dawid, editors, Bayesian Statistics 5, pages 255-276. Oxford Science Publications, 1996. | MR

[OS04] Alon Orlitsky and Narayana P. Santhanam. Speaking of infinity. IEEE Trans. Inform. Theory, 50(10) :2215-2230, 2004. | MR | Zbl | DOI

[OSVZ04] Alon Orlitsky, Narayana P. Santhanam, K. Viswanathan, and Junan Zhang. Limit results on pattern entropy of stationary processes. In Proceedings of the 2004 IEEE Information Theory workshop, San Antonio, Texas, USA, oct 2004.

[OSZ04] Alon Orlitsky, Narayana P. Santhanam, and Junan Zhang. Universal compression of memoryless sources over unknown alphabets. IEEE Trans. Inform. Theory, 50(7) : 1469-1481, 2004. | MR | Zbl | DOI

[PS98] George Pólya and Gabor Szegő. Problems and theorems in analysis. II. Classics in Mathematics. Springer-Verlag, Berlin, 1998. Theory of functions, zeros, polynomials, determinants, number theory, geometry, Translated from the German by C. E. Billigheimer, Reprint of the 1976 English translation. | MR | Zbl

[Ris76] Jorma Rissanen. Generalized Kraft inequality and arithmetic coding. IBM J. Res. Dev., 20(3), 1976. | MR | Zbl

[Ris78] Jorma Rissanen. Modelling by shortest data description. Automatica, 14 :465-471, 1978. | Zbl | DOI

[Ris83] Jorma Rissanen. A universal data compression system. IEEE Transactions on Information Theory, 29, September 1983. | MR | Zbl

[Ris84] Jorma Rissanen. Universal coding, information, prediction, and estimation. IEEE Trans. Inform. Theory, 30(4) :629-636, 1984. | MR | Zbl | DOI

[Ris86] Jorma Rissanen. Stochastic complexity and modeling. The Annals of Statistics, 14(3) :1080-1100, 1986. | MR | Zbl

[Ris99] Jorma Rissanen. Fast universal coding with context models. IEEE Trans. Inform. Theory, 45(4) : 1065-1071, 1999. | MR | Zbl | DOI

[RL81] Jorma Rissanen and Glen Jr. Langdon. Universal modeling and coding. IEEE Trans. Inform. Theory, 27(1) :12-23, Jan 1981. | MR | Zbl | DOI

[Rya79] B. Ya. Ryabko. Coding of a source with unknown but ordered probabilities. Problems Inform. Transmission, 15(2) :71-77, 1979. | Zbl | MR

[Rya81] Boris Ya. Ryabko. Comments on : "A source matching approach to finding minimax codes" [IEEE Trans. Inform. Theory 26 (1980), no. 2, 166-174; MR 81c :94021] by L. D. Davisson and A. Leon-Garcia. | MR | Zbl | DOI

[Rya81] Boris Ya. Ryabko. Comments on : "A source matching approach to finding minimax codes" IEEE Trans. Inform. Theory, 27(6) :780-781, 1981. | MR | Zbl | DOI

[Rya84] B. Ya. Ryabko. Twice-universal coding. Problemy Peredachi Informatsii, 20(3) :24-28, 1984. | MR | Zbl

[Sch78] Gedeon Schwarz. Estimating the dimension of a model. Ann. Statist., 6(2) : 461-464, 1978. | MR | Zbl

[Sha48] Claude E. Shannon. A mathematical theory of communication. Bell System Tech. J., 27 : 379-423, 623-656, 1948. | MR | Zbl | DOI

[Sha03a] Gil I. Shamir. On the entropy of patterns of i.i.d. sequences. In Proceedings of 41st Annual Allerton Conference on Communication, Control and Computing, Monticello, IL, USA, oct 2003.

[Sha03b] Gil I. Shamir. Universal lossless compression with unknown alphabets - the average case. IEEE Trans. Inform. Theory, submitted in June 2003. | MR | Zbl

[Sha04] Gil I. Shamir. A new redudancy bound for universal lossless compression of unknown alphabets. In Proceedings of the 38th Annual Conference on Information Sciences and Systems - CISS, pages 1175-1179, Princeton, New-Jersey, USA, march 2004.

[Sha06] Gil I. Shamir. On the MDL principle for i.i.d. sources with large alphabets. IEEE Trans. Inform. Theory, 52 :1939 - 1955, may 2006. | MR | Zbl | DOI

[Shi93] Paul C. Shields. Universal redundancy rates do not exist. IEEE Trans. Inform. Theory, 39(2) :520-524, 1993. | MR | Zbl | DOI

[Sht87] Yuri Shtar'Kov. Universal sequential coding of individual messages. Problemy Peredachi Informatsii, 23(3) :3-17, 1987. | MR | Zbl

[Sio58] Maurice Sion. On general minimax theorems. Pacific J. Math., 8 :171-176, 1958. | MR | Zbl | DOI

[SW95] Paul Shields and Benjamin Weiss. Universal redundancy rates for the class of B-processes do not exist. IEEE Trans. Inform. Theory, 41(2) :508-512, 1995. | MR | Zbl | DOI

[Sze51] George Szekeres. An asymptotic formula in the theory of partitions. Quart. J. Math. Oxford, 2 :85-108, 1951. | MR | Zbl | DOI

[Szp01] Wojciech Szpankowski. Average case analysis of algorithms on sequences. Wiley-Interscience Series in Discrete Mathematics and Optimization. Wiley-Interscience, New York, 2001. With a foreword by Philippe Flajolet. | MR | Zbl | DOI

[TSM85] Donald M. Titterington, Adrian F. M. Smith, and Udi E. Makov. Statistical analysis of finite mixture distributions. John Wiley & Sons Ltd., Chichester., 1985. | MR | Zbl

[Tsy04] Alexandre B. Tsybakov. Introduction à l'estimation non-paramétrique, volume 41 of Mathématiques & Applications (Berlin) [Mathematics & Applications]. Springer-Verlag, Berlin, 2004. | MR | Zbl

[Var82] Yehuda Vardi. Nonparametric estimation in renewal processes. Ann. Statist., 10(3) :772-785, 1982. | MR | Zbl

[vdV98] Aad W. Van Der Vaart. Asymptotic statistics. Cambridge University Press., 1998. | MR | Zbl

[Wel84] Terry A. Welch. A technique for high performance data compression. IEEE Computer, 17(6) :8-19, 1984. | DOI

[Wil94] Frans M.J. Willems. The Context-tree Weighting Method : Extensions. In IEEE Int. Symp. Information Theory, Trondheim, Norway, 1994.

[WMF94] Marcelo J. Weinberger, Neri Merhav, and Meir Feder. Optimal sequential probability assignment for individual sequences. IEEE Transactions on Information Theory, 40, Mar 1994. | MR | Zbl

[WT95] Yuri M. Willems, Frans M. J.; Shtarkov and Tjalling J. Tjalkens. The Context-tree Weighting Method : basic properties. IEEE Transactions on Information Theory, 41, May 1995. | Zbl

[XB97] Qun Xie and A.R. Barron. Minimax redundancy for the class of memoryless sources. IEEE Trans. Inform. Theory, 43(2) :646-657, Mar 1997. | Zbl | DOI

[XB00] Q. Xie and A. R. Barron. Asymptotic minimax regret for data compression, gambling and prediction. IEEE Trans. Inform. Theory, 46 :431-445, 2000. | MR | Zbl | DOI

[ZL77] Jacob Ziv and Abraham Lempel. A universal algorithm for sequential data compression. IEEE Trans. Information Theory, IT-23(3) :337-343, 1977. | MR | Zbl | DOI

[ZL78] Jacob Ziv and Abraham Lempel. Compression of individual sequences via variable-rate coding. IEEE Trans. Inform. Theory, 24(5) :530-536, 1978. | MR | Zbl | DOI

[ÄSMS97] Jan Äberg, Yuri M. Shtarkov, and Ben J. M. Smeets. Multialphabet coding with separate alphabet description. In IEEE Computer Society Press, editor, Proceedings of Compression and Complexity of SEQUENCES, pages 56-65, 1997.