Model Selection via Information Criteria for Tree Models and Markov Random Fields

Author:  Zsolt Talata
Degree:  Ph.D. in Applied Mathematics, 2005
Advisor:  Prof. Imre Csiszár


The dissertation consists of three chapters. The first one is a survey of the information criterion approach to model selection problems. New results of the following chapters are also introduced.

In the second chapter, the concept of context tree, usually defined for finite memory processes, is extended to arbitrary stationary ergodic processes (with finite alphabet). These context trees are not necessarily complete, and may be of infinite depth. The Bayesian Information Criterion (BIC) and Minimum Description Length (MDL) principles are shown to provide strongly consistent estimators of the context tree, via optimization of a criterion for hypothetical context trees of finite depth, allowed to grow with the sample size n as o(log n). Algorithms are provided to compute these estimators in O(n) time, and to compute them on-line for all i <= n in o(n log n) time.

In the third chapter, for Markov random fields on Zd with finite state space, we address the statistical estimation of the basic neighborhood, the smallest region that determines the conditional distribution at a site on the condition that the values at all other sites are given. A modification of the Bayesian Information Criterion, replacing likelihood by pseudo-likelihood, is proved to provide strongly consistent estimation from observing a realization of the field on increasing finite regions: the estimated basic neighborhood equals the true one eventually almost surely, not assuming any prior bound on the size of the latter. Stationarity of the Markov field is not required, and phase transition does not affect the results.

The dissertation can be downloaded at

Contact information:

Zsolt Talata

School of Mathematics
Georgia Institute of Technology

686 Cherry Street, N.W.
Atlanta, GA 30332-0160

Office:  +1 (404) 894 4569
Fax:  +1 (404) 894 4409