   Next: PredictabilityCompressibility, Complexity Up: Reflections on ``Universal Prediction Previous: Working Out the

# Prediction and the Lempel-Ziv Algorithm

Despite our failure to solve the prediction problem directly by forcing an analogy to the coding problem, the strong feeling that a predictor based on universal data compression algorithm will work, is indeed justified. Following the work of Rissanen and Langdon on universal modeling, one realizes that the crucial part in coding is to come up with a conditional probability for the next outcome given the past. This can be done, for example, by calculating sequentially empirical probabilities in each ``context'' of the data. A prediction algorithm that uses these conditional probabilities and makes predictions according to (1) will work, due to the statistical analysis performed by the universal data compression algorithm.

The best demonstration of a universal predictor based on a coding algorithm is, of course, the predictor based on the Lempel-Ziv (LZ) incremental parsing algorithm. Langdon  and Rissanen  interpreted the LZ incremental parsing algorithm as a universal modeling scheme where the ``context'' in which the counts are gathered is defined by the symbols from the beginning of the parsed string to the current symbol. We were able to use this observation and come up with a proof that the LZ codelength of any individual sequence attains the Markovian empirical entropy for any finite Markov order. Analogously, a predictor which uses the conditional probabilities induced by the LZ scheme and predicts according to (1) attains the Markovian predictability and hence, by (2), the FS predictability for any individual sequence.

We note that the result on LZ coding was known and is sometimes referred to as Ziv's inequality (see  ch. 12). However, unlike the previous proof which was tailored to the coding problem, our proof is simpler, and can be applied to other loss functions. Moreover, since it is based on the observation that the LZ algorithm essentially estimates the probabilities of a ``growing order'' Markovian model, our proof may provide additional insight as to why the LZ compression algorithm works.   Next: PredictabilityCompressibility, Complexity Up: Reflections on ``Universal Prediction Previous: Working Out the

Ramesh Rao
Tue Aug 8 11:36:42 PDT 1995