[IEEE Trans. on Information Theory, November 1994, pp. 1728-1740]

Rates of Convergence in the Source Coding Theorem, in Empirical Quantizer Design, and in Universal Lossy Source Coding

Tamás Linder, Gábor Lugosi, and Kenneth Zeger

Abstract

Rates of convergence results are established for vector quantization. Convergence rates are given for an increasing vector dimension and/or an increasing training set size. In particular, the following results are shown for memoryless real valued sources with bounded support at transmission rate R: (1) If a vector quantizer with fixed dimension k is designed to minimize the empirical MSE with respect to m training vectors, then its MSE for the true source converges in expectation and almost surely to the minimum possible MSE as $O(\sqrt{\log m/m})$; (2) The MSE of an optimal k-dimensional vector quantizer for the true source converges, as the dimension grows, to the distortion-rate function D(R) as $O(\sqrt{\log k/k})$; (3) There exists a fixed rate universal lossy source coding scheme whose per letter MSE on n real valued source samples converges in expectation and almost surely to the distortion-rate function D(R) as $O(\sqrt{\log\log n/\log n})$; and (4) Consider a training set of n real valued source samples blocked into vectors of dimension k, and a k-dimensional vector quantizer designed to minimize the empirical MSE with respect to the $m= \lfloor n/k \rfloor$ training vectors. Then the per letter MSE of this quantizer for the true source converges in expectation and almost surely to the distortion-rate function D(R) as $O(\sqrt{\log\log n/\log n})$, if one chooses $k = \lfloor \frac{1}{R} (1-\epsilon)(\log n) \rfloor$ for any $\epsilon \in (0,1)$.