Compression of data streams down to their information content.

According to Kolmogorov complexity, every finite binary string is compressible to a shortest code — its information content — from which it is effectively recoverable. In this paper my coauthor Barmpalias and I investigate the extent to which this holds for infinite binary sequences (streams). We devise a new coding method which uniformly codes every stream X into an algorithmically random stream Y, in such a way that the first n bits of X are recoverable from the first I(X_n) bits of Y, where I is any partial computable information content measure which is defined on all prefixes of X, and where X_n is the initial segment of X of length n. As a consequence, if g is any computable upper bound on the initial segment prefix-free complexity of X, then X is computable from an algorithmically random Y with oracle-use at most g. Alternatively (making no use of such a computable bound g) one can achieve an oracle-use bounded above by K(X_n)+log n. This provides a strong analogue of Shannon’s source coding theorem for algorithmic information theory.

With Barmpalias, to appear in IEEE Transactions on Information Theory: pdf.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s