[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Entropy, Randomness, etc.



John Kreznar writes:

> > Give a particular set of data used to generate a random key, such as,
> > a unix box's /dev/mem, how can one measure the number of bits of
> > entropy?
> 
> Actually, it can't be done.  The consistent measure of entropy for
> finite objects like a string or a (finite) series of random numbers is
> the so-called ``program length complexity''.  This is defined as the
> length of the shortest program for some given universal Turing machine
> which computes the string.  It's consistent in the sense that it has the
> familiar properties of ``ordinary'' (Shannon) entropy.  Unfortunately,
> it's uncomputable: there's no algorithm which, given an arbitrary finite
> string S, computes the program-length complexity of S.

The intuitive idea is similar to there being no "maximum compression"
of a string: though one may strongly suspect a compression is pretty
good and may in fact be the best there really is, one may find an even
better compression. Like the "pi" example Scott Collins used.

Still, one can make estimates of the entropy of a string.

> Program-length complexity is well-studied in the literature.  A good
> introductory paper is ``A Theory of Program Size Formally Identical to
> Information Theory'' by  G. J. Chaitin, _Journal of the ACM_, 22 (1975)
> reprinted in Chaitin's book _Information Randomness & Incompleteness_,
> World Scientific Publishing Co., 1990.

And an especially good place to read all about this is in the new book
by Ming Li and Paul Vitanyi, "An Introduction to Kolmogorov Complexity
and Its Applications," Springer-Verlag, 1993. $60.

Lots of good chapters on entropy, program length measures, algorithmic
information theory, etc. Ironically, no mention of cryptology at all.
(But Charles Bennett, one of the pioneers--especially in the area of
"logical depth"--has written about the deep links between the two
areas. Basically, ciphertext messages are "cryptoregular" in that they
_appear_ to be of high entropy (random) but actually have low entropy
when of course the right transformation (key) is applied.

You clever folks will by now have seen the link to the opening
discussion: how does one know if a given text is "cryptoregular" and
actually carries a message or is just random junk? The answer in
general is that no mechanistic/algorithmic method exists!

(Hardly surprising, if you think about it. A one-time pad is
information-theoretically secure. Every English (or Russian, etc.)
sentence of length L can be "found" in a cyphertext of length L by
trying the "right" pad. A thousand monkeys and all that.)

For messages that are not encrypted with one-time pads, this is
not the case, and various bits of information can sometimes be
extracted. Cryptanalysis sometimes works. Last I heard, though, it
doesn't help with breaking RSA (chosen plaintext attacks on RSA don't
help with the factoring problem at all...consult the textbooks on the
exact situation, if you're interested in such subtleties).

Kolmogorov-Chaitin measures of complexity are very exciting.

--Tim May


-- 
..........................................................................
Timothy C. May         | Crypto Anarchy: encryption, digital money,  
[email protected]       | anonymous networks, digital pseudonyms, zero
408-688-5409           | knowledge, reputations, information markets, 
W.A.S.T.E.: Aptos, CA  | black markets, collapse of governments.
Higher Power: 2^756839 | Public Key: PGP and MailSafe available.
Note: I put time and money into writing this posting. I hope you enjoy it.