[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Logical Depth



Nobody wrote:

> I have followed with interest this discussion of passphrase
> "entropy".  What I'm not clear on is the effect of a hashing 
> algorithm on the final entropy.  If I come up with a "random" set 
> of printable characters which contain 128 bits of entropy, and 
> feed them to MD5, let's say, will I still have 128 bits of 
> entropy on the output?  Or do I need some sort of safety margin 
> above 128 bits to "be sure"?
> 
> What's lurking in the back of my mind is this -- if you enter 
> something with LESS than 128 bits, the hashing algorithm has to 
> "pad" or otherwise fill in the missing bits from <somewhere>.  
> Now if I have entered a phrase with EXACTLY 128 bits of entropy, 
> hypothetically, is that enough to have flushed the padding or 
> whatever out of the pipeline?
> 
> Can we really treat MD5 as a "magic black box", or does the 
> optimal input require a knowledge of how the box works?

Consider a cellular automata...the Game of Life is a simple example it
2-D, but 1-D versions have been studied extensively.

It starts with the string: "1 0 1"

and iterates/crunches on it, producing this output:

              1 0 1
            1 1 0 1 0
          0 1 0 1 0 0 0
        0 1 1 0 0 0 1 0 1
      1 0 0 1 0 1 1 1 0 1 1

(etc.)

Now does the final string, a seemingly randomly-looking and
"high-entropy" string actually have high entropy? No, not if the
machine (CA rule set) that generated it is known.

(As an aside, encrypted strings _appear_ to have high entropy, but
generally they don't actually have this high entropy....because they
are actually fairly low entropy strings like "Frost in Brazil, buy
coffee futures today." Such strings are called "cryptoregular.")

In the above case, one can treat the machine as the key. Steven
Wienberg conjectured that cellular automata could be used for
encryption. I think it was later proved, not too surprisingly to me at
least, that his CA-based systems were formally equivalent to linear
feedback shift registers (LFSRs), which are are not very strong.

The point I want to make though is that the 3 bits started with (1 0
1) turn into 40 or 100 or whatever bits throught the process of
crunching on them. Things which give evidence of having a lot of
"history" or computation behind them are said to have high "logical
depth."

The most obvious example around us is _life_. For example, it is often
claimed by certain enthusiasts of nanotechnology that the creation of
life-like agents should be relatively easy because, for example, e.
coli "only" contains a few megabytes of code in its DNA. Since we can
make _chips_ that store this amount of code....

Aargghh! The problem is _which_ code! A few meg doesn't sound like
much, but e. coli only lives when the code is the right code, a
relatively few of the 2^1,000,000 or more sequences that are possible.
(Now that's a search space!).

Life has had several billion years and incredible numbers of
generations to find the interesting places in "DNA space." This is
what is meant by logical depth.

Back to crypto. The point "nobody" made about MD5 and the like
"padding out" the bits is a good one. There are, in a sense, no more
bits of entropy than one started with, because MD5 and similar hashes
are _deterministic_.

But an attacker must contend with the increased logical depth, which
is in some sense orthogonal to bit entropy (randomness). (If I could
draw a picture here, it would have an x-axis reprsenting bit entropy
and a y-axis representing logical depth.)

This can slow down an attack, in that the attacker probably (*) needs
to do certain computations to track this logical depth. Like requiring
someone in a contest to stop and do some computations, even if
deterministic.

I don't know of any good analyses of the cryptographic effects of
such lines of thinking.

(* I said "probably" because there's always the possibility that what
Alice thinks is an extra set of computations her hash is forcing Bob
to do is not actually needed, that Bob knows of some tricks that
allows him to bypass them. A standard crypto problem.)

Well, sorry for the long discussion. This business of logical depth is
near and dear to me, and is a part of "algorithmic information
theory," the field pioneered by Kolmogorov and Chaitin. Lots of
interesting resonances with crypto.

--Tim May


-- 
..........................................................................
Timothy C. May         | Crypto Anarchy: encryption, digital money,  
[email protected]       | anonymous networks, digital pseudonyms, zero
408-688-5409           | knowledge, reputations, information markets, 
W.A.S.T.E.: Aptos, CA  | black markets, collapse of governments.
Higher Power: 2^859433 | Public Key: PGP and MailSafe available.
"National borders are just speed bumps on the information superhighway."