# Re: Cracking MD5 for \$10M

Hal discusses using the Distinguished Points method to find hash
collisions presented by Michael Wiener with Paul van Oorschot at Rump
Crypto '94, and lists two benefits:

(1) saves space in searching for loops on a single processor;
(2) allows parallel searches for collisions over multiple processors.

I claim it's useful only for (2), because another algorithm dominates it
for single processor loop detection... at least in storage space.  It
works as follows: get a sequence of values v(i+1) = MD5(v(i));
simultaneously get another sequence w(i+1) = MD5(MD5(w(i))), and start them
at the same place, v(0) = w(0).  That is, you're running one of them twice
as fast as the other.  At each iteration you compare v(i) with w(i), and
if they're equal, you've looped.  Drawing a few rho-shaped trajectories on
paper and following them around with two pencils should be enough to
complete a proof by hand-waving that it always catches a cycle; but
perhaps not as soon as the distinguished points would.

The distinguished points across machines is a great idea for (2), though,
and doesn't depend on anything looping...  cool stuff!

Do you (Hal?) or anybody else know whether Wiener and van Oorschot were
taking into account the contraction of the range each time you iterate
MD5?  I think the size of the set of all numbers that are the result of
MD5ing a 128-bit number is considerably smaller than 2^128... is it 1/e of
that?  Anybody know about random mappings?  Subsequent iterations reduce
it further, though of course not by 1/e each time, so that the set of
numbers that are the result of iteratively MD5ing a number N times should
be an appreciably smaller set to be groping around in.  For example, I
iterated the right-most 14 bits of SHA 26,539 times from one seed before
the range shrank to a single point.  Note that it need not shrink that
far in general, since some of the survivors would typically map into each
other.

Jim Gillogly
18 Halimath S.R. 1994, 16:12