[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: [email protected]*Subject*: Re: Cracking MD5 for $10M*From*: Hal <[email protected]>*Date*: Fri, 9 Sep 1994 11:25:11 -0700*References*: <[email protected]>*Sender*: [email protected]

Jim Gillogly <[email protected]> writes: >Hal discusses using the Distinguished Points method to find hash >collisions presented by Michael Wiener with Paul van Oorschot at Rump >Crypto '94, and lists two benefits: >(1) saves space in searching for loops on a single processor; >(2) allows parallel searches for collisions over multiple processors. >I claim it's useful only for (2), because another algorithm dominates it >for single processor loop detection... at least in storage space. >["rho" method elided] Yes, this is a good point, the main advantage of the DP algorithm is that it parallelizes. Rho does have the problem that you have to run 3 MD5's for each step, but OTOH it does not have the overhead of saving and checking the distinguished points, so which one would be best on a single processor would depend on the relative costs. >Do you (Hal?) or anybody else know whether Wiener and van Oorschot were >taking into account the contraction of the range each time you iterate >MD5? I think the size of the set of all numbers that are the result of >MD5ing a 128-bit number is considerably smaller than 2^128... is it 1/e of >that? Anybody know about random mappings? They didn't mention anything about this, and I would think they would have if they had considered it. My intuition was that x=MD5(x) would cover a large fraction of the 128 bit output space, but on further thought Jim appears to be right: with n input values into a random function (n would be 2^128 in this case), the chance of a particular output being missed for any one input would be 1-1/n, and the chance of it being missed for all n inputs would be (1-1/n)^n. Taking the limit as n approaches infinity gives 1/e as the fraction of values which would be missed. This means that the fraction of hits would be 1 - 1/e, much lower than I had guessed. >Subsequent iterations reduce >it further, though of course not by 1/e each time, so that the set of >numbers that are the result of iteratively MD5ing a number N times should >be an appreciably smaller set to be groping around in. The way I figure it, if the fraction of the original n is f (which would be 1 before the first iteration, and 1 - 1/e before the 2nd iteration based on the above), the chance of a point being missed is (1-1/n)^(nf), which is 1/e^f. So f would be found by f = 1 - 1/e^f, iterating once per MD5 iteration and starting f at 1. I just did an experiment of iterating this. After 100 times f was about .02; after 1000 times f was about .002, suggesting f = 2/iterations. If this is right, you might be able to get a birthday match after only the cube root of n tries rather than the square root of n, or about 2^44 iterations or so rather than 2^64, because at that point you are only looking at 2^85 possible output values. This result is only really valid for serial machines; parallel ones search more per iteration so this would move you back towards the 2^64 number. It does imply that you don't really get k-fold speedup with k machines if you take this effect into consideration. > Jim Gillogly > 18 Halimath S.R. 1994, 16:12 Gee, my calendar must be off! Hal

**References**:**Re: Cracking MD5 for $10M***From:*Jim Gillogly <[email protected]>

- Prev by Date:
**Re: CONTROL FREAKS (nee, AIDs testing and privacy)** - Next by Date:
**Re: pgp key servers** - Prev by thread:
**Re: Cracking MD5 for $10M** - Next by thread:
**Re: Cracking MD5 for $10M** - Index(es):