[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Question re: MD5/other key-crunching methods
On the plane back home, I had the pleasure of being treated to a
screening of "Sgt. Bilko". Not a bad movie overall, but had a nice
throwaway crypto line. It got me to thinking, though...
Is it possible to make generalizations about the MD5 hashes of classes of
input values? That is, can one say that "no input values of length
greater than 512 bits will..." or 'all input values starting with the
value 3 have a tendency to..." with any degree of probability? I know
hash functions strive to evenly distribute values over their range, but I
wonder if it might sometimes be possible to predict the hash of a value
without computing it.
Why? Well, it's mainly in regards to the way MD5 and other hash functions
are used in mapping pass phrases to actual key values for a cipher.
Suppose I have a situation in which I feel comfortable in making
certain generalizations about the passphrase. Perhaps it's all lowercase,
perhaps all alphanumeric, has five hyphens, whatever. Information which
may allow one to restrict the passphrase to a certain range.
In a system where the passphrase is the encryption key, that range of key
values can be doled out and searched sequentially. Since they are likely
to be one or several contiguous blocks, one may simply distribute the
task of searching each one to willing machines everywhere. The efforts
with respect to RC4-40 in the previous year prove that much. If I can
rule out even 10% of all possible keyvalues, I've saved a good deal of
time.
What if one is dealing with a passphrase key-crunched w/MD5, though? The
obvious way to go about it is to compute the MD5 hash for each and every
value in the given range, then test that set of keys. This is an extra
step, and adds a measure of extra time to the whole operation. Sure, one
may abstract it away by claiming it's trivial compared to the problem of
searching an exponetially large keyspace, but that seems something of a
cop-out.
Perhaps it's a silly question, but is it possible to identify a set of
hashes which correspond to a set of domain values w/o performing the hash
itself? I'm aware that it's not possible to reverse a one-way hash like
MD5 (wish we could...what a compression ratio!), and I know "good" hash
functions strive for properties which would make this exceedingly
difficult. However, has anyone looked at the question? Is it worth
considering?
Thanks.
-David Molnar