[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: well



i wrote-

> ps.  MD5 of a file with a random string appended to the *end*
>      *can* be computed after having discarded the file.

Matt Thomlinson asked-

> hmmm. why is this? can you find a smaller file that will hash to the same
> number if you get to play with the pad bits appended before the 4 logic
> applications? it would seem reaonably strong either way..
>
> (I know I'm wrong on this, I'm just wondering what I'm missing.)

MD5 and similar hash functions work from the beginning of a file to the
end, in blocks.  For each block, you take the output of the
calculation on the previous block (or initiation constants, if it's
the first block), combine it with the current block, and get the output
for this block.

So, you can calculate the output of the second-to-the-last block,
and store that and the last block, and throw away the rest of the
file.  Then you can append anything you want to the last block (doing
it right, see next paragraph) and calculate the MD5 of the whole file
plus the appendage, even though you don't have the whole file any more.
This trick doesn't work for adding stuff at the beginning.  (*This*
trick doesn't...)

(About appending "right"--MD5 and sisters append some special stuff
at the end of the last block, including the total file length.
You'd have to insert at the right point and adjust.)

But the shortcut for appended-to files wasn't obvious to me at first
either.

I agree with Perry that MD5 isn't necessarily the one to use, and
certainly won't always be.

A couple people agree that my trick *sounds* safe.
Somebody (sorry!) suggested some other methods:
 - Hash of ( file xor'd with repetitions of the same random string)
   --sounds a little safer to me.
 - Xor of specific bits in the file.  Sounds okay if you do a 128-
   bit-wide xor.  Except it doesn't test for bit-decay in the bits
   you didn't ask about.  A hash of the whole file does.

Anyway, I get the feeling cryptographers haven't studied this problem
long and hard.  Meanwhile a method that's about as cheap to compute
and as simple to explain, but seems less likely to be weak is:
 - hash( IDEA( file, random password ) )

-fnerd

- - - - - - - - - - - - - - -
To auditors without the code, calls seem
indistinguishable from noise.  --George Gilder
-----BEGIN PGP SIGNATURE-----
Version: 2.3a

aKxB8nktcBAeQHabQP/d7yhWgpGZBIoIqII8cY9nG55HYHgvt3niQCVAgUBLMs3K
ui6XaCZmKH68fOWYYySKAzPkXyfYKnOlzsIjp2tPEot1Q5A3/n54PBKrUDN9tHVz
3Ch466q9EKUuDulTU6OLsilzmRvQJn0EJhzd4pht6hSnC1R3seYNhUYhoJViCcCG
sRjLQs4iVVM=
=9wqs
-----END PGP SIGNATURE-----