[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Weak steganography



Hal Finney said:

> Another problem is that encrypted files look different from executable
> files.  Encrypted files have a uniform histogram (that is, all 256 different
> possible byte values are equally frequent), but exe files do not.  ...

I am building a "steganosaurus" and eventually will need to solve a
similar problem.  (A "steganosaurus" applies a primitive steganographic
technique to English text by using a thesaurus to generate enough word
variation to encode a hidden message.)  One of the weaknesses of this
"steganosaurus" is that the resulting output has statistical differences
from normal English text.  For example, word frequency will be skewed.

Worse, I have to assume that the eavesdropper knows my steganization
algorithm and can "desteganize" any innocuous-looking text I produce.
That "desteganized" text will show clearly the existence of a hidden,
encrypted message because, as Hal pointed out, it has a uniform histogram.

What I want is a program that will transform an encrypted file to a
(slightly larger) file that mimics the distribution achieved by
applying the "desteganization" algorithm to normal English text
that does *not* contain any hidden message.  The steganization
algorithm then gets applied to this stealthy, mimic file, not
directly to the encrypted hidden message.  By the way, since we
must assume that the eavesdropper knows all our algorithms but
not our secret keys, this algorithm will require a *second* secret
key in addition to the secret key used in the original encryption.
I'm not ready to tackle that yet.  Unless I hear otherwise, I'll
assume that if anyone knows how to achieve this, they're not telling...

                              Kevin Q. Brown
                              INTERNET    [email protected]
                                 or       [email protected]

PS: I found that a simple, semi-automatic algorithm can generate a
    public message only 5 to 10 times as long as the hidden message.
    Unfortunately, the public message from my simple algorithm is
    almost always a bizarre, disconnected sequence of rants, which,
    for most people, is not normal.  That is why I am building my
    "steganosaurus".  After that I will see if combining a natural
    language parser with transformational grammars can produce a
    less primitive, more efficient "trans-steganosaurus".