[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Generating random numbers from english text



Hello,
	With the current random number discussion going on, I thought
I would point out one convenient way to generate random numbers in
the situation where you do not need to generate them frequently.  I believe
that Claude Shannon (Dr. Information Theory) proposed this:

1. Ask a person to speak about any topic for a paragraph or two.
   Instruct them to generate original sentences, not just repeats of
   some passage of a written work.  Write down what they say.

2. The english language has an entropy of 1.0 to 1.5 bits per letter,
   so it is safe to extract one bit per character.  If you read Shannon's
   work and its modern interpretation, you will understand that this
   bit per character is truely random.  It is uncorrelated with all
   the other bits.  The reason for avoiding a pre-existing written
   work (poem, article, story, etc), is to avoid a brute force search
   of the source language space.

	The modern version of this algorithm, is to compute the
MD5 digest of a long text string.  For a 128 bit digest, you need
at least 128 characters of source language.  For a larger random number,
you can concatenate multiple MD5 values from multiple pieces of source text.

	If the random numbers form the basis of crypto keys, then it
is important to make sure no one can uncover the original source text.

		--Bob