[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Randomness of a bit string
Continuing the discussion on whether there may exist a few random strings
that can be proven random...
Tim May writes:
> To see this another way, suppose an algorithm existed to always know
> if a given number is "random" or not. [Paradoxes follow]
But that's not what I was talking about; I specifically acknowledged
that there was no such algorithm that ALWAYS gives you the
answer. But even in the absence of a general algorithm to decide a
problem, it may be possible to decide some specific instances. For
example, a basic result of computability theory is that there is
no algorithm that will, for any program P and input x, tell you if
P eventually halts on input x. Yet there are many SPECIFIC instances
of programs P and inputs x for which it has been proven that P halts on
input x; this is what the whole business of formal proofs of program
correctness is about.
> If someone claims they can "prove" the sequence "0
> 1101100110111100010" is really random, ask them _how_. Ask them if the
> compression "Chaitin 27," meaning the example number given on page 27
> of Chaitin's book is not that same number, making it hardly random.
This argument is invalid. To see why, let's review the definition of
a random string. Randomness is defined in terms of Kolmogorov
complexity, which is defined relative to any universal function U. (A
universal function U takes as input an encoding of a Turing machine T,
together with its input z; its output is undefined if T does not halt
on input z, otherwise its output is the value T outputs on input z.
Each different effective encoding of program-input pairs defines a
different universal function.) The Kolmogorov complexity C_U(x) of a
string x (relative to U) is defined to be the length of the shortest
string y such that U(y) is defined and U(y) = x. In a sense, it
doesn't matter which universal function you use, since it turns out
that for any two universal functions U and V there exist constants c1
and c2 such that
C_U(x) <= C_V(x) + c1 for all x, and
C_V(x) <= C_U(x) + c2 for all x.
A string x is defined to be random (w.r.t. U) if C_U(x) >= x.
Trivially then, the empty string is a random string. Also, Tim's
example is meaningless, since it does not give an algorithm.
(Caveat: you COULD construct a universal function U that has
Chaitin's book built in to it, but it is certainly NOT the case that
every universal function has this property.)
To prove that a nonempty string x is nonempty, it suffices to prove
that for all strings y shorter than x, either U(y) is undefined
or U(y) != x. This amounts to proving the output (and halting
behavior) of a finite number of program-input pairs. For some strings
x and universal functions U this task may be absolutely trivial. Consider
a Turing machine T that always halts and always outputs the empty
string, regardless of its input. Let z_1,...,z_m be m arbitrary
strings, where m exceeds the number of strings shorter than x. It is
straightforward to construct an effective encoding of program-input
pairs for which (T,z_i) is encoded as the i-th bit-string in
lexicographic order. Suppose that U is the corresponding universal
function, and let y_i be the encoding of the program-input pair
(T,z_i). Then U(y_i) is the empty string, for all 1 <= i <= m. Since
the set { y_i : 1 <= i <= m } includes every string shorter than x,
and x is nonempty, we then see that x is random (relative to U.)
-----------------------------------------------------------------------------
Kevin S. Van Horn | It is the means that determine the ends.
[email protected] |