The purpose of this article is to explain the underlying principles
of cryptography by examples, and to show some criteria that should be
met by cryptographic algorithms before they are seriously considered
for adoption.
Cryptography is the art or science of scrambling plaintext into ciphertext with a key
so that it cannot be read by anyone who does not possess the key.
Digital information is stored in the form of 1s and 0s, called BINARY.
Binary Numbers
Let�s count from DECIMAL 0 to 15 in BINARY by adding 1 to the previous number.
decimal: |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
binary: |
0 |
1 |
10 |
11 |
100 |
101 |
110 |
111 |
1000 |
1001 |
1010 |
1011 |
1100 |
1101 |
1110 |
1111 |
Notice that first we start with a single number position, which can be 0 or 1 in BINARY. This
number position is a bit. Call this first bit b0.
Notice that b0 is either 0 or 1. That is, b0 = 0 or b0 = 1.
To get to DECIMAL 2, we have to introduce a second BINARY bit--call it b1.
We have b1b0 = 10. Next, for DECIMAL 3, we have BINARY b1b0 = 11.
binary: |
0 |
1 |
10 |
11 |
100 |
101 |
110 |
111 |
1000 |
1001 |
1010 |
1011 |
1100 |
1101 |
1110 |
1111 |
numbered bits: |
b0 |
b0 |
b1b0 |
b1b0 |
b2b1b0 |
b2b1b0 |
b2b1b0 |
b2b1b0 |
b3b2b1b0 |
b3b2b1b0 |
b3b2b1b0 |
b3b2b1b0 |
b3b2b1b0 |
b3b2b1b0 |
b3b2b1b0 |
b3b2b1b0 |
Notice that the bit subscript represents a power of 2. That is, b0 really
means
b0*2^0,
where * is multiplication, and ^ exponentiation (for example, 2^0 = 1, 2^1 = 2, 2^2 = 4, 2^3 = 8).
The subscript on b is the same as the power on 2.
If we had b26, we would know its meaning was b26*2^26.
If b26 = 0, then this value is 0. If b26 = 1, then
this value is 2^26.
Now look at "1111" (which in BINARY is equal to DECIMAL 15). In this
case, b3b2b1b0 = 1111. The
right-most BIT (b0) is the least-significant bit, because
it corresponds to the lowest power of 2.
Converting Binary Numbers to Decimal Numbers
To convert a BINARY number ...b3b2b1b0 to
a DECIMAL number Y, we simply write
Y = b0 + b1 * 2 + b2 * 2^2 + b3 * 2^3 + ...
The bits b0, b1, b2, b3 are limited to the values 0 and 1 ONLY.
Performing the exponentiation of powers of 2 and reversing the bits gives
Y = . . . + b3 * 8 + b2 * 4 + b1 * 2 + b0 .
Most of us were brought-up to understand that the most significant digits are to the LEFT of the previous
digit.
For sake of economy of writing and easy conversion, binary numbers are frequently represented
in base 16, or HEXADECIMAL, abbreviated HEX.
Hexadecimal Numbers
Binary weights | HEX | DECIMAL |
8 4 2 1 | | |
0 | 0 | 0 |
1 | 1 | 1 |
1 0 | 2 | 2 |
1 1 | 3 | 3 |
1 0 0 | 4 | 4 |
1 0 1 | 5 | 5 |
1 1 0 | 6 | 6 |
1 1 1 | 7 | 7 |
1 0 0 0 | 8 | 8 |
1 0 0 1 | 9 | 9 |
1 0 1 0 | A | 10 |
1 0 1 1 | B | 11 |
1 1 0 0 | C | 12 |
1 1 0 1 | D | 13 |
1 1 1 0 | E | 14 |
1 1 1 1 | F | 15 |
Conversion from binary to hexadecimal or hexadecimal to binary is easy if you remember
1010 is A
1100 is C
1110 is E
B is one greater than A, 1011. D is one greater than C, 1101. And F is one greater than E, 1111.
Computer Memory
Computer memory is frequently organized as BYTEs which are eight bits.
Since one hexadecimal digit represents 4 bits, it takes two hexadecimal digits to represent one byte.
There are 2^8 = 256 different binary values that can be represented in
a byte. These 256 values (written in HEX for brevity) are:
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F
30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F
40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F
50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F
60 61 62 63 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F
70 71 72 73 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F
80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F
99 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F
A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF
B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF
C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF
D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF
E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF
FF F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF
Representing Language on a Computer
The problem of how to represent characters in a computer has been solved several ways. One
way is the American Standard Code of Information Interchange (ASCII).
ASCII represents characters as 7 bits.
Here is table modified from a web site
(
http://members.tripod.com/~plangford/index.html).
Hex Char Description Hex Char Hex Char Hex Char
---------------------- --------- ---------- ----------
00 NUL (null) 20 40 @ 60 `
01 SOH (start of heading) 21 ! 41 A 61 a
02 STX (start of text) 22 " 42 B 62 b
03 ETX (end of text) 23 # 43 C 63 c
04 EOT (end of transmission) 24 $ 44 D 64 d
05 ENQ (enquiry) 25 % 45 E 65 e
06 ACK (acknowledge) 26 & 46 F 66 f
07 BEL (bell) 27 ' 47 G 67 g
08 BS (backspace) 28 ( 48 H 68 h
09 TAB (horizontal tab) 29 ) 49 I 69 i
0A LF (line feed, new line) 2A * 4A J 6A j
0B VT (vertical tab) 2B + 4B K 6B k
0C FF (form feed, new page) 2C , 4C L 6C l
0D CR (carriage return) 2D - 4D M 6D m
0E SO (shift out) 2E . 4E N 6E n
0F SI (shift in) 2F / 4F O 6F o
10 DLE (data link escape) 30 0 50 P 70 p
11 DC1 (device control 1) 31 1 51 Q 71 q
12 DC2 (device control 2) 32 2 52 R 72 r
13 DC3 (device control 3) 33 3 53 S 73 s
14 DC4 (device control 4) 34 4 54 T 74 t
15 NAK (negative acknowledge) 35 5 55 U 75 u
16 SYN (synchronous idle) 36 6 56 V 76 v
17 ETB (end of trans. block) 37 7 57 W 77 w
18 CAN (cancel) 38 8 58 X 78 x
19 EM (end of medium) 39 9 59 Y 79 y
1A SUB (substitute) 3A : 5A Z 7A z
1B ESC (escape) 3B ; 5B [ 7B {
1C FS (file separator) 3C < 5C \ 7C |
1D GS (group separator) 3D = 5D ] 7D }
1E RS (record separator) 3E > 5E ^ 7E ~
1F US (unit separator) 3F ? 5F _ 7F DEL
Now let us take two different one-word messages we might wish to cipher: "black" and "white". We
can use the preceding table to find the ASCII codes for the characters in "black" and "white".
message | character | ASCII (Hex) | Binary |
Message 1 | black | 62 6C 61 63 6B | 0110 0010 0110 1100 0110 0001 0110 0011 0110 1011 |
Message 2 | white | 77 68 69 74 65 | 0111 0111 0110 1000 0110 1001 0111 0100 0110 0101 |
But before doing this, we must understand the general "cipher problem."
The Cipher Problem
We have three elements in the encryption process:
1. The plaintext message
2. The key
3. The ciphertext
Let�s start REAL SIMPLE. Let's consider a situation where the plaintext
message, the key, and the ciphertext are all the same length. To
make it even simpler, let's make each one only one bit long. So the key
can be one of two possibilities (0 or 1), and so can the plaintext and
the ciphertext. So, in all, there are 2*2*2 = 8 total possible encipherment
circumstances.
Let�s enumerate ALL 8 POSSIBILITIES.
Possibilities Table |
Possibility | Key | Plaintext | Ciphertext |
1 | 0 | 0 | 0 |
2 | 0 | 0 | 1 |
3 | 0 | 1 | 0 |
4 | 0 | 1 | 1 |
5 | 1 | 0 | 0 |
6 | 1 | 0 | 1 |
7 | 1 | 1 | 0 |
8 | 1 | 1 | 1 |
That�s it! There are no more possibilities than these 8. What does this
mean for the encryption process--the "algorithm"?
An ALGORITHM is a deterministic processes that accepts inputs and transforms them into outputs.
"Deterministic" is important in that the same inputs ALWAYS produce the same output.
Consider ANY algorithm which takes as its inputs the key values of 0 or 1 and the plaintext message
values of 0 or 1.
ANY algorithm can only produce one of the ciphertext outputs seen above. Image the following
hypothetical but REAL SIMPLE algorithm:
Hypothetical Algorithm: Find the value of the Key in column 2 of the Possibilities Table
and the Plaintext message in column 3; then chose as the Ciphertext output whatever number is found
in column 4 of that row.
But there, of course, is a catch with a valid CRYPTOGRAPHIC ALGORITHM.
Given the Key and the Ciphertext, one must be able to get back the Plaintext!
A cryptographic algorithmic should have an INVERSE.
So a cryptographic algorithm could not produce ALL of the eight combinations seen above for the
reason that it is impossible to invert some of the possibilities. For example, some of the mappings
are incompatible from an inverse standpoint, because same values of the key and ciphertext
can lead to two different values of the plaintext. Notice how the following pairs of possibilities
conflict: 1 and 3; 2 and 4; 5 and 7; 6 and 8.
Possibility | Key | Plaintext | Ciphertext |
1 | 0 | 0 | 0 |
3 | 0 | 1 | 0 |
| | | |
2 | 0 | 0 | 1 |
4 | 0 | 1 | 1 |
| | | |
5 | 1 | 0 | 0 |
7 | 1 | 1 | 0 |
| | | |
6 | 1 | 0 | 1 |
8 | 1 | 1 | 1 |
But there two different cryptographic algorithms that could be made from the
Possibilities Table, both of which have inverses:
Cryptographic Algorithm 1 |
Possibility | Key | Plaintext | Ciphertext |
1 | 0 | 0 | 0 |
4 | 0 | 1 | 1 |
6 | 1 | 0 | 1 |
7 | 1 | 1 | 0 |
Cryptographic Algorithm 2 |
Possibility | Key | Plaintext | Ciphertext |
2 | 0 | 0 | 1 |
3 | 0 | 1 | 0 |
5 | 1 | 0 | 0 |
8 | 1 | 1 | 1 |
Of course, the output of Algorithm 2 is merely the same as the output of Algorithm 1,
with 0s and 1s switched. (This is called a logical NOT operation.)
Logic and Its Electronic Representation
Logic, sometimes called
Boolean logic when it is dealing with 0s and 1s, has several elementary rules.
In computers, TRUE is usually represented by a 1. FALSE is represented by a 0.
Electrically a 1 is usually, but not always, represented by a HIGH VOLTAGE. A zero by a
LOW VOLTAGE.
The three basic operations in logic are NOT, AND, and OR:
Logical Operation | Input(s) | Output |
NOT | 0 | | 1 |
| 1 | | 0 |
| | | |
AND | 0 | 0 | 0 |
| 0 | 1 | 0 |
| 1 | 0 | 0 |
| 1 | 1 | 1 |
| | | |
OR | 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 1 |
A derivative operation called an EXCLUSIVE-OR, abbreviated XOR, is defined as follows:
Logical Operation | Input(s) | Output |
XOR | 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |
In XOR, if the two input bits have the the same value, they sum to 0. If they have
different values, they sum to 1.
Now look back at Cryptographic Algorithm 1. It is, in fact,
the exclusive-or (XOR) of the key and plaintext.
Algorithm 1: Ciphertext Output = Key XOR Plaintext.
Cryptographic Algorithm 2, meanwhile, is just the NOT of Algorithm 1.
Algorithm 2: Ciphertext Output = NOT (Key XOR Plaintext).
The REALLY IMPORTANT property of the XOR is THAT IT HAS AN INVERSE.
By contrast, logical AND does not have an inverse for the reason that if the
Key and (Key AND Plaintext) are both 0, then the Plaintext itself is ambiguously
either 0 or 1.
Logical Operation | Key Input | Plaintext Input | Key AND Plaintext | |
AND | 0 | 0 | 0 | oops |
| 0 | 1 | 0 | oops |
| 1 | 0 | 0 | |
| 1 | 1 | 1 | |
Likewise, logical OR does not have an inverse for the reason that if the
Key and (Key OR Plaintext) are both 1, then the Plaintext itself is ambiguously
either 0 or 1.
Logical Operation | Key Input | Plaintext Input | Key OR Plaintext | |
OR | 0 | 0 | 0 | |
| 0 | 1 | 1 | |
| 1 | 0 | 1 | oops |
| 1 | 1 | 1 | oops |
So logical AND and OR don�t work well for a crypto algorithm, but the XOR does because it has
an inverse.
How to Create Two Keys for Deniability
The XOR works even better from a legal standpoint. Imagine the following
conversation:
Ciphercop: We have the ciphertext 0 and we CAUGHT you with the key with a bit value of 1, so you sent a plaintext 1.
Citizen: No I did not! You PLANTED the key with bit value 1. The real key bit is a 0, and I sent 0 as the plaintext.
Let�s generate a key for the REAL WORLD crypto messages "black" and "white",
message | character | ASCII (Hex) | Binary |
Message 1 | black | 62 6C 61 63 6B | 0110 0010 0110 1100 0110 0001 0110 0011 0110 1011 |
Message 2 | white | 77 68 69 74 65 | 0111 0111 0110 1000 0110 1001 0111 0100 0110 0101 |
and see if we can produce a REAL EXAMPLE of a SECOND KEY.
Here�s a key, which we will call key 1:
key 1: 1010 0101 1100 0011 1110 0111 1111 0000 0110 1001
Key 1 doesn�t look too random. Each group of four bits is followed by its
logical NOT (e.g. NOT(1010) = 0101, etc.). Which leads to another lesson.
To claim that a sequence of 0s and 1s is random requires statistical testing.
Otherwise, the state of the sequence is UNKNOWN.
Here's another key, which we will call key 2:
key 2: 1011 0000 1100 0100 1110 1111 1110 0111 0110 0111
These two keys produce the same ciphertext for the two different
messages "black" and "white".
black | 0110 0010 0110 1100 0110 0001 0110 0011 0110 1011 |
key 1 | 1010 0101 1100 0011 1110 0111 1111 0000 0110 1001 |
ciphertext (XOR) | 1100 0111 1010 1100 1000 0110 1001 0011 0000 0010 |
white | 0111 0111 0110 1000 0110 1001 0111 0100 0110 0101 |
key 2 | 1011 0000 1100 0100 1110 1111 1110 0111 0110 0111 |
ciphertext (XOR) | 1100 0111 1010 1100 1000 0110 1001 0011 0000 0010 |
So when the ciphercops FALSELY accuse you of encrypting "black", you SCREAM
"Bull pucky!", and produce key 2 to show that you, IN FACT, encrypted "white".
Then sue the government--pro se, of course. (See http://jya.com/whpfiles.htm.)
The recipe for producing the second key in this example is simple. Take two plaintext messages of the
same length. Encrypt one of them with an arbitrary key that yields a ciphertext of the
same length. XOR the ciphertext with the second plaintext message. The result is the
second key. Store this one for plausible deniability.
So from the standpoint of plausible deniability it is BEST to have TWO KEYS
for any given encryption:
1. The REAL KEY
2. The key you can CLAIM was the REAL KEY. Just in case you get caught.
"Have we gone beyond the bounds of reasonable dishonesty?"
--CIA memo |
(The CIA quote is from
Weird History 101, by John Richard Stephens, page 55.)
None of us want to get caught going beyond the bounds of reasonable dishonesty.
Thus far two criteria of a worth candidate for cryptographic algorithm have been established.
Criterion 1: The ciphertext is invertible with the help of a key back into the plaintext.
Criterion 2: There is ALWAYS a second key. Just in case you get caught.
|
Plaintext and Ciphertext Sizes
The plaintext and ciphertext should be the same size.
First, note that if the plaintext is longer than the ciphertext, then the
ciphertext is not invertible. For example, let�s suppose that the plaintext
is two bits long and the ciphertext is one bit long.
Plaintext | Ciphertext |
0 0 | 0 or 1 |
0 1 | 1 or 0 |
1 0 | oops |
1 1 | oops |
After the first two ciphertext bits have been assigned to plaintext pairs,
the next two plaintext pairs (10,11) must conflict with this assignment. The
ciphertext thus correspondents to more than one plaintext possibility.
We run into problems for the reason that we cannot establish a one-to-one
correspondence between the plaintext and cipher text and, therefore, can�t
possibly have an inverse.
Second, if the plaintext is shorter than the ciphertext, then the
ciphertext can't be trusted. It may include too much information. For
example, let�s suppose that the plaintext is one bit long, the key is one
bit long, and the cipher text is two bits long.
Iran Cryptographic Algorithm |
Key | Plaintext | Ciphertext |
0 | 0 | 00 |
0 | 1 | 10 |
1 | 0 | 11 |
1 | 1 | 01 |
Not only is the above algorithm invertible, but now the crypto key has been sent
along with the ciphertext in the second bit position!
That is, the first bit in the ciphertext is is the value of (key XOR plaintext).
The second bit is the key itself. So if you XOR the two ciphertext bits
with each other, you recover the plaintext bit.
You might ask who would be audacious enough to pull such stunt. The Great Satan,
of course.
For the story of how the National Security Agency (NSA) bugged the encryption
equipment that was sold by a Swiss company to 140 nations around the world, see
the following links:
http://www.aci.net/kalliste/speccoll.htm
http://caq.com/cryptogate
http://www.qainfo.se/~lb/crypto_ag.htm
http://jya.com/whpfiles.htm
And the Great Satan got caught. No plan B. Or in crypto parlance, no second key.
So we have a third criterion for a cryptographic algorithm we might wish to adopt.
Criterion 3: The length of the plaintext equals the length of the ciphertext.
|
In simple terms, if more bits come out of a crypto algorithm than go in, WATCH OUT!
Otis Mukinfuss and the Advanced Encryption Standard
Bruce Hawkinson ([email protected])
WAS Sandia National Laboratories Lab News editor some years ago.
In one editorial, Hawkinson wrote that while we was traveling for Sandia, he
spent his motel time looking up strange names in the phone book. One name
I recall mentioned was Steely Gray who was a government office furniture
salesman.
Hawkinson concluded his article by writing his all-time favorite name was
Otis Mukinfuss.
Hawkinson was no longer editor of Sandia�s Lab News shortly thereafter.
J. Orlin Grabbe has done an excellent job writing about cryptographic algorithms in
Cryptography and Number Theory for Digital Cash.
One inescapable conclusion from Grabbe�s internet article is that from a layman�s
standpoint public key cryptography is an incomprehensible mess. A Muckinfuss.
The National Institute of Standards and Technology (NIST) is holding a CONTEST to select
an Advanced Encryption Standard to replace the
current Data Encryption Standard (DES).
Click through the candidates to view some additional examples of Mukinfusses.
So another criterion has been established for a cryptographic algorithm to be considered
for adoption.
Criterion 4: The crypto algorithm must be simple and CONVINCE EVERYONE that it is
ONLY PERFORMING ITS SIMPLE INTENDED FUNCTION.
|
While we are at the NIST web site, the NIST Advanced Encryption Standard contest reminds
me of a the plot of a recent movie, The Game, starring Michael Douglas and Sean Penn:
The film is a thriller directed by David Fincher (Se7en). "The Game"
is what begins when a high-powered businessman named Nicholas Van
Orton (Douglas) receives the birthday gift of a lifetime from his
brother he alienated years ago (Penn). What Nicholas gets is entry
into a mysterious new form of entertainment provided by C.R.S.
(Consumer Recreational Services) simply called "The Game." It
proves to be an all-consuming contest with only one rule: there are
no rules. By the time Van Orton realizes he is in, it is too late to get
out. ...
(See
http://www.movietunes.com/soundtracks/1997/game/.)
NIST does not appear to publish any criteria for winning the AES contest!
Look at
http://www.nist.gov/public_affairs/confpage/980820.htm and decide for
yourself.
Perfect Cryptography
Here we have described a process of encrypting a plaintext by XORing it
with a key of the same length. This encryption technique is called a
"one-time pad", or Vernam cipher. Just as long as each key is only used once,
the encryption technique is perfectly secure.
The one-time pad described here satisfies all criteria mentioned so far:
1. The ciphertext is invertible with the help of a key back into the plaintext.
2. There is ALWAYS a second key. Just in case you get caught.
3. The length of the plaintext equals the length of the ciphertext.
4. The crypto algorithm must be simple and CONVINCE EVERYONE that it is
ONLY PERFORMING ITS SIMPLE INTENDED FUNCTION.
I add
Criterion 5: The length of the key must equal the length of the plaintext.
|
Extensive mathematics or complication fails Criterion 4.
Public key cryptograpy that uses the RSA algorithm MAY fail Criterion 1 if the
message divides the product of the two prime numbers, p and q, used in the modulus.
Most crypto algorithms are designed so that the key cannot be recovered from a
plaintext-ciphertext pair. Therefore, they fail Criterion 2.
Criterion 3 is much more difficult to ensure against.
Black and White Hats
American western movie producers used to aid their audiences in identification
of the heroes and villains. The heroes wore white hats. The villains, black hats.
US government agencies adopted the term �black hatter� to describe an employee whose
job it is to break into THINGS. Or screw them up:
http://www.jya.com/whp1.htm
A �white hatter� is one who analyzes THINGS to make sure they cannot be broken into.
And they can�t be screwed up.
But the empirical fact is that the �black hatters� can figure-out methods to transmit
the key on a covert channel, tell the �white hatters� they did this. And the �white
hatters� can�t find out how they did it.
Algorithmic Processes
Suppose the key is five bits:
1 0 1 0 1
Suppose the plaintext is six bits:
1 1 1 1 1 1
And the ciphertext is also six bits:
1 0 1 1 0 1
Ask the cryptographer give you a key which changes ONLY the sixth bit of the
ciphertext, as in the following:
1 0 1 1 0 0
You like the other 5 bits just fine.
If the cryptographer can�t, then you might look for another algorithm to adopt.
Conclusion
We have five criteria to judge the outcome of the NIST Advanced Encryption Standard contest.
If none of the algorithms pass the five tests, we will not be discouraged.
We know that Gilbert S. Vernam and Joseph O. Mauborgne solved the crytptography problem in 1918,
when they created the one-time pad. (See
"What is a One-Time Pad?".)
William H. Payne
P.O. Box 14838
Albuquerque, New Mexico 87191
505-292-7037
Embedded Controller Forth.
Generalized Feedback Shift Registers
Here are some links to some of my students:
Ted Lewis (Friction-Free Economy)
John Sobolewski (HPCERC)
-30-
|