Autokey cipher

From Wikipedia, the free encyclopedia

A tabula recta for use with an autokey cipher

An autokey cipher (also known as the autoclave cipher^[1]) is a cipher which incorporates the message (the plaintext) into the key. There are two forms of autokey cipher: key autokey and text autokey ciphers. A key-autokey cipher uses previous members of the keystream to determine the next element in the keystream. A text-autokey uses the previous message text to determine the next element in the keystream.

In modern cryptography, self-synchronizing stream ciphers are autokey ciphers.

1 History
2 Cryptanalysis
3 Autokey in modern ciphers
4 References

[edit] History

The first autokey cipher was invented by Girolamo Cardano, and, although it contained a weakness that made it easy to break, a number of attempts were made by other cryptographers to produce an autokey system that was not trivial to break; eventually one was invented by Blaise de Vigenère. Vigenère started with a tabula recta, a square with 26 copies of the alphabet, the first line starting with 'A', the next line starting with 'B', etc., like the one above.

In order to encrypt a plaintext, one locates the row with the first letter to be encrypted, and the column with the first letter of the key. The letter where the line and column cross is the ciphertext letter. This continues for all the letters of the message. So far, this is identical to an earlier cipher scheme, which, confusingly, was erroneously attributed to Vigenère: the Vigenère cipher.

Despite this, Vigenère's innovation was in the way the key was generated. He started with a relatively short keyword, and appended the message to it. So if the keyword was "QUEENLY", and the message was "ATTACK AT DAWN", the key would be "QUEENLYATTACKATDAWN".

Plaintext:  ATTACK AT DAWN...
Key:        QUEENL YA TTACK AT DAWN....
Ciphertext: QNXEPV YT WTWP...

The ciphertext message would therefore be "QNXEPVYTWTWP".

This text-autokey cipher was hailed as "le chiffre indéchiffrable", and was indeed undeciphered for over 200 years, until Charles Babbage discovered a means of breaking the cipher.

[edit] Cryptanalysis

Using an example message "meet at the fountain" encrypted with the keyword "KILT":

plaintext:  MEETATTHEFOUNTAIN (unknown)
key:        KILTMEETATTHEFOUN (unknown)
ciphertext: WMPMMXXAEYHBRYOCA (known)

We try common words, bigrams, trigrams etc. in all possible positions in the key. For example, "THE":

ciphertext: WMP MMX XAE YHB RYO CA 
key:        THE THE THE THE THE ..
plaintext:  DFL TFT ETA FAX YRK ..

ciphertext: W MPM MXX AEY HBR YOC A
key:        . THE THE THE THE THE .
plaintext:  . TII TQT HXU OUN FHY .

ciphertext: WM PMM XXA EYH BRY OCA
key:        .. THE THE THE THE THE
plaintext:  .. WFI EQW LRD IKU VVW

We sort the plaintext fragments in order of likelihood:

unlikely <------------------> promising
EQW DFL TFT ... ... ... ... ETA OUN FAX

We know that a correct plaintext fragment will also appear in the key, shifted right by the length of the keyword. Similarly our guessed key fragment ("THE") will also appear in the plaintext shifted left. So by guessing keyword lengths (probably between 3 and 12) we can reveal more plaintext and key.

Trying this with "OUN" (possibly after wasting some time with the others):

shift by 4:
ciphertext: WMPMMXXAEYHBRYOCA
key:        ......ETA.THE.OUN
plaintext:  ......THE.OUN.AIN

by 5:
ciphertext: WMPMMXXAEYHBRYOCA
key:        .....EQW..THE..OU
plaintext:  .....THE..OUN..OG

by 6:
ciphertext: WMPMMXXAEYHBRYOCA
key:        ....TQT...THE...O
plaintext:  ....THE...OUN...M

We see that a shift of 4 looks good (both of the others have unlikely Qs), so we shift the revealed "ETA" back by 4 into the plaintext:

ciphertext: WMPMMXXAEYHBRYOCA
key:        ..LTM.ETA.THE.OUN
plaintext:  ..ETA.THE.OUN.AIN

We have a lot to work with now. The keyword is probably 4 characters long ("..LT"), and we have some of the message:

M.ETA.THE.OUN.AIN

Because our plaintext guesses have an effect on the key 4 characters to the left, we get feedback on correct/incorrect guesses, so we can quickly fill in the gaps:

MEETATTHEFOUNTAIN

The ease of cryptanalysis is thanks to the feedback from the relationship between plaintext and key. A 3-character guess reveals 6 more characters, which then reveal further characters, creating a cascade effect, allowing us to rule out incorrect guesses quickly.

[edit] Autokey in modern ciphers

Modern autokey ciphers use very different encryption methods, but they follow the same approach of using either key bytes or plaintext bytes to generate more key bytes. Most modern stream ciphers are based on pseudorandom number generators: the key is used to initialize the generator, and either key bytes or plaintext bytes are fed back into the generator to produce more bytes.

Some stream ciphers are said to be "self-synchronizing", because the next key byte usually depends only on the previous N bytes of the message. If a byte in the message is lost or corrupted, therefore, the key-stream will also be corrupted--but only until N bytes have been processed. At that point the keystream goes back to normal, and the rest of the message will decrypt correctly.

[edit] References

^ Bletchley Park Cryptographic Dictionary

Classical cryptography v • d • e
Ciphers: ADFGVX \| Affine \| Alberti \| Atbash \| Autokey \| Bifid \| Book \| Caesar \| Four-square \| Hill \| Keyword \| Nihilist \| Permutation \| Pigpen \| Playfair \| Polyalphabetic \| Polybius \| Rail Fence \| Reihenschieber \| Reservehandverfahren \| ROT13 \| Running key \| Scytale \| Smithy code \| Solitaire \| Straddling checkerboard \| Substitution \| Tap Code \| Transposition \| Trifid \| Two-square \| VIC cipher \| Vigenère
Cryptanalysis: Frequency analysis \| Index of coincidence
Misc: Cryptogram \| Bacon \| Polybius square \| Scytale \| Straddling checkerboard \| Tabula recta
Cryptography v • d • e
History of cryptography \| Cryptanalysis \| Cryptography portal \| Topics in cryptography
Symmetric-key algorithm \| Block cipher \| Stream cipher \| Public-key cryptography \| Cryptographic hash function \| Message authentication code \| Random numbers