Autokey Cipher
The Vigenère Cipher was the first of many Polyalphabetic Ciphers which worked in much the same way. The only difference between these related ciphers was the way in which the keystream is generated.
The Autokey Cipher is one such example. In general, the term autokey refers to any cipher where the key is based on the original plaintext. In its simplest form, it was first described by Girolamo Cardano, and consisted of using the plaintext itself as the keystream. However, since there was no key involved in this system, it suffered the same major flaw as the Atbash and the Trithemius Ciphers: if you knew it had been used, it was trivial to decode.
The most famous version of the Autokey Cipher, however, was described by Blaise de Vigenère in 1586 (the one that was later misattributed the Vigenère Cipher). This cipher incorporates a keyword in the creation of the keystream, as well as the original plaintext.
Encryption
Encryption using the Autokey Cipher is very similar to the Vigenère Cipher, except in the creation of the keystream.
Encryption using the Autokey Cipher is very similar to the Vigenère Cipher, except in the creation of the keystream.
The keystream is made by starting with the keyword or keyphrase, and then appending to the end of this the plaintext itself.
We then use a Tabula Recta to find the keystream letter across the top, and the plaintext letter down the left, and use the crossover letter as the ciphertext letter.
As an example we shall encode the plaintext "meet me at the corner" using the keyword king. First we must generate the keystream, which starts with the keyword, and then continues with the plaintext itself, getting kingmeetme....
With the keystream generated, we use the Tabula Recta, just like for the Vigenère Cipher. We find K across the top, and M down the left side. The ciphertext letter is "W".
For the second letter, "e", we go to I across the top, and E down the left to get the ciphertext letter "M".
Continuing in this way we get the ciphertext "WMRZYIEMFLEVHYRGF".
|
Decryption
To decrypt a ciphertext using the Autokey Cipher, we start just as we did for the Vigenère Cipher, and find the first letter of the key across the top, find the ciphertext letter down that column, and take the plaintext letter at the far left of this row. As well as being the plaintext letter, we now need to add this letter to the end of the keystream as we shall need it later. Continuing to decode each letter, we add them to the end of the keystream each time.
To decrypt a ciphertext using the Autokey Cipher, we start just as we did for the Vigenère Cipher, and find the first letter of the key across the top, find the ciphertext letter down that column, and take the plaintext letter at the far left of this row. As well as being the plaintext letter, we now need to add this letter to the end of the keystream as we shall need it later. Continuing to decode each letter, we add them to the end of the keystream each time.
We shall decrypt the ciphertext "QNXEPKMAEGKLAAELDTPDLHN" which has been encrypted using the keyword queen. We start with the information shown in the table below.
We look along the top row to find the letter from the keystream, Q. We look down this column (in yellow) and find the ciphertext letter "Q" (in green). We then go along this row (in blue) to the left hand edge, and the letter here (in purple) is the plaintext letter. In this case it is "a".
We now add this to the end of the keystream, as well as to the plaintext row.
|
We then continue in the same way to retrieve the plaintext "attack the east wall at dawn".
Discussion
The Autokey Cipher is a much more secure way of generating the keystream than the Vigenère Cipher, which is amazing since for over 200 years it was believed that the Vigenère was unbreakable. The weakness of the Vigenère Cipher was the repeating nature of the keystream, which allowed us to work out the length of the keyword and thus perform frequency analysis on the different parts.
The Autokey Cipher is a much more secure way of generating the keystream than the Vigenère Cipher, which is amazing since for over 200 years it was believed that the Vigenère was unbreakable. The weakness of the Vigenère Cipher was the repeating nature of the keystream, which allowed us to work out the length of the keyword and thus perform frequency analysis on the different parts.
The Autokey Cipher does not suffer from this weakness, as the repeating nature of the keystream is not used. However, even though it is more secure, it is still not impossible to break the Autokey Cipher. The weakness here is that it is likely that some common words will have been used in the plaintext, and thus also in the keystream. For example "the" is likely to appear in the keystream somewhere, and so by trying this everywhere we can identify other bits of likely plaintext, and put these back in the keystream, and so on.
As an example, we have intercepted the message "PKBNEOAMMHGLRXTRSGUEWX", and we know an Autokey Cipher has been used. We are going to have a look to see if the word "the" produces any leads. If the word appears in the plaintext, then it is also likely to appear in the keystream. We start by putting "the" in every possible position in the keystream, to see if we get any fragments that make sense.
With this done, we identify the most likely plaintext fragments. For example, "bxs" and "zzq" are very unlikely plaintext, but "tac" and "ako" are more likely possibilities. We shall start with "tac". We know that, since it is an Autokey Cipher, if "tac" is plaintext it will also appear in the keystream. Also, if "THE" is in the keystream it appears in the plaintext.
If the keyword had length 4, then the "t" of "the" in the plaintext will be 4 places to the left of the "T" in "THE" in the keystream, and similarly for "tac". Putting this information in the grid we get the following table. The red letters are the information we can then work out using the Tabula Recta.
From this we would have "yxr" as some plaintext, which seems unlikely. So we try a different length of keyword. It is likely it is somewhere between 3 and 12 letters long. We shall look at the next couple.
We can continue down this route, but it does not get us anywhere. The hopeful "IGA" in the keystream (and keyword if it is of length 6), seems less likely with "arq" in the plaintext.
The plaintext "tac" has not helped use, so let's go back and try "ako". We do the same thing, but this time with the position of "THE" that produced "ako".
With this last one, we get "TAC" which is a possible piece of plaintext, and "wn" finishing the message, which could also work. With this, we decide to investigate a little bit more along this line of inquiry. Just as we did before, if "TAC" is in the keystream, it must be in the plaintext, so we can add it to the grid, and use it to work out some more keystream.
The revealed letters "INC" are the third, fourth and fifth letters of the keystream, and as we are working with a keyword of length 6, they would be in the keyword, not the plaintext. We can then think about words of length 6 with these letters (or use a crossword solver), and we find the most plausible is probably prince or flinch. Wee try the former of these.
As this has produced a word that makes sense, it is certain we have found the keyword. We can now continue to decode the message by putting in the rest of the known plaintext to the keystream, or we can decrypt it now that we know the keyword.
Finally, we retrieve the plaintext "attack at the break of dawn".
There are several parts to this system that worked well in this example. The first word we chose to check, "THE", was indeed in the plaintext. In reality, it may take a few goes to find a word that does appear. We also found a sensible plaintext segment on our second go with "ako". We could have tried many other possibilities before getting to this one. The final guess of the keyword relied on it being a word. To make the encryption more secure, they might have used a non-sensical 'word', which would have slowed us down as well.
Although there are difficulties in using this method, and it is quite long winded doing it by hand, with the help of a computer we can identify the possibilities very quickly.