## Homophonic Substitution

Homophonic Substitution was an early attempt to make Frequency Analysis a less powerful method of cryptanalysis. The basic idea behind homophonic substitution is to allocate more than one letter or symbol to the higher frequency letters. For example, you might use 6 different symbols to represent "e" and "t", 2 symbols for "m" and 1 symbol for "z".

Clearly, this cipher will require an alphabet of more than 26 letters, as each letter needs at least one ciphertext letter, and many need more than this. The standard way to do this is to include the numbers in the ciphertext alphabet, but you can also use a mixture of uppercase, lowercase and upside down letters. Some people even design artistic symbols to use.

We need to use a key of some form to order the letters of the ciphertext alphabet, and we shall use a keyword like for the Mixed Alphabet Cipher. In a similar way, we use the letters from the keyword first, without repeats, then use the rest of the alphabet. However, we assign multiple spaces to some letters. Using the keyphrase "18 fresh tomatoes and 29 cucumbers"

**Encryption**We have to generate the ciphertext alphabet using our keyword or keyphrase like above. We then replace each letter in the plaintext with one of the options in the ciphertext alphabet for that letter. We can either alternate between options, or choose randomly each time.

For example, say we want to encrypt the message "run away, the enemy are coming" using the keyphrase above. We start as if it was a normal Mixed Alphabet Cipher, getting "Q" for "r" and "0" for "u", but then we get to "n" and we could choose either "G" or "I" to represent "n". Continuing like this, and choosing randomly which symbol to use we could get the ciphertext "Q0I 1486, YNH OGSB6 1QH RKB2GA".

Obviously, by making a different choice at each of the letters where we had a choice, we could get a different ciphertext.

**Decryption**To decrypt we have to generate the ciphertext alphabet, and then simply look for each ciphertext letter along the bottom row, and replace it with the relevant plaintext letter above or if the space above is blank, choose the last letter in the plaintext alphabet before this.

The message "4O 8QH E2WRJ3SQTE" decrypts to "we are discovered".

**Discussion**Homophonic Substitution is a simple way to make monoalphabetic substitution more secure, by levelling out the frequencies with which the ciphertext letters appear. There are many approaches to the homonphonic substitution cipher, and it can be adapted in many ways.

Using the text we decrypted in Frequency Analysis, with the same keyword

*manuscript*, we get the frequency distributions below.We notice much more clumping of the letters in the homophonic example, and also the extra symbols to represent the 26 letters.One special type of homophonic substitution cipher is a

*nomenclator*. This combines a codebook with a large homophonic substitution cipher. Originally used in France, it is named after the people who announced the arrival of dignitaries, and started with a small codebook consisting of the names of dignitaries. This however expanded rapidly, to include many common words, phrases and places. When written, the code and cipher parts are not distinguished. Nomenclators were a hugely successful cipher, and many remained unbroken for hundreds of years. In fact, there are still some articles in achives that have not been broken, and provide interesting insights into historical accounts.In particular, in one encrypted message between Louis XIV and one of his generals, there is a possible solution to the mystery of who The Man in the Iron Mask was. The letter read:

His Majesty knows better than any other person the consequences of this act, and he is also aware of how deeply our failure to take the place will prejudice our cause, a failure which must be repaired during the winter. His Majesty desires that you immediately arrest General Bulonde and cause him to be conducted to the fortress of Pignerole, where he will be locked in a cell under guard at night, and permitted to walk the battlement during the day with a 330 309.

The appearance of "330" and "309" at the end are the only appearances of these codewords in the whole text, and as such it is impossible to know for certain what they stood for. However, it is thought that "masque" is a good guess.