Cracking Caesar crypto with dictionary attack

Updated TrustyTony 1 Tallied Votes 936 Views Share

Here is capital letter style caesar crypted message. We can simply try all possible shifts for first few words (ignoring punctuations, which is left as is). If both first words succeed we assume we cracked it. You could also use Vigenère encryption, but keeping non-letters is less simple, our Vigenère drops all non-letters (including numbers).

You need some english word dictionary saved as dict.txt in same directory as you save this code.

# -*- coding: cp1252 -*-
import string
from itertools import cycle

def caesar(m, shift):
    return ''.join(chr(x) if x < ord('A') or x > ord('Z')
                   else chr(x + shift if ord('A') <= x + shift <= ord('Z')
                            else x - ord('Z') + ord('A') + shift - 1
                            if shift > 0 else ord('Z') - x + ord('A') + shift + 1 )
                   for x in memoryview(m.upper()).tolist())

def vigenere(message, key, encode=True):
    """ vigenere ignoring non-letters """
    return "".join(chr((((ord(k) if encode else -ord(k)) + ord(c)) % 26) + ord('A'))
                    for c,k in zip((m.upper() for m in message if m.isalpha()), cycle(key)))

def stripped(word):
    """ keep only letters from word """
    return ''.join(c for c in word if c.isalpha())

def crack(message, words):
    test, control, rest = message.split(None,2)
    test, control = stripped(test), stripped(control)

    # demonstrating that caesar is one letter vigenere (only vigenere must ignore the non-letters
    for c in string.ascii_uppercase:
        if vigenere(test, c) in words and vigenere(control, c) in english_words:
            print('Shift to decode by Vigenère is %c' % c)
            print vigenere(message, c)
        
    for i in range(28):
        #print(caesar(test, i), caesar(control, i))
        if caesar(test, i) in words and caesar(control, i) in english_words:
            print('Shift to decode by Caesar is %i (encode %i).' % (i, 26-i))
            return caesar(message, i)

message = "BJ YMJ UJTUQJ TK YMJ ZSNYJI XYFYJX, NS TWIJW YT KTWR F RTWJ UJWKJHY ZSNTS, JXYFGQNXM OZXYNHJ, NSXZWJ ITRJXYNH YWFSVZNQNYD, UWTANIJ KTW YMJ HTRRTS IJKJSXJ, UWTRTYJ YMJ LJSJWFQ BJQKFWJ, FSI XJHZWJ YMJ GQJXXNSLX TK QNGJWYD YT TZWXJQAJX FSI TZW UTXYJWNYD, IT TWIFNS FSI JXYFGQNXM YMNX HTSXYNYZYNTS KTW YMJ ZSNYJI XYFYJX TK FRJWNHF"
english_words = set([word.strip().upper() for word in open('dict.txt')])
print crack(message, english_words)
TrustyTony 888 ex-Moderator Team Colleague Featured Poster

There should be break after words match at line 30. The function should actually have the Caesar check only, but I wanted to demonstrate the connection between the two methods. The range in Caesar loop should be range(1,26) as after it loops around and the function given for Caesar is not prepared for wrap around by modulo.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.