Categories
Uncategorized

Puzzlehunt 2015: Dossier


Puzzlehunt

Since 2012, CUCaTS have been running a puzzlehunt annually during Cambridge’s May Week. Usually lasting 24 hours, teams of up to three compete to solve as many puzzles as possible in the allotted time. The puzzles tend to cover a wide variety of fields, including steganography, number theory, encryption, signal processing, geometry and even linguistics. Some of them are pen-and-paper problems, some require URL hacking and the like, while some require physically going to a place to look for clues. I’ve taken part twice and helped organise it three times. And either side of the admin interface, it’s always been an amazing thing to be involved in.

The puzzle

This post is a writeup for one of the puzzles I set in 2015. This one’s called dossier and you can find it on the puzzlehunt 2015 page.

Uncharacteristically, the puzzle description isn’t that cryptic: you’re told that you need to decipher the two in-depth ciphertexts.

In other words, there are two plaintexts, let’s call them p_0 and p_1. These are both encrypted with the same key, say k for example. This means we have two ciphertexts c_0=p_0 \oplus k and c_1=p_1 \oplus k

Solution

Bitwise XORing the two ciphertexts together gives us the two plaintexts XORed together:

c_0 \oplus c_1 = ( p_0 \oplus k ) \oplus ( p_1 \oplus k ) = p_0 \oplus p_1

So if we know part of one of the plaintexts, we can work out the corresponding part of the key and, helpfully, the corresponding part of the other plaintext.

The body of the puzzle gives a few “crossword-style” clues, along with corresponding locations in the texts at which these words occur. Take the first clue, “0x014 Maniac Hat Time (13)” – this is an anagram of “mathematician”, which occurs starting at the 14th byte of the plaintext.

At this point, enter a little program I wrote to help with solving this: xorsolver. This is just a convenience thing really, and partly just a little exercise in tkinter for me. It does nothing which can’t be done with a few lines of c and a text editor. It’s also a terrible piece of code, with copy and pasted code lines all over the place.

Essentially it allows you to make guesses at parts of the plaintext and immediately see the result the other plaintext.

To start with, it initialises with a guess of the first plaintext as a long run of null bytes. The other plaintext is correspondingly nonsensical.

Screenshot_2017-02-12_18-01-53

As soon as we enter our first piece of guessed plaintext (specifying a memory location and the text at the bottom), a piece of the other plaintext reveals itself:

Screenshot_2017-02-12_18-04-03

” am writing t”. In fact we could guess that the word “mathematician” is surrounded by spaces: guessing ” mathematician ” at 0x13 yields: “I am writing th”. Now we can make a guess at the context for this. Let’s try ” I am writing this” at position 0x12.

Screenshot_2017-02-12_18-18-25

” a mathematician fr”. Guessing a bit further ahead: ” a mathematician from ” gives ” I am writing this le”. Another bit of leapfrogging ahead, and we end up with “, I am writing this letter ” and ” a mathematician from Meche”. Guessing placenames beginning with “Meche” or a word to follow “letter” might be a bit tedious, so let’s leave this section for now and pick up another clue.

Screenshot_2017-02-12_18-25-02

The clue “0x071 Not just a guess (6)” corresponds to “theory”. Guessing this at position 0x71 gives ” week.” in the other text. A decent guess at what comes before this might be ” last”. So we guess ” last week.” at 0x6C. This gives “ring theory”. A bit more leapfrogging and the furthest we can easily get with this section is ” in the fields of string theory and knot theory. He ” and “er I bought from you last week. Unfortunately, the a”.

Picking up some more clues, and doing a bit more leapfrogging, we might end up here with our two texts:

����������������� a mathematician from Meche����������������������������������������#548 in the fields of string theory and knot theory. He %,sm$6/��������������������������������������������������������������������������������������������������������������eI v probably ))-m)j9'?8l.3v��������������������������������������������������������������������tthe first reticulated pantograph ��������������������, this day to be one of the most important inventions of the 19th century in ?? ? ?  ?  theory�����������

and:

��Is$RN>A�, I am writing this letter 
NOO/��MOB'TPE�
��N�V????er I bought from you last week. Unfortunately, the a       D�EdA
�?TELFMI�
FT[@[VONHSI
IALGI
A
TCUKDtSL
	GN>O I&? however, ??????????????�HRFRM
HXVQI
NA"OD=
OHEs
�O
OJ� �' is interfering greatly with my pALNI
YP	DR ial purification and I would be grateful if you could rectify this at once. >+|#$  ",oS*v~vxE#A\�

So it looks like the two texts are a biography of some mathematician/inventor, and a complaint letter. Maybe not the juicy enemy communications we were after but let’s stick with it anyway. Incidentally, there’s no way to know which piece of plaintext corresponds to which text. You could end up with them randomly jumbled up together and have to untangle them – I’ve cheated a bit here since I know both of the texts anyway – although I did still make one mistake along the way and have to correct it. The only way to deal with this is through the context in the surrounding text. If the texts were formed of a load of independent characters, de-interlacing them would prove futile. But generally the structure of words and sentences, and the general style of the piece helps with decoding.

Anyway, using this higher level context, we might guess that the complaint letter would begin with a greeting. With a bit of guesswork we find that “Dear Sir or Madam,” fits perfectly. We end up with a good chunk of plaintext of the mathematician’s biography. “Henri Martens was a mathematician from Meche”.

The rest of the solution proceeds in the same way, using the clues given and guessing pieces of the text. Eventually you end up with the two plaintexts:
“Henri Martens was a mathematician from Mechelen, Belgium. He pioneered numerous advances in the fields of string theory and knot theory. He was awarded the Donald Knuth prize for computing in 1842 for the invention of the impersonal computer. This amazing feat, however, was surpassed shortly after in 1843 when Lord Timpson of Skegness demonstrated the the first reticulated pantograph — widely regarded to this day to be one of the most important inventions of the 19th century in the field of applied maths.”

and:
“Dear Sir or Madam, I am writing this letter to complain about the four dimensional printer I bought from you last week. Unfortunately, the arm meant to move it in the temporal dimension is defective and is stuck travelling forward at the speed of light. You are probably aware that this renders my machine into an ordinary three dimensional printer. This is interfering greatly with my plans for peaceful racial purification and I would be grateful if you could rectify this at once. Sincerely, Abratma Gandhler”

The theme of the latter was intended to change to something a little more lighthearted, along the lines of “…my plans for the enslavement of all of lifekind on earth…” but the wrong files got uploaded in the end.

Writing the puzzle

If I recall correctly, this puzzle was written at around 5AM during the 2014 puzzlehunt. Once I had the idea, two of us separately sat down to write 512 bytes of nonsense. Afterwards, I checked to see how they aligned, in terms of difficult sections matching up. The “leapfrogging” technique can reach a frustrating stall if both ciphertexts reach a difficult-to-guess bit at the same time. It’s much harder to figure that “shortly after in” is followed by “1843 when Lord Timpson of Skegness”, but if you see “I’m writing this letter”, then “to complain about the” isn’t a particularly outlandish guess. So by adding/removing filler in one text, I was able to adjust the difficulty of the puzzle. I also decided to give plenty of clues to give multiple angles of a attack for difficult sections.

Here I’ve coloured the two texts roughly by predictable filler text (green) and stupid high-entropy stuff (red). The clues are also marked in cyan.
plaintexts

You can see that the three sections where red occurs on both lines simultaneously have clues relatively nearby.

In the end, nobody reached this puzzle in 2014 and it was rolled over to the 2015 puzzlehunt. It proved quite popular, with several teams eventually submitting correct answers. If this seems like your kind of thing, sign up to the CUCaTS mailing list to get more information closer to the time.