How to Break XOR Encryption with just 20 Million USD of Quantum Computer!

or

A Worked Example of Grover's Algorithm, unnecessarily attacking XOR encryption

This is an introduction to utilising a quantum computer to break encryption, but using XOR as our example. XOR is a very simple algorithm, but a relevant one still today (even if it is more from a theoretical than practical point of view, for the most part). As such, it forms avery very nice, easily accessible example for demonstrating how to break XOR encryption

NB - An interactive version!

For those who would prefer an interactive version where you can run the code in this post, please see this Google Colab Notebook! Although we aren't using Google's Cirq library, the IBM Qiskit code still works!

So who is this for?

This introduction to Grover's Search Algorithm is designed for those who know a little about encryption, aren't phased by a few bits of fairly straightforward mathematics (there really isn't much - we've kept it to a minimum!) and is curious to get an insight on how to use this algorithm, its limitations, and how it works in as practical a way as possible!

The ingredients you need to accomplish this attack from a cryptographic point of view are as follows:

  1. A known byte of plaintext
  2. The corresponding byte of ciphertext

The astute amongst readers will know that, because of how the XOR gate works, this is totally unnecessary!! Here's a brief breakdown of why... (for those who don't care, skip the next section...)

Brief Analysis of XOR Encryption

We write, for two bits or bytes or bit streams or whatever, $A$ XOR $B$ as $A \oplus B$. The truth table is:

$A$ $B$ $A$ $\oplus$ $B$
0 0 0
0 1 1
1 0 1
1 1 0

With this, we can derive a few rules about XOR encryption:

  • $A \oplus (B \oplus C) = (A \oplus B) \oplus C$ (we can move brackets around, known as associative law)
  • $A \oplus 0 = A$ (anything XOR zero is just itself - zero is identity, we say)
  • $A \oplus A = 0$ (anything XOR itself is zero)

With these tools, we can show the following. For a plaintext $P$ and a ciphertext $C$ and a key $K$, we can define XOR encryptions as $$ C = P \oplus K $$ and using this and the rules above:

$$\begin{eqnarray} P \oplus C &=& P \oplus (P \oplus K) \nonumber \\ &=& (P \oplus P) \oplus K \nonumber \\ &=& 0 \oplus K \nonumber \\ &=& K \end{eqnarray}$$

Let's look at some code...

# plaintext hex...
plaintext = 0xAB
# Key hex..
key = 0xCD
print("Plaintext is: {}".format(hex(plaintext)))
print("The key is: {}".format(hex(key)))
# and here's the ciphertext, which we get from P XOR K
print("Ciphertext hex is: {}".format(hex(plaintext ^ key)))
# set the ciphertext
ciphertext = 0x66
# ...and show that P XOR C = K
print("This is P XOR C, which should be the key: {}".format(hex(plaintext ^ ciphertext)))
Plaintext is: 0xab
The key is: 0xcd
Ciphertext hex is: 0x66
This is P XOR C, which should be the key: 0xcd

As is clear, when evaluated, this works trivially easily. This is why when we use XOR in cryptography, we use cipherstreams or other cryptographic techniques to generate long keys, ideally the length of the plaintext, so that there is (ideally) no way to compromise XOR in this way.

TL;DR - if you have a byte of plaintext and a byte of ciphertext, you can recover a byte of key with minimal work.

TL;DR - YOU DON'T NEED TO DO ANY OF THE STUFF IN THIS NOTEBOOK - THERE'S NO QUANTUM ADVANTAGE HERE AT ALL!!

This means that all we are doing here is learning to understand something with a minimal example. What are we here for? Well, it's time to have...

A Whirlwind Tour of Quantum Computing

So, there's no way that we can give you a comprehensive overview of quantum computing in one blog post/jupyter notebook, but what we can do is show you just the bits we think you need to know to get rolling!

For a better overview we recommend "Quantum Computing a Practical Approach" by Jack Hidary [1]

NB - if you're not aware of a lot of the below, then I'd recommend reading an introduction to quantum computing first, like Hidary's book or Qiskit's introduction, as we're going to move quickly!

What is 'quantum computing' anyway?

In a nutshell, quantum computing is a different way of computing that embraces several ideas from quantum mechanics. Principally, these are superposition and entanglement. We won't cover entanglement, but superposition is relatively easy to give a view on.

Quantum bits, or qubits are just like regular bits in one sense - they have a '0' and a '1' state. But they also have the ability to be set to be exactly between the two... But how? Well, picture this (it's called the Bloch Sphere):

Here, you can see the '0' state represented as $| 0 \rangle$ and the '1' as $| 1 \rangle$. Whenever you 'measure' a qubit, we collapse whatever state it is into one of these states. However, you'll notice that the fact the state can move around the surface of this sphere might mean something - and you'd be right! Though this isn't mathematically totally kosher, you can think of the 'distance from the state to the outputs $| 0 \rangle$ and $| 1 \rangle$' as the 'proability a measurement will come out as 0 or 1'.

So, the idea in quantum computing is to use the fact that we can mix these states - which we call superposition to be fancy, but really, it's no different that mixing two sound waves/notes on a piano, mathematically.

A Little Mathematics

Now, quantum computing and quantum information theory are still a way off from having universally accepted, nice and easily digestable heuristic for its programs (but you might argue that Bob Coecke's ZX-Calculus is pretty damn close, and with its recent completeness result it's probably one of the best contenders for the curious). So here are some mathematical details for those that are comfortable - but if you don't want this, then just skip to the next section! :)

So, essentially, the vector in the middle of the Bloch Sphere is given by a two-place column vector of complex numbers, written usually as $$ \begin{pmatrix} a \\ b\end{pmatrix} \text{ for } a, b \in \mathbb{C} \text{ with } \lvert a \rvert^2 + \lvert b \rvert^2 = 1 $$ We define the $|0 \rangle$ and $|1 \rangle$ states above as $$ |0 \rangle = \begin{pmatrix} 1 \\ 0\end{pmatrix}, |1 \rangle = \begin{pmatrix} 0 \\ 1\end{pmatrix} $$

These are the only real preliminaries we need to proceed learning about ...

Quantum Gates

We construct quantum algorithms as 'quantum circuits' (see examples below), which are sequences of 2x2 matrices that are applied to qubits initially set to the $|0\rangle$.

For example, the gate to flip $| 0 \rangle$ to $|1\rangle$ is called the $X$-gate or the $NOT$-gate, and is as you might expect $$ X = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} $$

We talked about the mythical 'superposition' and we should show how that is done. Essentially, for a qubit to be placed into superposition we use a gate called the Hadamard gate, or $H$-gate, defined as $$ H = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}$$

We can also invert the phase of our qubit - not a concept we are used to dealing with with regular bits! But remember the Bloch Sphere? Flipping the phase is equivalent to a rotation around the z-axis on the Bloch Sphere image above. We represent this by the following matrix: $$ Z = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}$$ If you play with the $|0\rangle$ and $|1\rangle$ vectors you will find that $|0\rangle$ is not affected by the $Z$ gate, but $|1\rangle$ becomes $-|1\rangle$ - it will still measure to '1', but the phase has been shifted about the z-axis.

We also have the concept of a 'controlled' gate - so if you want to have a 'controlled $NOT$' (shortened to '$CNOT$') then the idea is that if the control qubit is $|0\rangle$ then the other qubit is unaffected, but if the control qubit is $|1\rangle$ then the other qubit has the $NOT$-gate applied.

For the mathematically curious, here is the 2 qubit (so 4x4) matrix for this gate: $$ CNOT = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{pmatrix} $$

Think of each column representing, from left to right, the inputs $|00\rangle, |01\rangle, |10\rangle$, and $|11\rangle$. Then consider the rows as outputs of the same from top to bottom. Something that will be important later is the fact that if you look at the outputs of the second qubit (the one that isn't the control qubit), then for input $|A,B\rangle$ the output is equivalent to $|A, A \oplus B \rangle$. Familiar from the XOR stuff above? You betcha'...

Lastly, the measurement operator... This will take the state of our circuit (as represented by the input vector multiplied by the product of all the interim matrices that make up our circuit) and 'measure' the qubit. Formally, this means that across all the probabilities of all the outcomes, one comes out at random according to that distribution.

For now, we will omit other gates, as we don't need them for our purposes here, but if you're curious you can find them on the Quantum Logic Gates page on Wikipedia and in many other places. If you want to play with quantum circuits, you can use IBM Quantum's drag'n'drop circuit builder, or for small circuits and in-browser simulation, try Quantum Javascript.

If the maths was something you're not familiar with but would like to be, have a look at Qiskit's Linear Algebra Course as this is a course designed to serve an introduction to Quantum Computing. If you prefer dead log editions, then Hidary's book [1] has an excellent section on the maths for quantum computing!

!python3 -m pip install -q qiskit numpy matplotlib
# we're gonna need to import some stuff now... 
import matplotlib.pyplot as plt
import numpy as np
from math import floor, ceil

# importing Qiskit
from qiskit import IBMQ, Aer, assemble, transpile
from qiskit import QuantumCircuit, ClassicalRegister, QuantumRegister
from qiskit.providers.ibmq import least_busy
from qiskit.circuit.library import MCMTVChain

# import basic plot tools
from qiskit.visualization import plot_histogram

#provider = IBMQ.load_account()
#provider = IBMQ.get_provider("ibm-q")

Building a Quantum circuit

To have a look at what these 'gates' we talked about look like, here are a few: an $H$ gate, an $X$-gate, a $CNOT$ with the control on the upper qubit, a $Z$-gate, and a measurement. We'll use this as an excuse to show how to add gates sequentially to a quantum circuit, as well as how to draw pretty circuit diagrams that are very report friendly.

test_qc = QuantumCircuit(2,2)
test_qc.h(0)
test_qc.x(0)
test_qc.cx(0,1)
test_qc.z(0)
test_qc.measure(0,0)
test_qc.draw(output='mpl') # the 'mpl' bit means it'll use matplotlib 
                           # and make it 'very pretty'...

Grover's Algorithm - A Guided Tour

So, we're going to base this walking tour on a paper by Song, Jang, Kim, Lee, and Seo [2] where they implemented Grover's Algorithm for Caesar and Vigenere Ciphers. Our example is a simplification that is not particularly academically interesting (see the above discussion on XOR) but is a nice 'toy example' for how to implement Grover's Algorithm to do something useful that we can very easily verify.

Such examples are useful educationally, so we hope this fits the bill. Later we will give a brief discussion about where Grover's algorithm fails, where it succeeds,

Grover's Algorithm - the working parts

Grover's Algorithm is quite simple - it has two principal parts:

  1. The Oracle - this provides the superposed search items (in our case, a cryptographic key), provided in such a way that our item of interest (the key) has its phase inverted.
  2. The Diffuser - this performs an operation that takes our inverted-phase key and amplifies its probability so that it can be easily extracted through the output of a quantum computer.

And that's really it. I mean, ofc, there's lots of details, but that's the shape of it!

Now, we're gonna go through an describe each part in detail, and build up our final circuit. 🙌

For reference, here are the variables:

plaintext_bits = [1,1,0,1]
ciphertext_bits = [0,1,0,0]
key_bits = [1,0,0,1]

So, let's get to work...

The Oracle

When most people think of 'the Oracle' they think of a character from a popular sci-fi movie who chainsmoked whilst handing out cookies, broken spoons, and vague-to-useless advice to people who thought they were in a simulation...

Rest assured, we don't need to have a cutlery supplier on speed-dial for what we're going to do! The 'oracle' in computing terms actually dates back in usage to Alan Turing (who showed that with an Oracle you could effecively solve the halting problem, giving rise to the hierarchy of Turing Degrees, but I digress on computability theory...). Tl;Dr - the Oracle sets things up for us to utilise what we have to find the term that we want.

As such, we're going to setup the oracle to process our plaintext and ciphertext and set the superposition of our key to have something we can locate at the position we want to know. To do this, we need to first decide how we're going to structure our plaintext and ciphertext.

Firstly, the key and plaintext - we're going to allocate the first four qubits to our key and put these into superposition using Hadamard gates. Then we'll use the next four qubits to code the four bits of plaintext we have - for each one, we'll use an $X$-gate to 'flip' the qubit from $|0\rangle$ to $|1\rangle$.

Here's the code to do that:

# let's put our P and C as arrays of bits
plaintext_bits = [1,1,0,1]
ciphertext_bits = [0,1,0,0]

# setup the circuit - 8 qubits, and 4 'classical bits' for our output later.
grover = QuantumCircuit(8,4)
for i in range(4):
    # put the first four qubits into superposition
    grover.h(i)
for i in range(4):
    if plaintext_bits[i] == 1:
        # set the next four qubits according to the array
        grover.x(i+4)
# we'll use barriers to make things clean lookin'... 
grover.barrier()
grover.draw(output='mpl')

Next, we want to 'encrypt' the plaintext with the superposition'd key qubits - and as we saw earlier, the CNOT is an XOR in disguise 😉 - so we set it up that the control bits are the plaintext bits, and the superposition'd keys are input with the XOR algorithm (coded as quantum gates):

for i in range(4):
    # CNOT is a 'controlled NOT' or 'controlled X', so in qiskit you use .cx(a,b) to do CNOT(a,b)
    grover.cx(i+4,i)
grover.barrier()
grover.draw(output='mpl')

Next comes the 'flip' - remember, we want to line up the oracle that it outputs the solution (the key) that we don't yet know (at least, supposedly). So here we take the plaintext and pass this to a $CCCZ$-gate. That is, a gate that takes 4 qubits input, and if the first 3 are $|1\rangle$, then it flips the phase of the last qubit. However, because we are in superposition, this inverts the phase of the entire setup thus far - so if we code the bitwise inverse of the ciphertext, then when we proceed to decrypt the state (by setting the using XOR again) the state that will be flipped is exactly the state we want, i.e. the state in the superposition of key bits that matched the ciphertext and plaintext in encryption and decryption - otherwise known as the key!

So let's do this slowly - firstly, let's code the ciphertext bits, but inverted...

for i in range(4):
    if ciphertext_bits[i] == 0:
        grover.x(i)
grover.draw(output='mpl')

With that done, we apply the phase flip for the state that would have matched up with the ciphertext and plaintext after/before encryption with XOR... To do this, we use a multi-controlled $X$-gate with two $H$-gates either side of the final qubit - this is equivalent to a multi-controlled $Z$-gate, and if you want to see how, fill out the matrices :P

grover.h(3)
# The multi-controlled $X$ gate is also called a 'Toffoli' gate, after the 
# prof. of electrical engineering Tommaso Toffoli, who invented it!
# To use this, we give a list of the control bits, and the bit that has the X gate on it as second parameter.
grover.mct([0,1,2],3)
# and another Hadamard to finish the CCCZ
grover.h(3)
grover.draw(output='mpl')

See the paper [2] for more details, but now we have to 'reset the state' to restore the superposition of keys, but with the one we are interested in inverted in phase - note that, despite the $CCCZ$-gate only being applied to the bottom qubit, because the qubits are in superposition, the whole state gets inverted.

The reason that we use the bitwise inverse of the ciphertext is that if our key is [1,0,1,0] and we set gates at [0,1,0,1] then the state that meets the $CCCZ$-gate will be $|1111\rangle$ and so the 'flip' will be activated. This sets us up to reset through the decryption process - this technique is taken directly from [2].

Here's what the inverted decryption process looks like:

# apply the cihpertext bits
for i in range(4):
    if ciphertext_bits[i] == 0:
        grover.x(i)
grover.barrier()
# and now apply the decryption procedure... there's no reason to do this backwards except that it looks pretty. :-P
for i in reversed(range(4)):
    grover.cx(i+4, i)
grover.barrier()
grover.draw(output='mpl')

But I still don't get why you have to decrypt, Mark?

Well, it comes down to two reasons. The algorithm is designed to be iterated, so you have to 'restore' the state. Secondly, you have to restore the balance between all of the core states. This diagram from [3] should help explain what the final state looks like:

The most intuitive (but not necessarily totally technically accurate) way to think about what is going on here is that we have encrypted using all possible key states, then we have used the phase flip to 'mark' the particular matching state that we know was the output (the ciphertext). We then have to 'repack' all of our states to have the same amplitude, hence we need to do the decryption routine in our oracle to attain the correct state of our computation, detailed in the diagram.

The $|W\rangle$ term in the diagram is equivalent to our key. So, once we've restored the superposition of all the states with a single phase-flip on the particular state of interest. With this done we can move onto the second part of the algorithm. That's right... with the Oracle done it's time to make the Diffuser...

The Diffuser - Phase Flippin' Magic

So, the diffuser is a circuit that makes Grover's idea work. If you look at the above diagram, you'll see the dotted line? That's the 'Average' probability of each term in our sequence of all possible terms. But what would happen if we could 'flip' all the bars along that line? Well, this (again, from [3]):

Notice that the flip is about the dotted line - and the result is that the other possibilities are reduced in probability, whilst the possibility of interest is significantly boosted - and that's just what we want! If we did our phase-flip of the correct key bits properly in the Oracle (don't worry, we did ;-P ) then the diffuser will perform this flip and let us find the key! 🤯🤯🤯🥳🥳🥳

(If you would like a more in-depth explaination of how this diffuser works, have a look at [3] and [1])

So, let's implement the diffuser in code...

for qubit in range(4):
    # superposition on all qubits
    grover.h(qubit)
for qubit in range(4):
    # flip all the qubits
    grover.x(qubit)
# do another CCCZ
grover.h(3)
grover.mcx(list(range(3)), 3)
grover.h(3)
for qubit in range(4):
    # flip the qubits again
    grover.x(qubit)
for qubit in range(4):
    grover.h(qubit)
grover.barrier()
grover.draw(output='mpl')

Measure for Measure

Now we've setup the circuit, done our calcultion, and (all being well) lifted out the key based on the coding of our input - let's measure the key! If we're right, then we should get a massive spike when we simulate this. So here's the full circuit, from key superpostion, oracle, diffuser, and finally measurement!

grover.measure(grover.qubits[:4], range(4)) # measure the first four qubits into the first four output classical bits
grover.draw(output='mpl')

The Proof is in the Pudding-it-into-action

So, we've made it this far - we've skipped a load of details, but hopefully you've got a good handle on roughly what's going on. So let's run it! First we call the AER simulator from qiskit, compile the circuit for it, and then pass it the circuit and get back the results! Here we go!

aer_sim = Aer.get_backend('aer_simulator')
transpiled_grover = transpile(grover, aer_sim)
qobj = assemble(transpiled_grover)
results = aer_sim.run(qobj).result()
counts = results.get_counts()
plot_histogram(counts)

H4CK4D!! Pwning with Quantum Computers

So, hopefully you've realised that the biggest spike (usually with around 45% of the output) is for the four output bits of 1001 which, if you recall, is precisely our key!! We did it! 😱😱😱

Ok, real talk - it's totally useless, as we pointed out at the start. But the illusration of how to do things in 'quantum' is hopefully useful to those who have to understand precisely where, how, and why Grover's search works.

But... 'wen quantum advantage'?

So, the magic in Grover's Algorithm is that for each iteration of 'Oracle -> Diffuser' (and you can just apply them sequentially to get the desired output to 'poke out' more readily on larger search spaces. Each Oracle-Diffuser instance is an 'iteration' in the search, and to get the statistical significance you would want to go "Ah, here's the answer!! :D " you should apply several iterations - but the number of iterations is at most $\sqrt(N)$ for searching over $N$-many items.

Compare this with a regular 'key search/bruteforce' - where you would have to make at most $N$-many attempts to guarantee finding the key. So this is where the 'quantum advantage' really lies - the run time of the circuit is bounded linearly by the length of the input.

In Closing...

But what does this mean? Well, hopefully this means two things:

  1. You've got a better understanding of how quantum computers really work.
  2. A better understanding of how Grover's algorithm works.
  3. A better understanding of where Grover's algorithm might not work...

...that last point. There's a lot of hype, and one thing that gets mentioned a lot is that "Grover's Algorithm can search a phone book in a fraction of the time" <-- well, that's a load of BS.

Why? Because in a phone book there's no algorithmic link between the inputs and the desired output. Sure, you could probably make it happen using coding magic (maybe a sprinkling of the Chinese Remainder Theorem, like Goedel used to do a lot), but that is likely computationally more expensive than actually just doing a regular search :-P

Happy Quantum Hacking! (And keep an eye out for the Quantum Village!)

References

AKA - all the stuff we couldn't make into links :P

[1] Jack Hidary, "Quantum Computing: An Applied Approach", 2019, Springer

[2] Song, Lang, Kim, Lee, and Seo - "Grover on Caesar and Vigenere Ciphers", 2021, IACR, link: https://eprint.iacr.org/2021/554.pdf

[3] - Grover's Algorithm, Qiskit documentation, link: https://qiskit.org/textbook/ch-algorithms/grover.html

Comments

Popular posts from this blog

Decrypting config.bin files for TP-Link WR841N, WA855RE, and probably more…

Aperiodic Tilings