Introducting Computer Cryptography, Part1: RSA Encryption Primer.


Here's a problem: you want to create a chat program which will allow you to talk to your friend in a secure way. Think about a few possible ways this could be done (and, historically, there have been many, many ways that this has been accomplished!).

In general, it seems reasonable that we should want (at least) two properties for an encryption scheme; this was best summarized by Auguste Kerckhoffs in his six principles (the first two of which are considered the most important):

  1. The system should be unbreakable in practice.
  2. The design of the system should not require secrecy. If the system should be compromised by an enemy, it should not inconvenience the correspondents.
  3. The key should be memorable without notes and should be easily changeable.
  4. The cryptograms should be transmittable by telegraph.
  5. The system should be portable and able to be operated by a single person.
  6. The system should be easy to use.

For some of these, I've paraphrased. The first and second (sometimes called Kerckhoffs' Principle) are the most important; the others are important with minor changes: memorization is no longer necessary with the use of computers (most of the time), and telegraph should perhaps be replaced with "computer network" or somesuch. The last two are interesting as well, noting that the system should not be intensely complicated and require the labor of multiple individuals — generally, computers have made this possible for most cases, and programming has allowed for complicated schemes to be constructed in such a way that the user need only input a few details to be able to use it.

The first principle here is important because of the vague nature of in practice; for us, this will mean that, even with an immense amount of computing power, it will still take a significant amount of time (say, longer than one life-span) to crack the code. This is important to note since, in general, there are few systems which are absolutely, provably secure which also follow the principles above — but, many which are "breakable" are only breakable if one is to constantly work at it for decades. This is good enough for us in practice.

The second point is perhaps the most mystifying one to those starting out in cryptography: how could the enemy know the keys and, yet, not be able to translate the message? This feels strange to us since what does the enemy not know that our recipient knows?

The main point of this post will be to show that such a system exists by detailing the RSA Algorithm (named after the founders: Ron Rivest, Adi Shamir, and Leonard Adleman). First, we'll detail what the algorithm is. Next, we'll do a (very) simple example showing how the algorithm should work.

For prereqs, you should know the basics of modular arithmetic and the basics of the $\phi$-function, though this last part is not required since we will explicitly compute the values below (so don't worry if you don't know much about it).

The RSA Algorithm.

There's five steps to the RSA algorithm, but most of them are fairly easy. The first one is the most important, and we'll talk about this a bit later. Let's call our message reciever Player 1 and the message sender Player 2.

  1. Player 1 chooses two primes, $p$, $q$.
    • Usually, these primes will be 'big primes', but any two primes will work; smaller ones will just be easier for an outsider to crack the system.
  2. Player 1 finds $n = pq$.
    • This will be used as a modulus; we'll see what this means in the following steps.
  3. Player 1 computes $\phi(n) = (p-1)(q-1)$.
    • This value will always be $(p-1)(q-1)$, so ignore the $\phi$ part if you don't like it.
  4. Player 1 chooses some integer $e$ where $1 \lt e \lt \phi(n)$ and $e$ is relatively prime to $\phi(n)$ (it has no common prime factors).
    • This $e$ will be the base for the exponent. There are some ways to maximize security by choosing a small (but not too small) $e$, but we won't worry about this.
  5. Player 1 finds $d$ such that $d\cdot e \equiv 1 \mod(\phi(n))$.
    • This is, perhaps, the hardest step to do by hand. Luckily, computers make this step nearly trivial. We'll explore one method to find this value later.

At this point Player 1 has the values $p, q, n, \phi(n), e$, and $d$. This is quite a lot. We're going to define two things now, using these values:

The Public Key is $[n,e]$. That is, the public key is a list of two values: the value for $n$ and the value for $e$. The point of the public key is that it will be made public and available to everyone (not just Player 1 and Player 2).

The Private Key is $[n,d]$. This list is what Player 1 must keep to themselves, and must not allow anyone (even Player 2) to know. Moreover, $p,q,\phi(n)$ must be kept secret, or else someone could calculate $d$.

To sum up everything so far, here's what the players and everyone else knows:

\[\begin{array}{|l|c|}\hline \mbox{Player 1} & n,d,e\\ \mbox{Player 2} & n,e\\ \mbox{Everyone} & n,e\\\hline \end{array}\]

Encrypting the Message (if the message is an integer...).

At this point, it's easy to finish the algorithm; but, we require the message to be sent to be an integer. This turns out not to be a big deal, since we can, in general, convert words into integers fairly easily. We'll cover the word-to-integer algorithms a bit later, but for now let's say Player 2 simply wanted to transmit the message $M$ which was some integer (maybe it's something like my social security number).

First, the public key is revealed; everyone knows $n$ and $e$ now.

The method of encryption (by Player 2) is the following calculation: \[M^{e} \mod n\] and let's call the number that we get $M_{E}$ (for "encrypted message").

The method of decryption (by Player 1) is the following calculation: \[(M_{E})^{d} \mod n\] this value will, in fact, be $M$, as we will show.

A (Very) Easy Example.

First, let's do a simple example so that you'll at least believe that such a thing could work.

Let's let \[p = 11, q = 17\] so that \[n = pq = 187.\] Hence, \[\phi(n) = (11-1)(17-1) = (10)(16) = 160.\] We now need to pick some $e$ with $1\lt e\lt 160$ which is relatively prime to $\phi(n)$; let's pick $e = 7$. From this, we need to solve for $d$ with \[de = 7d \equiv 1\mod 160\] which, in this case, we can do by hand: it turns out that $d = 23$ gives $ed = 161$ which is equivalent to $1\mod 160$. Neat.

So far, we have the following: \[\begin{array}{|c|c|}\hline p & 11\\ q & 17\\ n & 187\\ \phi(n) & 160\\ e & 7\\ d & 23\\ \mbox{Public Key} & [n = 187, e = 7]\\ \mbox{Private Key} & [n = 187, d = 23]\\\hline \end{array}\]

Now, let's say Player 2 has the message $M = 19$; we want to choose the message such that the integer value is less than $n$ (we'll show why after this example). Player 2 first encrypts the message: \[M_{E} = 19^{7}\mod 160 = 59.\] Player 2 now tells Player 1 (publicly) that $M_{E} = 59$. At this point we stop to ask: could anyone decode this? There seems no "reasonable" way (if this was encoding real words instead of an integer, one could test a lot of primes and see if anything reasonable is spit out). On the other hand, Player 1 has the value $d$ (kept a secret) and decrypts the value by: \[(M_{E})^{d} = 59^{23}\mod 160 = 19\] which was the message. Neato. Note that this last step, for me, took about a minute using Wolfram Alpha, and a little less than 30 seconds using Wolfram Mathematica — but these were relatively small primes. One could see how using larger primes would make this scheme much more difficult to crack.

How does this work?

This section will detail a little bit as to why this algorithm works. If you're not comfortable with the idea of relatively prime things, you may want to skip this section.

It should be somewhat clear that if someone did not know $d$, it would be difficult to decrypt the message. Note that it is not impossible, but computationally quite expensive.

Let's suppose that we have a message $M$, and $p,q,e,d, \phi(n)$ as above. When we encrypt $M$, this gives us \[M_{E} = M^{e}\mod(pq)\] We have that $0\leq M_{E} \lt n$ in this case. Recall that $e$ is relatively prime to $\phi(n)$; in other words, there exists $a,b$ integers such that \[ae + b(\phi(n)) = 1\] or, in other words, \[ae = 1 - b(p-1)(q-1)\] Note here that \[ae \equiv 1 \mod((p-1)(q-1)) = 1\mod(\phi(n))\] and, hence, we must have that $a = d$, as we've defined $d$ in exactly this way (is $d$ unique? is $d$ unique modulo $\phi(n)$?). Hence, we note that \[\begin{align*} (M_{E})^{d}\mod(n) &= (M^{e})^{d}\mod(n)\\ &= M^{ed}\mod(n)\\ &= M^{1 - b(\phi(n))}\mod(n)\\ &= M\cdot M^{-b(\phi(n))}\mod(n)\\ &= M\cdot(M^{-b(\phi(n))}\mod(n))\mod(n)\\ &= M(1^{-b}\mod(n))\mod(n)\\ &= M\mod(n) \end{align*}\] which is equal to $M$, so long as $M \lt n$ (which we'll talk about in a second. Note that we've used some properties of modular multiplication on the third line here, and we've made use of Euler's Theorem in the next-to-last line. Hence, decrypting in this way will give us back $M$.

Note that we've needed to do two things for this to work: first, we required $M$ to be an integer; second, we required $M \lt n$. We'll talk about the first assumption in the next part. For the second assumption, it generally is not a huge problem to make $M\lt n$ since the primes $p,q$ that we work with are quite large. So long as we do not encode our message in a silly way that purposely makes it a large number, things will be fine. This will segue us into the next topic...

Messages with Letters.

There are a number of ways to take a message and create an integer which represents this string. We'll talk about a popular (though, perhaps not totally efficient) scheme here; note, though, that there are other ways to do this!

The idea behind the method we'll talk about is as follows. A message is input, the message is broken up into same-size pieces (called blocks), some things are done which alters the text and turns the block into an integer, and then it returns that integer. At this point, the message is ready to be sent through RSA.

For example, we can make up a silly example like this:

  1. Break text into blocks containing 4 letters.
  2. Starting with $a = 10, b = 11, \dots$, change the letters to numbers and simply concatinate the numbers together. We can let spaces be equal to the number $99$, and if we need to pad anything at the end to complete a block (so that we don't have blocks that aren't less than four letters) we will let this be equal to $88$ (see below).
  3. Send the blocks through RSA in order.

This method would work (ignoring that we have not encoded numbers, non-alphabetic symbols, etc.). For example, the message \[\mbox{hi mom im home}\] would be broken up into: \[\begin{array}{l} \mbox{hi m}\\ \mbox{om i}\\ \mbox{m ho}\\ \mbox{me} \end{array}\] We can then translate the first three lines into numbers, but the last line doesn't have four letters so we need to pad it; that is, we will put "fake" characters on the end so that the blocks will be of the same length; above, we noted that this character would be equal to 88. The whole translation would look like this: \[\begin{array}{c} \mbox{17189922}\\ \mbox{24229918}\\ \mbox{22991724}\\ \mbox{22148888} \end{array}\] Since each letter corresponds to two digits, this message would be easy to translate back into text. Note that, here, the numbers are around eight digits long so as long as one used $p,q$ such that $pq \gt 100,000,000$ the message would be preserved.

Note again that this is just one possible way to create an integer from text. We'll investigate other methods in future posts which will be more efficient but, consequently, more complicated.

As practice, you may want to create your own method, then attempt to encrypt and decrypt your message. Note that in many programming languages, modular exponentation is built in and is extremely useful in constructing an RSA algorithm that doesn't take years to run.