The Math of Cryptography: RSA
Achyut BharadwajSeptember 2021
A presentation on the mathematics of cryptography, with an emphasis on RSA. A link to the slides is here.
Basic Cryptography #
Sending Messages #
Say, there are 3 people, Aditi, Bhaskar and Diti.
Aditi wants to send the message “HELLO” to Bhaskar, using her computer.
Computers, however, can only store numbers. How will she send the
message “HELLO” to Bhaskar?
To store characters, there is a code called the ASCII code. In ASCII,
each character is given a value in binary.
“HELLO” is coded as: 1001000
1000101
1001100
1001100
1001111
So, Aditi has to send the above code to Bhaskar.
The problem #
Aditi can send the message to Bhaskar easily. But, she
realises that Diti has been tapping into their conversation and can see
any messages that Aditi sends to Bhaskar or Bhaskar sends to Aditi.
This is not really a big problem in this case, but it is a big problem
when it comes to sharing military data and other secrets that cannot be
allowed to read by anyone else.
How do we solve this? We will see it in the next slide.
Encryption to Solve the Problem #
To solve the problem, Aditi and Bhaskar
can decide on a code. For example, every letter has to be written as the
next letter of the alphabet. So, “Hello” will be written as: “IFMMP”.
But, to say “HELLO” to Bhaskar, Aditi will actually have to type
“IFMMP”. That is inconvenient. The computer should be able to encrypt
the message to make it more convenient.
So, Aditi and Bhaskar decide on a “key”, say “DAVID” to make the
computer encrypt the message.
DAVID in ASCII will be: 1000100
1000001
1010110
1001001
1000100
The computer takes the ASCII code of the
message, “HELLO” (10010001000101100110010011001001111
) and performs the
“XOR” function on it with “DAVID” (It will put the key and the message
under one another and compare their individual digits. If a digit of
both the message and key is 1 or 0, it will be coded as 0. If a digit of
either David or the message is 0 and the other is one, it will be coded
as 1).
Bhaskar will receive the message 00011000000100001101000001010001011
and will again perform the xor function with the same key, “DAVID” on
the message he received, to recover the original message “HELLO”. This
way, Diti, who doesn’t know the key will not be able to decrypt the
message.
Description | Value |
---|---|
Message | HELLO |
Message in ASCII | 10010001000101100110010011001001111 |
Key (DAVID) | 10001001000001101011010010011000100 |
Encrypted Message | 00011000000100001101000001010001011 |
Another Problem #
The encryption method we saw was good. But, how will
they decide the key?
Either Aditi, or Bhaskar will have to send the key to the other person.
Diti will easily be able to see what the key is. Using the key, she can
decrypt the message. She can, again see what Aditi has sent to Bhaskar.
Is there a way to encrypt the message without exchanging a secret key?
This question will be answered in the next few slides.
One Way Functions #
Introduction to One Way Functions #
One way functions are functions that
are easy to compute but near impossible to reverse engineer.
In other words, if you give an input $x$, you can easily get the output,
$y$. But, given $y$, it’s almost impossible to get back $x$.
Product of two large primes is an example of a one way function. You can
multiply two large primes to get a large composite number. But, you
cannot, however, factorize a number that is a product of two large
primes with even the best of computers.
Example #
957135426615121804132268787531553843999746069720600225934835423219041
240458026369875973016253707964684602267800596205055638062793447182617
158905205152654580688125527658627703098418849708978400492980182745478
129984316595629417697663031045210596692776008178130663712333425775453
55387379048302677039785957132761
$$\times$$
120298622332284217920263119439218647995514427637318665999352631030708
675793376241885023787816151411152784569205251004620123401534895787679
835140103851233118061966130691593837228054872472834285075912353179064
626729066494090496091253648727437708994716038533787234568217532351802
713575782530013217095459609838209
$$=$$
11514207320722227407584898334199433951260663029356386611138421864505
62639447812519054944650062859844918134802110404265909224997432186496
79918779852095032169972009537317441162019098123608799297935923890024
04614768098348910941336785516287176956043488373513461909870898764923
26191097022823797710614428056766421773603616140713428277891859038876
58656035556025019009074058067579480457807682510162468160522583630162
76657940720520428359446258214400295746688681923924657477166968353910
31158431462110173687034653546658352210581304240609075498811182235458
29309414340345106712283658708268799065794030737232563073492743465049
This is easy to compute (using a calculator).
However, given
98419057684350161066776239027660152546794766334972656089987368
73193792902255516812870091756964768906248291997956410813089663
43347922796098783981289962167812549386924310001546712984926552
57281812139063849462224048130363233588210938124839879267816492
74330469942311669162079564565437945448039317147963818495958699
74758510805423513063659072931521415701222944340906205286544113
81075572593242915449589268144775989927299224695934233881736052
93015613419189065693495693617441291014788877156062526753186609
51815872431902888984449015918546153376338724920726118505676810
7615921198531051537813607718202134208938771569328329222193
you will spend the rest of your life, travelling at the speed of light, with a (not so super) computer on earth trying to factorize the number!
Some Math #
Modular Arithmetic #
If you have an equation:
$x\equiv y \pmod n$ is just another way of writing
Also note that while $x\mod n$ is a function, $x\equiv y \pmod n$ is an equivalence relation.
Problem: #
Find the inverse of $3$ mod $7$.
Sol.
Say, $x$ is the inverse of $3$ under mod $7$. This means that $3x \equiv 1 \pmod 7$. $x=5$ satisfies this.
Kid-RSA #
RSA is a cryptography system most widely used across the web for secure data transmission, and kid-RSA is a similar precursor which is far easier to break. Let’s say Aditi wants to send Bhaskar a message $m$, and she wants to encrypt it with kid-RSA. Then,
She multiplies it with an encryption code, $e$, which is known by everyone.
Then, she divides it with $n$, also made public by Bhaskar, and finds the remainder, and calls it, say $r$.
When Bhaskar receives $r$, he multiplies it with the decryption code, $d$, which only he knows, to decrypt the message.
Note that $m$ is smaller than $n$ and is relatively prime with $n$. $e$ is also relatively prime with $n$.
So, Aditi finds the value of $r<n$ such that
But, what exactly are $e$, $n$ and $d$?
Choosing $d$ #
Bhaskar chooses $d$ and $e$ in such a way that
Substituting $de$ from Eq-6 into Eq-9, we get,
You might realise that to crack this code, you need to find $d$.
Say, Diti also knows that $de \equiv 1 \pmod n$. Since e is made public by Bhaskar, she knows $e$ too. She now needs to find $d$.
So, how can Diti find $d$?
One way to do it is to use guesswork, but that will need high computational power, as $n$ is very large. Turns out that it gets much easier to find d with some math tricks.
More Math #
Euclid’s Division Lemma #
Lemma #
For every pair of integers $a,b$ $\exists$ $q,r\in\mathbb{Z}$ such that
Motivation #
Take an example of $a=57, b=13$. We know
This motivates us to examine the set of remainders $r$ that are non-negative, say $S$.
Is it always true that the smallest element of $S$ is less than $13$? Does $S$ always have a smallest element? Why? Suppose the smallest element of $S$ is greater than $13$. So what?
Proof #
Consider the set
Now, since
Now, since $S$ is bounded below ($-1$ is less than all elements of $S$) and $S$ is non-empty, $S$ has a smallest element (can be proven using the Well Ordering Principle), say $s$.
So, we can write,
We now prove that $s$ must be less than $b$. We assume for the sake of contradiction that
So, we have: $0\geq s’<s$ and $s$’ can be written as $a-bq$ for some integer $q$ (here $q = q_0+1$). So, $s’\in S$. We assumed that $s$ was the smallest element of $S$. But, we found another element $s’$, which is less than $s$. This gives us a contradiction.
Hence, $0\geq s<b$. So, there exists some $r$, which is the smallest element of $S$ such that $a-bq = r$ and $0\geq r<b$. Hence, for every $a$ and $b$, there exists integers $q$ and $r$ such that
Euclid’s Division Algorithm #
#
From Euclid’s Lemma, we can write
Proof #
Let $d$ be a common divisor of $a$ and $b$. Hence, $d$ divides $a - bq = r$. Hence, $d$ divides $r$. In other words, $d$ is a divisor of $r$ too. Now, if $d$ is a common divisor of $b$ and $r$, then $d$ divides $bq+r = a$. Hence, $d$ is a divisor of $a$ too.
In conclusion, every number that divides $a$ and $b$, divides $r$, and every number that divides $b$ and $r$, divides $a$. So, if we let $d$ be the gcd of $a$ and $b$, then $d$ divides $r$. Which implies that $\gcd(b,r) = d’(\text{ say })\geq d$. Also, if we let $d’$ be the gcd of $b$ and $r$, then $d$ divides $a$. Which implies that $\gcd(a,b) = d \geq d’$. We have $d’\geq d$ and $d’\leq d$. Hence, $d’ = d$, or gcd of $a$ and $b$ equals the gcd of $b$ and $r$.
Iterating the Division Algorithm #
Let $a,b,q,r$ be integers such that
Examples #
#
Find the gcd of $537$ and $103$.
#
Find the gcd of $490$ and $110$.
Tabulation of the gcd Algorithm #
We want to find the gcd of $537$ and $103$. So, let us write $103$ (the smaller number) bellow $537$ (the larger number).
$537$ | |
---|---|
$103$ | $\times5$ |
$22$ | $\times4$ |
$15$ | $\times1$ |
$7$ | $\times2$ |
$1$ | $\times7$ |
$0$ |
We now have,
We repeat this process until we get 0.
The process can be seen in Table-2. The gcd of the two original numbers will be the first number above 0 in the table.
Another Tabulation #
$537$ | $103$ | |||
---|---|---|---|---|
$537$ | $1$ | $0$ | $537 = 537\times1 + 103\times0$ | |
$103$ | $0$ | $1$ | $\times5$ | $103 = 537\times0+103\times1$ |
$22$ | $1$ | $-5$ | $\times4$ | $537\times(1-0\times5)+103\times(0-1\times5)$ |
$=537\times1+103\times(-5)=22$ | ||||
$15$ | $-4$ | $21$ | $\times1$ | |
$7$ | $5$ | $-26$ | $\times2$ | |
$1$ | $-14$ | $73$ | $537\times(-14)+103\times73 = 1$ |
Bezout’s Identity #
In the previous section, we found two integers such that $537$ times an integer plus $103$ times another integer gives $1$. But, how do we know that such integers exist?
#
For natural numbers $a,b$ where $(a,b) = d$, $\exists x,y\in\mathbb{Z}$ such that $ax+by = d$.
Proof #
Consider the set
Now, by the division algorithm, we can write
Say $d’$ is a common divisor of $a$ and $b$. That is, $d’|a, d’|b$. So, $d’|ax_0, d’|by_0$, which implies that $d’|ax_0+by_0 = s$. So, all common divisors of $a$ and $b$ divide $s$. Hence, $s$ is the gcd of $a$ and $b$, by the definition gcd.
Breaking Kid-RSA #
Now that we have sufficient ammunition in our backpack, we can proceed to break the kid-RSA.
Say, Bhaskar chooses his encryption key, $e$, as $103$, $n$ as $537$, and decryption key as $73$. Also, assume that Aditi wants to send Bhaskar the message $20$.
So, Aditi multiplies the message with the encryption key and finds the remainder: Remainder of $20\times\frac{103}{573}$ is $449$.
Aditi then sends the message $449$ to Bhaskar.
It is hard for Diti to decode this. But, Bhaskar has his own decryption key, that no one else knows. Bhaskar’s decryption key is $73$, as we have assumed earlier. So, to decrypt the message, Bhaskar multiplies what he got by $73$, and finds the remainder when it is divided by $537$. The remainder when $449\times73$ is divided by $537$ is $20$, which was the message! Now, in order to break the code, Diti needs to use the techniques we saw in the last few slides.
Diti Breaks it! Recall that $d$ is chosen in such a way that $de\equiv 1\pmod n$ and $d<n$. Diti’s goal is to find $d$ given $n$ and $e$.
$537$ | $103$ | ||
---|---|---|---|
$537$ | $1$ | $0$ | |
$103$ | $0$ | $1$ | $\times5$ |
$22$ | $1$ | $-5$ | $\times4$ |
$15$ | $-4$ | $21$ | $\times1$ |
$7$ | $5$ | $-26$ | $\times2$ |
$1$ | $-14$ | $73$ |
Diti now applies Bezout’s identity with $(a,b) = (e,n) = (537, 103)$ and computes $x,y$ such that $ax+by=1$ using the process shown in Table-4.
She finds that $537\times(-14)+103\times73 = 1$, or $103\times73 = 537\times14+1$. This implies that $103\times17 \equiv 1 \pmod{537}$, and that’s it! We had $e=103, n=537$, and she found an integer, d, less than $537$ such that $ed\equiv1\pmod{n}$.
In general, Diti can use the gcd algorithm to find $x,y$ such that $ex+ny = 1$. We can rearrange this to get $ex=n(-y)+1$, which in turn implies that $ex \equiv 1 \pmod n$. Since $e$ is coprime with $n$, $e$ has a distinct inverse modulo $n$.
Even More Math #
Euler’s Totient Function #
Euler’s Totient function is represented by $\phi(n)$, where $n$ is the input to the function.
The function counts the number of positive integers less than and coprime with $n$.
For example, take $\phi(18)$. The positive integers less than and coprime with $18$ are: $1,5,7,11,13,17$. These are 6 in number. Hence, $\phi(18)=6$.
If n can be written as
For example,
Euler’s Totient Theorem #
#
Let $a$ and $m$ be relatively prime positive integers. Let the set of positive integers relatively prime to $m$ and less than $m$ be
Motivation #
Let us take an example. Let $m = 18$. Then, $R = {1,5,7,11,13,17}$. Let us take $a = 25$. Then, $S = {25\times1,25\times5,25\times7,25\times11,25\times13,25\times17} = {25,125,175,275,325,425}$. We now take each element of $S$ under mod $m$ or mod $18$. Doing so, we get: ${7,17,13,5,1,11} = {1,5,7,11,13,17}$, which is the same as $R$ !
Observe that all elements of $S$ are co-prime to $18$. Why? So, if we take each element of $S$ mod $18$, they will all be elements of $R$. In other words, $S\subseteq R$. The length of $S$ is the same as the length of $R$. What can you do with this? What do you have to do to prove that $S\equiv R \pmod{18}$ from here?
We have
Each element of $S$ is distinct. Also, the product of two numbers, both coprime to $m$ is also coprime to $m$. Hence, all elements of $R$ when taken modulo $m$ are elements of $S$. Also, the length of the sets $R$ and $S$ are equal. So, if we can prove that all elements of $R$ are distinct mod $m$, we can prove that $R\equiv S\pmod m$
Assume that
Since $a$ is coprime with $n$, $a \neq 0 \pmod n$. Hence, $a_x - a_y = 0 \pmod n$, or $a_x = a_y \pmod n$. This only happens when $a_x= a_y$.
Since all elements of $R$ are distinct, and $a_x , a_y$ are both part of $R$, this can only be true when $x = y$.
With that, we are done.
For $a$ coprime to $m$, we have
Define $S,R$ as the same as in the previous theorem.
We know that $R\equiv S\pmod{m}$. Hence, the product of the elements of $R$ will be equal to the product of the elements of $S$ mod $m$. In other words:
Rearranging the above, we get:
Now, the product of all elements of $S$ is coprime to $m$ since all it’s elements are coprime to $m$. So, we must have
#
Evaluate $65^6$ mod $18$.
By Euler’s Totient Theorem, since $65$ is relatively prime to $18$,
$65^{\phi(18)} \equiv 1 \pmod{18}$.
$\phi(18) = 6$. Hence, $65^6 \equiv 1\pmod{18}$ (Check this using a
calculator yourself!)
Fermat’s Little Theorem #
#
By multiplying both sides of Eq-58 by $a$, we get
Fermat preceeded Euler, so he did not prove his theorem using Euler’s Totient Theorem as we did. He instead used induction. We leave it as an exercise to the reader to prove it using induction.
RSA #
Let us now extend the concept of Kid-RSA (which is easy to break, as we just saw), to a full blown encryption scheme that can’t be broken so easily.
The following steps are followed:
Bhaskar chooses public key, $e,n$ such that $e$ is less than $n$ and coprime with $\phi(n)$ and private key, $d,n$ such that $d$ less than $n$ and coprime with $\phi(n)$ and $de\equiv 1\pmod{\phi(n)}$. He then publishes his public key pair to everyone, including Aditi and Diti.
Aditi encodes the message she wants to send, say $m$, by finding the value of $m^e \mod n$ and sends the result, say $r$, to Bhaskar.
Bhaskar then finds the value of $r^d \mod n$, *Drumroll* and that is the message!
How does this work? Why does Bhaskar get back the same number that Aditi sent?
Why RSA Works #
Bhaskar chooses his private key such that :
Then,
We can rewrite Eq-60 as:
Hence,
Choosing $n$ #
Notice, that what we did in the previous slide holds true only if $m$ is coprime with $n$. So, how do we make sure that it always holds.
We have seen before that the message $m$ is smaller than $n$. So, if we can always choose $n$ such that it is the product of primes, all greater than $m$, then $m$ and $n$ are coprime.
So, we let $n$ be the product of two large primes, $p$ and $q$.
Also note that for $de \equiv 1 \pmod{\phi(n)}$ to exist, $e$ must be coprime with $\phi(n)$.
Generating $d$ #
There’s another problem. Sure, Diti can’t find d, but how will Bhaskar find it himself, when he is creating his private key?
He uses that fact that $n$ is the product of two large primes, $p$ and $q$. While Bhaskar knows both $p$ and $q$, Diti only knows $n$. As we have seen before in one way functions, it is very hard to factorize a large number, but it is easy to find the number given its factors.
Bhaskar needs to choose $d$ such that it satisfies Eq-60. $\phi(n)$ in Eq-60, where $n = pq$ is given by
On the other hand, Diti doesn’t know $p$ and $q$. So, it is impossible for her to find $d$ such that it satisfies Eq-67.
Let us take an example to understand this better. We shall use small primes, $p$ and $q$ so that it is easy to follow.
An Example to Explain RSA #
First, Bhaskar needs to create a private key, for him to decrypt Aditi’s message, and a public key for Aditi to Encrypt her message. Bhaskar’s private key includes $d,n$ and his public key includes $e,n$. He first chooses $n$ such that it is a product of two primes, say $13$ and $37$ (In reality , the primes chosen are very large). Hence, $n = 13\times37 = 481$.
Now, Bhaskar chooses $e$ such that $e$ is coprime with $\phi(481) = (13-1)(37-1) = 12\times36 = 432$. So, $e$ must not have $2$ or $3$ as prime factors. Say, he chooses $e = 65$.
With that, Bhaskar has created his public key. He now needs to find $d$ to create his private key.
Generating a Private Key #
Recall that $de \equiv 1 \pmod{\phi(n)}$. $\phi(n) = 432$, as we have seen earlier. Also, $e = 65$.
Hence, he chooses $d$ such that $65d \equiv 1 \pmod{432}$. In other words, $65d = 432k+1$, for some integer $k$. Or, $65d + 432(-k) = 1$. By Bezout’s identity such integers, $d$ and $k$ exist, since $gcd(432, 65) = 1$. He now applies Euclid’s gcd algorithm on $432$ and $65$.
$432$ | $65$ | ||
---|---|---|---|
$432$ | $1$ | $0$ | |
$65$ | $0$ | $1$ | $\times6$ |
$42$ | $1$ | $-6$ | $\times1$ |
$23$ | $-1$ | $7$ | $\times1$ |
$19$ | $2$ | $-13$ | $\times1$ |
$4$ | $-3$ | $20$ | $\times1$ |
$3$ | $14$ | $-93$ | $\times4$ |
$1$ | $-17$ | $113$ |
After applying the Euclidean gcd algorithm on $65$ and $432$ (Table-5), Bhaskar got $65\times113 + 432(-17) = 1$. What Bhaskar wanted was: $65d + 432(-k) = 1$.
Equating the LHS of the two equations, we get $65d + 432(-k) = 65\times113 + 432(-17)$ Which implies that $d = 113$.
Hence, Bhaskar has arrived at his public key, $(65, 481)$ and private key, $(113, 481)$.
Encrypting the Message #
Bhaskar has created both a private key and a public key,
He shares his public key with everyone, including Aditi, and even Eve.
Say, Aditi wants to send Bhaskar the message $10$ ($10$ is less than both $13$ and $37$). She raises $10$ to the $e$th power. Here $e$ is $65$ (a part of Bhaskar’s public key). She then divides the result by $n = 481$ and finds the remainder. In other words, she finds $10^{65} \mod 481$. The result is $433$ (Check it!)
So, she sends $433$ to Bhaskar.
Decrypting the Message #
Now, Bhaskar receives $433$. He raises $433$ to the $d$th power, and finds the remainder. He had chosen $d = 113$ earlier. So, he finds the remainder when $433^{113}$ is divided by $481$. Doing so, he gets back $10$ (Check!)
Now, Diti intercepts the message. She sees $433$. She also knows Bhaskar’s public key $(65, 481)$. What Eve needs to do in order to decode the message is that, she needs to find $d$. She also knows that $de \equiv 1 \pmod{\phi(n)}$, or $65d \equiv 1 \pmod{\phi(481)}$.
She can easily find $d$ using Euclid’s algorithm if she knows $\phi(481)$. She also knows that $481$ is the product of two primes, and that $\phi(pq) = (p-1)(q-1)$.
So, her aim now becomes to factorize $481$. If she does so, she can decrypt Aditi’s message.
In this example, Bhaskar chose two small primes to multiply to get $n$, just to give us a good understanding of RSA. However, in reality, the two primes are very large. It is very very hard for Eve to factorize a number that is the product of two large primes, in a practical amount of time.
This makes RSA practicaly unbreakable
Primality Testing #
Miller Rabin Primality Test #
This is a probabilistic test, where the a prime is always correctly identified, but there is a space for false positives with the occurrence of Fermat’s Pseudoprimes.
It relies on Fermat’s little theorem. Which states that for a given prime number $p$, for any natural number $a$, where $1 < a < p-1$,
Miller Rabin utilizes the following extension of Fermat’s little theorem
If $p$ is an odd prime number, such that