Cryptography solves a number of problems related to security. Two of the problems involve the security of a message you send to some other party. You don’t want anyone else to be able to read the message, and you don’t want anyone else to forge a message and pretend it was from you. The second problem is especially important in things like electronic signatures.

Conventional Cryptography

Two basic forms of cryptography have been invented: conventional cryptography and public-key cryptography. Conventional cryptography describes all of the different ways of securing a message where knowledge of how to encrypt a message tells you how to decrypt it. For instance, we could decide that we will substitute each letter in a message with the next letter in the alphabet. "Hello World" would become "Ifmmp Xpsme". But if someone knew how the letters were originally substituted, they would know how to reverse the process.

Conventional cryptography can be very secure. One type of conventional cryptography that is absolutely secure uses a side effect of the boolean xor operation. In boolean logic, the xor operator returns the value true if only one of its two arguments is true. If true and false are represented by 1 and 0, then we can represent the results of xor in the following table.

first argument second argument xor result
0 0 0
0 1 1
1 0 1
1 1 0

If we represent our message as a sequence of ones and zeros (as they are in a computer), then we can use the xor operator to convert those ones and zeros into an encrypted version. In the ASCII character format, the letter A is represented by the number 65. In binary, that is 01000001.

Let’s make up a new sequence of ones and zeros that will be our key: 01101001. Here’s what we get when we xor the two sequence of numbers together:

     01000001
xor 01101001
     —————
     00101000

The interesting property of xor that makes it good for cryptography is that we can use our same key a second time, and xor will recover our original message:

     00101000
xor 01101001
     —————
     01000001

The same key will both encrypt and decrypt the message, and as long as the key is known only to ourselves and the party we are communicating with, we should be absolutely secure.

Xor-based encryption is very, very fast for a computer to process, but it suffers from the same problem that afflicts all conventional cryptography: how do we make sure that this single key is known only to the sender and recipient of the message? This problem wasn’t solved until 1975.

Public-key Cryptography

In 1975, Whitfield Diffie and Martin Hellman decided to tackle this problem. The problem with conventional cryptography is that someone could intercept the encryption/decryption key while the two people who want to use it were agreeing on it. Was there a way for the two people to exchange enough information about the key (rather than the key itself) so that they know what key they agreed on, but no one else does? They published a paper called New Directions in Cryptography that outlines such an approach, and public key cryptography was born. The public information does not give away enough information for others to decrypt messages.

Let’s say we have two people named Alice and Bob. In the Diffie-Hellman algorithm, Alice makes up a number a, and Bob makes up a number b. Then, they publicly exchange two values: p, which is large prime number, and g which is some number less than p.

  1. Alice uses the values a, g, and p to calculate ga mod p. At the same time, Bob calculates gb mod p.

  2. Alice and Bob exchange the results of their calculations.

  3. Alice then uses gb mod p (that she received from Bob) to calculate (gb)amod p, and Bob calculates (ga)b mod p.

The values that Alice and Bob calculated in step three are equal, and we can call them both k. Alice and Bob have agreed on a key k without ever exchanging more information about it than g, p, ga mod p, and gb mod p. From a math point of view, it is not feasible to reverse engineer either a, b, or k, from just that information in any reasonable amount of time. At this point, Alice and Bob can communicate with each other securely using k to encrypt and decrypt messages through some means like the previously mentioned xor operation.

Diffie-Hellman does suffer from one problem though. Our example people Alice and Bob can’t know for sure that they are exchanging information with each other. If a third person, Carol, sits in the middle and pretends to Alice to be Bob, and pretends to Bob to be Alice, then Carol can produce a key k1 that she will use with Alice and k2 that she will use with Bob. If Alice tries to send a message to Bob, Carol can intercept it and decrypt it with k1. After reading the message, she can encrypt it with k2 and send it to Bob.

Neither Alice nor Bob will realize there is a person in the middle. This is known as the "middle-man" problem.

To solve this problem, we need a new algorithm. In 1977, Ron Rivest, Adi Shamir and Len Adleman invented a new public-key cryptography algorithm called RSA (after the first letters of each of their last names) that avoids the middle-man problem. RSA has the following properties:

  1. Instead of a single key for encryption and decryption, we have two keys.
  2. If either key is used to encrypt a message, the other key can be used to decrypt it.
  3. There is no way of figuring out one key based on the other.

Alice would generates the two keys, and published only one of them. The published key is known as the public key. She keeps the other key as the private key. If Bob wishes to send her a message securely, he encrypts it with the public key. Since Alice is the only one with the private key, only she can read the message. If Bob wishes to receive a message, all he has to do is generate a pair of keys and publish one of them as the public key for Alice to use.

To generate the two keys, one performs some mathematical operations on a pair of large prime numbers. Each key contains the product of the two prime numbers, and if it were feasible to factor the product into each of the prime numbers, then the private key would be revealed. So far, no one has been able to crack RSA keys, as long as the initial prime numbers are large enough.

RSA solves the two problems we initially discussed — keeping others from reading your messages and keeping others from forging messages in your name.

Keeping Others From Reading Your Messages

With RSA, no one can read your messages as long as you keep your private key secure. If everyone keeps their own private key secure, then everyone can securely communicate to everyone else using the public keys, and know that no unauthorized person is reading the messages.

One problem with RSA is that it is actually pretty hard for a computer to encrypt and decrypt a message using the two keys. It’s still pretty fast, but it’s not practical for widespread use in things like SSL, which is used for securing your communication with a web site. If there are a large number of messages to be sent back and forth, it helps to use a hybrid of RSA and xor encryption. The very first message can say what xor key you would like to use for the rest of the communication. Only the very first message needs to use the slow RSA algorithm for encrypting and decrypting. After that, you can use the very, very fast xor operation to process the rest of the messages.

Keeping Others From Forging Messages

The second property of the RSA keys that we listed (that either of the two keys can be used for encryption with the other used for decryption) is useful in ensuring that a message was written by the right person. All that needs to be done is to encrypt the message with your private key. Because of this second property, the message can be decrypted using the public key. Since you are the only one with the private key, a message that can be decrypted with the public key must have come from you.

In electronic signatures, you are not typically asked to encrypt the entire message with your private key. Instead, you run the message through a well-known calculation to generate a much smaller number called a digest. A digest is a number that is very, very likely to be unique, depending on the contents of the original message. Since the digest is much smaller than the original message, the encryption operation is much faster.

A small change in the message contents will result in a very different digest, and therefore the digest can also be used to validate that the received message matches what was originally signed. If someone changed the original message, then the decrypted digest would not match with one calculated with the new message.