Information Security Engineer
Mozilla Corporation
Cryptography is the practice and study of secure communications in the presence of third parties. It does this by providing three primary security guarantees:
Most importantly, cryptography is a story about a couple of truly lovely people by the name of Alice and Bob.
They just want to have a private, secure
conversation over an insecure network.
So let's make their wish come true!
The way way back machine!
… dates back to approximately 1500 BCE
These ciphers comprised the entirety of cryptography up until the beginning of modern cryptography with World War II and the creation of the Enigma machine.
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W |
Example:
That said, rotating 26 characters means that effectively
no encryption is done. 26 is a weak key!
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
G | J | Z | X | C | K | W | U | F | E | S | Q | D | L | I | R | P | A | O | M | Y | T | V | H | B | N |
Example:
Let's take a look at a simple example...
Canonicalize:
MOZIL LAISM YDINO SAUR
Example:
It turns out that not every letter in the
English language appears equally as often
Ideally, each letter should appear at the same frequency
Two consecutive "L"s encrypt to different outputs
What if the key was exactly as long as the message?
(and also totally random)
Congratulations! You've just invented the only truly unbreakable form of cryptography.
XOR, sometimes written ⊕, is an operator that outputs true (1) if the inputs differ and outputs false (0) if they are the same.
Message: | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Key: | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 1 |
Output: | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 |
XOR has the special property that its output is as random as the most random input
If your key is a one-time pad, then your message can never be broken!
So why don't we use one-time pads for all of our encryption, if it's so unbreakable?
The impracticality of having extremely long keys combined with the key reuse requirements makes the OTP infeasible for much real-world use.
Anyone selling them is likely a snake oil salesperson.
Symmetric-key cryptography, like historic cryptography, uses the same key for both encryption and decryption:
Here, plaintext is being encrypted by a key into ciphertext, and then decrypted with that same key into the original plaintext.
Remember that in a proper cryptosystem, the key is the only thing that keeps your data confidential. As such, keys should be:
When we talk about keys, we often say "128-bit keys" or "256-bit keys." What does that mean?
How big is 2128?
If a single key was the size of a grain of sand, the key space would be 31x the size of Jupiter! 310 bits gets you up to the size of the known universe.
Due to its speed, symmetric-key cryptography is the primary form of encryption for bulk encryption.
Cipher | Key size | Type |
---|---|---|
AES | 128, 192, or 256 bits | Block |
ChaCha20 | 256 bits | Stream |
Camellia | 128, 192, or 256 bits | Block |
3DES | 192 bits | Block |
RC4 | 128 bits | Stream |
Twofish | 128, 192, or 256 bits | Block |
Mozilla recommends AES with 128 or 256-bit keys or ChaCha20.
When in doubt, contact us!
The key and nonce are combined to generate a keystream, which is then XORed with the plaintext to generate the ciphertext
Why is it a problem that the same input
always encrypts to the same output?
To solve this, all block ciphers have
modes of operation for their use:
† More later!
This is meant to scare you! Even with good
ciphers, you can still screw up cryptography!
The various block cipher modes each provide their own benefits besides simply eliminating patterns in ciphertext
This is the counter (CTR) block mode of operation
This is the counter (CTR) block mode of operation
This is the counter (CTR) block mode of operation
This is the counter (CTR) block mode of operation
This is the counter (CTR) block mode of operation
The nonce must always be unique, and the counter should never loop!
Remember Alice and Bob?
They still want to have that conversation.
No problem, they physically
meet up and exchange a symmetric key.
After that, they encrypt all their messages
using that key, a symmetric cipher, and an
appropriate block mode of operation.
And what about our good friend, Eve the Eavesdropper?
And what about our good friend, Eve the Eavesdropper?
Well, she's disappointed by this
whole symmetric-key cryptography business.
See, when Alice and Bob use symmetric-key cryptography correctly, it provides them confidentiality.
Eavesdroppers like Eve can't decrypt and read their messages.
But what if our attacker wasn't just
passively listening like Eve was?
A round of applause for Mallory, everyone!
Mallory just loves to sit in the middle
of conversations and meddle with them.
What if Mallory was intercepting
Alice and Bob's conversation?
Mallory couldn't decrypt the ciphertext,
but she could still modify it.
Alice and Bob have achieved confidentiality, but they haven't achieved integrity or authentication.
What Alice and Bob need is a
message authentication code.
A message authentication code provides integrity and authenticity assurances about a message.
† Generally the ciphertext, known as Encrypt-then-MAC
Mallory can modify the ciphertext and
MAC to her heart's content.
But because she doesn't know the symmetric
key, she can't generate the correct MAC.
So when Bob plugs the message and MAC key into the MAC function, he'll quickly see that the MAC he received doesn't match the MAC he generated.
You're probably thinking this whole MAC thing might
as well be called a Magic Authentication Code.
How might a function like this work?
One such method is a keyed-hash message authentication code, or HMAC.
So, err… what's a hash?
Cryptographic hash functions take any
size input and convert it to a fixed size
output called a hash or message digest.
What might this look like?
This presentation can read your mind†
It knows you're thinking, “Ah ha! I've totally got this!”
†Bwahahahahaha! 😆
You run a photo sharing site with a public API:
Everything look great?
Turns out, that piece of code is horribly insecure, due to length-extension attacks on many hash functions.
I did warn you not to write your own crypto!
Just kidding. I know you'd never do something like that.
You'd go to your carefully-sourced cryptographic
library and use HMAC as instructed.
After all, the way they are constructed†
makes them immune to these attacks.
†HMAC = H((K ⊕ opad) || H((K ⊕ ipad) || m))
So when Alice sends a message to Bob,
she can protect it with HMAC.
The symmetric-key crypto provides confidentiality, and HMAC provides integrity and authentication.
Just when you thought things couldn't get any better…
…they get better! 😍
Cryptographers recognized that combining a cipher, block mode, and MAC was really hard and error prone.
She then sends the ciphertext, associated data, and authentication tag to Bob
Unfortunately, symmetric-key cryptography doesn't solve the most persistent of cryptography problems.
Symmetric-key cryptography becomes impossible
to scale as the number of parties grows.
Remember Alice and Bob?
They still want to have a conversation.
No problem, they met up and exchanged a key.
One key for two parties.
But what happens when we add more parties?
With just five parties, we're already up to 10 keys!
Number of parties | Number of key pairs |
---|---|
2 | 1 |
3 | 3 |
5 | 10 |
10 | 45 |
100 | 4950 |
1000 | 499,500 |
US population | 52 quadrillion |
World population | 2.54 x 1019 |
So how do we fix this?
Or asymmetric-key cryptography,
if you prefer to think of it that way.
Public-key cryptography attempts to solve the thorniest issues in symmetric-key cryptography:
In 1976, this guy struck upon a pair of brilliant ideas.
What if, instead of needing to meet ahead of time to exchange keys, two people could exchange keys remotely?
And our good friend Eve couldn't
figure out what their key was?
(Let's all give a hearty welcome to our eavesdropper, Eve!)
Diffie-Hellman key exchange† (DHE) allows two parties who have had no prior communications to jointly establish a key remotely, even when their method of communication is insecure.
† This class demonstrates classic DHE, instead of
the recommended Elliptic Curve DHE, which will be covered later
Known | Unknown | Known | Unknown | Known | Unknown |
---|---|---|---|---|---|
p = 23, g = 5 | p = 23, g = 5 | p = 23, g = 5 | |||
a = 6 | b | b = 15 | a | a, b | |
A = ga mod p | B = gb mod p | ||||
A = 56 mod 23 = 83> | B = 515 mod 23 = 19 | ||||
B = 19 | A = 8 | A = 8, B = 19 | |||
s = Ba mod p | s = Ab mod p | ||||
s = 196 mod 23 = 2 | s = 815 mod 23 = 2 | s = 19a mod 23 s = 8b mod 23 |
|||
s = 2 | s = 2 | s |
Blue values are public, whereas red values are private (secret).
Alas, Diffie-Hellman key exchange has a fatal flaw.
While it protects against eavesdroppers like Eve, it doesn't help against someone a bit more… malicious.
What if Mallory sat in the middle of a
conversation between Alice and Bob,
pretending to be the other person?
Alice would do her key exchange with Mallory, while Bob would do a different key exchange with Mallory.
In the end, Alice and Bob would both think that they had securely exchanged keys, but really they would both be talking to Mallory.
Alice and Bob are both having perfectly
secret conversations with Mallory.
In cryptography, we would say that we have achieved confidentiality, but we don't yet have authenticity.
So what about his other idea?
What if, instead of two people sharing a key...
...each person had two keys?
A public key that they could give out to anyone.
And a private key that they kept to themselves.
And what if that public key could encrypt messages
that only the private key could decrypt?
Well, it was great idea.
But he — nor anybody else at the time — knew
how to make the mathematics work.
Luckily for all of us, a trio of folks by the name of Ron Rivest, Adi Shamir, and Leonard Adleman discovered a way to make it work.
You might know them by their initials: RSA
RSA is based on a one-way mathematical function known as a "trapdoor function." These functions are easy to do in one direction, but difficult to do in the other direction.
More specifically, RSA relies on the difficulty of factoring the multiple of two large prime numbers.
Is it divisible by 2? Nope. How about 3? 5? 7? ...
† For more information on how this is done,
see the Wikipedia article on the RSA cryptosystem.
Let's take a look at how RSA encryption
works in practice…
First, Alice asks Bob for a copy of his public key.
Once she has Bob's public key,
she then encrypts her message with it.
Next, Alice sends the ciphertext to Bob.
Finally, Bob decrypts the ciphertext with his private key.
Because Eve doesn't have Bob's private key,
she can't decrypt Alice's message.
Now with 100 people, we just need 100 public-private key pairs, instead of 4950 shared keys.
Plus, Bob can encrypt a special piece of information with his private key that anybody can decrypt with his public key to assert his identity.
We call this signing, but more on that later.
This is 16x larger to transmit in addition to being far slower
Also, how does Alice know that she's actually getting Bob's public key when she asks for it?
We'll get to that, but there's a few more
concepts we need to understand first!
We're finally getting to the good stuff!
What if instead of just sending Bob the plain hash, she created a digital signature by encrypting the hash with her private key and sent that to Bob?
That's why we called this signing from earlier!
Bob would then decrypt it with Alice's public key to get back to the original hash. He would then compare it to the message's hash to see if it actually came from Alice.
Sure, Mallory could make up her own message and say it was from Alice. But since she doesn't have Alice's private key, she can't create the correct digital signature.
So when Bob tries to decrypt the digital signature with Alice's public key, it would produce a pile of gibberish that doesn't match the correct one!
Of course, if Mallory could forge a message that produced the same hash as Alice's message, she could just reuse Alice's digital signature.
But because hash functions are collision resistant, Mallory can't do that either!
No wonder Mallory looks so grumpy!
Plus, since only Alice is able to create the proper digital signature, she can't deny that she created the message.
In cryptography, we call this non-repudiation.
First, Alice takes her message and
encrypts it with Bob's public key.
Next, she creates a hash of the ciphertext, and encrypts it with her private key, creating a digital signature.
She then sends the encrypted message
and the digital signature to Bob.
Bob takes the encrypted message from Alice,
and decrypts it with his private key.
To verify the message came from Alice, he creates a hash of the ciphertext, just as Alice did.
Bob then takes the digital signature
and decrypts it with Alice's public key.
If the two hashes match,
he knows the message came from Alice.
We have confidentiality, because the message was encrypted with Bob's public key.
We have authentication, because only someone with Alice's private key could create a signature that would properly decrypt with Alice's public key.
And we have integrity, because if the ciphertext or digital signature were modified or replaced, then the resulting hashes wouldn't match.
Not shown: In real-world implementations, we typically create a symmetric key and encrypt the message with that. The symmetric key is then encrypted using the public key of Bob and signed with Alice's private key.
Using public-key cryptography and digital
signatures gets us a complete cryptosystem.
But one unsolved problem remains.
How does Alice know that she
actually has Bob's public key?
What if I told you that there was a way?
What if we created a document about Bob,
containing his public key?
It would be kind of like a driver's license, but for cryptography.
And what if there was someone that
we trusted to verify identities?
Just like how your government verifies your driver's license identity.
They could sign Bob's document,
verifying that the public key actually
belonged to Bob and nobody else.
As long as we trusted them to properly verify
the ownership of the public keys, we could
let them sign anybody's document…
…whether it be Alice's, Bob's, or Eve's.
This way, we'd only need to find a way to securely acquire a small number of verifier's public keys.
We call these so-called verifiers
certificate authorities.
And those documents?
Well, they're called certificates.
Your operating system and browser come preloaded with a collection of certificates that contain the public keys of these certificate authorities (CAs).
This is the certificate for www.mozilla.org.
DigiCert was the certificate authority
that issued the certificate.
Certificate authorities have a
number of responsibilities:
Let's take a look at an actual certificate…
Serial Number: 04:e4:eb:1e:7f:8c:51:09:db:bf:0c:1c:7f:41:16:91 Issuer: C=US, O=DigiCert Inc, OU=digicert.com, CN=DigiCert High Assurance EV CA-1 Validity: Not Before: Dec 2 00:00:00 2013 GMT Not After : Dec 7 12:00:00 2015 GMT Subject: C=US, ST=CA, L=Mountain View, O=Mozilla Foundation, CN=www.mozilla.org Subject Public Key Info: Public Key Algorithm: rsaEncryption Public-Key: (2048 bit) Modulus: 00:b8:71:c1:e0:d1:87:20:8d:bc:56:6e:16:ad:21: 63:64:de:58:33:46:c4:06:e5:5b:3d:cc:1b:c0:10: [...] Exponent: 65537 (0x10001) X509v3 extensions: X509v3 Subject Alternative Name: DNS:www.mozilla.org, DNS:mozilla.org X509v3 Key Usage: Digital Signature, Key Encipherment, TLS Web Server Authentication, TLS Web Client Authentication X509v3 CRL Distribution Points: URI:http://crl3.digicert.com/evca1-g4.crl Signature Algorithm: sha1WithRSAEncryption 19:e3:3a:a0:e8:3e:82:7f:dc:ad:56:f3:eb:b1:5c:b0:73:37: 6e:cf:df:51:08:f5:0b:36:17:dc:f8:a4:1e:df:95:8e:aa:47: [...]
If you were paying attention, you may have noticed that there are two certificates listed above www.mozilla.org in what is called the certificate chain.
The top-level certificate is the root certificate.
This is the certificate your browser comes bundled with.
The center certificate is the intermediate certificate.
CAs don't sign end-user certificates with their root certificate, as the risk is too high.
If the private key for the root were ever compromised, every certificate the CA had issued would instantly become invalid when trust in the public key was revoked.
Instead, CAs create an intermediate key pair and certificate, and sign that with the root certificate.
This intermediate CA then issues all end-user certificates. It's periodically replaced as technology progresses.
Finally, you have the end-entity (EE) or leaf certificate.
This is the certificate that identifies
an individual or organization.
This whole system of trust, registration, certificates, chains, authorities, and revocation is collectively known as the public key infrastructure, or PKI.
Now that we've got a strong grasp of all the cryptographic primitives, let's walk through the most important implementation.
Transport Layer Security (TLS), previously known as Secure Sockets Layer (SSL), is the protocol used by applications to communicate securely across a network.
It's the protocol that protects our web browsing, emails, instant messages, and much more.
First, Alice sends Bob a random number and the list of cipher suites that she supports.
A cipher suite is a combination of key exchange, authentication, encryption, and message authentication.
Feast your eyes: ECDHE-RSA-AES128-GCM-SHA256
Bob then selects a different random number and chooses a cipher suite, from Alice's options.
He'll also generate his Diffie-Hellman parameters, if that cipher suite uses DH for key exchange.
These random numbers are used later in key generation to ensure that an attacker can't just replay Alice's messages, in a replay attack.
He then takes both random numbers and the public Diffie-Hellman parameters and signs them with the private key for his certificate, to create a digital signature.
He then sends Alice:
First, Alice takes Bob's certificate (sans digital signature) and creates a hash of it.
Then, she takes the digital signature in the certificate and decrypts it with the intermediate CA's public key.
If the two hashes match, she knows that the certificate was signed by the intermediate CA.
She repeats this process for every certificate in the chain until reaching the one signed by the root CA's public key.
Remember: this key came bundled with her browser!
If the two hashes match, she knows that the certificate chain is valid and Bob's public key actually belongs to Bob: the CA confirmed it with its digital signature.
Now Alice just needs to verify that she's talking to Bob, and not just somebody with Bob's certificate.
Remember, since certificates are public,
everybody has access to Bob's certificate.
First, Alice takes the random numbers and DH parameters that “Bob” sent and hashes them.
Then she takes the digital signature that “Bob” also sent, and decrypts it with the public key in Bob's certificate.
If the two hashes match, it could only have been created by somebody with Bob's private key.
It's Bob, people!! Bob generated that signature! 😂
Since p and q are already chosen, Alice just needs to choose her private DH parameter.
Alice then sends Bob: ga mod p = A
They use s and the random numbers
to create a master secret which they use to
generate a bunch of shared symmetric keys!
Don't hyperventilate! We'll get to message authentication codes in a bit!
Now they simply begin communicating with their symmetric keys, using the cipher and mode of operation specified in the chosen cipher suite.
But before they do that, they must verify that their previous conversation, known as a handshake, wasn't a complete farce!
Now that they've arrived at their keys, Alice takes their entire previous conversation and hashes it.
She then combines it with the master secret
and the message “client finished” and
encrypts it with her encryption key.
She then takes this encrypted “finished” message
and sends it over to Bob.
Bob takes encrypted “finished” message
and decrypts it with Alice's encryption key.
At the same time, he takes the handshake he
believes they had and hashes it, as Alice did.
If the two “finished” messages match, he knows that
their handshake hadn't been tampered with.
This prevents someone from replacing the
cipher suite choices with something very weak.
Bob then encrypts the hash of their handshake, the master secret and the message “server finished” and
encrypts it with his encryption key.
He then sends his version of the encrypted
“finished” message over to Alice.
Alice takes Bob's encrypted “finished” message
and decrypts it with Bob's encryption key.
If the hashed handshake and master secret match Bob's “finished” message, she also knows their handshake hadn't been tampered with.
Alice takes her message and combines it with her symmetric MAC key in a special way.
She then takes that combination, and hashes it. This produces a hash-based message authentication code, or HMAC.
Without knowing the symmetric MAC key, you can't produce the same HMAC.
Who are the two people with the symmetric MAC keys?
Why, it's our good friends Alice and Bob.
So when she encrypts a message, she sends along the HMAC for that message.
Bob takes the ciphertext and decrypts it, returning back to original plaintext.
Bob then combines it with the symmetric MAC key that only he and Alice have, producing an HMAC.
If the HMACs match, then he knows the message hasn't been tampered with and that it came from someone else who possesses the symmetric MAC key: Alice!
Think of HMACs as a kind of digital signature that you can do with symmetric-key cryptography.
However, unlike digital signatures, they can only show that the message came from a party who possesses the symmetric key.
They don't provide authenticity or non-repudiation.
Once Alice and Bob are finished communicating, they throw away the symmetric keys and Diffie-Hellman parameters.
This ensures that their conversation can't be decrypted in the future, a feature known as forward secrecy.
Note: you can use RSA instead of Diffie-Hellman during the handshake,
but you lose forward secrecy.
Who here is ready to go write your own cryptography?
Go on. Raise your hands. You know you want to.
Noooooooooooo! It's a trap!
Bad crypto | Bad implementations |
---|---|
DES | BREACH / CRIME |
MD5 | FREAK |
RC4 | Goto Fail |
SHA-1 | Heartbleed |
Logjam | |
Lucky Thirteen | |
POODLE | |
WEP (Wireless Encryption) | |
Windows LM Hashes | |
(and on and on and on and on) |
Most vulnerabilities in cryptography are due to people using it incorrectly, and not due to the crypto itself.
Even when some of the best cryptographic engineers in the world work on crypto, they can still get it wrong.
Don't make the same mistake: don't write your own crypto!
You run a photo sharing site with a public API:
Everything look great?
Turns out, that piece of code is horribly insecure, due to length-extension attacks on SHA-2.
Should you be expected to know this?
No, probably not. It's an easy mistake to make, and demonstrates just how easy it is to screw up when you develop your own cryptography.
Don't believe me? The developers at Flickr, Vimeo,
and many others all made this exact mistake.
↓ P.S. Another example lies below! ↓
↓ How 'bout now? ↓
Woe! This code insecure too, due to a timing attack.
Cryptography isn't like programming.
When you make a mistake in your code,
you get immediate feedback when it crashes,
fails to work, or doesn't compile.
With cryptography, you only know you've screwed up when someone makes off with all your data.
Don't write your own crypto!
Don'ts | Do's |
---|---|
Combine primitives yourself | Use a purpose-built library |
Guess at which cipher suites to use | Use the Mozilla Server Side TLS guidelines |
Write your own encryption protocol | Use TLS and SSH |
Review your own cryptosystem | Have it reviewed by Security Engineering (Richard Barnes, Eric Rescorla) |
Skip using crypto because it's too hard | Contact Enterprise Information Security in #infosec |
Most of the demos in this presentation are actually
done these days with ECDHE instead of DHE!
Post-presentation, reference definitions belooooooow!