Padding Oracles: The Poo in Poodle
Buckle up, 'cause this involves cryptography and it's going to get a bit technical. That said, I'll try to keep it at a high level (nothing like S-boxes and avalanching); as long as you understand exclusive-or (xor) you'll have no trouble following along. To (hopefully) make it easier, I'll put major points in sections with headings.
First of all, a padding oracle has nothing to do with putting sumo suits on databases. They're a vulnerability in cryptographic systems arising from encryption algorithms needing data (we'll call it plaintext) to come in blocks of a fixed size (inventively called the blocksize). If your encryption algorithm works on blocks of size 4, but you've only got 3 bytes of data to send it, you're going to need a way to get it up to 4 bytes. The way that that's done is called padding.
Padding
We need a way to make our messages a multiple of our encryption algorithm's blocksize while still being able to tell what the plaintext was when we decrypt it (i.e. if you pad with random bytes, how will you know what's padding vs plaintext?). Fortunately, there's already a handy standard defined for us (it's PKCS #7 if you're interested). The way it works is that you encrypt each block as normal until you get to the last one. For the last block, you calculate the padding as the blocksize - the length of the last block. E.g. if your blocksize is 4 and your last block is 3 bytes long, your padding is 1, so you append a 1 to the last block; if your last block was of length 2, you'd append 2 bytes with the value of 2. More concretely, with a blocksize of 4 bytes, the plaintext {1,2,3} becomes {1,2,3,1} and {1,3} would become {1,3,2,2}. Now, you may be wondering what would happen if your plaintext was {1,2,3,1}; would the last 1 be discarded? Nope, we always add padding, so {1,2,3,1} would become {1,2,3,1,4,4,4,4} (it's not perfect since it bloats the data, but it has the nice effect that it means we can decrypt the data).
Now we've covered the case where we don't have enough data to fill a block for encryption, and we've seen that you may have more than one block of data to encrypt, and (most) encryption algorithms work on exactly one block of data, so we need to figure out what to do about that. Turning a block cipher (encryption algorithms that work on data a block at a time; i.e. most of them) into a stream cipher (one that works on an arbitrary amount of data) is an important area of research and one that's subtly easy to mess up. So, let's dig into some of those methods (which are frequently called "modes" in crypto-speak).
From Block to Stream
The most obvious approach to encrypting a stream of data longer than one block is to just keep applying the algorithm. This particular way of doing things is called Electronic Code Book (ECB), and it has one really nasty problem: using ECB means that any particular byte will always encrypt to the same value for a given key. A demonstration may help demonstrate why that's so bad.
Starting with this plaintext image:
I applied a Caesar cipher (really simple algorithm that does a constant mapping from one byte to another; you'd never use this in practice, but it exacerbates the effects of ECB so they're easier to see), chaining them with ECB.
The result was this image:
That's obviously not doing a great job of hiding what the plaintext was. In fact, with this particular encryption algorithm, it's no more secure than a newspaper puzzle.
It's pretty clear that we're going to need something better, something that muddles the plaintext of each block before it goes into the encryption algorithm.
Our next contender mixes up each block of plaintext by applying xor to it and the result of encrypting the previous block and is called (again with the imaginative names) Cipher Block Chaining (or CBC for short; we LOVE acronyms!)
To see how CBC fares, I put it through the same encryption algorithm used for ECB and got this image (spoiler alert: you can still kind of see the shape; that's because the encryption has a blocksize of 1):
Even with such a painfully simple encryption algorithm, CBC isn't a complete failure. You actually have to look at the image to see the schooner Pac Man and you can't readily tell that there are two of them overlapping.
The only obvious problem with CBC is what do you do about the first block? You could just work from its plaintext, but then the first block would have the same issues as ECB (and the first block in any data exchange tends to have a lot of data that's the same every time; e.g. GET / HTTP/1.1\r\n)
To avoid having our first block basically be in ECB mode, we add a block of random data, called an Initialization Vector (IV; goooooooo acronyms!) Using an IV, CBC mode looks pretty secure and it was used for a very long time. Unfortunately, appearances can be deceiving and there's a devastating vulnerability hidden there. To see it, let's make a trip to Delphi to see the padding oracle.
The Padding Oracle (of CBC Mode)
Before getting into the details of the padding oracle, a quick refresher on the only thing you'll need to know about xor -- xor undoes itself. By that, I mean that if you have a xor b = c, then c xor a = b and c xor b = a. It is to itself as subtraction is to addition.
The base assumptions for the padding oracle are that you have the ciphertext (one or more blocks of it) and if one end of the communications channel receives data that it can't decrypt properly, it'll respond with an error message. One cause that it may not decrypt properly is that the padding is wrong. Looking back, we see two things: 1) we know what the last byte of a block has to be if it's the last block of a stream and 2) the plaintext will be modified by xor'ing it against the prior block's ciphertext prior to encrypting (which means that the decryption will yield the xor of the plaintext and the previous block's ciphertext).
Given those assumptions, here's how the padding oracle works. For the sake of simplicity, I'll refer to the end of the communications channel to which we're sending data to be decrypted as the server.
- Take one block of ciphertext out of the stream (doesn't really matter which one)
- Send that block to the server with a custom IV
- If you get an error that decryption failed, change the last byte of the IV and try again; repeat this step until decryption succeeds and you'll know that the last byte of the plaintext xor'ed with your IV is one of two values -- either 1 or 2 if the penultimate byte of the xor'ed plaintext is 2 (or 3 if the 2 penultimate bytes are 3, etc)
- At this point, you suspect that the plaintext xor the IV that you're sending ends in 1; so let's say that the last byte of the original plaintext is P and the last byte of the IV is I and the last byte of the xor'ed plaintext is Px. We know that P xor I = Px and we know that Px is probably 1, so we can reorder that so that Px xor I = P and now we can calculate P. The oracle has started speaking and is pronouncing the doom of your message's secrecy.
- Now that we know what the last byte of the actual plaintext is (what we found in 4), we can change our IV so that the last byte will decrypt to 2 and repeat step 3 modifying the next-to-late byte until the server doesn't report a decryption error.
- If you get through all 256 possible values for that byte without a success, then you know that you did not find the actual value of the last byte, so you drop back to step 3 and continue until (with the IV that you were using in 3, without 4's changes) it successfully decrypts again.
- Once you get a success, you've definitely found the actual value of the last byte. You can use that and the value that you got the first time to figure out what value you had generated the first time and then you'll know that the plaintext has a consecutive run of one less than that many of that number; i.e. if you find out that you had originally gotten a 3, then you know that the two penultimate bytes of the plaintext are 3.
- From steps 3-7, you now know at least the last two bytes of plaintext and you can keep applying step 3-7 moving one byte towards the front each time until you have the entire plaintext and then just keep repeating those steps for each block
Silencing the Oracle
Things look bleak; the Oracle's pronouncing our doom and Cassandra's gone hoarse from warning us. Is there anything we can do? Fortunately, yes, there is! Newer versions of TLS (Transport Layer Security, the successor to SSL) also provide other modes for block ciphers like GCM (Galois/Counter Mode) that don't give an attacker that level of control over the IV that gets fed into the xor operation.
To make sure that your traffic isn't exposed, the most straightforward protection is to completely disable SSLv3 (and earlier, obviously; SSLv2 was broken almost 20 years ago!) and only use newer versions of TLS (v 1.2 if you can, because that's where AES-GCM was introduced as an available algorithm).
No comments:
Post a Comment