How secure is Internet security? A team of researchers recently set out to crack the security of about 6.6 million sites around the Internet that use a supposedly “unbreakable” cryptosystem. The good news is that they succeeded only .2% of the time. The bad — and rather shocking — news — is that .2% of 6.6 million is almost 13,000. That’s 13,000 sites with effectively no security. The interesting part is what went wrong. It’s a gaping security hole that I’d never stopped to consider, but is obvious once it’s pointed out.
First, a quick primer on how public-key security works. Suppose you want people to be able to send you encrypted messages that only you can read. First you choose two large prime numbers — say 13 and 19. (In practice, you pick prime numbers that are several hundred digits long, but we’ll do the baby case to see what’s going on.) You multiply them together to get 247. Then you announce to the world that 247 is your “public key”.
Now suppose I want to send you a message including my credit card number and a catalogue order. The first thing my browser does is take my entire message and encode it as a single number in some simple way (say by stringing together my credit card number, the catalogue number of the part I’m ordering, my house number, a numerical representation of my street name, etc.) That number will be many many digits long, but for purposes of our simple example, let’s pretend it turns out to be 53. Then my browser does this: It cubes the number 53, getting 148877, divides that result by your public key (247), getting a remainder of 183, and sends that remainder to your browser.
(Yes, this is slightly oversimplified. No, the simplifications don’t matter.)
Now when your browser receives the coded message “183″, it somehow has to reverse the process to figure out that the true message is “53″. It turns out there’s a very quick and easy way to reverse that process — provided you know that 247 is 13 times 19. You do know that, because that’s how you came up with 247 as your public key in the first place. So decoding my message is easy for you. But when the bad guy intercepts the message, hoping to harvest my credit card number, he’s missing the key piece of knowledge that 247 = 13 x 19, so he’s stuck.
(Of course if your public key were really 247, he could succeed by brute force, just trying every possible original message to see which one encrypts to 183. But if your key and my message are hundreds of digits long, this is unthinkably impractical.)
That’s all well and good, provided the bad guy can’t figure out how to factor your public key. We have no proof that he can’t, but lots of good reasons to think that it’s incredibly unlikely. That’s what much internet security is based on.
So in this context “cracking Internet security” means gathering the public keys from a whole lot of sites and trying to factor them. Here’s the researchers’ insight: If I give you a single number and ask you to factor it, that’s really really hard. But if I give you two numbers, tell you that they have a factor in common, and ask you to find that factor, that’s really really easy, using a method invented thousands of years ago and recorded by Euclid.
So. My public key is 247, which I got by multiplying 13 times 19. You have essentially no way of factoring that. (Of course, you can actually factor it easily because it’s such a small number, but we’re pretending it’s huge.) But another site, using similar security, has a public key of 437, which they got by multiplying 19 times 23. Suppose you come along and say: “Hmmm. I know that each of these guys used similar software to generate large primes. Maybe their software gave similar results, so maybe they generated a prime in common. Let’s try using Euclid’s method to find that prime. Bingo! The prime is 19!”
Now you can easily factor my 247 and the other guy’s 437, just by dividing them each by 19.
The ingenuity is recognizing that two hard problems can be a lot easier than one. Factoring 247: near impossible. Factoring 437: near impossible. Factoring them both, given the correct guess that they’ve got a factor in common: Easy.
As long as people keep using the same sort of software in the same sort of way to generate large primes, this is going to keep happening.
Someday, as computing power grows, all of today’s public keys will be factorable directly. It’s been assumed all along that this is no problem; we’ll just go to longer public keys. Whatever the current state of computing power, we can always choose public keys that are beyond its reach. This new research is a warning that if people keep picking keys the way they do now, going to longer ones won’t help very much.
Riffing on Stalin’s “One death is a tragedy; a million deaths is a statistic”, the researchers write that
Factoring one 1024-bit RSA modulus [that's a fancy name for a multi-hundred digit public key] would be historic; factoring 12720 such moduli is a statistic. The former is still out of reach for the academic community … The latter comes as an unwelcome warning that underscores the difficulty of key generation in the real world.
(Hat tip to Dick Lipton, who has some additional very interesting things to say about the expected likelihood of this problem.)