History and implementation status of Opportunistic Encryption for IPsec

(as sent to the cryptography mailing list)

FreeS/WAN

In light of the NSA achievements, a few people asked about the FreeS/WAN IPsec OE efforts and whatever happened to it.

The short answer is, we failed and got distracted. The long answer follows below. At the end I will talk about the current plans that have lingered in the last two years to revive this initiative. Below I will use the word “we” a lot. Its meaning changes based on the context as various communities touched, merged, intersected and drifted apart.

NOTE: On September 28, 2013 there is be a memorial service in Ann Arbour for Hugh Daniel, manager of the old IPsec FreeS/WAN Project. Various crypto people will attend, including a bunch of us from freeswan. Hugh would have loved nothing better than his memorial service being used as a focal point to talk about “new OE”, so that’s what we will do on Saturday and Sunday. If you are interested in attending, feel free to contact me.

OE in a nutshell

For those not familiar with IPsec OE as per FreeS/WAN implementation. When activated, a host would install a blocking policy for 0.0.0.0/0. Every packet to an IP address would trigger the kernel to hold the packet and signal the IKE daemon to go find an IPsec policy for that destination. If found, the tunnel would be build, and an IPsec tunnel to the remote IP would be established, and packets would flow. If no policy was found, a “pass” hole was poked so packets would go out unencrypted. Public keys for IP addresses were looked up in the reverse DNS by the IKE daemon based on the destination address. To help with roaming clients (roadwarriors), initiators could store their public key in their FQDN, and convey their FQDN as ID when performing IKE so the remote peer could look up their public key in the forward DNS. This came at the price of two dynamic clients not being able to do OE to each other. (turns out they couldn’t anyway, because of NAT)

What were the reasons for failing to encrypt the internet with OE IPsec (in no particular order):

1) Fragmentation of IPsec kernel stacks

In part due to the early history of FreeS/WAN combined with the export restrictions at the time. Instead of spending more time on IKE and key management for large scale enduser IPsec, we ended up wasting a lot of time fixing the FreeS/WAN KLIPS IPsec stack module for each Linux release. Another IPsec stack, which we dubbed XFRM/NETKEY appeared around 2.6.9 and was backported to 2.4.x. It was terribly incomplete and severely broken. With KLIPS not being within the kernel tree, it was never taken into account. XFRM/NETKEY remained totally unsuitable for OE for a decade. XFRM/NETKEY now has almost all functionality needed – I found out today it shoudl finally have first+last packet caching for dynamic tunnels, which are essential for OE. Since the application’s first packet triggered the IKE mechanism, the application would start retransmitting before IKE was completed. Even when the tunnel finally came up, the application was usually still waiting on that TCP retransmit. David McCullough and I still spend a lot of time fixing up KLIPS to work with the current Linux kernel. Look at ipsec_kversion.h just to see what a nightmare it has been to support Linux 2.0 to 2.6 (libreswan removed support for anything lower then recent 2.4.x kernels)

Linux IPsec Crypto hardware acceleration in practise is only possible with KLIPS + OCF, as the mainstraim async crypto is lacking in hardware driver support. If you want to build OE into people’s router/modem/setup box, this is important, though admittingly less so as time has moved on and even embedded hardware and phones are multicore or have special crypto CPU instructions.

An effort to make the kernel the sole provider of crypto algorithms that everyone could use also failed, and the idea was abandoned when CPU crypto instructions appeared directly accessable from userland.

2) US citizens could not contribute code or patches to FreeS/WAN

This was John Gilmore’s policy to ensure the software remained free for US citizens. If no US citizen touched the code, it would be immune to any presidential National Security Letter. I believe this was actually the main reason for KLIPS not going in mainstream kernel, although personal egos of kernel people seemed to have played a role here as well. Freeswan people really tried had in 2000/2001 to hook KLIPS into the kernel just the way the kernel people wanted. (Ironically, the XFRM/NETKEY hook so bad, it even confuses tcpdump and with it every sysadmin trying to see whether or not their traffic is encrypted) I still don’t fully understand why it was never merged, as the code was GPL, and it should have just been merged in, even against John’s wishes. Someone would have stepped in as maintainer – after all the initial brunt of the work had been done and we had a functional IPsec stack.

In the summer of 2003, I talked to John and together we agreed it was time to fork. Openswan was born to clearly indicate US coders could contribute. However, at that point the (then crappy) FRM/NETKEY IPsec stack was there to prevent OE from working due to the missing first+last packet caching. The FreeS/WAN Project ended and Openswan continued. At first in good pace, but that later slowed down and OE was no longer its focal point. (Due to legal reasons, I cannot go into details regarding the openswan history)

3) Not using DNS without DNSSEC

There were various issues that caused DNSSEC to get massively delayed. We needed DNSSEC to secure our DNS based distributed public key platform. Although it would have worked fine to use DNS against passive attackers (NSA trawling), we believed it was principly wrong to trust cryptographic material that was untrusted and vulnerable against active attacks. So while the developers encouraged people to put keys in DNS even without security, no one else picked it up. It sucks to need to say ‘we told you so’. But we should have really not waited on DNSSEC.

4) Dealing with the DNS working groups at IETF

The DNS community is one of the most pedantic group of people I know. They are very smart, often right, and had been known to be extremely defense of their DNS turf. (Note that things have improved considerably and if you think this is still an issue, I’m happy to try and help)

IETF was divided about the convergence of the “security of the DNS” and the “DNS as PKI” despite that this had always been a goal of DNSSEC for a large group of people within the IETF. The FreeS/WAN people were driving DNSSEC not so much for DNS as for the key distribution. After all, you can detect DNS forging if you know your public keys.

When we had the KEY/SIG records ready to go, it was decreed that it could only be used for the DNS itself. Applications could not use this KEY record. To make that distinction more clear, on the next change in the draft protocol, KEY was obsoleted and DNSKEY introduced. So IPsec keys were relegated back to TXT, since at the time we had no Generic Record format (RFC 3597) support, so waiting for any new RRtype to get any deployment to become usuable would take years. Almost everyone was on bind4 and never upgraded left us with no other choice but the TXT. Even though we wrote the OE and IPSECKEY RFCs, OE’s only deployments were done using TXT records.

5) DNSSEC was delayed by a decade

DNSSEC deployment was slowly gaining traction, but I think we really needed the Kaminsky bug to get that extra push for DNSSEC outside the geeks of the IETF. The US government mandate for DNSSEC in .GOV helped as well. But by this time, OE was mostly forgotten.

djb repeatedly tried to peddle his own warez. While not at all realistic, it always gained a lot of hype and media attention and probably did cause delays of DNSSEC deployment.

Kaminsky himself was shooting down DNSSEC too. I personally heckled him at various Black Hat’s and ICANN conferences until we finally sat down for a couple of hours to talk about DNSSEC’s history and design goals. I’ll claim my 15 minutes of fame for having converted him. It helped having Kaminsky say that although he didn’t like the complexity, he couldn’t see anything better. DNSSEC was needed for everyone.

DNSSEC was gaining traction. Then we ran into a bunch of DNSSEC deployment issues. We had the delays due to NSEC vs NSEC3 with OPTIN, and then on top of that in 2008 when the first big ISP in Sweden turned on DNSSEC in their resolvers all that traction was blown away.

Most consumer routers ran DNS proxies that implemented DNS as “known bitstreams” instead of implemeting the actual DNS protocol. The DNSSEC OK bit caused thousands of routers to drop DNSSEC packets as “invalid DNS”. The only realistic solution: Turn it off and wait two years for those routers to get obsoleted by faster wifi standards and talk to those vendors so they would not repeat their mistake with their next generation of routers.

We now have the IPSECKEY record format (though RFC 4025 is not useful, see below) and RFC 3597 for the generic DNS record deployed on all DNS servers. And we’re on our way to have DNSSEC on every end node (see also draft-wouters-edns-tcp-chain-query-00 I just submitted to the IETF)

We have a mostly clean working UDP/TCP port 53 transport for DNSSEC on most networks (in part thanks to Google DNS). Although our hotspot handling is still a little rough, with dnssec-trigger the only tool to hack configurable DNSSEC support into the OS for our coffee shop visits when we need to rely on forged DNS.

6) When you’re NAT on the net, you’re NOT on the net.

Opportunistic Encryption relied on a clear peer to peer connection. But we managed to degrade the internet into servers and clients. NAT was the biggest problem, and with CGN around the corner, it’s not something that is going away despite IPv6 offering enough IPs for everyone. In fact, for our “new OE”, this is the biggest hurdle to overcome. When Alice cannot talk to Bob because she cannot reach him due to a (carrier grade) NAT, we are stuck wildly poking holes and hoping packets flow.

7) The reverse DNS tree is dead Jim

OE depended on the reverse tree as a security mechanism that someone who was claiming a public key for a specific IP range was actually the legitimate owner of that IP space. It was the security method for RFC-4025.

But unless you are running in a datacenter, you do not have access to the reverse DNS. It is useless as key distribtion method. On top of that, large IPv6 deployments don’t even care any more to run any authoritative DNS for their reverse.

8) BTNS

The IETF tried to revive this OE with the Better Then Nothing Security (“BTNS”) working group. Contrary to the name, they also fell into the “perfect is the enemy of good” trap and most discussion seemed to go into “channel binding” to upgrade anonymous IPsec to some kind of authenticated IPsec – at least by the time I became aware of them. In other words, the most important problem of key distribution was left outside the scope and no one actually seemed to have implemented anything. Though I have to admit, I’m behind on reading the VPN auto-discovery drafts. It is just
very discouraging to still be reading problem statement drafts. More over, I don’t think we should setup IPsec tunnels based on packets hitting the kernel. We have better ways now that we can leverage DNSSEC.

9) We were all complacent

The only interest for IPsec was for corporate VPNs. During the above listed problem periods, OE people gave up. Some walked away from IETF. While everyone gained an always-on portable IP device,
their crypto capabilities were practically non-existent. My current iphone 5 can connect to a corporate VPN, but trying to make it _just_ send out encrypted packets is impossible. Some trickery can be used to cause almost any packet to setup the VPN, but while that’s going on it is still leaking like a sieve. VPN is seen by phone vendors as a method to gain some enterprise users, not as the tool to protect the consumer. The Apple VPN client is a 10+ year old patched version of racoon. The only vendor that took VPNs seriously was RIM and we punished them by not buying their products, because we had other priorities like FourSquare, Facebook and Twitter.

We can only hope that those PRISM players are now put under economic pressure by frightened consumers to fix this. But as long as VPNs and DNSSEC is slow and error-prone, it is better for them not to go there.

The New Opportunistic Encryption

I’ve been brainstorming with various people on how to put IPsec OE back on the table. I’ve discussed this with a bunch of people around me, including the late Hugh Daniel, John Gilmore and Hugh Redelmeier of freeswan.

The packet capturing 0.0.0.0/0 policy is not a good method because we cannot make any decision on where to find a public key for an IP address. The reverse is unusable, and IP addresses change often. We used it because we had nothing better. But now we do. Since every (secure) platform now runs DNSSEC on the end node, we can use this as our decision making point. Imagine my phone running a DNSSEC resolver (say unbound) and an IKE daemon (say libreswan). The DNS server has access to the set of DNS name and matching IP address. It can lookup the key in the forward DNS zone, and hand over the public key, dns name and IP address to the IKE daemon!

1) User tells browser to go to www.cypherpunks.ca

2) browser does a lookup for the A/AAAA record of www.cypherpunks.ca

3) DNSSEC resolver performs the lookup/validation for the A/AAAA record of www.cypherpunks.ca and additionally looks up the IPSECKEY record of www.cypherpunks.ca.

4a) The resolver will wait with returning the A/AAAA record to the browser until it knows if the IPSECKEY record exists or not. If not, it releases the A/AAA answer to the application. Packets flow in the clear.

4b) The resolver finds an IPSECKEY record. It sends the pubic key, the FQDN and the IP address(es) to the IKE daemon and waits for a response. Meanwhile it does _not_ release the A/AAAA record to the application.

5) The IKE daemon sets up the IPsec tunnel. We haven’t reached agreement yet over how this should be done. There are two choices:

a) The client uses an “@anonymous” ID for itself along with sending its public key inline with IKE. The client is responsible for ensuring there is no MITM attack, as it knows the server’s public key (from DNSSEC). The responding server will just use any key it received inline if it was received for the “@anonymous” ID.

b) The initiator (aka client) uses its own FQDN-based ID. It has preconfigured its DNS so that an IPSECKEY record exists for its FQDN (protected by DNSSEC). The key is not send inline with IKE. Instead, when the responder (aka server) sees the non-anonymous ID, it will perform a DNSSEC secured lookup to obtain the IPSECKEY out of band. Both parties confirm there is no MITM.

The advantage of a) is that it leaks less user information and makes tracking users harder. The client can regularly generate another anonymous keypair. The disadvantage of a) is that it turns peers into clients and servers. And two clients cannot initiate OE to each other.

6) The tunnel is established and the IKE daemon notifies the local DNSSEC server that had instructed it to setup the IPsec tunnel.

7) The resolver releases the IP address to the application.

8) The applications starts sending packets and the IPsec policy encrypts them al.

I’m personally in favour of the @anonymous solution. But there is no reason why support for both could not be implemented.

What are some of the obstacles and work to do:

1) writing the unbound plugin

2) writing the support for @anonymous for the server-side. This includes raw keys for IKEv2 (draft-ietf-ipsecme-oob-pubkey)

3) With NAT, the client suggests an inner-IP. This could be abused or clash, We need to ‘contain’ each connection, possibly using generated ipv6 addresses 4) We cannot use the “gateway” field of RFC-4025, or people could trick a server into giving a client all communication to a certain IP address that does not belong to them

5) anonymous connections should generate throw-away keys to remain anonymous

6) implement draft-wouters-edns-tcp-chain or else latency/RTTs will prevent real-life deployment of DNSSEC validated IPSECKEYs on mobile devices.

7) This allows no upgrading from anonymous to mutually authenticated, but IKE policies can be added to the server/client that would match on different IDs (eg X.509) that work independantly of OE without introducing complicated channel binding promotion code. Other IKEv2 extensions could possible be applied to facilitate promotions.

I’m sure more implementation issues will show up once we get this going, but there are no real fundamental issues why we cannot deploy this in a couple of months of time. My plan is to get libreswan to support this version of OE. Additionally, once we use draft-wouters-edns-tcp-chain, it becomes cheap to do these lookups through the tor network. If the tor exit nodes then also feed each other with DNSSEC cache material, it should make tracing individual clients even harder.

(anyone willing to assist, especially with coding, do contact me)

Skytech and Dawson College versus Ahmed the kid

In case you had not heard it, the information of about a quarter million Canadian students was compromised in a security incident last week. These kind of breaches are so common place now, I hardly take notice of them any more. But an article about this case appeared in the National Post, “Youth expelled from Montreal college after finding ‘sloppy coding’ that compromised security of 250,000 students personal data

hamed02_small
It turns out 20 year old Ahmed Al-Khabaz found a flaw that was trivial. He reported it to the vendor. The data was not leaked online on pastbin – at least, not by Mr.Al-Khabaz. But others before Al-Khabaz could have come in, copied the data, and left.
According to the article,

After an initial meeting with Director of Information Services and Technology François Paradis on Oct. 24, where Mr. Paradis congratulated Mr. Al-Khabaz and colleague Ovidiu Mija for their work and promised that he and Skytech, the makers of Omnivox, would fix the problem immediately, things started to go downhill.

Two days later, Mr. Al-Khabaz decided to run a software program called Acunetix, designed to test for vulnerabilities in websites, to ensure that the issues he and Mija had identified had been corrected. A few minutes later, the phone rang in the home he shares with his parents.

It was Edouard Taz, the President of Skytech, the company who wrote the Omnivox software. Taz threatened to report Al-Khabaz to the RCMP promising six to twelve months jail time unless Al-Khabaz agreed to immediately meet up and sign an NDA. It goes down hill from here. Dawson College expelled the student for a “serious professional conduct issue”. Fourteen out of fifteen professors in the computer science department voted in favour to expel Al-Khabaz. And the result is that a smart young kid, who did the right thing, and made a little mistake out of curiosity, is facing serious long term effects of not being able to get a college degree. And with him having nothing left to lose, the NDA they forced him to sign became useless, and Skytech lost as well because we all now know how crappy the security of Omnivox is. Edouard Taz shot himself in the foot. It’s just too bad he had to take down Al-Khabaz with him.

And the vulnerability? We don’t know we can assume it was something as trivial as manually editing the URL in the address bar, or some simple SQL injection attack. Something that in any other industry would be called “gross negligence”, but for some reason IT companies get away with delivering cars with faulty breaks and calling the slippery road an “Advanced Persistent Threat”. And companies like Skytech, with the assistance of an ill-informed Computer Science department at Dawson College, take it out on a young kid who honestly tried to do the right thing. Security through bullying. The response of both Skytech and Dawson College was over the top, and counter-productive to everyone’s interest.

Compare this to the response of my old university, the Radboud Universiteit Nijmegen. A 19 year old kid spent all his money on a 1200 baud modem (half duplex) and uses it to login to a university LAT-TCP server to connect to something called “the internet”. He uses “ftp” and “sz” to download files, “nn” and “telnet” to read news and chat with people all over the world. He did much worse then Al-Khabaz, although he too was careful not to do any accidental damage. And like Al-Khabaz performing his scan of Skytech, this person knew he was a trespasser. An assistant professor known as “Sparky”, notices that a student who had not been active at the university, started logging in very regularly. He becomes suspicious that the account might have been compromised. So he sends a Unix Talk request to her (see man talk). The two briefly chat online. Sparky confirms the account is compromised. But instead of screaming and threatening like Mr. Taza, Sparky asks the hacker how he got into the account. The hacker tells him he used ypcat to get /etc/shadow, then ran a cracker. Sparky then tells the hacker he has two days to get his stuff from the account and leave, and not to come back using any other accounts. Three years later, our young hacker has actually enlisted with the university to study CS, and in his second year he becomes friends with Sparky – and even gets an account on the university’s historic PDP-11. He confesses the whole affair and both have a good laugh. The student later on starts an ISP, becomes a security consultant who speaks at Black Hat, and is part of a group that writes VPN software in use all over the world. And the young-hacker-turned-sysadmin also has to deal with hackers himself. One time he logs into an irc channels to tell a group of file sharing hackers that they overstayed their welcome and should not have filled up the disks, and that it was time to go – and probably stop stealing movies and ISP resources. The circle of not threatening kids who make mistakes continued.

I can’t imagine how my life would have turned out if Sparky had hunted me down and had threatened me with jail time or that he would have kicked me out of university when I was 19. I learned a lot from Sparky, and I’ve passed along the same treatment to others. I wish Mr. Al-Khabaz all the best. He deserves much better then the treatment he received from Skytech and Dawson College. Don’t we all make those mistakes when we are that young? The real mistake here, is that Omnivox is a product with severe security flaws, that could have resulted in identity theft of 250,000 Canadian Students. Perhaps Mr. Taz should focus his bullying at his own security department?