Don’t stop using IPsec just yet

Today, Laura Poitras and Jake Appelbaum spoke at the 31C3 conference and in collaboration with Der Spiegel published an interesting article on VPN exploitation by the NSA.

The “TL;DR” summary of what follows below is: If you configure your IPsec based VPN properly, you are not affected. Always use Perfect Forward Secrecy (“pfs=yes” wich is the default in libreswan IPsec) and avoid PreSharedKeys (authby=secret which is not the default in libreswan IPsec). If you really need to use PSK, use a strong shared secret that cannot be brute forced. The NSA has their own version of IKEcrack running on millions of dollars worth of CPU’s. Also, the NSA sneaks into your router to steal your PSK’s so they can decrypt all your traffic.

Update 1: from media-35513.pdf (“TURMOIL/APEX/APEX High Level Description Document”):

CES generally requires the packets from both sides of an IKE exchange and knowledge of the associated pre-shared key (PSK) in order to have a chance of recovering a key for the corresponding cipher (ESP). A major goal of APEX is to access two sides of key exchanges of traffic of interest”

Update 2: To stop the NSA HappyDance, I just commited code to libreswan IPsec to warn about weak PSKs with a message notifying the sysadmin that as of July 1st, 2015, libreswan will refuse those weak PSKs. To determine the strength of the PSK, I used the Shannon Entropy value. If the PSK has a Shannon Entropy of less than 3.5, a warning will sound and in 6 months it will refuse to use that PSK.

Update 3: Just to make it clear – To break IKE PSKs, you first need to break the initial DiffieHellman exchange, which is usually MODP1024 or MODP1536 in the bad cases (and MODP2048+ in the good cases). The exception to this is IKEv1 Aggressive Mode, where the MAC computed with the PSK as secret key is sent
in the clear. As a result one can mount an offline dictionary attack.

IPsec is not IKE

First, let’s clarify things a little bit. Although everyone is talking about breaking IPsec, what they really mean is breaking the Internet Key Exchange (IKE) that is used to negotiate and create symmetric keys that are used with IPsec for encryption and decryption. IPsec itself can use various ciphers and algorithms. It can use a separate encryption (AES, CAMELLIA, SERPENT, TWOFISH, 3DES, etc) + integrity (SHA2, SHA1, AES_XCBC, MD5, etc) mode or it can use an AEAD cipher that combines these two into one (AES GCM, AES CCM, CAMELLIA GCM, etc)

But Jake said to stop using IPsec

None of what was said by Jake or published by Der Spiegel shows any attack on these ciphers or algorithms. Even the MD5 use within IPsec (HMAC-MD5) is not as weak as people usually believe it to be by confusing it with non-hmac use of MD5. The comments Jake made and the material in the published slides show attacks against weak IKE configurations, router compromises, and possibly against some IPsec hardware accelerators. Below are my comments on those slides that are interesting in the context of IKE and IPsec.

slide7

The focus of the VPN decryption team lies with IPsec and SSH. I think that fits the deployment. These two protocols are the most widely used encryption protocols to transfer bulk data. While the hacker community might focus on OpenVPN, that’s not where the real deployment is. The mention of PPTP is curious. From what I see, PPTP is basically dead as a VPN technology, although it is still widely used in Russia as an encapsulation technique for ISPs to connect endusers. It’s probably just listed there for intercept because it is so trivial to do, you might as well gather it.

page15

This slide is harder to make sense of. I think that here the NSA is talking about IKE and IPsec combined, although they call it IPsec. Perhaps once they decrypt traffic, it goes into XKEYSCORE? They don’t seem to use XKEYSCORE as an input to crack the IKE PSKs? I’m amused by the name VULCAN-DEATHGRIP which might indicate that the HiFN (now Exar) Vulcan IPsec accelerator card might have a flaw in it that allows them to decrypt those IPsec VPNs. Also, I hope they properly mask their perl scripts for malicious decrypted traffic :P

page16

This is probably the most interesting slide. It basically states that IPsec VPNs are compromised by router compromises that steal the IKE PSKs. This might be a good time to upgrade your router firmware to the latest version, and change to new strong PSKs (or switch it over to RSA 2048+ instead)
The AppID refers to the XKEYSCORE signature name. I’m glad to see that IKEv2 is finally catching on, even within the NSA :)

page26

This slide is mentioning IPsec scripts, which might refer back to the previously mentioned genericIPSec_wrapper.pl script. This clearly indicates an IKE PSK brute force decrypting engine. If you are using weak PSKs, the NSA is decrypting your IKE traffic which will give them the keys to decrypt your IPsec traffic. Enabling PFS will help you a bit, but ultimtely you should really just switch to using RSA (or ECP) authentication and skip PSK’s alltogether. Note that the FIPS standard already disallows using PSK’s for IKE/IPsec. The NSA knows it is just too dangerous and weak.

Note to the NSA, please see RFC 4301 Section 1.1 which states:

The spelling “IPsec” is preferred and used throughout this and all related IPsec standards. All other capitalizations of IPsec (e.g., IPSEC, IPSec, ipsec) are deprecated.

On slide 31 it states: IPSec: 6640 (protocols) and 6648 (ports)

I don’t know what that refers to. Possible backhaul transport ports or part of an API port ?

On slide 33 it states: Need all the pieces (IKE and ESP for IPSec)

This really shows that the NSA is breaking IPsec by breaking IKE, which mostly seems to come down to weak PSKs and no PFS. And again, in case you still did not get it on slide 39:

page39

“TAO got the configuration files which provided us the PSKs to enable passive exploitation”

So your IPsec VPN is getting owned because your VPN gateway got owned! Run your VPN on a dedicated machine, preferably opensource, and lock remote access via ssh down using strong keys and IP filtering.

And the last interesting nugget is on page 40:

page40

“NSP had an implant which allows passive exploitation with just ESP”.

I read this to mean that the hardware or software of the system running IPsec was compromised, causing it to send valid protocol ESP packets, but creating those in such a way that these could be decrypted without knowing the ESP session keys (from IKE). Possibly by subverting the hardware number generator, or functions related to IV / ICV’s / nonces that would appear to be random but were not. Or less likely if the hardware/software was using padding bytes to reveal decryption keys.

Conclusion

If anything, I would say that IPsec and IKE have withstood the attacks of the NSA. The fact that the NSA needs implants and needs to use brutce force attacks against IKE PSKs shows that despite its kitchen sink nature, IKE and IPsec are doing quite well if you configure it properly and run it on a secure host. But misconfiguration of IKE is something we did a poor job of preventing. We can and should have more automatic IKE/IPsec deployed using strong default configurations.

And you should hear back from The Libreswan Team in the next two months regarding just that. Opportunistic Encryption: Round Two

Development of libreswan vs openswan

animated

UPDATE: Statistics updated to include 2017

Yesterday, openswan released version 2.6.40 to address CVE-2013-6466. You might be confused by its changelog (not the non-updated CHANGES) crediting me for the vast majority of code changes. . Basically all changes are pulled from the libreswan repository and are backports to openswan. The exception is their version of a patch for CVE-2013-6466. Libreswan’s fix is not a band-aid but an updated state machine. The backported libreswan fix is what is going into the updated openswan packages for RHEL5 and RHEL6 and are available at libreswan.org/security/openswan/CVE-2013-6466. RHEL7 will contain libreswan.

Basically, yesterday’s openswan 2.6.40 release brings it up to the initial libreswan-3.0 release of two years ago, plus the two CVE issues. Except that it crashes KLIPS and backported a libreswan commit that broke all non-XAUTH IPsec connections.

History and implementation status of Opportunistic Encryption for IPsec

(as sent to the cryptography mailing list)

FreeS/WAN

In light of the NSA achievements, a few people asked about the FreeS/WAN IPsec OE efforts and whatever happened to it.

The short answer is, we failed and got distracted. The long answer follows below. At the end I will talk about the current plans that have lingered in the last two years to revive this initiative. Below I will use the word “we” a lot. Its meaning changes based on the context as various communities touched, merged, intersected and drifted apart.

NOTE: On September 28, 2013 there is be a memorial service in Ann Arbour for Hugh Daniel, manager of the old IPsec FreeS/WAN Project. Various crypto people will attend, including a bunch of us from freeswan. Hugh would have loved nothing better than his memorial service being used as a focal point to talk about “new OE”, so that’s what we will do on Saturday and Sunday. If you are interested in attending, feel free to contact me.

OE in a nutshell

For those not familiar with IPsec OE as per FreeS/WAN implementation. When activated, a host would install a blocking policy for 0.0.0.0/0. Every packet to an IP address would trigger the kernel to hold the packet and signal the IKE daemon to go find an IPsec policy for that destination. If found, the tunnel would be build, and an IPsec tunnel to the remote IP would be established, and packets would flow. If no policy was found, a “pass” hole was poked so packets would go out unencrypted. Public keys for IP addresses were looked up in the reverse DNS by the IKE daemon based on the destination address. To help with roaming clients (roadwarriors), initiators could store their public key in their FQDN, and convey their FQDN as ID when performing IKE so the remote peer could look up their public key in the forward DNS. This came at the price of two dynamic clients not being able to do OE to each other. (turns out they couldn’t anyway, because of NAT)

What were the reasons for failing to encrypt the internet with OE IPsec (in no particular order):

1) Fragmentation of IPsec kernel stacks

In part due to the early history of FreeS/WAN combined with the export restrictions at the time. Instead of spending more time on IKE and key management for large scale enduser IPsec, we ended up wasting a lot of time fixing the FreeS/WAN KLIPS IPsec stack module for each Linux release. Another IPsec stack, which we dubbed XFRM/NETKEY appeared around 2.6.9 and was backported to 2.4.x. It was terribly incomplete and severely broken. With KLIPS not being within the kernel tree, it was never taken into account. XFRM/NETKEY remained totally unsuitable for OE for a decade. XFRM/NETKEY now has almost all functionality needed – I found out today it shoudl finally have first+last packet caching for dynamic tunnels, which are essential for OE. Since the application’s first packet triggered the IKE mechanism, the application would start retransmitting before IKE was completed. Even when the tunnel finally came up, the application was usually still waiting on that TCP retransmit. David McCullough and I still spend a lot of time fixing up KLIPS to work with the current Linux kernel. Look at ipsec_kversion.h just to see what a nightmare it has been to support Linux 2.0 to 2.6 (libreswan removed support for anything lower then recent 2.4.x kernels)

Linux IPsec Crypto hardware acceleration in practise is only possible with KLIPS + OCF, as the mainstraim async crypto is lacking in hardware driver support. If you want to build OE into people’s router/modem/setup box, this is important, though admittingly less so as time has moved on and even embedded hardware and phones are multicore or have special crypto CPU instructions.

An effort to make the kernel the sole provider of crypto algorithms that everyone could use also failed, and the idea was abandoned when CPU crypto instructions appeared directly accessable from userland.

2) US citizens could not contribute code or patches to FreeS/WAN

This was John Gilmore’s policy to ensure the software remained free for US citizens. If no US citizen touched the code, it would be immune to any presidential National Security Letter. I believe this was actually the main reason for KLIPS not going in mainstream kernel, although personal egos of kernel people seemed to have played a role here as well. Freeswan people really tried had in 2000/2001 to hook KLIPS into the kernel just the way the kernel people wanted. (Ironically, the XFRM/NETKEY hook so bad, it even confuses tcpdump and with it every sysadmin trying to see whether or not their traffic is encrypted) I still don’t fully understand why it was never merged, as the code was GPL, and it should have just been merged in, even against John’s wishes. Someone would have stepped in as maintainer – after all the initial brunt of the work had been done and we had a functional IPsec stack.

In the summer of 2003, I talked to John and together we agreed it was time to fork. Openswan was born to clearly indicate US coders could contribute. However, at that point the (then crappy) FRM/NETKEY IPsec stack was there to prevent OE from working due to the missing first+last packet caching. The FreeS/WAN Project ended and Openswan continued. At first in good pace, but that later slowed down and OE was no longer its focal point. (Due to legal reasons, I cannot go into details regarding the openswan history)

3) Not using DNS without DNSSEC

There were various issues that caused DNSSEC to get massively delayed. We needed DNSSEC to secure our DNS based distributed public key platform. Although it would have worked fine to use DNS against passive attackers (NSA trawling), we believed it was principly wrong to trust cryptographic material that was untrusted and vulnerable against active attacks. So while the developers encouraged people to put keys in DNS even without security, no one else picked it up. It sucks to need to say ‘we told you so’. But we should have really not waited on DNSSEC.

4) Dealing with the DNS working groups at IETF

The DNS community is one of the most pedantic group of people I know. They are very smart, often right, and had been known to be extremely defense of their DNS turf. (Note that things have improved considerably and if you think this is still an issue, I’m happy to try and help)

IETF was divided about the convergence of the “security of the DNS” and the “DNS as PKI” despite that this had always been a goal of DNSSEC for a large group of people within the IETF. The FreeS/WAN people were driving DNSSEC not so much for DNS as for the key distribution. After all, you can detect DNS forging if you know your public keys.

When we had the KEY/SIG records ready to go, it was decreed that it could only be used for the DNS itself. Applications could not use this KEY record. To make that distinction more clear, on the next change in the draft protocol, KEY was obsoleted and DNSKEY introduced. So IPsec keys were relegated back to TXT, since at the time we had no Generic Record format (RFC 3597) support, so waiting for any new RRtype to get any deployment to become usuable would take years. Almost everyone was on bind4 and never upgraded left us with no other choice but the TXT. Even though we wrote the OE and IPSECKEY RFCs, OE’s only deployments were done using TXT records.

5) DNSSEC was delayed by a decade

DNSSEC deployment was slowly gaining traction, but I think we really needed the Kaminsky bug to get that extra push for DNSSEC outside the geeks of the IETF. The US government mandate for DNSSEC in .GOV helped as well. But by this time, OE was mostly forgotten.

djb repeatedly tried to peddle his own warez. While not at all realistic, it always gained a lot of hype and media attention and probably did cause delays of DNSSEC deployment.

Kaminsky himself was shooting down DNSSEC too. I personally heckled him at various Black Hat’s and ICANN conferences until we finally sat down for a couple of hours to talk about DNSSEC’s history and design goals. I’ll claim my 15 minutes of fame for having converted him. It helped having Kaminsky say that although he didn’t like the complexity, he couldn’t see anything better. DNSSEC was needed for everyone.

DNSSEC was gaining traction. Then we ran into a bunch of DNSSEC deployment issues. We had the delays due to NSEC vs NSEC3 with OPTIN, and then on top of that in 2008 when the first big ISP in Sweden turned on DNSSEC in their resolvers all that traction was blown away.

Most consumer routers ran DNS proxies that implemented DNS as “known bitstreams” instead of implemeting the actual DNS protocol. The DNSSEC OK bit caused thousands of routers to drop DNSSEC packets as “invalid DNS”. The only realistic solution: Turn it off and wait two years for those routers to get obsoleted by faster wifi standards and talk to those vendors so they would not repeat their mistake with their next generation of routers.

We now have the IPSECKEY record format (though RFC 4025 is not useful, see below) and RFC 3597 for the generic DNS record deployed on all DNS servers. And we’re on our way to have DNSSEC on every end node (see also draft-wouters-edns-tcp-chain-query-00 I just submitted to the IETF)

We have a mostly clean working UDP/TCP port 53 transport for DNSSEC on most networks (in part thanks to Google DNS). Although our hotspot handling is still a little rough, with dnssec-trigger the only tool to hack configurable DNSSEC support into the OS for our coffee shop visits when we need to rely on forged DNS.

6) When you’re NAT on the net, you’re NOT on the net.

Opportunistic Encryption relied on a clear peer to peer connection. But we managed to degrade the internet into servers and clients. NAT was the biggest problem, and with CGN around the corner, it’s not something that is going away despite IPv6 offering enough IPs for everyone. In fact, for our “new OE”, this is the biggest hurdle to overcome. When Alice cannot talk to Bob because she cannot reach him due to a (carrier grade) NAT, we are stuck wildly poking holes and hoping packets flow.

7) The reverse DNS tree is dead Jim

OE depended on the reverse tree as a security mechanism that someone who was claiming a public key for a specific IP range was actually the legitimate owner of that IP space. It was the security method for RFC-4025.

But unless you are running in a datacenter, you do not have access to the reverse DNS. It is useless as key distribtion method. On top of that, large IPv6 deployments don’t even care any more to run any authoritative DNS for their reverse.

8) BTNS

The IETF tried to revive this OE with the Better Then Nothing Security (“BTNS”) working group. Contrary to the name, they also fell into the “perfect is the enemy of good” trap and most discussion seemed to go into “channel binding” to upgrade anonymous IPsec to some kind of authenticated IPsec – at least by the time I became aware of them. In other words, the most important problem of key distribution was left outside the scope and no one actually seemed to have implemented anything. Though I have to admit, I’m behind on reading the VPN auto-discovery drafts. It is just
very discouraging to still be reading problem statement drafts. More over, I don’t think we should setup IPsec tunnels based on packets hitting the kernel. We have better ways now that we can leverage DNSSEC.

9) We were all complacent

The only interest for IPsec was for corporate VPNs. During the above listed problem periods, OE people gave up. Some walked away from IETF. While everyone gained an always-on portable IP device,
their crypto capabilities were practically non-existent. My current iphone 5 can connect to a corporate VPN, but trying to make it _just_ send out encrypted packets is impossible. Some trickery can be used to cause almost any packet to setup the VPN, but while that’s going on it is still leaking like a sieve. VPN is seen by phone vendors as a method to gain some enterprise users, not as the tool to protect the consumer. The Apple VPN client is a 10+ year old patched version of racoon. The only vendor that took VPNs seriously was RIM and we punished them by not buying their products, because we had other priorities like FourSquare, Facebook and Twitter.

We can only hope that those PRISM players are now put under economic pressure by frightened consumers to fix this. But as long as VPNs and DNSSEC is slow and error-prone, it is better for them not to go there.

The New Opportunistic Encryption

I’ve been brainstorming with various people on how to put IPsec OE back on the table. I’ve discussed this with a bunch of people around me, including the late Hugh Daniel, John Gilmore and Hugh Redelmeier of freeswan.

The packet capturing 0.0.0.0/0 policy is not a good method because we cannot make any decision on where to find a public key for an IP address. The reverse is unusable, and IP addresses change often. We used it because we had nothing better. But now we do. Since every (secure) platform now runs DNSSEC on the end node, we can use this as our decision making point. Imagine my phone running a DNSSEC resolver (say unbound) and an IKE daemon (say libreswan). The DNS server has access to the set of DNS name and matching IP address. It can lookup the key in the forward DNS zone, and hand over the public key, dns name and IP address to the IKE daemon!

1) User tells browser to go to www.cypherpunks.ca

2) browser does a lookup for the A/AAAA record of www.cypherpunks.ca

3) DNSSEC resolver performs the lookup/validation for the A/AAAA record of www.cypherpunks.ca and additionally looks up the IPSECKEY record of www.cypherpunks.ca.

4a) The resolver will wait with returning the A/AAAA record to the browser until it knows if the IPSECKEY record exists or not. If not, it releases the A/AAA answer to the application. Packets flow in the clear.

4b) The resolver finds an IPSECKEY record. It sends the pubic key, the FQDN and the IP address(es) to the IKE daemon and waits for a response. Meanwhile it does _not_ release the A/AAAA record to the application.

5) The IKE daemon sets up the IPsec tunnel. We haven’t reached agreement yet over how this should be done. There are two choices:

a) The client uses an “@anonymous” ID for itself along with sending its public key inline with IKE. The client is responsible for ensuring there is no MITM attack, as it knows the server’s public key (from DNSSEC). The responding server will just use any key it received inline if it was received for the “@anonymous” ID.

b) The initiator (aka client) uses its own FQDN-based ID. It has preconfigured its DNS so that an IPSECKEY record exists for its FQDN (protected by DNSSEC). The key is not send inline with IKE. Instead, when the responder (aka server) sees the non-anonymous ID, it will perform a DNSSEC secured lookup to obtain the IPSECKEY out of band. Both parties confirm there is no MITM.

The advantage of a) is that it leaks less user information and makes tracking users harder. The client can regularly generate another anonymous keypair. The disadvantage of a) is that it turns peers into clients and servers. And two clients cannot initiate OE to each other.

6) The tunnel is established and the IKE daemon notifies the local DNSSEC server that had instructed it to setup the IPsec tunnel.

7) The resolver releases the IP address to the application.

8) The applications starts sending packets and the IPsec policy encrypts them al.

I’m personally in favour of the @anonymous solution. But there is no reason why support for both could not be implemented.

What are some of the obstacles and work to do:

1) writing the unbound plugin

2) writing the support for @anonymous for the server-side. This includes raw keys for IKEv2 (draft-ietf-ipsecme-oob-pubkey)

3) With NAT, the client suggests an inner-IP. This could be abused or clash, We need to ‘contain’ each connection, possibly using generated ipv6 addresses 4) We cannot use the “gateway” field of RFC-4025, or people could trick a server into giving a client all communication to a certain IP address that does not belong to them

5) anonymous connections should generate throw-away keys to remain anonymous

6) implement draft-wouters-edns-tcp-chain or else latency/RTTs will prevent real-life deployment of DNSSEC validated IPSECKEYs on mobile devices.

7) This allows no upgrading from anonymous to mutually authenticated, but IKE policies can be added to the server/client that would match on different IDs (eg X.509) that work independantly of OE without introducing complicated channel binding promotion code. Other IKEv2 extensions could possible be applied to facilitate promotions.

I’m sure more implementation issues will show up once we get this going, but there are no real fundamental issues why we cannot deploy this in a couple of months of time. My plan is to get libreswan to support this version of OE. Additionally, once we use draft-wouters-edns-tcp-chain, it becomes cheap to do these lookups through the tor network. If the tor exit nodes then also feed each other with DNSSEC cache material, it should make tracing individual clients even harder.

(anyone willing to assist, especially with coding, do contact me)