IPSec on RHEL6/CentOS6 (Don't do it)
You want to use a RHEL 6/CentOS 6 server as an IPSec/VPN gateway?
Here's the tl;dr.... don't do it. Buy some Juniper SRX210's on eBay for $200/each instead.
The Linux kernel team massively broke IPSec performance somewhere between kernels 2.6.18 and 2.6.35. The good news is that it's supposedly fixed in 2.6.35. I haven't tested it, but reports are that it works OK. So if you must stay with RHEL or CentOS, compile your own kernel (I'd recommend doing that anyway).
So back to the long story.
Like us, you're probably run an "ops box" or two in PoPs to handle some things like DNS, LDAP, Puppet, distributing code, etc. One of the things have our ops boxes do is create IPSec tunnel endpoints back to the mothership. We don't push much traffic through the tunnel -- just little things like puppet files, code updates, internal DNS, etc. We're not using these tunnels for our core business, just the ancillary stuff.
We decided to deploy one of these boxes on CentOS 6 and get with the new program. We've been running CentOS 5.x for a while and have been pretty pleased with it overall. However, sometimes you gotta run the new distros to get the new things (or just to get the server to boot). So we went with CentOS 6 for one ops box and started to work through some things. Once we thought we had it all up and running, we enabled the IPSec tunnel using racoon (gotta build your own)...then all hell broke loose.
Not initially, but over the course of a day, things started to stall on the box. Puppet runs hiccuped. NFS requests timed out. Things were getting bad. Looking into the matter, I noticed that the power governor was causing some trouble. I also found this bug to correlate our findings (https://bugzilla.kernel.org/show_bug.cgi?id=42981). Basically, don't run ACPI with any kind of power governors on RHEL/CentOS 6. The problem can occur when 2 or more cores are told to idle. It doesn't take 16 or 32 like in the bug report. Hopefully this will get backported to the RHEL/CentOS kernels, but in the meantime just build your own out of the latest.
Having found and dealt with that problem, we thought we were in the clear. Ends up that it was just the tip of the iceberg. The bigger problem is that IPSec is just plain busted in these kernels. It still encrypts and authenticates everything OK, but the processing of the incoming and outgoing datagrams is painfully slow. Every time that iptables, VPN, or some other kernel networking module needs to do something, you'll see ksoftirqd run. ksoftirqd is the equivalent of services.exe on a Windows box. Lots of stuff runs through it, but it generally means "server kernel stuffs." When you see lots of these run, you've got a busy box. Once ksoftirqd hits 100%, your box is toast. It doesn't matter if you have 48 cores or 2 cores. Once you see it hit 100%, the box is a goner. Here's what I mean (8 core server):
I think you can see where we enabled IPSec on the graph. The ksoftirqd stuff is represented by the light blue area (system). We disabled the power stuff on Monday. We "fixed" the IPSec stuff on Wednesday night. What's the fix? It's really obtuse, but we think it's this:
echo 32768 > /proc/sys/net/ipv4/xfrm4_gc_thresh
You'll probably want to cat that value before you change it just in case things get worse. On our machines it was over 4 million. That seems broken for something that's supposed to be some sort of garbage collection threshold. I've seen some distros hardcode it at 262144. Some are at 32768. One guy said he got great IPSec performance setting it at as low as 100. Your mileage might vary (like it might break other things), so try experimenting with different values.
Here's a post that led me down this path: http://en.usenet.digipedia.org/thread/16263/17508/
Don't get me wrong and think that adding IPSec shouldn't create more work for your CPU's. It will. However, one can expect to send a few megabits of traffic without having the box fall over.
Or just scrap distro 6 altogether and just run RHEL/CentOS 5. It works fine there. Personally, I think we're just going to buy a bunch of SRX210's off of eBay.
Update: for those who wanted a longer term graph and didn't think the problem was fixed:
2/15/2013 04:50:46 am
Just wanted to let you know I very much appreciate you posting this article. I believe you probably just saved me a few days of work trying to get this figured out. Thank you!
6/4/2013 10:14:28 am
Leo, glad I could help out! You should definitely leave ACPI enabled on the machine, but disable the power governors in the OS. We saw a lot of latency and lower throughput when they were enabled. Disabling/stopping the cpuspeed (if memory serves correct) service should do the trick. The gc_thresh thing was responsible for the biggest gains though.
Comments are closed.
A NOLA native just trying to get by. I live in San Francisco and work as a digital plumber for the joint that runs this thing. (Square/Weebly) Thoughts are mine, not my company's.