What You Need To Know About TCP "SACK Panic"
Last Updated: 2019-06-19 15:56:39 UTC
by Johannes Ullrich (Version: 1)
Netflix discovered several vulnerabilities in how Linux (and in some cases FreeBSD) are processing the "Selective TCP Acknowledgment (SACK)" option . The most critical of the vulnerabilities can lead to a kernel panic, rendering the system unresponsive. Patching this vulnerability is critical. Once an exploit is released, the vulnerability could be used to shut down exposed servers, or likely clients connecting to malicious services.
|CVE||Operating System Affected||Description/Impact|
|CVE-2019-11477||Linux > 2.6.29||SACK processing integer overflow. Leads to kernel panic.|
|CVE-2019-11478||Linux < 4.14.127||SACK Slowness or Excess Resource Usage|
|CVE-2019-5599||FreeBSD||RACK Send Map SACK Slowness|
|CVE-2019-11479||Linux (all versions)||Excess Resource Consumption Due to Low MSS Values|
You are vulnerable if you are using a current Linux system, have selective acknowledgments enabled (a common default) and are using a network card with TCP Segment Offload (again, a common default in modern servers). A patch has been made available. Alternatively, you can disable SACK.
Netflix included patches for the Linux kernel in its advisory. The following Linux kernel versions include the patch: 4.4.182, 4.9.182, 4.14.127, 4.19.52, 5.1.11.
What is SACK?
Each host connected to a network can send packets of a specific maximum size ("MTU"). This size depends on the network technology used, and for Ethernet, a typical size is 1500 bytes. But it can be as large as 9,000 for Ethernet. Some of this space is used for headers. With a standard 20 byte IP header, and a 20 byte TCP header, TCP packets usually can hold up to 1,460 bytes of data (the "Maximum Segment Size"). TCP will break down a data stream into segments that are small enough not to exceed this size, and hosts will communicate their respective maximum segment size to each other to reduce the chance of fragmentation.
To order packets in a TCP connection, each byte transmitted is assigned a sequence number, and the TCP header will list the sequence number of the first byte contained in the packet. A receiver will acknowledge which sequence number it received by communicating the next sequence number it expects.
Only acknowledging complete segments leads to a bit of inefficiency. A receiver can not communicate to a sender that it already received some out of order data. Instead, it will continue to acknowledge the last complete set of segments it has received.
To avoid this inefficiency, SACK was introduced. It allows receivers to notify the sender that it has received an out of order segment. "I received everything up to sequence number 10, and expect 11 next, but I also received 20-30". This way, the sender knows to resend only 11-19 and to continue with 31 next.
What is TCP Segment Offload?
TCP Segment Offload is a feature included in most current network cards. To reduce the work CPUs have to do to buffer and reassemble TCP segments, network cards will take care of some of the TCP processing. In this case, the operating system will receive large "packets" exceeding the MTU of the network.
What is TCP "SACK Panic"
Operating systems need to store data until it is transmitted (and acknowledged) or received. Linux uses a data structure referred to as "Socket Buffer" to do so. In Linux, this socket buffer can hold up to 17 segments. As packets are sent and acknowledged, data is removed from the structure, or some of the data may be consolidated. Moving the data can, in some cases, lead to more than 17 segments stored, which in turn, leads to a kernel panic.
What can I do to prevent this?
1. Disable SACK in Linux
You may temporarily disable SACK without a reboot. As root run:
echo 0 > /proc/sys/net/ipv4/tcp_sack
The first line is only necessary if you are using SELinux as it may block the second statement.
To make this change permanent, add the following to /etc/sysctl.conf (or probably cleaner as a new file in /etc/sysctl.d ):
net.ipv4.tcp_sack = 0
net.ipv4.tcp_dsack = 0
net.ipv4.tcp_fack = 0
Run "sysctl -p" to apply the changes without a reboot (and again, you may need to disable SELinux).
2. Firewall Rules
The exploit requires very small maximum segment size settings. You can block packets advertising a small MSS in iptables:
iptables -t mangle -A PREROUTING -p tcp -m conntrack --ctstate NEW -m tcpmss ! --mss 536:65535 -j DROP
Per RFC 879, TCP requires an MTU of at least 576, leading to a minimum MSS of 536.
How do I know I am vulnerable
3. Finding Vulnerable Systems
There is no easy scanning tool available to find vulnerable systems. So far, there is also no PoC exploit available that could be used to find vulnerable systems. As a quick test, you can look for systems supporting SACK (and running Linux). The following tcpdump command may help:
tcpdump -i eth0 -n 'ip<65 and tcp&0x2f=2' | grep 'sackOK'
This command will help identify systems with either the SYN or SYN-ACK flags set with a TTL of less than 65 (to help limit this to Linux systems). There is no simple filter for the SackOK option as it may appear in different positions, so I cheated a bit with the "grep."
You can use the "ethtool" command on Linux to see if TCP offloading is enabled (ethtool -k interface_name). [thanks to Alan for pointing this out via twitter).
Vendor Statements / Patches
Johannes B. Ullrich, Ph.D., Dean of Research, SANS Technology Institute