netfilter regression causes lost pings "operation not permitted"

From: Trevor Cordes
Date: Wed Dec 07 2016 - 03:13:56 EST


Bisected down to:
870190a9ec9075205c0fa795a09fa931694a3ff1
7c9664351980aaa6a4b8837a314360b3a4ad382a

Hi! 4.8.x caused a script of mine that pings all IPs on my LAN /24 subnet
in about 0.5s, and nmap doing the same, to error on the send() call with
"operation not permitted". This happens after a somewhat random number of
packets have already been sent. That number shrinks each time you run the
script, so the first run you'll get up to around 200 pings, then it goes
down to 50 pings, before the error. If you wait, it goes back up to
around 200 pings. It almost never completes all 253 of them.

Interestingly, the problem only occurs when you ping different IPs. If
you send the same ping count using my script to just one IP, there is no
bug.

4.7.0 kernels don't have this problem: the pings go out and everything is
fine no matter how fast you repeat the script.

I bisected the bug to the above commits. I had to skip
7c9664351980aaa6a4b8837a314360b3a4ad382a because it wouldn't boot... just
panic on every try. So I can't narrow it any closer than within 2
commits.

You can reproduce this bug in 4.8.8 or newer with:

# change to your LAN subnet
nmap -PE 192.168.100.0/24

Or use my test script I will paste below. (Modify the top lines to suit
your LAN IPs; or more work for different netmasks.) Sometimes you have to
run the script a few times before the error occurs.

When you see "operation not permitted", that's the symptom. Boot into
4.7.10, say, and you don't get any error.

I played with all the sysctls that looked relevant, like: ratelimit,
per_sec, max, etc. I modified everything I could find but nothing made
the problem go away, though I *think* some had a modest effect on how many
times I could run the script before the error popped up, but even if I
took them to extreme values the bug never went away.

I'm back to the Fedora defaults now:

#sysctl -a | grep -iP 'icmp|nf_|conntrack|iptable'|grep -viP 'nf_log'
net.ipv4.icmp_echo_ignore_all = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_errors_use_inbound_ifaddr = 0
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.icmp_msgs_burst = 50
net.ipv4.icmp_msgs_per_sec = 1000
net.ipv4.icmp_ratelimit = 1000
net.ipv4.icmp_ratemask = 6168
net.ipv6.icmp.ratelimit = 1000
net.netfilter.nf_conntrack_acct = 0
net.netfilter.nf_conntrack_buckets = 65536
net.netfilter.nf_conntrack_checksum = 1
net.netfilter.nf_conntrack_count = 201
net.netfilter.nf_conntrack_events = 1
net.netfilter.nf_conntrack_expect_max = 1024
net.netfilter.nf_conntrack_generic_timeout = 600
net.netfilter.nf_conntrack_helper = 0
net.netfilter.nf_conntrack_icmp_timeout = 30
net.netfilter.nf_conntrack_log_invalid = 0
net.netfilter.nf_conntrack_max = 262144
net.netfilter.nf_conntrack_tcp_be_liberal = 0
net.netfilter.nf_conntrack_tcp_loose = 1
net.netfilter.nf_conntrack_tcp_max_retrans = 3
net.netfilter.nf_conntrack_tcp_timeout_close = 10
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
net.netfilter.nf_conntrack_tcp_timeout_established = 432000
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_last_ack = 30
net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300
net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 60
net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 300
net.netfilter.nf_conntrack_timestamp = 0
net.netfilter.nf_conntrack_udp_timeout = 30
net.netfilter.nf_conntrack_udp_timeout_stream = 180
net.nf_conntrack_max = 262144


Thanks for your help!



TEST SCRIPT:::::

#!/usr/bin/perl -w
# sorry, cheesy formatting, this is a test case I just slapped together

my $subnet = '192.168.100.';
#my $single = '192.168.101.110';

use Socket;
use Symbol;
use NetAddr::IP::Lite;

sub ICMP_ECHO () { 8 }
sub ICMP_SUBCODE () { 0 }
sub ICMP_STRUCT () { 'C2S3A56' }
sub ICMP_FLAGS () { 0 }
sub ICMP_PORT () { 0 }

$sequence=0;

for $i (2..254) {

$protocol = (getprotobyname('icmp'))[2] or
die('Cannot get ICMP protocol number by name - ', $!);

$socket = Symbol::gensym;
socket($socket, PF_INET, SOCK_RAW, $protocol) or
die('Cannot create IMCP socket - ', $!);


$sequence = ($sequence+1) & 0xFFFF;

my $checksum = 0;
my $msg = pack(
ICMP_STRUCT,
ICMP_ECHO,
ICMP_SUBCODE,
$checksum,
$$ & 0xFFFF,
$sequence,
'0' x 56
);

my $short = int(length($msg) / 2);
$checksum += $_ for unpack "S$short", $msg;
$checksum += ord(substr($msg, -1)) if length($msg) % 2;
$checksum = ($checksum >> 16) + ($checksum & 0xFFFF);
$checksum = ~(($checksum >> 16) + $checksum) & 0xFFFF;

$msg = pack(
ICMP_STRUCT,
ICMP_ECHO,
ICMP_SUBCODE,
$checksum,
$$ & 0xFFFF,
$sequence,
'0' x 56
);

my($address)=$single?$single:"$subnet$i";
my $netaddr = inet_aton($address);
my $sockaddr = pack_sockaddr_in(ICMP_PORT, $netaddr);
send($socket, $msg, ICMP_FLAGS, $sockaddr) or
die("ERROR ($address) sending ICMP packet - $!");
}
print "OK\n";