Re: Kernel 4.1 hang, apparently in __inet_lookup_established

From: Eric Dumazet
Date: Wed Sep 23 2015 - 12:31:27 EST


On Wed, 2015-09-23 at 10:25 +0200, Patrick Schaaf wrote:
> Dear kernel developers,
>
> I recently started to upgrade my production hosts and VMs from the 3.14 series
> to 4.1 kernels, starting with 4.1.6. Yesterday, for the second time after I
> started these upgrades, I experienced one of our webserver VMs hanging.
>
> The first time this happened, the VM hung completely, all 5 virtual cores
> spinning at 100%, ping still worked, but nothing else, including no virsh
> console reaction - I had to destroy and restart that VM. No messages were to
> be found.
>
> Yesterday, when it happened the second time, I found the VM spinning on a
> single core only, and could still connect to it via ssh - but it stopped
> accepting apache connections. The core it spun on showed 100% time used in
> "si", with top, and it produced the messages appended below. The VM did not
> shutdown properly when told to, and had to be destroyed again.
>
> If I read that dmesg output correctly it spins in __inet_lookup_established,
> which indeed reads like it has infinite spin potential. But that code itself
> did not change relative to the 3.14 series we've been running for a long time
> without the issues - so the root cause would be something else.
>
> For our production systems I'll revert to the 3.14 series, but maybe this
> report may help somebody understand what's going on.
>
> best regards
> Patrick


You could try following commits :

http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=ed2e923945892a8372ab70d2f61d364b0b6d9054

http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=29c6852602e259d2c1882f320b29d5c3fec0de04



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/