Re: Regression in 5.3-rc1 and later

From: Thomas Gleixner
Date: Thu Aug 22 2019 - 09:44:54 EST


Chris,

On Thu, 22 Aug 2019, Chris Clayton wrote:

Trimmed cc list

> I've found a problem that isn't present in 5.2 series or 4.19 series
> kernels, and seems to have arrived in 5.3-rc1. The problem is that if I
> suspend (to ram) my laptop, on resume 14 minutes or more after
> suspending, I have no networking functionality. If I resume the laptop
> after 13 minutes or less, networking works fine. I haven't tried to get
> finer grained timings between 13 and 14 minutes, but can do if it would
> help.
>
> ifconfig shows that wlan0 is still up and still has its assigned ip
> address but, for instance, a ping of any other device on my network,
> fails as does pinging, say, kernel.org. I've tried "downing" the network
> with (/sbin/ifdown) and unloading the iwlmvm module and then reloading
> the module and "upping" (/sbin/ifup) the network, but my network is still
> unusable. I should add that the problem also manifests if I hibernate the
> laptop, although my testing of this has been minimal. I can do more if
> required.

What happens if you restart the network manager and/or wpa_supplicant or
whatever your distro uses for that.

> As I say, the problem first appears in 5.3-rc1, so I've bisected between
> 5.2.0 and 5.3-rc1 and that concluded with:

Just for confirmation, it's still broken as of 5.3-rc5, right? We had fixes
post rc1.

> x86/vdso: Switch to generic vDSO implementation

> To confirm my bisection was correct, I did a git checkout of
> 7ac8707479886c75f353bfb6a8273f423cfccb2. As expected, the kernel
> exhibited the problem I've described. However, a kernel built at the
> immediately preceding (parent?) commit
> (bfe801ebe84f42b4666d3f0adde90f504d56e35b) has a working network after a
> (>= 14minute) suspend/resume cycle.

~14 minutes is odd. I can't come up with anything which rolls over, wraps
or overflows at that point.

Can you please provide the output of:

dmesg | grep -i TSC

Thanks,

tglx