Re: [RESEND PATCH v4] x86/hpet: Reduce HPET counter read contention

From: Prarit Bhargava
Date: Wed Aug 10 2016 - 15:02:15 EST




On 08/10/2016 02:37 PM, Long, Wai Man wrote:
> Hi,
>
> I would like to restart the discussion about the merit of this patch.
>
> This patch was created in response to a problem we have on the 16-socket Broadwell-EX systems (up to 768 logical CPUs) that were under development. About 10% of the kernel boots experienced soft lockups:
>
> [ 71.618132] NetLabel: Initializing
> [ 71.621967] NetLabel: domain hash size = 128
> [ 71.626848] NetLabel: protocols = UNLABELED CIPSOv4
> [ 71.632418] NetLabel: unlabeled traffic allowed by default
> [ 71.638679] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0, 0, 0, 0, 0
> [ 71.646504] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
> [ 71.655313] Switching to clocksource hpet
> [ 95.679135] BUG: soft lockup - CPU#144 stuck for 23s! [swapper/144:0]
> [ 95.693363] BUG: soft lockup - CPU#145 stuck for 23s! [swapper/145:0]
> [ 95.694203] Modules linked in:
> [ 95.694697] CPU: 145 PID: 0 Comm: swapper/145 Not tainted
> 3.10.0-327.el7.x86_64 #1
> [ 95.695580] BUG: soft lockup - CPU#582 stuck for 23s! [swapper/582:0]
> [ 95.696145] Hardware name: HP Superdome2 16s x86, BIOS Bundle:
> 008.001.006
> SFW: 041.063.152 01/16/2016
> [ 95.698128] BUG: soft lockup - CPU#357 stuck for 23s! [swapper/357:0]
> [ 95.699774] task: ffff8cf7fecf4500 ti: ffff89787c748000 task.ti:
> ffff89787c748000
>
> During the bootup process, there is a short time where the clocksource is switched to hpet to calibrate the tsc's. Then it will be switched back to tsc once the calibration is done. It is during the short period that soft lockups may happen.
>
> Prarit also hit this problem with a smaller Intel box that has 96 cores (192 threads). Maybe he can supply more information of what he had seen.
>

I've hit this on a system with 192 threads. The TSC is functional and has
passed the TSC sync checks during boot. When the HPET is used to resynchronize
the TSC, I occasionally see

PCI: Using ACPI for IRQ routing
NetLabel: Initializing
NetLabel: domain hash size = 128
NetLabel: protocols = UNLABELED CIPSOv4
NetLabel: unlabeled traffic allowed by default
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0, 0, 0, 0, 0
hpet0: 8 comparators, 64-bit 24.000000 MHz counter
Switched to clocksource hpet

followed by the same NMI flood that Waiman described. After some debugging I
came to the same conclusion that Waiman had, the HPET is causing contention on
the system with many threads accessing it rapidly.

After applying his patch the problem no longer occurs.

P.