[tip:x86/urgent] Prevent timer value 0 for MWAITX

From: tip-bot for Janakarajan Natarajan
Date: Sun Apr 30 2017 - 07:44:37 EST


Commit-ID: 88d879d29f9cc0de2d930b584285638cdada6625
Gitweb: http://git.kernel.org/tip/88d879d29f9cc0de2d930b584285638cdada6625
Author: Janakarajan Natarajan <Janakarajan.Natarajan@xxxxxxx>
AuthorDate: Tue, 25 Apr 2017 16:44:03 -0500
Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
CommitDate: Sun, 30 Apr 2017 13:35:11 +0200

Prevent timer value 0 for MWAITX

Newer hardware has uncovered a bug in the software implementation of
using MWAITX for the delay function. A value of 0 for the timer is meant
to indicate that a timeout will not be used to exit MWAITX. On newer
hardware this can result in MWAITX never returning, resulting in NMI
soft lockup messages being printed. On older hardware, some of the other
conditions under which MWAITX can exit masked this issue. The AMD APM
does not currently document this and will be updated.

Please refer to http://marc.info/?l=kvm&m=148950623231140 for
information regarding NMI soft lockup messages on an AMD Ryzen 1800X.
This has been root-caused as a 0 passed to MWAITX causing it to wait
indefinitely.

This change has the added benefit of avoiding the unnecessary setup of
MONITORX/MWAITX when the delay value is zero.

Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@xxxxxxx>
Link: http://lkml.kernel.org/r/1493156643-29366-1-git-send-email-Janakarajan.Natarajan@xxxxxxx
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

---
arch/x86/lib/delay.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index a8e91ae..29df077 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -93,6 +93,13 @@ static void delay_mwaitx(unsigned long __loops)
{
u64 start, end, delay, loops = __loops;

+ /*
+ * Timer value of 0 causes MWAITX to wait indefinitely, unless there
+ * is a store on the memory monitored by MONITORX.
+ */
+ if (loops == 0)
+ return;
+
start = rdtsc_ordered();

for (;;) {