Regression :-) Re: [GIT PULL RESEND] x86/jumpmplabel changes forv3.12-rc1

From: Konrad Rzeszutek Wilk
Date: Wed Sep 11 2013 - 09:47:52 EST


On Tue, Sep 10, 2013 at 07:48:44PM -0700, H. Peter Anvin wrote:
> Hi Linus,
>
> One more x86 tree for this merge window. This tree improves the
> handling of jump labels, so that most of the time we don't have to do
> a massive initial patching run. Furthermore, we will error out of the
> jump label is not what is expected, e.g. if it has been corrupted or
> tampered with.
>
> This tree does conflict with your top of tree; the resolution should be
> reasonably straightforward but let me know if you want a merged tree.
>
> The following changes since commit ad81f0545ef01ea651886dddac4bef6cec930092:
>
> Linux 3.11-rc1 (2013-07-14 15:18:27 -0700)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-jumplabel-for-linus
>
> for you to fetch changes up to fb40d7a8994a3cc7a1e1c1f3258ea8662a366916:
>
> x86/jump-label: Show where and what was wrong on errors (2013-08-06 21:54:33 -0400)

This triggers BUG when booting a Xen guest with PV ticketlocks enabled (which
are by default enabled). If I revert this merge it boots, or if I provide 'xen_nopvspin'..

With some modifications (pasted-in-at-the-end) I see:

about to get started...
Unexpected op at trace_clock_global+0x6b/0x120 [ffffffff8113a21b] (0f 1f 44 00 00) /home/build/linux-konrad/arch/x86/kernel/jumpn VCPU 0 [ec=0000]
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.2.2-pre x86_64 debug=n Not tainted ]----
(XEN) CPU: 0
(XEN) RIP: e033:[<ffffffff81051e3d>]
(XEN) RFLAGS: 0000000000000292 EM: 1 CONTEXT: pv guest
(XEN) rax: 0000000000000000 rbx: ffffffff81eaaec0 rcx: 0000000000000001
(XEN) rdx: ffffffff81fac0a0 rsi: 000000000000008c rdi: 0000000000000000
(XEN) rbp: ffffffff81c01e88 rsp: ffffffff81c01e08 r8: 000000000000fffa
(XEN) r9: 0000000000000002 r10: 0000000000000000 r11: 000000000000fffd
(XEN) r12: ffffffff81ca8598 r13: ffffffff81eaaea0 r14: 0000000000000000
(XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000426f0
(XEN) cr3: 0000000231c0c000 cr2: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
(XEN) Guest stack trace from rsp=ffffffff81c01e08:
(XEN) 0000000000000001 000000000000fffd ffffffff81051e3d 000000010000e030
(XEN) 0000000000010092 ffffffff81c01e48 000000000000e02b ffffffff81051e3d
(XEN) ffffffff00000000 0000000000000000 ffffffff81952c18 0000000000000035
(XEN) 0000000000441f0f 0000000000000018 ffffff9066666666 ffffffffffffffff
(XEN) ffffffff81c01ea8 ffffffff81051eb5 0000000000441f0f 0000000000000000
(XEN) ffffffff81c01ed8 ffffffff81cfbbfb ffffffff81d6b900 ffffffffffffffff
(XEN) ffffffff81d6b900 ffffffff81d742e0 ffffffff81c01f28 ffffffff81cd3e3c
(XEN) ffffffff81cd3af2 ffffffff82051000 ffffffff82052000 ffffffff81d742e0
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) ffffffff81c01f38 ffffffff81cd35f3 ffffffff81c01ff8 ffffffff81cd833a
(XEN) 0300000100000032 0000000000000005 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 9d9822831fc9cbf5 000206a700100800
(XEN) 0000000000000001 0000000000000000 0000000000000000 0f00000060c0c748
(XEN) ccccccccccccc305 cccccccccccccccc cccccccccccccccc cccccccccccccccc
(XEN) cccccccccccccccc cccccccccccccccc cccccccccccccccc cccccccccccccccc
(XEN) cccccccccccccccc cccccccccccccccc cccccccccccccccc cccccccccccccccc
(XEN) cccccccccccccccc cccccccccccccccc cccccccccccccccc cccccccccccccccc
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.
(XEN) Resetting with ACPI MEMORY or I/O RESET_REG.

I can boot it with 'xen_nopvspin' which leads me to believe that it is due
to:

262 void __init xen_init_spinlocks(void)
263 {
264
265 if (!xen_pvspin) {
266 printk(KERN_DEBUG "xen: PV spinlocks disabled\n");
267 return;
268 }
269
270 static_key_slow_inc(&paravirt_ticketlocks_enabled); <====

Which means that all of the arch_spin_unlock (which are inlined) and such
will now be patched over.

But perhaps they are not suppose to be enabled in the .smp_prepare_boot_cpu
function chain? But that seems the best place - as you need to enable
this before the spinlocks are used on SMP.

And the IPs are all NOPs.

Steven, ideas?


diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
index ee11b7d..e37a2bb 100644
--- a/arch/x86/kernel/jump_label.c
+++ b/arch/x86/kernel/jump_label.c
@@ -23,7 +23,7 @@ union jump_code_union {
int offset;
} __attribute__((packed));
};
-
+#include <xen/hvc-console.h>
static void bug_at(unsigned char *ip, int line)
{
/*
@@ -31,7 +31,7 @@ static void bug_at(unsigned char *ip, int line)
* Something went wrong. Crash the box, as something could be
* corrupting the kernel.
*/
- pr_warning("Unexpected op at %pS [%p] (%02x %02x %02x %02x %02x) %s:%d\n",
+ xen_raw_printk("Unexpected op at %pS [%p] (%02x %02x %02x %02x %02x) %s:%d\n",
ip, ip, ip[0], ip[1], ip[2], ip[3], ip[4], __FILE__, line);
BUG();
}

Let me modify the bug_at so that the 'line' can been seen as it seems to have been
truncated.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/