Re: [BUG REPORT] Soft Lockup in smp_call_function_single+0xD8

From: Jeff Merkey
Date: Sat Jan 30 2016 - 03:42:15 EST


Here is an MDB debugger trace of the code in question. please note
that the flags being compared don't match what's in r11 and the
comparison bits are wrong.

(3)>

Break at 0xFFFFFFFF81680022 due to - Proceed (single step)
RAX: 0000000000000080 RBX: 0000000000000002 RCX: 00007FC9877F2A30
RDX: 0000000000000000 RSI: FFFF8800BFD9BC00 RDI: FFFF88011FCD6C80
RSP: FFFF8800CD6C7F58 RBP: 00007FC988119000 R8: FFFF8800CD6C4000
R9: 0000017C85499D0E R10: FFFF8800C17BB8F0 R11: 0000000000000246 << WRONG!!!
R12: 00007FC987AC6400 R13: 0000000000000002 R14: 0000000000000001
R15: 0000000000000000 CS: 0010 DS: 0000 ES: 0000 FS: 0000 GS: 0000 SS: 0018
IP: FFFFFFFF81680022 FLAGS: 0000000000000146 (PF ZF TF) << real flags
0xffffffff81680022 49F7C300010100 test r11,0x10100 < comparison
bits correct r11 is WRONG!!!
(3)>

Break at 0xFFFFFFFF81680029 due to - Proceed (single step)
RAX: 0000000000000080 RBX: 0000000000000002 RCX: 00007FC9877F2A30
RDX: 0000000000000000 RSI: FFFF8800BFD9BC00 RDI: FFFF88011FCD6C80
RSP: FFFF8800CD6C7F58 RBP: 00007FC988119000 R8: FFFF8800CD6C4000
R9: 0000017C85499D0E R10: FFFF8800C17BB8F0 R11: 0000000000000246
R12: 00007FC987AC6400 R13: 0000000000000002 R14: 0000000000000001
R15: 0000000000000000 CS: 0010 DS: 0000 ES: 0000 FS: 0000 GS: 0000 SS: 0018
IP: FFFFFFFF81680029 FLAGS: 0000000000000146 (PF ZF TF)
0xffffffff81680029 753C jne opportunistic_sysret_failed
(0xffffffff81680067) (down)
(3)>

Maps to this code in entry_64.S

movq R11(%rsp), %r11 << picks up BOGUS flags here
cmpq %r11, EFLAGS(%rsp) /* R11 == RFLAGS */
jne opportunistic_sysret_failed

/*
* SYSRET can't restore RF. SYSRET can restore TF, but unlike IRET,
* restoring TF results in a trap from userspace immediately after
* SYSRET. This would cause an infinite loop whenever #DB happens
* with register state that satisfies the opportunistic SYSRET
* conditions. For example, single-stepping this user code:
*
* movq $stuck_here, %rcx
* pushfq
* popq %r11
* stuck_here:
*
* would never get past 'stuck_here'.
*/
testq $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
jnz opportunistic_sysret_failed


Anyway, there is your bug. Should andy go back and fix this -- probably.

Jeff