Re: [BUG REPORT] Soft Lockup in smp_call_function_single+0xD8

From: Jeff Merkey
Date: Mon Jan 25 2016 - 18:06:38 EST


On 1/24/16, Jeff Merkey <linux.mdb@xxxxxxxxx> wrote:
> On 1/24/16, Jeff Merkey <linux.mdb@xxxxxxxxx> wrote:
>> If I single step with either kgdb, kgdb, or mdb kernel debuggers over
>> a sysret instruction anywhere in the OS, the system hard hangs in
>> smp_call_function_single after the debugger releases the system and it
>> resumes normal operation. The specific place the kernel hangs is in
>> the loop below. The softlockup detector will periodically detect
>> this condition when it occurs, but not always, most of the time the
>> system is just hung and unresponsive.
>>
>> (2)> u smp_call_function_single+d8
>> <<<< hard hang in this loop with EDX=3
>> 0xffffffff810fce48 8B55E0 mov edx,DWORD PTR
>> [rbp-32]=0xCE037DC0
>> 0xffffffff810fce4b 83E201 and edx,0x1
>> 0xffffffff810fce4e 75F6 jne
>> smp_call_function_single+0xd6 (0xffffffff810fce46) (up)
>> <<<<<
>> 0xffffffff810fce50 EBC3 jmp
>> smp_call_function_single+0xa5 (0xffffffff810fce15) (up)
>> 0xffffffff810fce52 8B05E08EC700 mov eax,[oops_in_progress]=0x0
>> 0xffffffff810fce58 85C0 test eax,eax
>> 0xffffffff810fce5a 7585 jne
>> smp_call_function_single+0x71 (0xffffffff810fcde1) (up)
>> 0xffffffff810fce5c 803D8E0C9D0000 cmp [__warned.20610]=0x00,0x0
>> 0xffffffff810fce63 0F8578FFFFFF jne
>> smp_call_function_single+0x71 (0xffffffff810fcde1) (up)
>> 0xffffffff810fce69 BE24010000 mov esi,0x124
>> 0xffffffff810fce6e 48C7C796B08C81 mov rdi,0xffffffff818cb096
>> 0xffffffff810fce75 894DBC mov DWORD PTR [rbp-68]=0x0,ecx
>> 0xffffffff810fce78 488955C0 mov QWORD PTR
>> [rbp-64]=0xFFFFFFFFFFFFFF10,rdx
>> 0xffffffff810fce7c E8FF21F8FF call warn_slowpath_null
>> 0xffffffff810fce81 C605690C9D0001 mov [__warned.20610]=0x00,0x1
>> 0xffffffff810fce88 8B4DBC mov ecx,DWORD PTR [rbp-68]=0x0
>> 0xffffffff810fce8b 488B55C0 mov rdx,QWORD PTR
>> [rbp-64]=0xFFFFFFFFFFFFFF10
>> 0xffffffff810fce8f E94DFFFFFF jmp
>> smp_call_function_single+0x71 (0xffffffff810fcde1) (up)
>> 0xffffffff810fce94 E8A71EF8FF call __stack_chk_fail
>> 0xffffffff810fce99 0F1F8000000000 nop DWORD PTR [rax]=0x0
>> (2)> g
>>
>>
>> The stack backtrace when the bug occurs is:
>>
>> smp_call_function_single+0xd8
>> unmap_page_range+0x613
>> flush_tlb_func+0x0
>> smp_call_function_many+215
>> native_flush_tlb_others+0x118
>> flush_tlb_mm_range+0x61
>> tlb_flush_mmu_tlbonly+0x6b
>> tlb_finish_mmu+0x14
>> unmap_region+0xe2
>> vma_rb_erase+0x10f
>> do_unmap+0x217
>> vm_unmap+0x41
>> SyS_munmap+0x22
>> entry_SYSCALL_64_fastpath+0x12
>>
>> I traced through this code a bunch of times in just normal operations
>> without triggering the bug to get a feel for what it normally sees in
>> EDX and it looks like someone has coded a looping function that always
>> has EDX=0 in every case I saw in the except for when this bug occurs.
>>
>> So the exact C code this maps fro objdump of kernel/smp.o is:
>>
>> 469: e8 62 fe ff ff callq 2d0 <generic_exec_single>
>> 46e: 8b 55 e0 mov -0x20(%rbp),%edx
>> * previous function call. For multi-cpu calls its even more interesting
>> * as we'll have to ensure no other cpu is observing our csd.
>> */
>> static void csd_lock_wait(struct call_single_data *csd)
>> {
>> while (smp_load_acquire(&csd->flags) & CSD_FLAG_LOCK)
>> 471: 83 e2 01 and $0x1,%edx
>> 474: 74 cf je 445 <smp_call_function_single+0xa5>
>> 476: f3 90 pause
>> <<<<<<<<<<
>> 478: 8b 55 e0 mov -0x20(%rbp),%edx
>> 47b: 83 e2 01 and $0x1,%edx
>> 47e: 75 f6 jne 476 <smp_call_function_single+0xd6>
>> <<<<<<<<<<<
>> 480: eb c3 jmp 445 <smp_call_function_single+0xa5>
>> * Can deadlock when called with interrupts disabled.
>> * We allow cpu's that are not yet online though, as no one else can
>> * send smp call function interrupt to this cpu and as such deadlocks
>> * can't happen.
>> */
>> WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
>>
>> Each time this bug occurs csd->flags is always set to a value of 3 and
>> never changes. When the system is just running normally, it seems to
>> be 0 the rest of the time. Setting EDX=0 from the debugger console
>> clears the hang condition and the system seems to recover except the
>> system reports this error from the console when you attempt to load
>> programs, indicating the ability of the system to load shared objects
>> is fritzed.
>>
>> #
>> # ls -l
>> /lib64/libc.so.6 version GLI not found << this error and no shared
>> objects will load
>> #
>> #
>>
>> Jeff
>>
>
> I am running down a trace of the MSR values for swapgs. Looks like it
> got nested somewhere down in the entry_64 code. If so, then this is
> just a symptom and not the sickness.
>
> Jeff
>

I got to the bottom of this one and its related to the function loop
locking up the system that is described above when code is allowed to
run on one processor but not the others and one of them is trying to
send an IPI.

You might want to consider some sort of timeout logic for that
function or ability for it to recover. The problem is caused by
holding all the processors except the target processor being traced
and letting it run for a trace session while the other processors are
held suspended. If I release the all processors each time I step
with the target processor the problem goes away. If I hold the
processors while stepping over a section of code this lockup will
occur and the processor the bug occurs on apparently never recovers

In the process of tracking down this bug I traced all of the swapgs
usage in all of the exception handlers and it works well. The check
in paranoid_entry and paranoid_exit seems to catch the cases where the
debugger is or has stepped over a swapgs instruction pointing to a gs
in userspace for both int1 and int3 exceptions, and recovers correctly
through paranoid_entry and paranoid exit with the correct gs and MSR
value in all the cases I traced.

I coded around this issue but it's probably a bug in using that while
loop without some way to determine a processor has gone away
temporarily. The solution is to let the processors run in between
each and every int1 trap and not hold a processor in focus mode, then
this code snippet that locks up in smp.c does not get hit.

Jeff