Re: [PATCH] kmemcheck: SMP support

From: Jan Kiszka
Date: Fri May 23 2008 - 13:55:24 EST

Vegard Nossum wrote:
> On Fri, May 23, 2008 at 7:12 PM, Jan Kiszka <jan.kiszka@xxxxxx> wrote:
>> Vegard Nossum wrote:
>>> On Fri, May 23, 2008 at 5:40 PM, Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:
>>>> Vegard Nossum wrote:
>>>>> This works on real hw, but not on qemu. It seems to get stuck waiting for
>>>>> one
>>>>> of the atomic values to change. Don't know why yet, it might just be yet
>>>>> another bug in qemu... (we've hit at least two of them so far. And they
>>>>> were
>>>>> real bugs too.)
>>>> I've noticed that qemu mis-reports the eip of cmpxchg if it faults (it
>>>> reports the eip of the start of the basic block, I think). Does that match
>>>> what you're seeing?
>>> You mean the EIP that gets pushed on the stack for the page fault?
>>> (That would be bad news for kmemcheck. I suppose the rest of the
>>> kernel never page faults on cmpxchg addresses?)
>>> Or do you mean the EIP that shows up in gdb?
>>> But no, it seems to be unrelated. What I hit so far were (in 0.9.0):
>>> 1. qemu doesn't set the single-stepping flag of DR6 on single-step
>>> debug exceptions.
>>> 2. qemu triggers int 0 (divide error) instead of int 2 on NMI IPIs.
>>> But both of these were fixed in the latest 0.9.1.
>> I guess you mean trunk - NMI IPIs didn't came with "old" 0.9.1.
> Are you sure? It does in fact deliver the NMI IPI as far as I can see
> and I am running from a qemu-0.9.1.tar.gz... E.g. for "-smp 3" on this
> 0.9.1 qemu:
> (first number is smp_processor_id())
> [0 pause all] <-- in page fault handler
> [1 paused] <-- in nmi handler
> [2 paused]
> [0 resume all] <-- in debug exception handler
> [2 resuming, paused = 1] <-- still in nmi handler, now exiting
> [1 resuming, paused = 0]

Revision 4205 (2008-04-13) introduced the NMI abstraction to QEMU, 4206
added NMI IPIs - while 0.9.1 was released in January. Find /me confused
about what triggers the handler.

> But maybe I should try the trunk and see if that fixes the problem I was seeing!

And even if that hangs, either the internal gdbstub or an external gdb
(on the qemu process) may reveal where things got stuck. Keep in mind
that QEMU is fairly good in widening tiny race windows ;). But if it's
too obscure, just report to qemu-devel.

>>> I don't yet know if what I'm hitting now is really an error with qemu.
>>> But I usually trust the real hardware more :-)
>> Try KVM as well. It is, of course, must faster than QEMU, and it comes
>> with true SMP (given you have a SMP host). With in-kernel irqchip
>> (that's default), KVM now also supports NMI IPIs. And debug registers
>> should be fine with my latest patch.
>> I'm currently trying to get debug support straight for upstream KVM and,
>> where also required, QEMU. SMP debugging is a common issue, but already
>> usable with KVM. So testers are welcome, an overview on required patches
>> can be provided.
> Hm. Doesn't KVM require special hardware? I have just a cheap laptop
> (Pentium Dual-Core) and I doubt I will be able to run it... :-(

Yeah, forgot to mention that "minor" precondition...


Attachment: signature.asc
Description: OpenPGP digital signature