Re: kernel bug in kvm_intel

From: Tejun Heo
Date: Sun Nov 01 2009 - 05:44:55 EST


Avi Kivity wrote:
> We get a page fault immediately (next instruction) after returning from
> the guest when running with oprofile. The page fault address does not
> match anything the instruction does, so presumably it is one of the
> accesses the processor performs in order to service an NMI (ordinary
> interrupts are masked; and the fact that it happens with oprofile
> strengthens this assumption).

Ah... okay, that's tricky but IIRC faults like that can be
distinguished from regular ones via processor state, right?

> If this is correct, the fault is not in the NMI handler itself, but in
> one of the memory areas the cpu looks in to vector the NMI, which can be:
> - the IDT
> - the GDT
> - the TSS
> - the NMI stack
> Except for the IDT these are per-cpu structure, though I don't know
> whether they are allocated with the percpu infrastructure.

Don't know where NMI stack is but all else are percpu.

> Here is the code in question:
>> 3ae7: 75 05 jne 3aee<vmx_vcpu_run+0x26a>
>> 3ae9: 0f 01 c2 vmlaunch
>> 3aec: eb 03 jmp 3af1<vmx_vcpu_run+0x26d>
>> 3aee: 0f 01 c3 vmresume
>> 3af1: 48 87 0c 24 xchg %rcx,(%rsp)
> ^^^ fault, but not at (%rsp)

Can you please post the full oops (including kernel debug messages
during boot) or give me a pointer to the original message? Also, does
the faulting address coincide with any symbol?


