Re: [patch 1/2] x86_64 page fault NMI-safe

From: Linus Torvalds
Date: Sun Jul 18 2010 - 13:37:30 EST


On Sun, Jul 18, 2010 at 4:03 AM, Avi Kivity <avi@xxxxxxxxxx> wrote:
>
> By trading off some memory, we don't need this trickery.  We can allocate
> two nmi stacks, so the code becomes:

I really don't think you need even that. See earlier in the discussion
about how we could just test %rsp itself. Which makes all the %rip
testing totally unnecessary, because we don't even need any flags,and
we have no races because %rsp is atomically changed with taking the
exception.

Lookie here, the %rsp comparison really isn't that hard:

nmi:
pushq %rax
pushq %rdx
movq %rsp,%rdx # current stack top
movq 40(%rsp),%rax # old stack top
xor %rax,%rdx # same 8kB aligned area?
shrq $13,%rdx # ignore low 13 bits
je it_is_a_nested_nmi # looks nested..
non_nested:
...
... ok, we're not nested, do normal NMI handling ...
...
popq %rdx
popq %rax
iret

it_is_a_nested_nmi:
cmpw $0,48(%rsp) # double-check that it really was a nested exception
jne non_nested # from user space or something..
# this is the nested case
# NOTE! NMI's are blocked, we don't take any exceptions etc etc
addq $-160,%rax # 128-byte redzone on the old stack + 4 words
movq (%rsp),%rdx
movq %rdx,(%rax) # old %rdx
movq 8(%rsp),%rdx
movq %rdx,8(%rax) # old %rax
movq 32(%rsp),%rdx
movq %rdx,16(%rax) # old %rflags
movq 16(%rsp),%rdx
movq %rdx,24(%rax) # old %rip
movq %rax,%rsp
popq %rdx
popq %rax
popf
ret $128 # restore %rip and %rsp

doesn't that look pretty simple?

NOTE! OBVIOUSLY TOTALLY UNTESTED!

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/