Re: [PATCH 18/25] [PATCH] turn priviled operations into macros inentry.S

From: Steven Rostedt
Date: Wed Aug 08 2007 - 09:48:34 EST



--
On Wed, 8 Aug 2007, Andi Kleen wrote:

> > If you were talking about the general iretq => INTERRUPT_RETURN, then the
> > answer is "Yes, they are sufficient". The first version of lguest ran the
> > guest kernel in ring 3 (using dual page tables for guest kernel and guest
> > user). The current version I'm pushing runs lguest in ring 1, and the
> > entry.S code worked for both.
>
> How do you implement system calls then?

/me working very hard to get lguest64 ready for public display

Here's a snippet from my version of core.c. I've been thinking of ways to
optimize it, but for now it works fine. This was done for both ring 3 and
ring 1 lguest versions (this is the host running):


/*
* Update the LSTAR to point to the HV syscall handler.
* Also update the fsbase if the guest uses one.
*/
wrmsrl(MSR_LSTAR, (unsigned long)HV_OFFSET(&lguest_syscall_trampoline));

[...]
asm volatile ("pushq %2; pushq %%rsp; pushfq; pushq %3; call *%6;"
/* The stack we pushed is off by 8, due to the
previous pushq */
"addq $8, %%rsp"
: "=D"(foo), "=a"(bar)
: "i" (__KERNEL_DS), "i" (__KERNEL_CS), "0"
(vcpu->vcpu),
"1"(get_idt_table()),
"r" (sw_guest)
: "memory", "cc");

[...]
/* restore old LSTAR */
wrmsrl(MSR_LSTAR, vcpu->host_syscall);



Also in the switcher.S (The Hypervisor):

.global lguest_syscall_trampoline
.type lguest_syscall_trampoline, @function
lguest_syscall_trampoline:
/*
* Tricky, we don't have much to choose from here.
* The only way to get to our stack is with swapgs.
* but we need to save the stack too, so we have to play
* very carefully.
*/
swapgs
/* now gs points to our VCPU Guest Data */

/* first save the stack! */
movq %rsp, %gs:LGUEST_GUEST_DATA_regs_rsp

/*
* x86 arch doesn't have an easy way to find out where
* gs is located. So we need to read the MSR. But first
* we need to save off the rcx, rax and rdx.
*/
movq %rax, %gs:LGUEST_GUEST_DATA_regs_rax
movq %rdx, %gs:LGUEST_GUEST_DATA_regs_rdx
movq %rcx, %gs:LGUEST_GUEST_DATA_regs_rcx

/* Need to read manual, does rdmsr clear
* the top 32 bits of rax? */
xor %rax, %rax

movl $MSR_GS_BASE,%ecx
rdmsr
shl $32, %rdx
orq %rax, %rdx

movq %rdx, %rsp

/* see if we need to disable interrupts */
testq $(1<<9), %gs:LGUEST_GUEST_DATA_SFMASK
jz 1f
movq $0, %gs:LGUEST_GUEST_DATA_irq_enabled
jmp 2f
1:
/* Still need to clear bit 10 (just in case) */
/* (see lguest_iretq) */
movq $(1<<10), %rax
not %rax
andq %rax, %gs:LGUEST_GUEST_DATA_irq_enabled
2:
/* put back the generic regs */
movq %gs:LGUEST_GUEST_DATA_regs_rdx, %rdx
movq %gs:LGUEST_GUEST_DATA_regs_rcx, %rcx
movq %gs:LGUEST_GUEST_DATA_regs_rax, %rax

/* Is this a hypercall? */
testq $1, %gs:LGUEST_GUEST_DATA_is_hc
jnz handle_hcall

/* We have 64 bytes to play with */
addq $LGUEST_GUEST_DATA_regs, %rsp

/* do the swapgs if possible */
testq $1, %gs:LGUEST_GUEST_DATA_do_swapgs
je 1f

/* We now have a stack to use */
/* go back to the guest's gs */
swapgs
DO_SWAPGS_USE_STACK
/* and then back to HV gs */
swapgs
1:
/*
* The stack has 64 bytes of playing room.
* Which is enough to do a jump to the guest kernel.
* We store the guest LSTAR register in the scatch pad
* because we don't care if the guest messes with it.
* If it is a bad address, we fault from the guest side
* and we kill the guest. No harm done to the host.
*/
pushq $(__KERNEL_DS | 1)
pushq %gs:LGUEST_GUEST_DATA_regs_rsp
pushfq
/* Make sure we have actual interrupts on */
orq $(1<<9), 0(%rsp)
pushq $(__KERNEL_CS | 1)
pushq %gs:LGUEST_GUEST_DATA_LSTAR

swapgs
iretq



NOTE: This is still under development, since I'm going with a new design
change to try to stay more in sync with lguest32.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/