Re: [PATCH RFC WIP] x86/paravirt: add register-saving thunks toreduce caller register pressure

From: Ingo Molnar
Date: Tue Feb 03 2009 - 10:12:52 EST



* Pavel Machek <pavel@xxxxxxx> wrote:

>
> > This patch seeks to alleviate this pressure by introducing wrapper
> > thunks that will do the register saving/restoring, so that the
> > callsite doesn't need to worry about it, but the callee function can
> > be conventional compiler-generated code. In many cases (particularly
> > performance-sensitive cases) the callee will be in assembler anyway,
> > and need not use the compiler's calling convention.
> >
> > Standard calling convention is:
> > arguments return scratch
> > x86-32 eax edx ecx eax ?
>
> esi edi ebp ?

No, they are not scratch [callee-clobbered] registers, they are
callee-saved.

Jeremy's table is incomplete (and incorrect on 64-bit):

| Standard calling convention is:
| arguments return scratch
| x86-32 eax edx ecx eax ?
| x86-64 rdi rsi rdx rcx rax r8 r9 r10 r11

The correct one is:

x86 function calling convention, 64-bit:
----------------------------------------
arguments | callee-saved | extra caller-saved | return
[callee-clobbered] | | [callee-clobbered] |
---------------------------------------------------------------------------
rdi rsi rdx rcx r8 r9 | rbx rbp [*] r12-r15 | r10, r11 | rax [**]

( rsp is obviously invariant across normal function calls. (gcc can 'merge'
functions when it sees tail-call optimization possibilities) rflags is
clobbered. Leftover arguments are passed over the stack frame. )

[*] In the frame-pointers case ebp is fixed to the stack frame.

[**] for struct return values wider than 64 bits the return convention is a
bit more complex: up to 128 bits width we return small structures
straight in rax, rdx. For structures larger than that (3 words or
larger) the caller puts a pointer to an on-stack return struct
[allocated in the caller's stack frame] into the first argument - i.e.
into rdi. All other arguments shift up by one in this case.
Fortunately this case is rare in the kernel.

As you can see r8 and r9 are regparm arguments for 5 or 6 parameter calls
and not just extra scratch registers. (although they certainly can be used
as such too - all regparm arguments are callee-clobbered on x86)

For 32-bit we have the following conventions - kernel is build with
-mregparm=3 and -freg-struct-return:

x86 function calling convention, 32-bit:
----------------------------------------
arguments | callee-saved | extra caller-saved | return
[callee-clobbered] | | [callee-clobbered] |
-------------------------------------------------------------------------
eax edx ecx | ebx edi esi ebp [*] | <none> | eax, [**]

( here too esp is obviously invariant across normal function calls. eflags
is clobbered. Leftover arguments are passed over the stack frame. )

[*] In the frame-pointers case ebp is fixed to the stack frame.

[**] We build with -freg-struct-return, which on 32-bit means similar
semantics as on 64-bit: edx can be used for a second return value
(i.e. covering structure sizes up to 64 bits) - after that it gets
more complex and more expensive: 3-word or larger struct returns get
done in the caller's frame and the pointer to the return struct goes
into regparm0, i.e. eax - the other arguments shift up and the
function's register parameters degenerate to regparm=2 in essence.

> actually standard calling convention is all arguments on stack iirc but we
> use regparm=3 for kernel...?

Correct, 32-bit x86 gets built with:

KBUILD_CFLAGS += -msoft-float -mregparm=3 -freg-struct-return

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/