Re: Proposal for finishing the 64-bit x86 syscall cleanup

From: Jan Beulich
Date: Tue Aug 25 2015 - 03:29:45 EST


>>> Andy Lutomirski <luto@xxxxxxxxxxxxxx> 08/24/15 11:14 PM >>>
>Thing 1: partial pt_regs
>
>64-bit fast path syscalls don't fully initialize pt_regs: bx, bp, and
>r12-r15 are uninitialized. Some syscalls require them to be
>initialized, and they have special awful stubs to do it. The entry
>and exit tracing code (except for phase1 tracing) also need them
>initialized, and they have their own messy initialization. Compat
>syscalls are their own private little mess here.
>
>This gets in the way of all kinds of cleanups, because C code can't
>switch between the full and partial pt_regs states.
>
>I can see two ways out. We could remove the optimization entirely,
>which consists of pushing and popping six more registers and adds
>about ten cycles to fast path syscalls on Sandy Bridge. It also
>simplifies and presumably speeds up the slow paths.
>
>We could also annotate with syscalls need full regs and jump to the
>slow path for them. This would leave the fast path unchanged (we
>could duplicate the sys call table so that regs-requiring syscalls
>would turn into some asm that switches to the slow path). We'd make
>the syscall table say something like:
>
>59 64 execve sys_execve:regs
>
>The fast path would have exactly identical performance and the slow
>path would presumably speed up. The down side would be additional
>complexity.

Namely - would this be any better than the current, "special awful" stubs?

>Thing 2: vdso compilation with binutils that doesn't support .cfi directives
>
>Userspace debuggers really like having the vdso properly
>CFI-annotated, and the 32-bit fast syscall entries are annotatied
>manually in hexidecimal. AFAIK Jan Beulich is the only person who
>understands it.
>
>I want to be able to change the entries a little bit to clean them up
>(and possibly rework the SYSCALL32 and SYSENTER register tricks, which
>currently suck), but it's really, really messy right now because of
>the hex CFI stuff. Could we just drop the CFI annotations if the
>binutils version is too old or even just require new enough binutils
>to build 32-bit and compat kernels?

I think that's a reasonable thing - iirc the oldest binutils I'm building with
(SLE10 i.e. 2.16.91-ish) support them, and I'd suppose the equally old
RHEL's binutils do too. Not sure if there are any other long maintained
distros that might carry even older binutils.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/