Re: FSGSBASE ABI considerations

From: Andy Lutomirski
Date: Mon Jul 31 2017 - 10:15:25 EST


On Sun, Jul 30, 2017 at 9:38 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Sun, Jul 30, 2017 at 8:05 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>
>> This means that, when gdb saves away a regset and reloads it using
>> PTRACE_SETREGS or similar, the effect is to load gs_base and then load
>> gs. If gs != 0, this will blow away gs_base. Without FSGSBASE, this
>> doesn't matter so much. With FSGSBASE, it means that using gdb to do,
>> say, 'print func()' may corrupt gsbase.
>>
>> What, if anything, should we do about this? One option would be to
>> make gs_base be accurate all the time (it currently isn't) and teach
>> PTRACE_SETREGS to restore in the opposite order despite the struct
>> layout.
>
> I do not think that ordering should ever matter. If it does, it means
> that you've designed something. We already screwed that up with the
> msr interface, can we try to not do it again?
>
> Could we perhaps do something like:
>
> - every process starts out with CR4.FSGSBASE cleared
>
> - if we get an #UD due to the process using the {rd|wr}{gs|fs}base
> instructions, we enable FSGSBASE and mark the process as using those
> instructions.
>
> - once a process is marked as FSGSBASE, the kernel prioritizes
> FSGSBASE. We'll still save/restore the selector too, but every time we
> restore the selector, we will first do a rd*base, and then do a
> wr*base afterwards
>
> IOW, the "selector" ends up being meaningless after people have used
> fsgsbase. It is saved and restored as a _value_, but it has no effect
> what-so-ever on the actual base pointer.
>
> Yes, it's modal, but at least you don't end up in some situation where
> it matters whether you write the selector first or not.
>
> Hmm?

I hadn't thought of that approach. I have three very different objections.

- The only reason I think that FSGSBASE is worth supporting at all is
that it provides a fairly dramatic speedup to context switches by
getting rid of the awful serializing WRMSR. I tend to consider the
actual exposure of the instructions to userspace to be more trouble
than it's worth. But, with your approach, we may only get the speedup
when running SPECJava Environmentally Friendly Threads, and we'll lose
it again due to all the CR4 writes, and that would make me want to
just drop the whole thing.

- The modal approach makes the modify_ldt() consistency issue go
away, but it doesn't help with ptrace, I think, because, with ptrace,
we care about the debugger, not the debuggee.

- glibc will probably be daft and start using WRGSBASE instead of
arch_prctl and this whole idea may become irrelevant.

All that being said, we might be able to get away with treating the
selector and the base totally separately no matter what. I've
searched a bit, and I haven't come up with anything that needs
modify_ldt() to behave synchronously, presumably because its behavior
used to be so utterly erratic that user code always had to follow
modify_ldt() by an explicit segment write. The only thing that cares
about ptrace that I've spotted and that do anything more complicated
than reading the state and writing it back out the same way it found
it is stuff like gdb's 'print $gs = 43', and I find it hard to believe
that there are gdb scripts that do that and need to be supported for
compatibility.

--Andy