Re: FSGSBASE ABI considerations

From: Andy Lutomirski
Date: Mon Aug 07 2017 - 15:08:13 EST


On Mon, Aug 7, 2017 at 10:35 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, Aug 7, 2017 at 9:20 AM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>
>> Windows does something sort of like this (I think), but I don't like
>> this solution. I fully expect that someone will write a program that
>> does:
>>
>> old = rdgsbase();
>> wrgsbase(new);
>> call_very_fast_function();
>> wrgsbase(old);
>>
>> This will work if GS == 0, which is fine. The problem is that it will
>> *also* work if GS != 0 with very high probability, especially if this
>> code sequence is right after some operation that sleeps. And then
>> we'll get random crashes with very low probability, depending on where
>> the scheduler hits.
>
> It will work reliably if you just make the scheduler save/restore the
> base rather than the selector.
>
> I really think you need to walk away from the "selector is meaningful"
> model. Yes, yes, it's the legacy model, but it's the *insane* model.
>
> So screw the selector. It doesn't matter. We'll need to save/restore
> the value, but that's it. What we *really* save and restore is just
> the base pointer.
>
> Why do you care so much about the selector? If people *don't* use the
> fsgsbase, then the selector and the base of the segment will always
> match anyway (modulo the system calls that actually change the
> gdt/ldt, and we can just sat that *then* selectors matter).
>
> And if people *do* use fsgsbase, then the selector is by definition
> not important.
>
> So just make the scheduler save the base first, and restore it last.
> End of problem. Your user-space code above just works. There is no
> race, i doesn't matter one whit whether GS is 0 ir not, there simply
> is no problem.

I agree completely. The scheduler should do exactly this and, with my
patches applied, it does.

>
> So just what is the problem you're trying to solve?
>

I'm trying to avoid a situation where we implement that policy and the
interaction with modify_ldt() becomes very strange. Linux has a long
history of having ill-defined semantics x86_64, and I don't want to
make it worse.

If we *just* change the way the scheduler works, then we end up with
modify_ldt() behaving determinstically on IVB+ and behaving
deterministically on 32-bit kernels, but having that deterministic
behavior be *different*. This makes me rather unhappy about the whole
situation.

Also, I don't want to break gdb, and even telling whether a change
breaks gdb is an incredible PITA. Whern GDB saves and restores a
context, it currently restores the base first and the selector second,
and I have no idea whether gdb expects restoring the selector to
update the base.