Re: [REGRESSION] x86/entry: TIF_SINGLESTEP handling is still broken

From: Andy Lutomirski
Date: Sun Jan 31 2021 - 18:39:12 EST




> On Jan 31, 2021, at 2:57 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Sun, Jan 31, 2021 at 2:20 PM Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>
>> A smallish test that we could stick in selftests would be great if that’s straightforward.
>
> Side note: it would be good to have a test-case for the interaction
> with the "block step" code too.
>
> I hate our name for it ("block step"?), but it modifies TF, to only
> trap on taken branches, not after each instruction.
>
> So there's all these things that interact:
>
> - you can set TF yourself in user space with 'popf' and get a debug
> signal (not after the popf, but after the instruction _after_ popf,
> iirc)
>
> - you can set TF as a debugger and that's basically what
> TIF_SINGLESTEP fundamentally means
>
> - we have TIF_FORCED_TF, which says whether TF was set by ptrace, or
> whether it was already set independently by the application before
> ptrace, so that we can know whether to clear it or keep it set *after*
> the single-step.
>
> - you can then *modify* the behavior of TF to trap only on control
> flow changes (and we use TIF_BLOCKSTEP to specify that behavior)
>
> - and there's also obviously some very subtle and unclear rules about
> when system call instructions cause debug traps
>
> The basic TF behavior is fairly simple: it caused #DB, and we send a signal.

This is all fundamentally impossible to do fully correctly because a program can use PUSHF to read TF, and there is only one TF bit, and the app and the debugger can fight over it. The insn breakpoint mechanism is much better, but even AMD CPUs can’t (I think) be programmed to breakpoint the entire user address space. So we do our best to fudge it.

>
> The "app set TF _itself_, and we want to debug across that event" is a
> lot more interesting, but it's unclear whether anybody does it. It's
> really just a "we want to be able to debug that case too", and
> TIF_FORCED_TF means that it should be possible.
>
> I didn't test that it works, though. Sounds worth a test-case?
>

I can look. We do have tests for apps setting TF with no debugger attached.

> The TIF_BLOCKSTEP thing changes no other logic, but basically sets the
> bit in the MSR that modifies just when TF traps. I may hate the name,
> but I think it works.
>

It has certainly been busted in the past in corner cases. I don’t think we have tests.

> The odd system call tracing part I have no idea who depends on it
> (apparently "rr", which I assume is some replay thing), and I suspect
> our semantics for it has been basically random historical one, and
> it's apparently what changed.
>
> That's the one that we _really_ should have a test-case for, along
> with some documentation and code comment what the actual semantics
> need to be so that we don't break it again.

This rr thing may be tangled up with the nonsense semantics of SYSRET. I’ll muck around with Kyle’s test and try to figure out what broke.

I’m guessing the issue is that we are correctly setting TF in the EFLAGS image, but IRET helpfully only traps after the first user insn executes, which isn’t what the tracer is expects.

>
> Linus