Re: [PATCH v14 06/14] arch/x86: enable task isolation functionality

From: Andy Lutomirski
Date: Wed Aug 10 2016 - 15:18:42 EST


On Aug 10, 2016 5:30 PM, "Chris Metcalf" <cmetcalf@xxxxxxxxxxxx> wrote:
>
> On 8/10/2016 3:52 AM, Andy Lutomirski wrote:
>>
>> On Aug 9, 2016 11:30 PM, "Chris Metcalf" <cmetcalf@xxxxxxxxxxxx> wrote:
>> @@ -91,6 +92,15 @@ static long syscall_trace_enter(struct pt_regs *regs)
>> if (emulated)
>> return -1L;
>>
>> + /* In isolation mode, we may prevent the syscall from running. */
>> + if (work & _TIF_TASK_ISOLATION) {
>> + if (task_isolation_syscall(regs->orig_ax) == -1) {
>> + regs->orig_ax = -1;
>> + return 0;
>> + }
>> + work &= ~_TIF_TASK_ISOLATION;
>> + }
>> +
>> What is this? It's not mentioned in the changelog. It seems
>> nonsensical to me. If nothing else, you forgot to update regs->ax,
>> but I don't even know what you're trying to do.
>
>
> It's mentioned in the changelog as "Fixes a bug in x86 syscall_trace_enter()
> [seen by Francis Giraldeau]." To be fair, I didn't hear back from Francis, and
> you're right, this doesn't look like it makes any sense now. (I've added him
> to the cc's on this email; for this series I had just put him on the cover letter.)
>
> I modeled this code on a snippet from the old two-phase syscall entry work:
>
> if (ret == SECCOMP_PHASE1_SKIP) {
> regs->orig_ax = -1;
> ret = 0;
> }
>
> You got rid of this during the 4.7-rc series, but my code above was at least
> plausibly valid until then :-)
>
> Regardless, I assume that the right thing for that condition to do now when
> it triggers is to set regs->ax = -ENOSYS and return -1L? I'll update the
> git repository with that in any case.

regs->ax will already be -ENOSYS unless something changed it, but I'm
not sure what this code is trying to do. Is the idea that
task_isolation_syscall might enqueue a signal and you want to deliver
it without processing the syscall? If so, a comment would be nice.
You could even WARN_ON(!signal_pending()).

>
> Thanks!
>
>
>>> #ifdef CONFIG_SECCOMP
>>> /*
>>> * Do seccomp after ptrace, to catch any tracer changes.
>>> @@ -136,7 +146,7 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>>
>>> #define EXIT_TO_USERMODE_LOOP_FLAGS \
>>> (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
>>> - _TIF_NEED_RESCHED | _TIF_USER_RETURN_NOTIFY)
>>> + _TIF_NEED_RESCHED | _TIF_USER_RETURN_NOTIFY | _TIF_TASK_ISOLATION)
>>>
>> Where are you updating the conditions to force use of the slow path?
>> (That's _TIF_ALLWORK_MASK.)
>
>
> Whenever _TIF_TASK_ISOLATION is set, _TIF_NOHZ is also set.

OK, but why not decouple it a bit and add it to the mask? I keep
meaning to add a BUILD_BUG_ON checking for bits in
EXIT_TO_USERMODE_LOOP_FLAGS that aren't in the appropriate slow path
masks.

>
> --
> Chris Metcalf, Mellanox Technologies
> http://www.mellanox.com
>