Re: [RFC PATCH 1/5] x86: introduce preemption disable prefix

From: Peter Zijlstra
Date: Fri Oct 19 2018 - 04:20:11 EST


On Thu, Oct 18, 2018 at 09:29:39PM -0700, Andy Lutomirski wrote:

> > Another example is __BPF_PROG_RUN_ARRAY(), which also uses
> > preempt_enable_no_resched().
>
> Alexei, I think this code is just wrong. Do you know why it uses
> preempt_enable_no_resched()?

Yes, that's a straight up bug.

It looks like I need to go fix up abuse again :/

> Oleg, the code in kernel/signal.c:
>
> preempt_disable();
> read_unlock(&tasklist_lock);
> preempt_enable_no_resched();
> freezable_schedule();
>

The purpose here is to avoid back-to-back schedule() calls, and this
pattern is one of the few correct uses of preempt_enable_no_resched().

Suppose we got a preemption while holding the read_lock(), then when
we'd do read_unlock(), we'd drop preempt_count to 0 and reschedule, then
when we get back we instantly call into schedule _again_.

What this code does, is it increments preempt_count such that
read_unlock() doesn't hit 0 and doesn't call schedule, then we lower it
to 0 without a call to schedule() and then call schedule() explicitly.