Re: [BUG] Race between policy reload sidtab conversion and live conversion

From: Paul Moore
Date: Mon Mar 01 2021 - 09:48:27 EST


On Mon, Mar 1, 2021 at 5:36 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote:
> On Sun, Feb 28, 2021 at 8:21 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> > On Fri, Feb 26, 2021 at 6:12 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote:
> > > On Fri, Feb 26, 2021 at 2:07 AM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> > > > On Wed, Feb 24, 2021 at 4:35 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote:

...

> > Ah, yes, you're right. I was only thinking about the problem of
> > adding an entry to the old sidtab, and not the (much more likely case)
> > of an entry being added to the new sidtab. Bummer.
> >
> > Thinking aloud for a moment - what if we simply refused to add new
> > sidtab entries if the task's sidtab pointer is "old"? Common sense
> > would tell us that this scenario should be very rare at present, and I
> > believe the testing mentioned in this thread adds some weight to that
> > claim. After all, this only affects tasks which entered into their
> > RCU protected session prior to the policy load RCU sync *AND* are
> > attempting to add a new entry to the sidtab. That *has* to be a
> > really low percentage, especially on a system that has been up and
> > running for some time. My gut feeling is this should be safe as well;
> > all of the calling code should have the necessary error handling in
> > place as there are plenty of reasons why we could normally fail to add
> > an entry to the sidtab; memory allocation failures being the most
> > obvious failure point I would suspect. This obvious downside to such
> > an approach is that those operations which do meet this criteria would
> > fail - and we should likely emit an error in this case - but is this
> > failure really worse than any other transient kernel failure,
>
> No, I don't like this approach at all. Before the sidtab refactor, it
> had been done exactly this way ...

I recognize I probably haven't made my feelings about reverts clear,
or if I have, I haven't done so recently. Let me fix that now: I
*hate* them. Further I hate reverts with a deep, passionate hatred
that I reserve for very few things. Maybe we have to revert this
change, even though I *hate* reverts they do remain an option; you
just need to be 99% sure you've exhausted all the other options first.

> Perhaps it wasn't clear from what I wrote, but I certainly don't want
> to abandon it completely. Just to revert to a safe state until we
> figure out how to do the RCU policy reload safely. The solution with
> two-way conversion seems doable, it's just not a quick and easy fix.

I suggest pursuing this before the revert to see what it looks like
and we can discuss it further during review.

--
paul moore
www.paul-moore.com