Re: [PATCH] sched/psi: fix use-after-free in ep_remove_wait_queue()

From: Suren Baghdasaryan
Date: Mon Feb 13 2023 - 18:50:46 EST


On Thu, Feb 9, 2023 at 11:13 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Thu, Feb 9, 2023 at 10:46 AM Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
> >
> > On Thu, Feb 09, 2023 at 09:09:03AM -0800, Suren Baghdasaryan wrote:
> > > On Thu, Feb 2, 2023 at 1:11 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> > > >
> > > > On Wed, Feb 1, 2023 at 8:56 PM Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Wed, Feb 01, 2023 at 07:00:23PM -0800, Munehisa Kamata wrote:
> > > > > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> > > > > > index 8ac8b81bfee6..6e66c15f6450 100644
> > > > > > --- a/kernel/sched/psi.c
> > > > > > +++ b/kernel/sched/psi.c
> > > > > > @@ -1343,10 +1343,11 @@ void psi_trigger_destroy(struct psi_trigger *t)
> > > > > >
> > > > > > group = t->group;
> > > > > > /*
> > > > > > - * Wakeup waiters to stop polling. Can happen if cgroup is deleted
> > > > > > - * from under a polling process.
> > > > > > + * Wakeup waiters to stop polling and clear the queue to prevent it from
> > > > > > + * being accessed later. Can happen if cgroup is deleted from under a
> > > > > > + * polling process otherwise.
> > > > > > */
> > > > > > - wake_up_interruptible(&t->event_wait);
> > > > > > + wake_up_pollfree(&t->event_wait);
> > > > > >
> > > > > > mutex_lock(&group->trigger_lock);
> > > > >
> > > > > wake_up_pollfree() should only be used in extremely rare cases. Why can't the
> > > > > lifetime of the waitqueue be fixed instead?
> > > >
> > > > waitqueue lifetime in this case is linked to cgroup_file_release(),
> > > > which seems appropriate to me here. Unfortunately
> > > > cgroup_file_release() is not directly linked to the file's lifetime.
> > > > For more details see:
> > > > https://lore.kernel.org/all/CAJuCfpFZ3B4530TgsSHqp5F_gwfrDujwRYewKReJru==MdEHQg@xxxxxxxxxxxxxx/#t
> > > > .
> > > > So, if we want to fix the lifetime of the waitqueue, we would have to
> > > > tie cgroup_file_release() to the fput() somehow. IOW, the fix would
> > > > have to be done at the cgroups or higher (kernfs?) layer.
> > >
> > > Hi Eric,
> > > Do you still object to using wake_up_pollfree() for this case?
> > > Changing higher levels to make cgroup_file_release() be tied to fput()
> > > would be ideal but I think that would be a big change for this one
> > > case. If you agree I'll Ack this patch.
> > > Thanks,
> > > Suren.
> > >
> >
> > I haven't read the code closely in this case. I'm just letting you know that
> > wake_up_pollfree() is very much a last-resort option for when the waitqueue
> > lifetime can't be fixed.
>
> Got it. Thanks for the warning.
> I think it can be fixed but the right fix would require a sizable
> higher level refactoring which might be more justifiable if we have
> more such cases in the future.
>
> > So if you want to use wake_up_pollfree(), you need to
> > explain why no other fix is possible. For example maybe the UAPI depends on the
> > waitqueue having a nonstandard lifetime.
>
> I think the changelog should explain that the waitqueue lifetime in
> cases of non-root cgroups is tied to cgroup_file_release() callback,
> which in turn is not tied to file's lifetime. That's the reason for
> waitqueue and the file having different lifecycles. Would that suffice
> as the justification?

Ok, in the absence of objections, I would suggest resending this patch
with the changelog including details about waitqueue lifetime and
reasons wake_up_pollfree() is required here.
Munehisa, feel free to reuse
https://lore.kernel.org/all/CAJuCfpFZ3B4530TgsSHqp5F_gwfrDujwRYewKReJru==MdEHQg@xxxxxxxxxxxxxx/#t
if you find it useful.
Thanks,
Suren.

> Again, I'm not saying that no other fix is possible, but that the
> right fix would be much more complex.
> Thanks,
> Suren.
>
> >
> > - Eric