Re: [PATCH] sched/psi: fix use-after-free in ep_remove_wait_queue()

From: Suren Baghdasaryan
Date: Thu Feb 09 2023 - 14:13:18 EST


On Thu, Feb 9, 2023 at 10:46 AM Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
>
> On Thu, Feb 09, 2023 at 09:09:03AM -0800, Suren Baghdasaryan wrote:
> > On Thu, Feb 2, 2023 at 1:11 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Feb 1, 2023 at 8:56 PM Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
> > > >
> > > > On Wed, Feb 01, 2023 at 07:00:23PM -0800, Munehisa Kamata wrote:
> > > > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> > > > > index 8ac8b81bfee6..6e66c15f6450 100644
> > > > > --- a/kernel/sched/psi.c
> > > > > +++ b/kernel/sched/psi.c
> > > > > @@ -1343,10 +1343,11 @@ void psi_trigger_destroy(struct psi_trigger *t)
> > > > >
> > > > > group = t->group;
> > > > > /*
> > > > > - * Wakeup waiters to stop polling. Can happen if cgroup is deleted
> > > > > - * from under a polling process.
> > > > > + * Wakeup waiters to stop polling and clear the queue to prevent it from
> > > > > + * being accessed later. Can happen if cgroup is deleted from under a
> > > > > + * polling process otherwise.
> > > > > */
> > > > > - wake_up_interruptible(&t->event_wait);
> > > > > + wake_up_pollfree(&t->event_wait);
> > > > >
> > > > > mutex_lock(&group->trigger_lock);
> > > >
> > > > wake_up_pollfree() should only be used in extremely rare cases. Why can't the
> > > > lifetime of the waitqueue be fixed instead?
> > >
> > > waitqueue lifetime in this case is linked to cgroup_file_release(),
> > > which seems appropriate to me here. Unfortunately
> > > cgroup_file_release() is not directly linked to the file's lifetime.
> > > For more details see:
> > > https://lore.kernel.org/all/CAJuCfpFZ3B4530TgsSHqp5F_gwfrDujwRYewKReJru==MdEHQg@xxxxxxxxxxxxxx/#t
> > > .
> > > So, if we want to fix the lifetime of the waitqueue, we would have to
> > > tie cgroup_file_release() to the fput() somehow. IOW, the fix would
> > > have to be done at the cgroups or higher (kernfs?) layer.
> >
> > Hi Eric,
> > Do you still object to using wake_up_pollfree() for this case?
> > Changing higher levels to make cgroup_file_release() be tied to fput()
> > would be ideal but I think that would be a big change for this one
> > case. If you agree I'll Ack this patch.
> > Thanks,
> > Suren.
> >
>
> I haven't read the code closely in this case. I'm just letting you know that
> wake_up_pollfree() is very much a last-resort option for when the waitqueue
> lifetime can't be fixed.

Got it. Thanks for the warning.
I think it can be fixed but the right fix would require a sizable
higher level refactoring which might be more justifiable if we have
more such cases in the future.

> So if you want to use wake_up_pollfree(), you need to
> explain why no other fix is possible. For example maybe the UAPI depends on the
> waitqueue having a nonstandard lifetime.

I think the changelog should explain that the waitqueue lifetime in
cases of non-root cgroups is tied to cgroup_file_release() callback,
which in turn is not tied to file's lifetime. That's the reason for
waitqueue and the file having different lifecycles. Would that suffice
as the justification?
Again, I'm not saying that no other fix is possible, but that the
right fix would be much more complex.
Thanks,
Suren.

>
> - Eric