Re: [PATCH] mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock
From: Jeongjun Park
Date: Thu Jul 03 2025 - 02:09:39 EST
Hello,
Dennis Zhou <dennis@xxxxxxxxxx> wrote:
>
> Hello,
>
> On Thu, Jul 03, 2025 at 01:45:36PM +0900, Jeongjun Park wrote:
> > Christoph Lameter (Ampere) <cl@xxxxxxxxxx> wrote:
> > >
> > > On Wed, 2 Jul 2025, Jeongjun Park wrote:
> > >
> > > > diff --git a/mm/percpu.c b/mm/percpu.c
> > > > index b35494c8ede2..0f98b857fb36 100644
> > > > --- a/mm/percpu.c
> > > > +++ b/mm/percpu.c
> > > > @@ -3355,7 +3355,13 @@ void __init setup_per_cpu_areas(void)
> > > > */
> > > > unsigned long pcpu_nr_pages(void)
> > > > {
> > > > - return pcpu_nr_populated * pcpu_nr_units;
> > > > + unsigned long flags, ret;
> > > > +
> > > > + spin_lock_irqsave(&pcpu_lock, flags);
> > > > + ret = pcpu_nr_populated * pcpu_nr_units;
> > > > + spin_unlock_irqrestore(&pcpu_lock, flags);
> > >
> > >
> > > Ummm.. What? You are protecting a single read with a spinlock? There needs
> > > to be some updating of data somewhere for this to make sense.
> > >
> > >
> > > Unless a different critical section protected by the lock sets the value
> > > intermittendly to something you are not allowed to see before a final
> > > store of a valid value. But that would be unusual.
> > >
> > > This is an academic exercise or did you really see a problem?
> > >
> > > What is racing?
> > >
> > >
> >
> > This patch is by no means an academic exercise.
> >
> > As written in the reported tag, This race has actually been reported
> > in syzbot [1].
> >
> > [1]: https://syzkaller.appspot.com/bug?extid=e5bd32b79413e86f389e
> >
>
> A report by syzbot doesn't mean it is a real problem. A production
> problem or broken test case is much more urgent.
>
> > pcpu_nr_populated is currently being write in pcpu_chunk_populated()
> > and pcpu_chunk_depopulated(), and since this two functions perform
> > pcpu_nr_populated write under the protection of pcpu_lock, there is no
> > race for write/write.
> >
> > However, since pcpu_nr_pages(), which performs a read operation on
> > pcpu_nr_populated, is not protected by pcpu_lock, races between read/write
> > can easily occur.
> >
> > Therefore, I think it is appropriate to protect it through pcpu_lock
> > according to the comment written in the definition of pcpu_nr_populated.
> >
>
> You're right that this is a race condition, but this was an intention
> choice done because the value read here is only being used to pass
> information to userspace for /proc/meminfo. As Christoph mentioned, the
> caller of pcpu_nr_pages() will never see an invalid value nor does it
> really matter either.
>
> The pcpu_lock is core to the percpu allocator and isn't something we
> would want to blindly expose either.
>
> The appropriate solution here is what Shakeel proposed to just mark the
> access as a data_race().
>
> Thanks,
> Dennis
If this data race was intentional, it makes sense why it was written
this way. I'll send v2 patch with the fix Shakeel proposed.
--
Regards,
Jeongjun Park