Re: [PATCH v2 3/3] khugepaged: Reduce race probability between migration and khugepaged
From: Lorenzo Stoakes
Date: Thu Jun 26 2025 - 00:57:53 EST
On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote:
>
> On 25/06/25 6:58 pm, Lorenzo Stoakes wrote:
> > On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote:
> > > Suppose a folio is under migration, and khugepaged is also trying to
> > > collapse it. collapse_pte_mapped_thp() will retrieve the folio from the
> > > page cache via filemap_lock_folio(), thus taking a reference on the folio
> > > and sleeping on the folio lock, since the lock is held by the migration
> > > path. Migration will then fail in
> > > __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of
> > > such a race happening (leading to migration failure) by bailing out
> > > if we detect a PMD is marked with a migration entry.
> > >
> > > This fixes the migration-shared-anon-thp testcase failure on Apple M3.
> > Hm is this related to the series at all? Seems somewhat unrelated?
>
> Not related.
>
> >
> > Is there a Fixes, Closes, etc.? Do we need something in stable?
>
> We don't need anything. This is an "expected race" in the sense that
> both migration and khugepaged collapse are best effort algorithms.
> I am just seeing a test failure on my system because my system hits
> the race more often. So this patch reduces the window for the race.
Does it rely on previous patches? If not probably best to send this one
separately :)