Re: [PATCH 0/3] migrate_pages: fix deadlock in batched synchronous migration

From: Andrew Morton
Date: Sat Feb 25 2023 - 23:55:56 EST


On Fri, 24 Feb 2023 22:11:42 +0800 Huang Ying <ying.huang@xxxxxxxxx> wrote:

> Two deadlock bugs were reported for the migrate_pages() batching
> series.

"migrate_pages(): batch TLB flushing"

> Thanks Hugh and Pengfei. Analysis shows that if we have
> locked some other folios except the one we are migrating, it's not
> safe in general to wait synchronously, for example, to wait the
> writeback to complete or wait to lock the buffer head.
>
> So 1/3 fixes the deadlock in a simple way, where the batching support
> for the synchronous migration is disabled. The change is
> straightforward and easy to be understood. While 3/3 re-introduce the
> batching for synchronous migration via trying to migrate
> asynchronously in batch optimistically, then fall back to migrate
> synchronously one by one for fail-to-migrate folios. Test shows that
> this can restore the TLB flushing batching performance for synchronous
> migration effectively.

If anyone backports the "migrate_pages(): batch TLB flushing" series
into their kernels, they will want to know about such fixes. So we can
help them by providing suitable Link: tags.

Such a Link: may also be helpful to people who are performing git
bisection searches for some issue but who keep stumbling over the
issues which this series addresses.

Being lazy, I slapped

Fixes: 6f7d760e86fa ("migrate_pages: move THP/hugetlb migration support check to simplify code")

on all three, as this was the final patch in that series. Inaccurate,
but it means that these fixes will land in a suitable place if anyone
needs them.