Re: [BUG]userfaultfd_move fails to move a folio when swap-in occurs concurrently with swap-out
From: David Hildenbrand
Date: Tue May 27 2025 - 07:06:59 EST
EBUSY
The pages in the source virtual memory range are either
pinned or not exclusive to the process. The kernel might
only perform lightweight checks for detecting whether the
pages are exclusive. To make the operation more likely to
succeed, KSM should be disabled, fork() should be avoided
or MADV_DONTFORK should be configured for the source
virtual memory area before fork().
Note the "lightweight" and "more likely to succeed".
Initially, my point was that an exclusive folio (single-process case)
should be movable.
Yeah, I would wish that we wouldn't need that PAE hack in the swapin code.
I was asking myself if we could just ... wait for writeback to end in
that case?
I mean, if we would have to swap in the folio we would also have to wait
for disk I/O ... so here we would also have to wait for disk I/O.
We could either wait for writeback before mapping the folio, or set the
PAE bit and map the page R/O, to then wait for writeback during write
faults.
The latter has the downside that we have to handle it with more
complexity during write faults (check if page is under writeback, then
check if we require this sync I/O during write faults even though PAE is
set ...).
Now I understand this isn’t a bug, but rather a compromise made due
to implementation constraints.
That is a good summary!
Perhaps the remaining value of this report is that it helped better
understand scenarios beyond fork where a move might also fail.
I truly appreciate your time and your clear analysis.
YW :)
--
Cheers,
David / dhildenb