Re: [PATCH v4 3/4] mm: Optimize mprotect() by PTE-batching

From: Dev Jain
Date: Sat Jun 28 2025 - 08:40:46 EST



On 28/06/25 5:04 pm, Dev Jain wrote:
Use folio_pte_batch to batch process a large folio. Reuse the folio from
prot_numa case if possible.

For all cases other than the PageAnonExclusive case, if the case holds true
for one pte in the batch, one can confirm that that case will hold true for
other ptes in the batch too; for pte_needs_soft_dirty_wp(), we do not pass
FPB_IGNORE_SOFT_DIRTY. modify_prot_start_ptes() collects the dirty
and access bits across the batch, therefore batching across
pte_dirty(): this is correct since the dirty bit on the PTE really is
just an indication that the folio got written to, so even if the PTE is
not actually dirty (but one of the PTEs in the batch is), the wp-fault
optimization can be made.

The crux now is how to batch around the PageAnonExclusive case; we must
check the corresponding condition for every single page. Therefore, from
the large folio batch, we process sub batches of ptes mapping pages with
the same PageAnonExclusive condition, and process that sub batch, then
determine and process the next sub batch, and so on. Note that this does
not cause any extra overhead; if suppose the size of the folio batch
is 512, then the sub batch processing in total will take 512 iterations,
which is the same as what we would have done before.

Signed-off-by: Dev Jain <dev.jain@xxxxxxx>
---

Forgot to add:

Co-developed-by: Ryan Roberts <ryan.roberts@xxxxxxx>
Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>

as this patch is almost identical to the diff Ryan had suggested.