Re: [PATCH v4] arm64/mm: Optimize loop to reduce redundant operations of contpte_ptep_get

From: David Hildenbrand
Date: Thu May 08 2025 - 04:30:29 EST


On 08.05.25 09:03, Xavier Xia wrote:
This commit optimizes the contpte_ptep_get and contpte_ptep_get_lockless
function by adding early termination logic. It checks if the dirty and
young bits of orig_pte are already set and skips redundant bit-setting
operations during the loop. This reduces unnecessary iterations and
improves performance.

In order to verify the optimization performance, a test function has been
designed. The function's execution time and instruction statistics have
been traced using perf, and the following are the operation results on a
certain Qualcomm mobile phone chip:

For the future, please don't post vN+1 as reply to vN.

--
Cheers,

David / dhildenb