On 11/04/2024 15:07, Lance Yang wrote:
On Thu, Apr 11, 2024 at 9:48 PM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
[...]
+
+ if (!folio_trylock(folio))
+ continue;
This is still wrong. This should all be protected by the "if
(folio_test_swapcache(folio) || folio_test_dirty(folio))" as it was previously
so that you only call folio_trylock() if that condition is true. You are
unconditionally locking here, then unlocking, then relocking below if the
condition is met. Just put everything inside the condition and lock once.
I'm not sure if it's safe to call folio_mapcount() without holding the
folio lock.
As mentioned earlier by David in the v2[1]
What could work for large folios is making sure that #ptes that map the
folio here correspond to the folio_mapcount(). And folio_mapcount()
should be called under folio lock, to avoid racing with swapout/migration.
[1] https://lore.kernel.org/all/5cc05529-eb80-410e-bc26-233b0ba0b21f@xxxxxxxxxx/
But I'm not suggesting that you should call folio_mapcount() without the lock.
I'm proposing this:
if (folio_test_swapcache(folio) || folio_test_dirty(folio)) {
if (!folio_trylock(folio))
continue;
/*
- * If folio is shared with others, we mustn't clear
- * the folio's dirty flag.
+ * If we have a large folio at this point, we know it is
+ * fully mapped so if its mapcount is the same as its
+ * number of pages, it must be exclusive.
*/
- if (folio_mapcount(folio) != 1) {
+ if (folio_mapcount(folio) != folio_nr_pages(folio)) {
folio_unlock(folio);
continue;
}
IIUC, if the folio is clean and not in the swapcache, we still need to
compare the number of batched PTEs against folio_mapcount().
Why? That's not how the old code worked. In fact the comment says that the
reason for the exclusive check is to avoid marking a dirty *folio* as clean if
shared; that would be bad because we could throw away data that others relied
upon. It's perfectly safe to clear the dirty flag from the *pte* even if it is
shared; the ptes are private to the process so that won't affect sharers.
You should just follow the pattern already estabilished by the original code.
The only difference is that because the folio is now (potentially) large, you
have to change the way to detect exclusivity.