Re: [PATCH v2 1/1] fs/splice: add missing callback for inaccessible pages

From: Dave Hansen
Date: Fri May 01 2020 - 12:32:49 EST


On 5/1/20 12:18 AM, Christian Borntraeger wrote:
>> unlock_page();
>> get_page();
>> // ^ OK because I have a ref
>> // do DMA on inaccessible page
>>
>> Because the make_secure_pte() code isn't looking for a *specific*
>> 'expected' value, it has no way of noticing that the extra ref snuck in
>> there.
> I think the expected calcution is actually doing that,giving back the minimum
> value when no one else has any references that are valid for I/O.
>
> But I might not have understood what you are trying to tell me?

I was wrong. I was looking at migrate_page_move_mapping():

> int expected_count = expected_page_refs(mapping, page) + extra_count;
...
> xas_lock_irq(&xas);
> if (page_count(page) != expected_count || xas_load(&xas) != page) {
> xas_unlock_irq(&xas);
> return -EAGAIN;
> }
>
> if (!page_ref_freeze(page, expected_count)) {
> xas_unlock_irq(&xas);
> return -EAGAIN;
> }

I saw the check for page_count(page) *and* the page_ref_freeze() call.
My assumption was that both were needed. My assumption was wrong. (I
think the migrate_page_move_mapping() code may actually be doing a
superfluous check.)

The larger point, though, is that the s390 code ensures no extra
references exist upon entering make_secure_pte(), but it still has no
mechanism to prevent future, new references to page cache pages from
being created.

The one existing user of expected_page_refs() freezes the refs then
*removes* the page from the page cache (that's what the xas_lock_irq()
is for). That stops *new* refs from being acquired.

The s390 code is missing an equivalent mechanism.

One example:

page_freeze_refs();
// page->_count==0 now
find_get_page();
// ^ sees a "freed" page
page_unfreeze_refs();

find_get_page() will either fail to *find* the page because it will see
page->_refcount==0 think it is freed (not great), or it will
VM_BUG_ON_PAGE() in __page_cache_add_speculative().

My bigger point is that this patches doesn't systematically stop finding
page cache pages that are arch-inaccessible. This patch hits *one* of
those sites.