On 23.06.25 16:58, Jens Axboe wrote:...>>> When only pinning a single tail page (iovec.iov_len = pagesize), it works as expected.
On 6/23/25 6:22 AM, David Hildenbrand wrote:
On 23.06.25 12:10, David Hildenbrand wrote:
On 23.06.25 11:53, Alexander Potapenko wrote:
On Mon, Jun 23, 2025 at 11:29?AM 'David Hildenbrand' via
syzkaller-bugs <syzkaller-bugs@xxxxxxxxxxxxxxxx> wrote:
So, if we pinned two tail pages but end up calling io_release_ubuf()->unpin_user_page()
on the head page, meaning that "imu->bvec[i].bv_page" points at the wrong folio page
(IOW, one we never pinned).
So it's related to the io_coalesce_buffer() machinery.
And in fact, in there, we have this weird logic:
/* Store head pages only*/
new_array = kvmalloc_array(nr_folios, sizeof(struct page *), GFP_KERNEL);
...
Essentially discarding the subpage information when coalescing tail pages.
I am afraid the whole io_check_coalesce_buffer + io_coalesce_buffer() logic might be
flawed (we can -- in theory -- coalesc different folio page ranges in
a GUP result?).
@Jens, not sure if this only triggers a warning when unpinning or if we actually mess up
imu->bvec[i].bv_page, to end up pointing at (reading/writing) pages we didn't even pin in the first
place.
Can you look into that, as you are more familiar with the logic?
Leaving this all quoted and adding Pavel, who wrote that code. I'm
currently away, so can't look into this right now.
I did some more digging, but ended up being all confused about io_check_coalesce_buffer() and io_imu_folio_data().
Assuming we pass a bunch of consecutive tail pages that all belong to the same folio, then the loop in io_check_coalesce_buffer() will always
run into the
if (page_folio(page_array[i]) == folio &&
page_array[i] == page_array[i-1] + 1) {
count++;
continue;
}
case, making the function return "true" ... in io_coalesce_buffer(), we then store the head page ... which seems very wrong.
In general, storing head pages when they are not the first page to be coalesced seems wrong.