Re: Excessive page cache occupies DMA32 memory

From: Robin Murphy
Date: Tue Jul 22 2025 - 06:08:00 EST


On 2025-07-22 8:24 am, Greg KH wrote:
On Tue, Jul 22, 2025 at 11:05:11AM +0500, Muhammad Usama Anjum wrote:
Adding ath/mhi and dma API developers to the discussion.

On 7/22/25 10:32 AM, Greg KH wrote:
On Mon, Jul 21, 2025 at 06:13:10PM +0100, Matthew Wilcox wrote:
On Mon, Jul 21, 2025 at 08:03:12PM +0500, Muhammad Usama Anjum wrote:
Hello,

When 10-12GB our of total 16GB RAM is being used as page cache
(active_file + inactive_file) at suspend time, the drivers fail to allocate
dma memory at resume as dma memory is either occupied by the page cache or
fragmented. Example:

kworker/u33:5: page allocation failure: order:7, mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0

Just to be clear, this is not a page cache problem. The driver is asking
us to do a 512kB allocation without doing I/O! This is a ridiculous
request that should be expected to fail.

The solution, whatever it may be, is not related to the page cache.
I reject your diagnosis. Almost all of the page cache is clean and
could be dropped (as far as I can tell from the output below).

Now, I'm not too familiar with how the page allocator chooses to fail
this request. Maybe it should be trying harder to drop bits of the page
cache. Maybe it should be doing some compaction.
That's very thoughtful. I'll look at the page allocator why isn't it dropping
cache or doing compaction.

I am not inclined to
go digging on your behalf, because frankly I'm offended by the suggestion
that the page cache is at fault.
I apologize—that wasn't my intention.


Perhaps somebody else will help you, or you can dig into this yourself.

I'm with Matthew, this really looks like a driver bug somehow. If there
is page cache memory that is "clean", the driver should be able to
access it just fine if really required.

What exact driver(s) is having this problem? What is the exact error,
and on what lines of code?
The issue occurs on both ath11k and mhi drivers during resume, when
dma_alloc_coherent(GFP_KERNEL) fails and returns -ENOMEM. This failure has
been observed at multiple points in these drivers.

For example, in the mhi driver, the failure is triggered when the
MHI's st_worker gets scheduled-in at resume.

mhi_pm_st_worker()
-> mhi_fw_load_handler()
-> mhi_load_image_bhi()
-> mhi_alloc_bhi_buffer()
-> dma_alloc_coherent(GFP_KERNEL) returns -ENOMEM

And what is the exact size you are asking for here?
What is the dma ops set to for your system? Are you sure that is
working properly for your platform? What platform is this exactly?

The driver isn't asking for DMA32 here, so that shouldn't be the issue,
so why do you feel it is? Have you tried using the tracing stuff for
dma allocations to see exactly what is going on for this failure?

I'm guessing the device has a 32-bit DMA mask, and the allocation ends up in __dma_direct_alloc_pages() such that that adds GFP_DMA32 in order to try to satisfy the mask via regular page allocation. How GFP_KERNEL turns into GFP_NOIO, though, given that the DMA layer certainly isn't (knowingly) messing with __GFP_IO or __GFP_FS, is more of a mystery... I suppose "during resume" is the red flag there - is this worker perhaps trying to run too early in some restricted context before the rest of the system has fully woken up?

Thanks,
Robin.


I think you need to do a bit more debugging :)

thanks,

greg k-h