Re: [PATCH v4] mm/page_alloc: bail out on fatal signal during reclaim/compaction retry attempt

From: Vlastimil Babka
Date: Mon May 31 2021 - 07:35:35 EST


On 5/31/21 1:33 PM, Michal Hocko wrote:
> On Thu 20-05-21 15:29:01, Aaron Tomlin wrote:
>> A customer experienced a low-memory situation and decided to issue a
>> SIGKILL (i.e. a fatal signal). Instead of promptly terminating as one
>> would expect, the aforementioned task remained unresponsive.
>>
>> Further investigation indicated that the task was "stuck" in the
>> reclaim/compaction retry loop. Now, it does not make sense to retry
>> compaction when a fatal signal is pending.
>
> Is this really true in general? The memory reclaim is retried even when
> fatal signals are pending. Why should be compaction different? I do
> agree that retrying way too much is bad but is there any reason why this
> special case doesn't follow the max retry logic?

Compaction doesn't do anything if fatal signal is pending, it bails out
immediately and the checks are rather frequent. So why retry?