Re: [PATCH] mm: warn about allocations which stall for too long

From: Balbir Singh
Date: Sat Sep 24 2016 - 09:18:59 EST

Next message: Mathieu Desnoyers: "Re: [RFC PATCH v2 5/5] tracing: add sched_update_prio"
Previous message: Stefan Richter: "Re: [PATCH 02/10] firewire-net: Rename a jump label in fwnet_broadcast_start()"
Next in thread: Michal Hocko: "Re: [PATCH] mm: warn about allocations which stall for too long"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 24/09/16 03:34, Dave Hansen wrote:
> On 09/23/2016 01:15 AM, Michal Hocko wrote:
>> + /* Make sure we know about allocations which stall for too long */
>> + if (!(gfp_mask & __GFP_NOWARN) && time_after(jiffies, alloc_start + stall_timeout)) {
>> + pr_warn("%s: page alloction stalls for %ums: order:%u mode:%#x(%pGg)\n",
>> + current->comm, jiffies_to_msecs(jiffies-alloc_start),
>> + order, gfp_mask, &gfp_mask);
>> + stall_timeout += 10 * HZ;
>> + dump_stack();
>> + }
>
> This would make an awesome tracepoint. There's probably still plenty of
> value to having it in dmesg, but the configurability of tracepoints is
> hard to beat.

An awesome tracepoint and a great place to trigger other tracepoints. With stall timeout
increasing every time, do we only care about the first instance when we exceeded stall_timeout?
Do we debug just that instance?

Balbir Singh.

Next message: Mathieu Desnoyers: "Re: [RFC PATCH v2 5/5] tracing: add sched_update_prio"
Previous message: Stefan Richter: "Re: [PATCH 02/10] firewire-net: Rename a jump label in fwnet_broadcast_start()"
Next in thread: Michal Hocko: "Re: [PATCH] mm: warn about allocations which stall for too long"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]