Re: [Intel-gfx] [PATCH 2/3] mm, notifier: Catch sleeping/blocking for !blockable

From: Tvrtko Ursulin
Date: Fri Nov 23 2018 - 08:23:54 EST



On 23/11/2018 13:12, Daniel Vetter wrote:
On Fri, Nov 23, 2018 at 1:46 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:

On Fri 23-11-18 13:38:38, Daniel Vetter wrote:
On Fri, Nov 23, 2018 at 12:12:37PM +0100, Michal Hocko wrote:
On Thu 22-11-18 17:51:05, Daniel Vetter wrote:
We need to make sure implementations don't cheat and don't have a
possible schedule/blocking point deeply burried where review can't
catch it.

I'm not sure whether this is the best way to make sure all the
might_sleep() callsites trigger, and it's a bit ugly in the code flow.
But it gets the job done.

Yeah, it is quite ugly. Especially because it makes DEBUG config
bahavior much different. So is this really worth it? Has this already
discovered any existing bug?

Given that we need an oom trigger to hit this we're not hitting this in CI
(oom is just way to unpredictable to even try). I'd kinda like to also add
some debug interface so I can provoke an oom kill of a specially prepared
process, to make sure we can reliably exercise this path without killing
the kernel accidentally. We do similar tricks for our shrinker already.

Create a task with oom_score_adj = 1000 and trigger the oom killer via
sysrq and you should get a predictable oom invocation and execution.

Ah right. We kinda do that already in an attempt to get the tests
killed without the runner, for accidental oom. Just didn't think about
this in the context of intentionally firing the oom. I'll try whether
I can bake up some new subtest in our userptr/mmu-notifier testcases.

Very handy trick - I think I will think of applying it in the shrinker area as well.

Regards,

Tvrtko