Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

From: Andrey Grodzovsky
Date: Mon Apr 30 2018 - 13:18:49 EST




On 04/30/2018 12:25 PM, Eric W. Biederman wrote:
Christian KÃnig <ckoenig.leichtzumerken@xxxxxxxxx> writes:

Hi Eric,

sorry for the late response, was on vacation last week.

Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
Andrey Grodzovsky <Andrey.Grodzovsky@xxxxxxx> writes:

On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
On 04/25, Andrey Grodzovsky wrote:
here (drm_sched_entity_fini) is also a bad idea, but we still want to be
able to exit immediately
and not wait for GPU jobs completion when the reason for reaching this code
is because of KILL
signal to the user process who opened the device file.
Can you hook f_op->flush method?
THANKS! That sounds like a really good idea to me and we haven't investigated
into that direction yet.
For the backwards compatibility concerns you cite below the flush method
seems a much better place to introduce the wait. You at least really
will be in a process context for that. Still might be in exit but at
least you will be legitimately be in a process.

But this one is called for each task releasing a reference to the the file, so
not sure I see how this solves the problem.
The big question is why do you need to wait during the final closing a
file?
As always it's because of historical reasons. Initially user space pushed
commands directly to a hardware queue and when a processes finished we didn't
need to wait for anything.

Then the GPU scheduler was introduced which delayed pushing the jobs to the
hardware queue to a later point in time.

This wait was then added to maintain backward compability and not break
userspace (but see below).
That make sense.

The wait can be terminated so the wait does not appear to be simply a
matter of correctness.
Well when the process is killed we don't care about correctness any more, we
just want to get rid of it as quickly as possible (OOM situation etc...).

But it is perfectly possible that a process submits some render commands and
then calls exit() or terminates because of a SIGTERM, SIGINT etc.. In this case
we need to wait here to make sure that all rendering is pushed to the hardware
because the scheduler might need resources/settings from the file
descriptor.

For example if you just remove that wait you could close firefox and get garbage
on the screen for a millisecond because the remaining rendering commands where
not executed.

So what we essentially need is to distinct between a SIGKILL (which means stop
processing as soon as possible) and any other reason because then we don't want
to annoy the user with garbage on the screen (even if it's just for a few
milliseconds).
I see a couple of issues.

- Running the code in release rather than in flush.

Using flush will catch every close so it should be more backwards
compatible. f_op->flush always runs in process context so looking at
current makes sense.

- Distinguishing between death by SIGKILL and other process exit deaths.

In f_op->flush the code can test "((tsk->flags & PF_EXITING) &&
(tsk->code == SIGKILL))" to see if it was SIGKILL that terminated
the process.

What about Oleg's note not to rely on tsk->code == SIGKILL (still not clear why ?)
and that this entire check is racy (against what ?) ? Or is it relevant to .release hook
only ?

Andrey


- Dealing with stuck queues (where this patchset came in).

For stuck queues you are going to need a timeout instead of the current
indefinite wait after PF_EXITING is set. From what you have described a
few milliseconds should be enough. If PF_EXITING is not set you can
still just make the wait killable and skip the timeout if that will give
a better backwards compatible user experience.

What can't be done is try and catch SIGKILL after a process has called
do_exit. A dead process is a dead process.

Eric