Re: [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case

From: Olivier Langlois
Date: Sat Jan 15 2022 - 14:24:37 EST


On Fri, 2022-01-14 at 18:12 -0600, Eric W. Biederman wrote:
> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
>
> > On Tue, Jan 11, 2022 at 10:51 AM Eric W. Biederman
> > <ebiederm@xxxxxxxxxxxx> wrote:
> > >
> > > +       while ((n == -ERESTARTSYS) &&
> > > test_thread_flag(TIF_NOTIFY_SIGNAL)) {
> > > +               tracehook_notify_signal();
> > > +               n = __kernel_write(file, addr, nr, &pos);
> > > +       }
> >
> > This reads horribly wrongly to me.
> >
> > That "tracehook_notify_signal()" thing *has* to be renamed before
> > we
> > have anything like this that otherwise looks like "this will just
> > loop
> > forever".
> >
> > I'm pretty sure we've discussed that "tracehook" thing before - the
> > whole header file is misnamed, and most of the functions in theer
> > are
> > too.
> >
> > As an ugly alternative, open-code it, so that it's clear that "yup,
> > that clears the TIF_NOTIFY_SIGNAL flag".
>
> A cleaner alternative looks like to modify the pipe code to use
> wake_up_XXX instead of wake_up_interruptible_XXX and then have code
> that does pipe_write_killable instead of pipe_write_interruptible.

Do not forget that the problem might not be limited to the pipe FS as
Oleg Nesterov pointed out here:

https://lore.kernel.org/io-uring/20210614141032.GA13677@xxxxxxxxxx/

This is why I did like your patch fixing __dump_emit. If the only
problem is the tracehook_notify_signal() function unclear name, that
should be addressed instead of trying to fix the problem in a different
way.
>
> There is also a question of how all of this should interact with the
> freezer, as I think changing from interruptible to killable means
> that
> the coredumps became unfreezable.
>
> I am busily simmering this on my back burner and I hope I can come up
> with something sensible.

IMHO, fixing the problem on the emit function side has the merit of
being future proof if something else than io_uring in the future would
raise the TIF_NOTIFY_SIGNAL flag

but I am wondering why no one commented anything about my proposal of
cancelling io_uring before generating the core dump therefore stopping
it to flip TIF_NOTIFY_SIGNAL while the core dump is generated.

Is there something wrong with my proposed approach?
https://lore.kernel.org/lkml/cover.1629655338.git.olivier@xxxxxxxxxxxxxx/

It did flawlessly created many dozens of io_uring app core dumps in the
last months for me...

Olivier