Re: perf_event_open+clone = unkillable process

From: Eric W. Biederman
Date: Mon Feb 04 2019 - 22:01:10 EST


Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:

> On Mon, 4 Feb 2019, Dmitry Vyukov wrote:
>
>> On Mon, Feb 4, 2019 at 10:27 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> >
>> > On Fri, 1 Feb 2019, Dmitry Vyukov wrote:
>> >
>> > > On Fri, Feb 1, 2019 at 5:48 PM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>> > > >
>> > > > Hello,
>> > > >
>> > > > The following program creates an unkillable process that eats CPU.
>> > > > /proc/pid/stack is empty, I am not sure what other info I can provide.
>> > > >
>> > > > Tested is on upstream commit 4aa9fc2a435abe95a1e8d7f8c7b3d6356514b37a.
>> > > > Config is attached.
>> > >
>> > > Looking through other reproducers that create unkillable processes, I
>> > > think I found a much simpler reproducer (below). It's single threaded
>> > > and just setups SIGBUS handler and does timer_create+timer_settime to
>> > > send repeated SIGBUS. The resulting process can't be killed with
>> > > SIGKILL.
>> > > +Thomas for timers.
>> >
>> > +Oleg, Eric
>> >
>> > That's odd. With some tracing I can see that SIGKILL is generated and
>> > queued, but its not delivered by some weird reason. I'm traveling in the
>> > next days, so I won't be able to do much about it. Will look later this
>> > week.
>>
>> Just a random though looking at the repro: can constant SIGBUS
>> delivery starve delivery of all other signals (incl SIGKILL)?
>
> Indeed. SIGBUS is 7, SIGKILL is 9 and next_signal() delivers the lowest
> number first....

We do have the special case in complete_signal that causes most of the
signal delivery work of SIGKILL to happen when SIGKILL is queued.

I need to look at your reproducer. It would require being a per-thread
signal to cause problems in next_signal.

It is definitely worth fixing if there is any way for userspace to block
SIGKILL.

Eric