Re: [BUG] 2.4.x RT signal leak with kupdated (and maybe others)

From: Benjamin Herrenschmidt
Date: Tue Sep 30 2003 - 12:49:58 EST


> When I wrote the kupdate code, only the real time signals could be
> queued. Now things have changed to carry the siginfo for non-RT too. The
> fact we clear the pending by hand is what allows more than a RT signal
> to be stacked, we shouldn't clear the bitflag unless we dequeue the
> signal too. That's definitely a bug (though a minor one ;)

"Minor" but leads to interesting results in the end when coupled
with something like noflushd that regulary send those signals ;)

Not only we leak them, but we also get nr_queued_signals reaching
nr_max_signals. This has the side effect of making do_notify_parent()
silently fail when a pthread is dead (libpthread use an RT signal).

The end result is that after a few days, a machine running noflushd
and thread intensive apps like evolution and gkrellm will have dozens
(or even hundreds) of zombies as the child threads are never reclaimed
by libpthread "manager" thread since it never gets the signal...

> I sure agree it should be fixed with a dequeue_signal in kupdate.

I'll cook something tomorrow (I'm away from any 2.4 machine tonight).

> BTW, things like this in daemonize don't protect against allocating
> signals (the kernel deamon should flush_signals once in a while in the
> main loop to do that):
>
> /* Block and flush all signals */
> sigfillset(&blocked);
> sigprocmask(SIG_BLOCK, &blocked, NULL);
> flush_signals(current);
>
> But it's not a big problem you shouldn't send signals to daemons
> anyways, only kupdate gives a semantic to signals.

Interesting... though hopefully, I didn't see anybody else causing
such a constant increase of nr_queued_signals so far on this laptop...

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/