Re: [RESEND 2] [PATCH] rlimits: Print more information when limits are exceeded

From: Arun Raghavan
Date: Sat Feb 18 2017 - 10:47:17 EST




On Sat, 18 Feb 2017, at 02:07 PM, Arun Raghavan wrote:
> This dumps some information in logs when a process exceeds its CPU or RT
> limits (soft and hard). Makes debugging easier when userspace triggers
> these limits.
>
> Signed-off-by: Arun Raghavan <arun@xxxxxxxxxxxxxxxx>
> ---
> kernel/time/posix-cpu-timers.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> Hello,
> This has come up a couple of times in the past, but we haven't been able
> to
> resolve whatever issues were pointed out.
>
> In the mean time, we have frustrated users who don't know where they're
> getting
> a SIGKILL from, and I'd really like to have a way for people to not have
> to go
> through this.
>
> The issues that came up the last time were:
>
> 1. SIGXCPU messages shouldn't be needed since they can be caught: it's
> still
> useful to have the log because it isn't always possible to pin down
> the
> thread causing the problem in userspace.
>
> 2. SIGKILL logging should be centralised: there seem to be multiple
> paths that
> trigger a SIGKILL -- and it seemed a bit ugly to try to add a reason
> parameter on all of them for the KILL case. Any other suggestions on
> how to
> deal with this?
>
> I'm happy to fix this up to actually make it this time, but if there
> aren't
> none, just pushing this out will make our lives a little less painful.

That was meant to read -- "... if there aren't blocking objections to
this, just pushing this out will make our lives a little less painful."

-- Arun