Re: [PATCH v2] hung_task : check the value of "sysctl_hung_task_timeout_sec"

From: Satoru Takeuchi
Date: Fri Mar 28 2014 - 07:56:43 EST


Hi Liu,

At Wed, 26 Mar 2014 15:56:58 +0800,
Liu hua wrote:
>
> 于 2014/3/26 0:25, Satoru Takeuchi 写道:
> > At Tue, 25 Mar 2014 16:58:58 +0800,
> > Liu hua wrote:
> >>
> >> 于 2014/3/24 4:50, Satoru Takeuchi 写道:
> >>> At Sun, 23 Mar 2014 15:54:04 +0800,
> >>> Liu Hua wrote:
> >>>>
> >>>> As sysctl_hung_task_timeout_sec is unsigned long, when this value is
> >>>> larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in
> >>>> watchdog will return immediately without sleep and with print :
> >>>>
> >>>> [ 205.452934] schedule_timeout: wrong timeout value ffffffffffffff83
> >>>>
> >>>> and then the funtion watchdog will call schedule_timeout_interruptible again
> >>>> and again. The screen will be filled with
> >>>> "schedule_timeout: wrong timeout value ffffffffffffff83"
> >>>>
> >>>> This patch does some check and correction in timeout_jiffies, to let the
> >>>> function schedule_timeout_interruptible allways get the valid parameter.
> >>>>
> >>>> Cc: <stable@xxxxxxxxxxxxxxx>
> >>>> Signed-off-by: Liu Hua <sdu.liu@xxxxxxxxxx>
> >>>> ---
> >>>> kernel/hung_task.c | 8 ++++++--
> >>>> 1 file changed, 6 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> >>>> index 6df6149..f992286 100644
> >>>> --- a/kernel/hung_task.c
> >>>> +++ b/kernel/hung_task.c
> >>>> @@ -174,8 +174,12 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> >>>>
> >>>> static unsigned long timeout_jiffies(unsigned long timeout)
> >>>> {
> >>>> - /* timeout of 0 will disable the watchdog */
> >>>> - return timeout ? timeout * HZ : MAX_SCHEDULE_TIMEOUT;
> >>>> + /* timeout of 0 or >= LONG_MAX/HZ will disable the watchdog */
> >>>> + if ((timeout == 0) || (timeout > MAX_SCHEDULE_TIMEOUT))
> >>>
> >>> You should check whether sysctl_hung_task_timeout_sec > MAX_SCHEDULE_TIMEOUT/HZ
> >>> or not when setting this parameter instead. Then this check ins't necessary here.
> >>>
> >>> # Just FYI, MAX_SCHEDULE_TIMEOUT should be MAX_SCHEDULE_TIMEOUT/HZ here.
> >>>
> >>> Thanks,
> >>> Satoru
> >>
> >> Yes, how about this :
> >
> > I confirmed the followings.
> >
> > - 3.14-rc8: system hunged up with "hung_task_timeout_secs > LONG_MAX/HZ".
> > - 3.14-rc8 with your patch: works fine. I can't set the above mentioned value any more.
> >
> > Writing possible values (0..LONG_MAX/HZ) in Documentation/sysctl/kernel.txt
> > make this patch better.
> >
> > Thanks,
> > Satoru
>
> Thanks to you attention and suggestion. I remade this patch as following.
> Is it appropriate to be reposted with tag "PATCH v3"
>
> Subject: [PATCH v3] hung_task : check the value of "sysctl_hung_task_timeout_sec"

It looks good to me.

Thanks,
Satoru

>
> As sysctl_hung_task_timeout_sec is unsigned long, when this value is
> larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in
> watchdog will return immediately without sleep and with print :
>
> [ 205.452934] schedule_timeout: wrong timeout value ffffffffffffff83
>
> and then the funtion watchdog will call schedule_timeout_interruptible
> again and again. The screen will be filled with
> "schedule_timeout: wrong timeout value ffffffffffffff83"
>
> This patch does some check and correction in sysctl, to let the
> function schedule_timeout_interruptible allways get the valid parameter.
>
> Signed-off-by: Liu Hua <sdu.liu@xxxxxxxxxx>
> Tested-by: Satoru Takeuchi <satoru.takeuchi@xxxxxxxxx>
> ---
> Documentation/sysctl/kernel.txt | 1 +
> kernel/sysctl.c | 6 ++++++
> 2 files changed, 7 insertions(+)
>
> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
> index e55124e..855d9b3 100644
> --- a/Documentation/sysctl/kernel.txt
> +++ b/Documentation/sysctl/kernel.txt
> @@ -317,6 +317,7 @@ for more than this value report a warning.
> This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
>
> 0: means infinite timeout - no checking done.
> +Possible values to set are in range {0..LONG_MAX/HZ}.
>
> ==============================================================
>
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 49e13e1..aae21e8 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -144,6 +144,11 @@ static int min_percpu_pagelist_fract = 8;
> static int ngroups_max = NGROUPS_MAX;
> static const int cap_last_cap = CAP_LAST_CAP;
>
> +/*this is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs */
> +#ifdef CONFIG_DETECT_HUNG_TASK
> +static unsigned long hung_task_timeout_max = (LONG_MAX/HZ);
> +#endif
> +
> #ifdef CONFIG_INOTIFY_USER
> #include <linux/inotify.h>
> #endif
> @@ -995,6 +1000,7 @@ static struct ctl_table kern_table[] = {
> .maxlen = sizeof(unsigned long),
> .mode = 0644,
> .proc_handler = proc_dohung_task_timeout_secs,
> + .extra2 = &hung_task_timeout_max,
> },
> {
> .procname = "hung_task_warnings",
> --
> 1.9.0
>
> Thanks,
> Liu Hua
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/