Re: + itimers-fix-itimer-many-thread-hang.patch added to -mm tree

From: Frank Mayhar
Date: Mon Sep 15 2008 - 13:51:41 EST


On Sun, 2008-09-14 at 21:50 +0400, Oleg Nesterov wrote:
> s/mm-commits/lkml/
>
> (Frank, please don't forget to CC me ;)

Did I forget last time? Oops.

> Really minor nit. Suppose that task_cputime_zero(tsk) == T and
> task_cputime_zero(sig) == F. In that case task_cputime_expired(&task_sample)
> is not needed, perhaps it makes sense to reformat this function a bit
>
> static inline int fastpath_timer_check(struct task_struct *tsk,
> struct signal_struct *sig)
> {
> if (!task_cputime_zero(&tsk->cputime_expires)) {
> struct task_cputime task_sample = {
> .utime = tsk->utime,
> .stime = tsk->stime,
> .sum_exec_runtime = tsk->se.sum_exec_runtime
> };
> if (task_cputime_expired(&task_sample, &tsk->cputime_expires))
> return 1;
> }
>
> if (!task_cputime_zero(&sig->cputime_expires)) {
> struct task_cputime group_sample;
> thread_group_cputime(tsk, &group_sample);
> if (task_cputime_expired(&group_sample, &sig->cputime_expires))
> return 1;
> }
>
> return 0;
> }
> this way it also looks more symmetrical.

Yeah, I like it better this way myself. Also...

> I'd suggest to move the "if (!sig)" check into fastpath_timer_check(),
> run_posix_cpu_timers() doesn't use sig, but this is matter a of taste.

...I agree with this. It looks better and removes the sig dereference
from run_posix_cpu_timers() where it is otherwise unused.

> This is a bit misleading, lock_task_sighand() can't fail or we have a bug.
> We already checked ->signal != NULL, and the task is current, we can use
> spin_lock(&tsk->sighand->siglock).
>
> To clarify, if lock_task_sighand() could fail, fastpath_timer_check()
> is not safe. So I'd suggest the next patch:

Okay, I get it. (This actually matches an iteration of the code but I
decided that I wasn't sure enough of my understanding to depend on
lock_task_sighand() not failing. Things have now changed enough,
though, that it makes sense again.)

> > +unsigned long long thread_group_sched_runtime(struct task_struct *p)
> > +{
> > + unsigned long flags;
> > + u64 ns;
> > + struct rq *rq;
> > + struct task_cputime totals;
> > +
> > + rq = task_rq_lock(p, &flags);
> > + thread_group_cputime(p, &totals);
> > + ns = totals.sum_exec_runtime + task_delta_exec(p, rq);
> > task_rq_unlock(rq, &flags);
>
> Hmm. This is used by cpu_clock_sample_group_locked() which has already
> called thread_group_cputime(). Yes, without task_rq_lock(), but
> thread_group_cputime() is not "atomic" anyway. And please note that
> thread_group_sched_runtime() is not that "group", we don't account
> task_delta_exec() for other threads. Perhaps we can kill this function?
> Or, at least, perhaps we can simplify this:

Deferring to your superior knowledge, I've made the suggested changes.
My original intent was to retain the original structure of the code but,
as you say, this code is now redundant.

> BTW, with or without this patch cpu_clock_sample_group() doesn't need
> ->siglock, afaics.

Fixed. I removed cpu_clock_sample_group_locked() entirely and moved the
guts of it to cpu_clock_sample_group().

I have a few more things to do; expect a new iteration of the patch
tonight or tomorrow.
--
Frank Mayhar <fmayhar@xxxxxxxxxx>
Google, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/