Re: [PATCH 2.6.27-rc5] Fix itimer/many thread hang.

From: Frank Mayhar
Date: Tue Sep 09 2008 - 16:31:09 EST


On Tue, 2008-09-09 at 20:01 +0400, Oleg Nesterov wrote:
> The patch has a lot of
>
> rcu_read_lock();
> sig = rcu_dereference(tsk->signal);
>
> This is bogus, task_struct->signal is not protected by RCU.

Roland suggested this approach; I just want to snapshot the field
atomically. I guess I can just do an atomic read...

> However, at first glance the code (this and other funcs) looks correct...
> Either tsk == current, or the code runs under ->siglock. Or we know that
> ->signal can't go away (wait_task_zombie).
>
> As for this particular function, it seems to me that ->signal == NULL
> is not possible, no?

That's not completely clear to me. I'm allowing for the possibility
that it might be called during, say, process teardown. It's used in so
many places that I'm uncomfortable leaving the == NULL check out.

> Please remove the false RCU stuff.

OK.

> Btw, this function has a lot of callers, perhaps it is better to
> uninline it.

If that's the consensus I'll do so. I assumed that speed was more
important than space in this case. Am I mistaken?

> > static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
> > {
> > struct signal_struct *sig;
> > int ret;
> >
> > if (clone_flags & CLONE_THREAD) {
> > - atomic_inc(&current->signal->count);
> > - atomic_inc(&current->signal->live);
> > - return 0;
> > + ret = thread_group_cputime_clone_thread(current, tsk);
> > + if (likely(!ret)) {
> > + atomic_inc(&current->signal->count);
> > + atomic_inc(&current->signal->live);
> > + }
>
> So, the first CLONE_THREAD creates ->cputime.totals. After that
> thread_group_cputime_account_xxx() start to use it even if the task
> doesn't have the attached cpu timers.
>
> Stupid question: can't we allocate .totals in posix_cpu_timer_create() /
> set_process_cpu_timer() ?

That was the original plan but we (that is, Roland and I) decided to
eliminate the separate storage for the dead-threads totals. It's now
all kept in the totals field, for the whole thread group. Note that
this doesn't affect single-threaded processes at all. Multithreaded
processes, though, now keep their times in the new field, eliminating
the old dead-thread fields from the signal structure, reducing the bloat
this patch generates by just a bit.

> Let's suppose the task doesn't have cpu timers. Currently, in this case
> run_posix_cpu_timers() quickly checks UNEXPIRED() and returns. With this
> patch we call fastpath_timer_check(). The first task_cputime_expired()
> returns 0, so we are doing thread_group_cputime()->for_each_possible_cpu().
>
> Not good, this code runs every timer tick. Perhaps it makes sense
> to add a fastpath check.

After looking at the code again, you're right. I paid so much attention
to fast-pathing the unexpired timer case that I forgot to handle the
_no_-timer case. I've added code to fastpath_timer_check() to handle
that case.

I'll be addressing the rest of your comments (particularly the RCU
stuff) and producing a new patch fairly shortly.
--
Frank Mayhar <fmayhar@xxxxxxxxxx>
Google, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/