Re: [RFC][PATCH 09/10] taskstats: Fix exit CPU time accounting

From: Balbir Singh
Date: Tue Sep 28 2010 - 04:36:25 EST


* Martin Schwidefsky <schwidefsky@xxxxxxxxxx> [2010-09-27 15:42:57]:

> On Sun, 26 Sep 2010 20:11:27 +0200
> Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>
> > Hi,
> >
> > On 09/24, Michael Holzheu wrote:
> > >
> > > On Thu, 2010-09-23 at 19:10 +0200, Oleg Nesterov wrote:
> > > >
> > > > On 09/23, Michael Holzheu wrote:
> > > > >
> > > > > Currently there are code pathes (e.g. for kthreads) where the consumed
> > > > > CPU time is not accounted to the parents cumulative counters.
> > > >
> > > > Could you explain more?
> > >
> > > I think one place was "khelper" (kmod.c). It is created with
> > > kernel_thread() and it exits without having accounted the times with
> > > sys_wait() to the parent's ctimes
> >
> > No. Well yes, it is not accounted, but this is not because it is
> > kthread.
>
> We noticed that behavior with kernel threads but as you point out
> the problem is bigger than that.
>
> > To simplify the discussion, lets talk about utime/cutime only,
> > and lets forget about the multithreading.
> >
> > It is very simple, currently linux accounts the exiting task's
> > utime and adds its to ->cutime _only_ if parent does do_wait().
> > If parent ignores SIGCHLD, the child reaps itself and it is not
> > accounted.
> >
> > I do not know why it was done this way, but I'm afraid we can't
> > change this historical behaviour.
>
> Why? I would consider it to be a BUG() that the time is not accounted.
> Independent of the fact that a parent wants to see the SIGCHLD and
> the exit status of its child the process time of the child should be
> accounted, no? And I'm not a particular fan of the "this has always
> been that way" reasoning.
>
> > > Ok, the problem is that I did not consider exiting threads that are no
> > > thread group leaders. When they exit the ctime of the parent is not
> > > updated. Instead the time is accumulated in the signal struct.
> >
> > I think I am a bit confused, but see above. With or without threads
> > the whole process can exit without accounting.
>
> Got the part about self-reaping processes. But there is another issue:
> consider an exiting thread where the group leader is still active.
> The time for the thread will be added to the utime/stime fields in
> the signal structure. Taskstats will happily ignore that time while
> the group leader is still running.
>

Why do you say that? Not sure your comment is very clean, in
fill_tgid, we do

1. Accumulate signal stats (contains stats for dead threads)
2. Accumulate stats for current threads

fill_tgid_exit does something similar

> Please keep in mind that we want to get to a point where it is
> possible to get a 100% coverage of cpu cycles in the last snapshot
> cycle through the taskstats interface. Otherwise the precise top
> would not be very precise ..

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/