Re: I/O statistics per process

From: Andrew Morton
Date: Thu Sep 28 2006 - 15:10:51 EST


On Thu, 28 Sep 2006 11:55:38 -0700
Jay Lan <jlan@xxxxxxxxxxxx> wrote:

> Andrew Morton wrote:
> > On Wed, 27 Sep 2006 23:22:02 +0200
> > "roland" <devzero@xxxxxx> wrote:
> >
> >>thanks. tried to contact redflag, but they don`t answer. maybe support is
> >>being on holiday.... !?
> >>
> >>linux kernel hackers - there is really no standard way to watch i/o metrics
> >>(bytes read/written) at process level?
> >
> > The patch csa-accounting-taskstats-update.patch in current -mm kernels
> > (whcih is planned for 2.6.19) does have per-process chars-read and
> > chars-written accounting ("Extended accounting fields"). That's probably
> > not waht you really want, although it might tell you what you want to know.
> >
> >>it`s extremly hard for the admin to track down, what process is hogging the
> >>disk - especially if there is more than one task consuming cpu.
>
> Rolend,
>
> The per-process chars-read and chars-writeen accounting is made
> available through taskstats interface (see Documentation/accounting/
> taskstats.txt) in 2.6.18-mm1 kernel. Unfortunately, the user-space CSA
> package is still a few months away. You may, for now, write your
> own taskstats application or go a long way to port the in-kernel
> implementation of pagg/job/csa.
>
> However, the "Externded acocunting fields" patch does not provide you
> straight forward answer. The patch provides accounting data only at
> process termination (just like the BSD accounting) and it seems that
> you want to see which run-away application (ie, alive) eating up your
> disk. The taskstats interface offers a query mode (command-response),
> but currently only delayacct uses that mode. We would need to make
> those data available in the query mode in order for application to
> see accounting data of live processes.

ow. That is a rather important enhancement to have.

> >
> > csa-accounting-taskstats-update.patch makes that information available to
> > userspace.
> >
> > But it's approximate, because
> >
> > - it doesn't account for disk readahead
> >
> > - it doesn't account for pagefault-initiated reads (althought it easily
> > could - Jay?)
> >
> > - it overaccounts for a process writing to an already-dirty page.
> >
> > (We could fix this too: nuke the existing stuff and do
> >
> > current->wchar += PAGE_CACHE_SIZE;
> >
> > in __set_page_dirty_[no]buffers().) (But that ends up being wrong if
> > someone truncates the file before it got written)
> >
> > - it doesn't account for file readahead (although it easily could)
> >
> > - it doesn't account for pagefault-initiated readahead (it could)
> >
> >
> > hm. There's actually quite a lot we could do here to make these fields
> > more accurate and useful. A lot of this depends on what the definition of
> > these fields _is_. Is is just for disk IO? Is it supposed to include
> > console IO, or what?

I'd be interested in your opinions on all the above, please.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/