Re: I/O statistics per process

From: Andrew Morton
Date: Wed Sep 27 2006 - 18:56:23 EST

On Wed, 27 Sep 2006 23:22:02 +0200
"roland" <devzero@xxxxxx> wrote:

> thanks. tried to contact redflag, but they don`t answer. maybe support is
> being on holiday.... !?
> linux kernel hackers - there is really no standard way to watch i/o metrics
> (bytes read/written) at process level?

The patch csa-accounting-taskstats-update.patch in current -mm kernels
(whcih is planned for 2.6.19) does have per-process chars-read and
chars-written accounting ("Extended accounting fields"). That's probably
not waht you really want, although it might tell you what you want to know.

> it`s extremly hard for the admin to track down, what process is hogging the
> disk - especially if there is more than one task consuming cpu.

Sure. Doing this is actually fairly tricky because disk writes are almost
always deferred. We'd need to remember which process dirtied some memory,
then track that info all the way down to the disk IO level, then perform
the accounting operations at IO submit-time or completion time, on a
per-page basis. It isn't rocket-science, but it's a lot of stuff and some

> meanwhile i found blktrace and read into the documenation. looks really cool
> and seems to be very powerful tool - but it it`s seems a little bit
> "oversized" and not the right tool for this. seems to be for
> tracing/debugging/analysis
> what about "with following patch,
> userspace processes/utilities will be able to access per process I/O
> statistics. for example, top like utilites can use this information" which
> has been posted to lkml one year ago ? any update on this ?

csa-accounting-taskstats-update.patch makes that information available to

But it's approximate, because

- it doesn't account for disk readahead

- it doesn't account for pagefault-initiated reads (althought it easily
could - Jay?)

- it overaccounts for a process writing to an already-dirty page.

(We could fix this too: nuke the existing stuff and do

current->wchar += PAGE_CACHE_SIZE;

in __set_page_dirty_[no]buffers().) (But that ends up being wrong if
someone truncates the file before it got written)

- it doesn't account for file readahead (although it easily could)

- it doesn't account for pagefault-initiated readahead (it could)

hm. There's actually quite a lot we could do here to make these fields
more accurate and useful. A lot of this depends on what the definition of
these fields _is_. Is is just for disk IO? Is it supposed to include
console IO, or what?
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at