Re: [PATCH 0/4] block: Per-partition block IO performance histograms

From: Jeff Moyer
Date: Thu Apr 15 2010 - 09:40:23 EST


Divyesh Shah <dpshah@xxxxxxxxxx> writes:

> The following patchset implements per partition 2-d histograms for IO to block
> devices. The 3 types of histograms added are:
>
> 1) request histograms - 2-d histogram of total request time in ms (queueing +
> service) broken down by IO size (in bytes).
> 2) dma histograms - 2-d histogram of total service time in ms broken down by
> IO size (in bytes).
> 3) seek histograms - 1-d histogram of seek distance
>
> All of these histograms are per-partition. The first 2 are further divided into
> separate read and write histograms. The buckets for these histograms are
> configurable via config options as well as at runtime (per-device).

Do you also keep track of statistics for the entire device? The I/O
schedulers operate at the device level, not the partition level.

> These histograms have proven very valuable to us over the years to understand
> the seek distribution of IOs over our production machines, detect large
> queueing delays, find latency outliers, etc. by being used as part of an
> always-on monitoring system.
>
> They can be reset by writing any value to them which makes them useful for
> tests and debugging too.
>
> This was initially written by Edward Falk in 2006 and I've forward ported
> and improved it a few times it across kernel versions.
>
> He had also sent a very old version of this patchset (minus some features like
> runtime configurable buckets) back then to lkml - see
> http://lkml.indiana.edu/hypermail/linux/kernel/0611.1/2684.html
> Some of the reasons mentioned for not including these patches are given below.
>
> I'm requesting re-consideration for this patchset in light of the following
> arguments.
>
> 1) This can be done with blktrace too, why add another API?
[...]
> This is about 1.8% average throughput loss per thread.
> The extra cpu time spent with blktrace is in addition to this loss of
> throughput. This overhead will only go up on faster SSDs.

I don't see any analysis of the overhead of your patch set. Would you
mind providing those numbers?

Thanks,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/