Re: [PATCH] block: make iolatency avg_lat exponentially decay

From: Johannes Weiner
Date: Tue Jul 31 2018 - 17:19:05 EST


Hi Dennis,

this generally looks good to me. Just two small nit picks:

On Tue, Jul 31, 2018 at 01:36:47PM -0700, Dennis Zhou wrote:
> @@ -135,6 +135,24 @@ struct iolatency_grp {
> struct child_latency_info child_lat;
> };
>
> +#define BLKIOLATENCY_MIN_WIN_SIZE (100 * NSEC_PER_MSEC)
> +#define BLKIOLATENCY_MAX_WIN_SIZE NSEC_PER_SEC
> +/*
> + * These are the constants used to fake the fixed-point moving average
> + * calculation just like load average. The latency window is bucketed to
> + * try to approximately calculate average latency for the last 1 minute.
> + */
> +#define BLKIOLATENCY_NR_EXP_FACTORS 5
> +#define BLKIOLATENCY_EXP_BUCKET_SIZE (BLKIOLATENCY_MAX_WIN_SIZE / \
> + (BLKIOLATENCY_NR_EXP_FACTORS - 1))
> +static const u64 iolatency_exp_factors[BLKIOLATENCY_NR_EXP_FACTORS] = {
> + 2045, // exp(1/600) - 600 samples
> + 2039, // exp(1/240) - 240 samples
> + 2031, // exp(1/120) - 120 samples
> + 2023, // exp(1/80) - 80 samples
> + 2014, // exp(1/60) - 60 samples

Might be useful to drop the FIXED_1 name in a comment here. It says
"fixed-point", and "load average", but since the numbers are directly
in relationship to that constant, it'd be good to name it I think.

> @@ -462,7 +480,7 @@ static void iolatency_check_latencies(struct iolatency_grp *iolat, u64 now)
> struct child_latency_info *lat_info;
> struct blk_rq_stat stat;
> unsigned long flags;
> - int cpu;
> + int cpu, exp_idx;
>
> blk_rq_stat_init(&stat);
> preempt_disable();
> @@ -480,11 +498,10 @@ static void iolatency_check_latencies(struct iolatency_grp *iolat, u64 now)
>
> lat_info = &parent->child_lat;
>
> - iolat->total_lat_avg =
> - div64_u64((iolat->total_lat_avg * iolat->total_lat_nr) +
> - stat.mean, iolat->total_lat_nr + 1);
> -
> - iolat->total_lat_nr++;
> + exp_idx = min_t(int, BLKIOLATENCY_NR_EXP_FACTORS - 1,
> + iolat->cur_win_nsec / BLKIOLATENCY_EXP_BUCKET_SIZE);
> + CALC_LOAD(iolat->total_lat_avg, iolatency_exp_factors[exp_idx],
> + stat.mean);

The load average keeps the running value in fixed point presentation
to avoid rounding errors. I guess because this is IO time in ns, the
values are so much higher than the FIXED_1 denominator (2048) that
rounding errors are negligible, and we don't need to bother with it.

Can you mention that in a comment, please?