Re: [PATCH v2] blk-throttle: simplify logic by token bucket algorithm

From: Hong zhi guo
Date: Wed Oct 16 2013 - 02:09:47 EST


Hi, Vivek,

Thanks for your careful review. I'll rename t_c to last_dispatch, it's
much better.

For the big burst issue, I've different opinion. Let's discuss it.

Any time a big IO means a big burst. Even if it's throttled at first
time, queued in
the service_queue, and then waited for a while, when it's delivered
out, it IS still
a big burst. So throttling the bio for another while makes no sense.

If a group has been idle for 5 minutes, then it owns the credit to
deliver a big IO
(within 300 * bps bytes). And the extra credit will be cut off after
the delivery.

Waiting for your comments.

Zhiguo


On Wed, Oct 16, 2013 at 1:32 AM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> On Mon, Oct 14, 2013 at 05:09:17PM +0800, Hong Zhiguo wrote:
>
> [..]
>
> Hi Hong,
>
> Thanks for the token based throttling implementation. In general it looks
> good and it simplies the logic. Couple of comments/concerns below.
>
>> @@ -133,14 +135,13 @@ struct throtl_grp {
>> /* IOPS limits */
>> unsigned int iops[2];
>>
>> - /* Number of bytes disptached in current slice */
>> - uint64_t bytes_disp[2];
>> - /* Number of bio's dispatched in current slice */
>> - unsigned int io_disp[2];
>> + /* Token for throttling by bps */
>> + uint64_t bytes_token[2];
>> + /* Token for throttling by iops */
>> + unsigned int io_token[2];
>>
>> - /* When did we start a new slice */
>> - unsigned long slice_start[2];
>> - unsigned long slice_end[2];
>> + /* Time check-point */
>> + unsigned long t_c[2];
>
> Can we rename this variable to say last_dispatch[2].
>
> [..]
>>
>> @@ -852,41 +708,26 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg, struct bio *bio,
>> unsigned long *wait)
>> {
>> bool rw = bio_data_dir(bio);
>> - u64 bytes_allowed, extra_bytes, tmp;
>> - unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
>> -
>> - jiffy_elapsed = jiffy_elapsed_rnd = jiffies - tg->slice_start[rw];
>> -
>> - /* Slice has just started. Consider one slice interval */
>> - if (!jiffy_elapsed)
>> - jiffy_elapsed_rnd = throtl_slice;
>> -
>> - jiffy_elapsed_rnd = roundup(jiffy_elapsed_rnd, throtl_slice);
>> + u64 extra_bytes, token;
>> + unsigned long jiffy_wait;
>>
>> - tmp = tg->bps[rw] * jiffy_elapsed_rnd;
>> - do_div(tmp, HZ);
>> - bytes_allowed = tmp;
>> + token = (u64)tg->bps[rw] * (jiffies - tg->t_c[rw]);
>
> So here we seem to be calculating how many tokens a group is eligible
> for. This is done based on since last time check. And last time check
> was updated in last dispatch from the group.
>
> What happens if no IO happens in a group for some time, say for 5 minutes,
> and then one big IO comes in. IIUC, we will allow a big burst of IO for
> the first IO (tokens worth of full 5 minutes will be given to this group).
> I think we should have a mechanism where we don't allow too big a burst
> for the first IO. (Possibly limit the burst to THROTL_BURST_IO and
> THROTL_BURST_BYTES until and unless a bigger IO is already queued and
> we are waiting for it).
>
> In previous time slice logic, I took care of it by extend time slice
> by 100ms and if no IO happens in the group for 100ms, time slice will
> expire and next IO will get at max of 100ms of worth of credit (equivalent
> of burst limits).
>
> Thanks
> Vivek



--
best regards
Hong Zhiguo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/