Re: [PATCH v2 1/2] blk-iocost: add refcounting for iocg

From: Yu Kuai
Date: Mon Jan 09 2023 - 20:39:57 EST


Hi,

在 2023/01/10 2:23, Tejun Heo 写道:
Yeah, that's unfortunate. There are several options here:

1. Do what you originally suggested - bypass to root after offline. I feel
uneasy about this. Both iolatency and throtl clear their configs on
offline but that's punting to the parent. For iocost it'd be bypassing
all controls, which can actually be exploited.

2. Make all possible IO issuers use blkcg_[un]pin_online() and shift the
iocost shutdown to pd_offline_fn(). This likely is the most canonical
solution given the current situation but it's kinda nasty to add another
layer of refcnting all over the place.

3. Order blkg free so that parents are never freed before children. You did
this by adding refcnts in iocost but shouldn't it be possible to simply
shift blkg_put(blkg->parent) in __blkg_release() to blkg_free_workfn()?

As I tried to explain before, we can make sure blkg_free() is called
in order, but blkg_free() from remove cgroup can concurrent with
deactivate policy, and we can't guarantee the order of ioc_pd_free()
that is called both from blkg_free() and blkcg_deactivate_policy().
Hence I don't think #3 is possible.

I personaly prefer #1, I don't see any real use case about the defect
that you described, and actually in cgroup v1 blk-throtl is bypassed to
no limit as well.

I'm not sure about #2, that sounds a possible solution but I'm not quite
familiar with the implementations here.

Consider that bfq already has such refcounting for bfqg, perhaps
similiar refcounting is acceptable?

Thanks,
Kuai