Re: [PATCH 3/4] writeback, memcg: Implement cgroup_writeback_by_id()

From: Jan Kara
Date: Thu Aug 15 2019 - 10:54:26 EST


On Sat 03-08-19 07:01:54, Tejun Heo wrote:
> Implement cgroup_writeback_by_id() which initiates cgroup writeback
> from bdi and memcg IDs. This will be used by memcg foreign inode
> flushing.
>
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> ---
> fs/fs-writeback.c | 64 +++++++++++++++++++++++++++++++++++++++
> include/linux/writeback.h | 4 +++
> 2 files changed, 68 insertions(+)
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 6129debdc938..5c79d7acefdb 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -880,6 +880,70 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
> wb_put(last_wb);
> }
>
> +/**
> + * cgroup_writeback_by_id - initiate cgroup writeback from bdi and memcg IDs
> + * @bdi_id: target bdi id
> + * @memcg_id: target memcg css id
> + * @nr_pages: number of pages to write
> + * @reason: reason why some writeback work initiated
> + * @done: target wb_completion
> + *
> + * Initiate flush of the bdi_writeback identified by @bdi_id and @memcg_id
> + * with the specified parameters.
> + */
> +int cgroup_writeback_by_id(u64 bdi_id, int memcg_id, unsigned long nr,
> + enum wb_reason reason, struct wb_completion *done)
> +{
> + struct backing_dev_info *bdi;
> + struct cgroup_subsys_state *memcg_css;
> + struct bdi_writeback *wb;
> + struct wb_writeback_work *work;
> + int ret;
> +
> + /* lookup bdi and memcg */
> + bdi = bdi_get_by_id(bdi_id);
> + if (!bdi)
> + return -ENOENT;
> +
> + rcu_read_lock();
> + memcg_css = css_from_id(memcg_id, &memory_cgrp_subsys);
> + if (memcg_css && !css_tryget(memcg_css))
> + memcg_css = NULL;
> + rcu_read_unlock();
> + if (!memcg_css) {
> + ret = -ENOENT;
> + goto out_bdi_put;
> + }
> +
> + /* and find the associated wb */
> + wb = wb_get_create(bdi, memcg_css, GFP_NOWAIT | __GFP_NOWARN);
> + if (!wb) {
> + ret = -ENOMEM;
> + goto out_css_put;
> + }

One more thought: You don't want the "_create" part here, do you? If
there's any point in writing back using this wb, it must be attached to
some inode and thus it must exist. In the normal case wb_get_create() will
just fetch the reference and be done with it but when you feed garbage into
this function due to id going stale or frn structures getting corrupted due
to concurrent access, you can be creating bogus wb structures in bdi...

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR