Re: [PATCH v2 03/10] blktrace: fix debugfs use after free

From: Luis Chamberlain
Date: Wed Apr 22 2020 - 03:48:07 EST


On Wed, Apr 22, 2020 at 12:27:15AM -0700, Christoph Hellwig wrote:
> On Sun, Apr 19, 2020 at 07:45:22PM +0000, Luis Chamberlain wrote:
> > +{
> > + struct dentry *dir = NULL;
> > +
> > + /* This can happen if we have a bug in the lower layers */
> > + dir = debugfs_lookup(kobject_name(q->kobj.parent), blk_debugfs_root);
> > + if (dir) {
> > + pr_warn("%s: registering request_queue debugfs directory twice is not allowed\n",
> > + kobject_name(q->kobj.parent));
> > + dput(dir);
> > + return -EALREADY;
> > + }
>
> I don't see why we need this check. If it is valueable enough we
> should have a debugfs_create_dir_exclusive or so that retunrns an error
> for an exsting directory, instead of reimplementing it in the caller in
> a racy way. But I'm not really sure we need it to start with.

In short races, and even with synchronous request_queue removal I'm
seeing the race is still possible, but that's due to some other races
I'm going to chase down now.

The easier solution really is to just have a debugfs dir created for
each partition if debugfs is enabled, this way the directory will
always be there, and the lookups are gone.

> > +
> > + q->debugfs_dir = debugfs_create_dir(kobject_name(q->kobj.parent),
> > + blk_debugfs_root);
> > + if (!q->debugfs_dir)
> > + return -ENOMEM;
> > +
> > + return 0;
> > +}
> > +
> > +void blk_queue_debugfs_unregister(struct request_queue *q)
> > +{
> > + debugfs_remove_recursive(q->debugfs_dir);
> > + q->debugfs_dir = NULL;
> > +}
>
> Which to me suggests we can just fold these two into the callers,
> with an IS_ENABLED for the creation case given that we check for errors
> and the stub will always return an error.

Sorry not sure I follow this.

> > debugfs_create_files(q->debugfs_dir, q, blk_mq_debugfs_queue_attrs);
> >
> > /*
> > @@ -856,9 +853,7 @@ void blk_mq_debugfs_register(struct request_queue *q)
> >
> > void blk_mq_debugfs_unregister(struct request_queue *q)
> > {
> > - debugfs_remove_recursive(q->debugfs_dir);
> > q->sched_debugfs_dir = NULL;
> > - q->debugfs_dir = NULL;
> > }
>
> This function is weird - the sched dir gets removed by the
> debugfs_remove_recursive, so just leaving a function that clears
> a pointer is rather odd. In fact I don't think we need to clear
> either sched_debugfs_dir or debugfs_dir anywhere.

Indeed. Will clean it up.

> > @@ -975,6 +976,14 @@ int blk_register_queue(struct gendisk *disk)
> > goto unlock;
> > }
> >
> > + ret = blk_queue_debugfs_register(q);
> > + if (ret) {
> > + blk_trace_remove_sysfs(dev);
> > + kobject_del(&q->kobj);
> > + kobject_put(&dev->kobj);
> > + goto unlock;
> > + }
> > +
>
> Please use a goto label to consolidate the common cleanup code.

Sure.

> Also I think these generic debugfs changes probably should be separate
> to the blktrace changes.

I'll try to do that.

> > static struct dentry *blk_trace_debugfs_dir(struct blk_user_trace_setup *buts,
> > + struct request_queue *q,
> > struct blk_trace *bt)
> > {
> > struct dentry *dir = NULL;
> >
> > + /* This can only happen if we have a bug on our lower layers */
> > + if (!q->kobj.parent) {
> > + pr_warn("%s: request_queue parent is gone\n", buts->name);
> > + return NULL;
> > + }
>
> Why is this not simply a WARN_ON_ONCE()?

I'll actually remove it and instead fix the race where it happens.

> > + if (blk_trace_target_disk(buts->name, kobject_name(q->kobj.parent))) {
> > + if (!q->debugfs_dir) {
> > + pr_warn("%s: expected request_queue debugfs_dir is not set\n",
> > + buts->name);
> > + return NULL;
> > + }
> > + /*
> > + * debugfs_lookup() is used to ensure the directory is not
> > + * taken from underneath us. We must dput() it later once
> > + * done with it within blktrace.
> > + */
> > + dir = debugfs_lookup(buts->name, blk_debugfs_root);
> > + if (!dir) {
> > + pr_warn("%s: expected request_queue debugfs_dir dentry is gone\n",
> > + buts->name);
> > + return NULL;
> > + }
> > + /*
> > + * This is a reaffirmation that debugfs_lookup() shall always
> > + * return the same dentry if it was already set.
> > + */
> > + if (dir != q->debugfs_dir) {
> > + dput(dir);
> > + pr_warn("%s: expected dentry dir != q->debugfs_dir\n",
> > + buts->name);
> > + return NULL;
> > + }
> > + bt->backing_dir = q->debugfs_dir;
> > + return bt->backing_dir;
> > + }
>
> Even with the gigantic commit log I don't get the point of this
> code. It looks rather sketchy and I can't find a rationale for it.

Yeah I think this is going to be much easier on the eyes with the
revert to synchronous request_queue removal first.

Luis