Re: [PATCH v3 46/57] perf: Simplify pmu_dev_alloc()

From: Greg KH
Date: Mon Jun 12 2023 - 09:09:19 EST


On Mon, Jun 12, 2023 at 02:18:03PM +0200, Greg KH wrote:
> On Mon, Jun 12, 2023 at 11:44:00AM +0200, Peter Zijlstra wrote:
> > On Mon, Jun 12, 2023 at 11:07:59AM +0200, Peter Zijlstra wrote:
> > >
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > > ---
> > > kernel/events/core.c | 65 ++++++++++++++++++++++++---------------------------
> > > 1 file changed, 31 insertions(+), 34 deletions(-)
> > >
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -11285,49 +11285,46 @@ static void pmu_dev_release(struct devic
> > >
> > > static int pmu_dev_alloc(struct pmu *pmu)
> > > {
> > > + int ret;
> > >
> > > + struct device *dev __free(put_device) =
> > > + kzalloc(sizeof(struct device), GFP_KERNEL);
> > > + if (!dev)
> > > + return -ENOMEM;
> > >
> > > + dev->groups = pmu->attr_groups;
> > > + device_initialize(dev);
> > >
> > > + dev_set_drvdata(dev, pmu);
> > > + dev->bus = &pmu_bus;
> > > + dev->release = pmu_dev_release;
> > >
> > > + ret = dev_set_name(dev, "%s", pmu->name);
> > > if (ret)
> > > + return ret;
> > >
> > > + ret = device_add(dev);
> > > if (ret)
> > > + return ret;
> > >
> > > + struct device *del __free(device_del) = dev;
> >
> > Greg, I'm not much familiar with the whole device model, but it seems
> > unfortunate to me that one has to call device_del() explicitly if we
> > already have a put_device() queued.
> >
> > Is there a saner way to write this?
>
> Ok, the "problem" here is that you have decided to do the "complex" way
> to initialize a struct device. And as such, you have to do more
> housekeeping than if you were to just use the simple interface.
>
> The rule is, after you call device_initialize() you HAVE to call
> put_device() on the pointer if something goes wrong and you want to
> clean up properly. Unless you have called device_add(), and at that
> point in time, then you HAVE to call device_del() if the device_add()
> call succeeded. If the device_add() call failed, then you HAVE to call
> put_device().
>
> Yeah, it's a pain, but you are trying to hand-roll code that is not a
> "normal" path for a struct device, sorry.
>
> I don't know if you really can encode all of that crazy logic in the
> cleanup api, UNLESS you can "switch" the cleanup function at a point in
> time (i.e. after device_add() is successful). Is that possible?
>
> Anyway, let me see about just cleaning up this code in general, I don't
> think you need the complex interface here for a tiny struct device at
> all, which would make this specific instance moot :)
>
> Also, nit, you are racing with userspace by attempting to add new device
> files _AFTER_ the device is registered with the driver core, this whole
> thing can be made more simpler I hope, give me a bit...

Nope, I was wrong, I can fix the race condition, but the logic here for
how to init and clean up on errors is right, and you want this because
you are a bus and so, you need the two-step init/teardown process,
sorry.

Here's the patch I came up with to get rid of the race, but doesn't
really help you out here at all :(

------------------------
>From foo@baz Mon Jun 12 03:07:54 PM CEST 2023
Date: Mon, 12 Jun 2023 15:07:54 +0200
To: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Subject: [PATCH] perf/core: fix narrow startup race when creating the perf nr_addr_filters sysfs file


Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>


diff --git a/kernel/events/core.c b/kernel/events/core.c
index db016e418931..d2a6182ad090 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11351,9 +11351,32 @@ static DEVICE_ATTR_RW(perf_event_mux_interval_ms);
static struct attribute *pmu_dev_attrs[] = {
&dev_attr_type.attr,
&dev_attr_perf_event_mux_interval_ms.attr,
+ &dev_attr_nr_addr_filters.attr,
+ NULL,
+};
+
+static umode_t pmu_dev_is_visible(struct kobject *kobj, struct attribute *a, int n)
+{
+ struct device *dev = kobj_to_dev(kobj);
+ struct pmu *pmu = dev_get_drvdata(dev);
+
+ if (!pmu->nr_addr_filters)
+ return 0;
+
+ return a->mode;
+
+ return 0;
+}
+
+static struct attribute_group pmu_dev_attr_group = {
+ .is_visible = pmu_dev_is_visible,
+ .attrs = pmu_dev_attrs,
+};
+
+const static struct attribute_group *pmu_dev_groups[] = {
+ &pmu_dev_attr_group,
NULL,
};
-ATTRIBUTE_GROUPS(pmu_dev);

static int pmu_bus_running;
static struct bus_type pmu_bus = {
@@ -11389,18 +11412,11 @@ static int pmu_dev_alloc(struct pmu *pmu)
if (ret)
goto free_dev;

- /* For PMUs with address filters, throw in an extra attribute: */
- if (pmu->nr_addr_filters)
- ret = device_create_file(pmu->dev, &dev_attr_nr_addr_filters);
-
- if (ret)
- goto del_dev;
-
- if (pmu->attr_update)
+ if (pmu->attr_update) {
ret = sysfs_update_groups(&pmu->dev->kobj, pmu->attr_update);
-
- if (ret)
- goto del_dev;
+ if (ret)
+ goto del_dev;
+ }

out:
return ret;