Re: dev->of_node overwrite can cause device loading with differentdriver

From: Russell King - ARM Linux
Date: Sun Sep 15 2013 - 06:04:43 EST


On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote:
> I think there are three options to solve this:
>
> 1. Break out of the driver list iteration loop as soon as a driver probe
> function fails. This way there is no possibility for another driver
> to be probed with a device struct that was changed by the first
> driver. But it would remove the support to fall back to
> another driver for a device. I am not aware of any device that is
> supported by multiple drivers.

What if the incorrect driver (the one which created this platform device)
is the one which was matched first?

> 2. We could restore the device struct completely or partially (only
> of_node) after a probe failure. This would avoid driver probes with
> unclean device structs, but introduces some overhead.

I don't think it's about changing an existing device structure. The
problem is more to do with this:

- We have a platform device with an associated of_node.
- The of_node is used to match it to its device driver.
- This device driver spawns a new platform device, and copies the
of_node to the new platform device (so that the _intended_ driver
can get access to the DT properties)
- The DD layer tries to match this new platform device with a driver,
and _can_ match this new platform device against the device driver
which created it due to the of_node matching.

There's two solutions here:

1. get rid of this yucky "lets spawn a new device" stuff - if you
actually work out what's going on with MUSB and its use of this
platform device, it's _really_ horrid, and that's putting it
mildly. Let's call the device being probed, devA, and briefly:

new_dev = platform_device_alloc();

devA->platform_data = glue
glue->dev = devA
glue->musb = new_dev
new_dev->parent = devA

set_drvdata(devA, glue)

platform_device_add(new_dev);

musb->controller is the new device, callbacks into this layer do:

glue = get_drvdata(musb->controller->parent)

that's not too bad, because this is accessing devA's driver data
from within its owning driver... until you find this:

musb = glue_to_musb(glue)

which is defined as:

#define glue_to_musb(g) platform_get_drvdata(g->musb)

glue->musb is the _child_ platform device, and we're accessing the
child's driver data here.

This seems to me to be a layering violation, and also rather racy when
you consider that glue_to_musb() gets used from workqueue contexts
(and I don't see a flush_workqueue() call in these stub drivers.)
What if the new platform device gets unbound just before the workqueue
fires?

Another thing to note here is that platform_device_add_data() takes a
copy of the data - in the case of omap2430, this is kzalloc'd but
is pointlessly kept around until this driver is removed (via the devm_
mechanisms.)

The last thing I don't like in these drivers is this:

memset(musb_resources, 0x00, sizeof(*musb_resources) *
ARRAY_SIZE(musb_resources));

musb_resources[0].name = pdev->resource[0].name;
musb_resources[0].start = pdev->resource[0].start;
musb_resources[0].end = pdev->resource[0].end;
musb_resources[0].flags = pdev->resource[0].flags;

musb_resources[1].name = pdev->resource[1].name;
musb_resources[1].start = pdev->resource[1].start;
musb_resources[1].end = pdev->resource[1].end;
musb_resources[1].flags = pdev->resource[1].flags;

musb_resources[2].name = pdev->resource[2].name;
musb_resources[2].start = pdev->resource[2].start;
musb_resources[2].end = pdev->resource[2].end;
musb_resources[2].flags = pdev->resource[2].flags;

ret = platform_device_add_resources(musb, musb_resources,
ARRAY_SIZE(musb_resources));

A little knowledge of the driver model will reveal that the above
is utterly pointless - and can be simplified to:

ret = platform_device_add_resources(musb, pdev->resource,
pdev->num_resources);

as platform_device_add_resources() copies the passed resource
structures via kmemdup() already. There's no reason for this
device driver to make its own copy of the resources just to have
them re-copied. It's also a lot safer in case fewer than three
resources are supplied. It's also a lot less hastle if additional
resources are added (like what's happened recently to these drivers.)

2. don't pass the of_node via the platform_device, but as part of
the platform device's data.

Is there any reason why musb isn't implemented as a library which these
stub drivers can hook into?

I'd much prefer (1), because it gets rid of the horrid dma_mask stuff in
these drivers, turning it more into a "conventional" driver.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/