Re: [PATCH v2 08/11] driver core: fw_devlink: Make cycle detection more robust

From: Andy Shevchenko
Date: Fri Jan 27 2023 - 04:43:28 EST


On Thu, Jan 26, 2023 at 04:11:35PM -0800, Saravana Kannan wrote:
> fw_devlink could only detect a single and simple cycle because it relied
> mainly on device link cycle detection code that only checked for cycles
> between devices. The expectation was that the firmware wouldn't have
> complicated cycles and multiple cycles between devices. That expectation
> has been proven to be wrong.
>
> For example, fw_devlink could handle:
>
> +-+ +-+
> |A+------> |B+
> +-+ +++
> ^ |
> | |
> +----------+
>
> But it couldn't handle even something as "simple" as:
>
> +---------------------+
> | |
> v |
> +-+ +-+ +++
> |A+------> |B+------> |C|
> +-+ +++ +-+
> ^ |
> | |
> +----------+
>
> But firmware has even more complicated cycles like:
>
> +---------------------+
> | |
> v |
> +-+ +---+ +++
> +--+A+------>| B +-----> |C|<--+
> | +-+ ++--+ +++ |
> | ^ | ^ | |
> | | | | | |
> | +---------+ +---------+ |
> | |
> +------------------------------+
>
> And this is without including parent child dependencies or nodes in the
> cycle that are just firmware nodes that'll never have a struct device
> created for them.
>
> The proper way to treat these devices it to not force any probe ordering
> between them, while still enforce dependencies between node in the
> cycles (A, B and C) and their consumers.
>
> So this patch goes all out and just deals with all types of cycles. It
> does this by:
>
> 1. Following dependencies across device links, parent-child and fwnode
> links.
> 2. When it find cycles, it mark the device links and fwnode links as
> such instead of just deleting them or making the indistinguishable
> from proxy SYNC_STATE_ONLY device links.
>
> This way, when new nodes get added, we can immediately find and mark any
> new cycles whether the new node is a device or firmware node.

...

> + * Check if @sup_handle or any of its ancestors or suppliers direct/indirectly
> + * depend on @con. This function can detect multiple cyles between @sup_handle

A single space is enough.

> + * and @con. When such dependency cycles are found, convert all device links
> + * created solely by fw_devlink into SYNC_STATE_ONLY device links. Also, mark

Ditto.

> + * all fwnode links in the cycle with FWLINK_FLAG_CYCLE so that when they are
> + * converted into a device link in the future, they are created as
> + * SYNC_STATE_ONLY device links. This is the equivalent of doing

Ditto.

> + * fw_devlink=permissive just between the devices in the cycle. We need to do
> + * this because, at this point, fw_devlink can't tell which of these
> + * dependencies is not a real dependency.
> + *
> + * Return true if one or more cycles were found. Otherwise, return false.

Return:

(you may run `kernel-doc -v ...` to see all warnings)

...

> +static bool __fw_devlink_relax_cycles(struct device *con,
> + struct fwnode_handle *sup_handle)
> +{
> + struct fwnode_link *link;
> + struct device_link *dev_link;

> + struct device *sup_dev = NULL, *par_dev = NULL;

You can put it the first line since it's long enough.
But why do you need sup_dev assignment?

> + bool ret = false;
> +
> + if (!sup_handle)
> + return false;
> +
> + /*
> + * We aren't trying to find all cycles. Just a cycle between con and
> + * sup_handle.
> + */
> + if (sup_handle->flags & FWNODE_FLAG_VISITED)
> + return false;
> +
> + sup_handle->flags |= FWNODE_FLAG_VISITED;

> + sup_dev = get_dev_from_fwnode(sup_handle);
> +

I would put it closer to the condition:

> + /* Termination condition. */
> + if (sup_dev == con) {

/* Get supplier device and check for termination condition */
sup_dev = get_dev_from_fwnode(sup_handle);
if (sup_dev == con) {

> + ret = true;
> + goto out;
> + }
> +
> + /*
> + * If sup_dev is bound to a driver and @con hasn't started binding to
> + * a driver, @sup_dev can't be a consumer of @con. So, no need to

sup_dev or @sup_dev? What's the difference? Should you spell one of them
in full?

> + * check further.
> + */
> + if (sup_dev && sup_dev->links.status == DL_DEV_DRIVER_BOUND &&

As in the comment above, the single space is enough.

> + con->links.status == DL_DEV_NO_DRIVER) {
> + ret = false;
> + goto out;
> + }
> +
> + list_for_each_entry(link, &sup_handle->suppliers, c_hook) {
> + if (__fw_devlink_relax_cycles(con, link->supplier)) {
> + __fwnode_link_cycle(link);
> + ret = true;
> + }
> + }
> +
> + /*
> + * Give priority to device parent over fwnode parent to account for any
> + * quirks in how fwnodes are converted to devices.
> + */

> + if (sup_dev) {
> + par_dev = sup_dev->parent;
> + get_device(par_dev);
> + } else {
> + par_dev = fwnode_get_next_parent_dev(sup_handle);
> + }

if (sup_dev)
par_dev = get_device(sup_dev->parent);
else
par_dev = fwnode_get_next_parent_dev(sup_handle);

> + if (par_dev)
> + ret |= __fw_devlink_relax_cycles(con, par_dev->fwnode);

Instead I would rather do a similar pattern of the ret assignment as elsewhere
in the function.

if (par_dev && __fw_devlink_relax_cycles(con, par_dev->fwnode))
ret = true;

> + if (!sup_dev)
> + goto out;
> +
> + list_for_each_entry(dev_link, &sup_dev->links.suppliers, c_node) {
> + /*
> + * Ignore a SYNC_STATE_ONLY flag only if it wasn't marked as a
> + * such due to a cycle.
> + */
> + if (device_link_flag_is_sync_state_only(dev_link->flags) &&
> + !(dev_link->flags & DL_FLAG_CYCLE))
> + continue;
> +
> + if (__fw_devlink_relax_cycles(con,
> + dev_link->supplier->fwnode)) {

Keep it on one line.

> + fw_devlink_relax_link(dev_link);
> + dev_link->flags |= DL_FLAG_CYCLE;
> + ret = true;
> + }
> + }
> +
> +out:
> + sup_handle->flags &= ~FWNODE_FLAG_VISITED;
> + put_device(sup_dev);
> + put_device(par_dev);
> + return ret;
> +}

--
With Best Regards,
Andy Shevchenko