Re: [PATCH v1 1/2] driver core: Add fw_devlink.sync_state command line param

From: Doug Anderson
Date: Tue Feb 28 2023 - 17:33:21 EST


Hi,

On Thu, Feb 23, 2023 at 11:05 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
>
> When all devices that could probe have finished probing, this parameter
> controls what to do with devices that haven't yet received their
> sync_state() calls.
>
> fw_devlink.sync_state=strict is the default and the driver core will
> continue waiting on consumers to probe successfully in the future.

This description is misleading / borderline wrong. You say that when
"sync_state=strict" that you'll wait on consumers to probe
successfully in the future. As talked about below, I think that when
the pre-existing "deferred_probe_timeout" (which you're tying into)
expires, it's unlikely that devices will probe successfully in the
future. Sure, it's possible, but in general once the
"deferred_probe_timeout" expires then the system is done waiting for
new devices to show up. While it's still _possible_ to add new
devices, you need to take care to deal with the fact that some
important devices might have already given up and also that you're
adding these new devices in strict dependency order...

IMO better would be to say something like when sync_state=strict that
you'll just leave resources in a high power state if not all devices
have shown up and the system thinks probing is done.


> This
> is the default behavior since calling sync_state() when all the
> consumers haven't probed could make some systems unusable/unstable.
>
> fw_devlink.sync_state=timeout will cause the driver core to give up
> waiting on consumers and call sync_state() on any devices that haven't
> yet received their sync_state() calls. This option is provided for
> systems that won't become unusable/unstable as they might be able to
> save power (depends on state of hardware before kernel starts) if all
> devices get their sync_state().

While I don't object to this being a kernel command line flag, the
default should also be a Kconfig option. The kernel command line is
not a great place for general configuration. As we jam too much stuff
in the kernel command line it gets unwieldy quickly. IMO:

* Kconfig: the right place for stuff for config options that a person
building the kernel might want to tweak.

* Kernel command line: the right place for a user of a pre-built
kernel to tweak; also (sometimes) the right place for the bootloader
to pass info to the kernel; also a good place for debug options that a
kernel engineer might want to tweak w/out rebuilding the kernel.

In this case it makes sense for the person building the kernel to
choose a default that makes sense for the hardware that their kernel
is targetting. It can also make sense for a user of a pre-built kernel
to tweak this if their hardware isn't working correctly. Thus it makes
sense for Kconfig to choose the default and the kernel command line to
override.


> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx>
> ---
> .../admin-guide/kernel-parameters.txt | 12 ++++
> drivers/base/base.h | 1 +
> drivers/base/core.c | 58 +++++++++++++++++++
> drivers/base/dd.c | 6 ++
> 4 files changed, 77 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 6cfa6e3996cf..f0bf2f40af64 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1594,6 +1594,18 @@
> dependencies. This only applies for fw_devlink=on|rpm.
> Format: <bool>
>
> + fw_devlink.sync_state =

Is there a reason this is nested under "fw_devlink"? The sysfs
attribute "sync_state" that you modify in patch #2 doesn't reference
"fw_devlink" at all.


> + [KNL] When all devices that could probe have finished
> + probing, this parameter controls what to do with
> + devices that haven't yet received their sync_state()
> + calls.
> + Format: { strict | timeout }
> + strict -- Default. Continue waiting on consumers to
> + probe successfully.
> + timeout -- Give up waiting on consumers and call
> + sync_state() on any devices that haven't yet
> + received their sync_state() calls.

Some description needs to be included about how long the timeout is.
Specifically, tie it into the "deferred_probe_timeout" feature since
that's what you're using.


> +
> gamecon.map[2|3]=
> [HW,JOY] Multisystem joystick and NES/SNES/PSX pad
> support via parallel port (up to 5 devices per port)
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index 726a12a244c0..6fcd71803d35 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -209,6 +209,7 @@ extern void device_links_no_driver(struct device *dev);
> extern bool device_links_busy(struct device *dev);
> extern void device_links_unbind_consumers(struct device *dev);
> extern void fw_devlink_drivers_done(void);
> +extern void fw_devlink_probing_done(void);
>
> /* device pm support */
> void device_pm_move_to_tail(struct device *dev);
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index f9297c68214a..929ec218f180 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1727,6 +1727,26 @@ static int __init fw_devlink_strict_setup(char *arg)
> }
> early_param("fw_devlink.strict", fw_devlink_strict_setup);
>
> +#define FW_DEVLINK_SYNC_STATE_STRICT 0
> +#define FW_DEVLINK_SYNC_STATE_TIMEOUT 1

I don't care tons, but I feel like this should be an enum, or a bool.


> +
> +static int fw_devlink_sync_state;
> +static int __init fw_devlink_sync_state_setup(char *arg)
> +{
> + if (!arg)
> + return -EINVAL;
> +
> + if (strcmp(arg, "strict") == 0) {
> + fw_devlink_sync_state = FW_DEVLINK_SYNC_STATE_STRICT;
> + return 0;
> + } else if (strcmp(arg, "timeout") == 0) {
> + fw_devlink_sync_state = FW_DEVLINK_SYNC_STATE_TIMEOUT;
> + return 0;
> + }
> + return -EINVAL;
> +}
> +early_param("fw_devlink.sync_state", fw_devlink_sync_state_setup);
> +
> static inline u32 fw_devlink_get_flags(u8 fwlink_flags)
> {
> if (fwlink_flags & FWLINK_FLAG_CYCLE)
> @@ -1797,6 +1817,44 @@ void fw_devlink_drivers_done(void)
> device_links_write_unlock();
> }
>
> +static int fw_devlink_dev_sync_state(struct device *dev, void *data)
> +{
> + struct device_link *link = to_devlink(dev);
> + struct device *sup = link->supplier;
> +
> + if (!(link->flags & DL_FLAG_MANAGED) ||
> + link->status == DL_STATE_ACTIVE || sup->state_synced ||
> + !dev_has_sync_state(sup))
> + return 0;
> +
> + if (fw_devlink_sync_state == FW_DEVLINK_SYNC_STATE_STRICT) {
> + dev_warn(sup, "sync_state() pending due to %s\n",
> + dev_name(link->consumer));

This warning message is (IMO) an important feature of your patch. IMO
it deserves a mention in the commit message and even if (for some
reason) we decide we don't like the concept of forcing sync_state
after a timeout then we should still find a way to get this warning
message printed out. Maybe promote it to its own patch?

Specifically, I think this warning message gets printed out after
we've given up waiting for devices to show up. At this point
-EPROBE_DEFER becomes an error that we won't retry. That means that we
expect that sync state will _never_ be called in the future and that
resources will be left enabled / in a higher power state than needed.

I would perhaps also make it sound a little scarier since, IMO, this
is a problem that really shouldn't be "shipped" if this is an embedded
kernel. Maybe something like:

sync_state pending (%s); resources left in high power state


> + return 0;
> + }
> +
> + if (!list_empty(&sup->links.defer_sync))
> + return 0;
> +
> + dev_warn(sup, "Timed out. Calling sync_state()\n");

nit: since you aren't directly calling it after this print (you're
adding it to the queue), maybe change to "Forcing sync_state()".