Re: [PATCH] platform/chrome: cros_ec_typec: Check for EC driver

From: Akihiko Odaki
Date: Thu Apr 07 2022 - 13:04:06 EST


On 2022/04/08 1:28, Guenter Roeck wrote:
On Wed, Apr 6, 2022 at 6:16 PM Akihiko Odaki <akihiko.odaki@xxxxxxxxx> wrote:
[ ... ]
ec_dev = dev_get_drvdata(&typec->ec->ec->dev);

I completely missed the part that this is not on the parent.

+ if (!ec_dev)
+ return -EPROBE_DEFER;
[ ... ]

1. The parent exists and dev_get_drvdata(pdev->dev.parent) returns
non-NULL value. However, dev_get_drvdata(&typec->ec->ec->dev) returns
NULL. (Yes, that is confusing.) I'm wondering

I am actually surprised that typec->ec->ec is not NULL. Underlying
problem (or, one underlying problem) is that it is set in
cros_ec_register():

/* Register a platform device for the main EC instance */
ec_dev->ec = platform_device_register_data(ec_dev->dev, "cros-ec-dev",
PLATFORM_DEVID_AUTO, &ec_p,
sizeof(struct cros_ec_platform));

"cros-ec-dev" is the mfd device which instantiates the character
device. On devicetree (arm64) systems, the typec device is registered
as child of google,cros-ec-spi and thus should be instantiated only
after the spi device has been instantiated. The same should happen on
ACPI systems, but I don't know if that is really correct.

I don't know what exactly is happening, but apparently typec
registration happens in parallel with cros-ec-dev registration, which
is delayed because the character device is not loaded. As mentioned, I
don't understand why typec->ec->ec is not NULL. Can you check what it
points to ?

If I read the code correctly, the registration itself happens synchronously and platform_device_register_data() always returns a non-NULL value unless it returns -ENOMEM. The driver, however, can be asynchronously bound and dev_get_drvdata(&typec->ec->ec->dev) can return NULL as the consequence. It would have a call trace like the following when scheduling asynchronous driver binding:
platform_device_register_data()
platform_device_register_resndata()
platform_device_register_full()
- This always creates and returns platform_device.
platform_device_add()
- This adds the created platform_device.
device_add()
bus_probe_device()
device_initial_probe()
__device_attach()
- This schedules asynchronous probing.

typec->ec->ec should be pointing to the correct platform_device as the patched driver works without Oops on my computer. It is not NULL at least.

Regards,
Akihiko Odaki


Thanks,
Guenter

dev_get_drvdata(pdev->dev.parent) returned NULL in the following crash
log but it would be a problem distinct from what is handled with my patch:
https://lore.kernel.org/lkml/CABXOdTe9u_DW=NZM1-J120Gu1gibDy8SsgHP3bJwwLsE_iuLAQ@xxxxxxxxxxxxxx/

2. My patch returns -EPROBE_DEFER instead of -ENODEV and I confirmed it
will eventually be instantiated.

Regards,
Akihiko Odaki


Guenter

Thanks,

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/platform/chrome?id=ffebd90532728086007038986900426544e3df4e