Re: [PATCH v4] amba: Remove deferred device addition

From: Sudeep Holla
Date: Tue Jul 05 2022 - 07:05:32 EST


On Tue, Jul 05, 2022 at 01:39:34AM -0700, Saravana Kannan wrote:
> The uevents generated for an amba device need PID and CID information
> that's available only when the amba device is powered on, clocked and
> out of reset. So, if those resources aren't available, the information
> can't be read to generate the uevents. To workaround this requirement,
> if the resources weren't available, the device addition was deferred and
> retried periodically.
>
> However, this deferred addition retry isn't based on resources becoming
> available. Instead, it's retried every 5 seconds and causes arbitrary
> probe delays for amba devices and their consumers.
>
> Also, maintaining a separate deferred-probe like mechanism is
> maintenance headache.
>
> With this commit, instead of deferring the device addition, we simply
> defer the generation of uevents for the device and probing of the device
> (because drivers needs PID and CID to match) until the PID and CID
> information can be read. This allows us to delete all the amba specific
> deferring code and also avoid the arbitrary probing delays.
>
> Cc: Rob Herring <robh@xxxxxxxxxx>
> Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> Cc: John Stultz <john.stultz@xxxxxxxxxx>
> Cc: Saravana Kannan <saravanak@xxxxxxxxxx>
> Cc: Linus Walleij <linus.walleij@xxxxxxxxxx>
> Cc: Sudeep Holla <sudeep.holla@xxxxxxx>
> Cc: Nicolas Saenz Julienne <nsaenzjulienne@xxxxxxx>
> Cc: Geert Uytterhoeven <geert+renesas@xxxxxxxxx>
> Cc: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
> Cc: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
> Cc: Russell King <linux@xxxxxxxxxxxxxxx>
> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx>
> ---
>
> v1 -> v2:
> - Dropped RFC tag
> - Complete rewrite to not use stub devices.
>
> v2 -> v3:
> - Flipped the if() condition for hard-coded periphids.
> - Added a stub driver to handle the case where all amba drivers are
> modules loaded by uevents.
> - Cc Marek after I realized I forgot to add him.
>
> v3 -> v4:
> - Finally figured out and fixed the issue reported by Kefeng (bus match
> can't return an error other than -EPROBE_DEFER).
> - I tested the patch on "V2P-CA15" on qemu
> - Marek tested v3, but that was so long ago and the rebase wasn't clean,
> so I didn't include the tested-by.
>
> Marek/Kefeng,
>
> Mind giving a Tested-by?
>
> -Saravana
>
> drivers/amba/bus.c | 317 +++++++++++++++++++++------------------------
> 1 file changed, 145 insertions(+), 172 deletions(-)
>
> diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
> index 0e3ed5eb367b..9590c93b2aea 100644
> --- a/drivers/amba/bus.c
> +++ b/drivers/amba/bus.c
> @@ -130,11 +130,100 @@ static struct attribute *amba_dev_attrs[] = {
> };
> ATTRIBUTE_GROUPS(amba_dev);
>
> +static int amba_read_periphid(struct amba_device *dev)
> +{
> + struct reset_control *rstc;
> + u32 size, pid, cid;
> + void __iomem *tmp;
> + int i, ret;
> +
> + ret = dev_pm_domain_attach(&dev->dev, true);
> + if (ret) {
> + dev_dbg(&dev->dev, "can't get PM domain: %d\n", ret);
> + goto err_out;
> + }
> +
> + ret = amba_get_enable_pclk(dev);
> + if (ret) {
> + dev_dbg(&dev->dev, "can't get pclk: %d\n", ret);
> + goto err_pm;
> + }
> +
> + /*
> + * Find reset control(s) of the amba bus and de-assert them.
> + */
> + rstc = of_reset_control_array_get_optional_shared(dev->dev.of_node);
> + if (IS_ERR(rstc)) {
> + ret = PTR_ERR(rstc);
> + if (ret != -EPROBE_DEFER)
> + dev_err(&dev->dev, "can't get reset: %d\n", ret);
> + goto err_clk;
> + }
> + reset_control_deassert(rstc);
> + reset_control_put(rstc);
> +
> + size = resource_size(&dev->res);
> + tmp = ioremap(dev->res.start, size);
> + if (!tmp) {
> + ret = -ENOMEM;
> + goto err_clk;
> + }
> +

I am seeing constant crash roughly around here. The read below is triggering
synchronous external abort. And adding even a single printk above this
anywhere seem to be making the issue disappear. While any prints after
the below reads just defer the issue and finally hit for some other device
while it always hits for the first delayed amba device.

> + /*
> + * Read pid and cid based on size of resource
> + * they are located at end of region
> + */
> + for (pid = 0, i = 0; i < 4; i++)
> + pid |= (readl(tmp + size - 0x20 + 4 * i) & 255) << (i * 8);
> + for (cid = 0, i = 0; i < 4; i++)
> + cid |= (readl(tmp + size - 0x10 + 4 * i) & 255) << (i * 8);
> +

I haven't spent time debugging this any further.

--
Regards,
Sudeep

1. No logs, always crash

Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc5-next-20220705-00001-g69f8b939ba4c #244
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Jun 21 2022
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : amba_read_periphid+0x128/0x270
lr : amba_read_periphid+0x114/0x270
sp : ffff80000803bc40
x29: ffff80000803bc40 x28: f675a09e14a55da9 x27: ffff800009c56068
x26: 0000000000000007 x25: ffff00080035f500 x24: ffff0008016c6468
x23: ffff80000a5be848 x22: ffff80000a446088 x21: 0000000000000000
x20: ffff0008003e1c00 x19: 0000000000001000 x18: ffffffffffffffff
x17: 000000000000001c x16: 000000005885f043 x15: ffff000800032a1c
x14: ffffffffffffffff x13: 0000000000000000 x12: 0101010101010101
x11: ffff800008000000 x10: 0000000022010000 x9 : 0140000000000000
x8 : 0040000000000001 x7 : 0000800019e9b000 x6 : 0000000000000000
x5 : ffff800048175000 x4 : 0000000000000000 x3 : ffff800008175fe0
x2 : 0068000000000f13 x1 : 0000000000000000 x0 : ffff800008175000
Call trace:
amba_read_periphid+0x128/0x270
amba_match+0x24/0x90
__driver_attach+0x2c/0x1c4
bus_for_each_dev+0x60/0xd0
driver_attach+0x24/0x30
bus_add_driver+0xf4/0x230
driver_register+0xbc/0x110
amba_stub_drv_init+0x6c/0x9c
do_one_initcall+0x6c/0x1c0
kernel_init_freeable+0x34c/0x444
kernel_init+0x28/0x13c
ret_from_fork+0x10/0x20
Code: d1008263 52800004 8b030003 52800006 (b9400062)
---[ end trace 0000000000000000 ]---

2. With prints after 2 for loops above

amba_read_periphid 20030000.tpiu
amba_read_periphid 20040000.funnel
amba_read_periphid 20070000.etr
amba_read_periphid 20100000.stm
amba_read_periphid 20120000.replicator
Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
Modules linked in:
CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc5-next-20220705-00001-g69f8b939ba4c-dirty #243
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Jun 21 2022
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : readl+0x4/0x20
lr : amba_read_periphid+0x138/0x250
sp : ffff80000803bc40
x29: ffff80000803bc40 x28: f675a09e14a55da9 x27: ffff800009c56068
x26: 0000000000000007 x25: ffff00080035f500 x24: ffff000800248568
x23: ffff80000a5be848 x22: ffff80000a446088 x21: 0000000000001000
x20: ffff0008003e1c00 x19: 00000000fffffff4 x18: ffffffffffffffff
x17: 000000040044ffff x16: 00400032b5503510 x15: 0000000000000000
x14: ffffffffffffffff x13: 0000000000000000 x12: 0101010101010101
x11: ffff800008000000 x10: 0000000022010000 x9 : 0140000000000000
x8 : 0040000000000001 x7 : 0000800019e9b000 x6 : 0000000000000000
x5 : ffff800048175000 x4 : 0000000000000000 x3 : ffff800008175000
x2 : ffff800008175fe0 x1 : 0000000000000000 x0 : ffff800008175fe0
Call trace:
readl+0x4/0x20
amba_match+0x24/0x90
__driver_attach+0x2c/0x1c4
bus_for_each_dev+0x60/0xd0
driver_attach+0x24/0x30
bus_add_driver+0xf4/0x230
driver_register+0xbc/0x110
amba_stub_drv_init+0x6c/0x9c
do_one_initcall+0x6c/0x1c0
kernel_init_freeable+0x34c/0x444
kernel_init+0x28/0x13c
ret_from_fork+0x10/0x20
Code: d50323bf d65f03c0 00000000 d503233f (b9400000)
---[ end trace 0000000000000000 ]---