Re: Probing devices by their less-specific "compatible" bindings (here: brcmnand)

From: Florian Fainelli
Date: Fri Mar 17 2023 - 17:54:47 EST


+William,

On 3/17/23 03:02, Rafał Miłecki wrote:
Hi, I just spent few hours debugging hidden hw lockup and I need to
consult driver core code behaviour.

I have a BCM4908 SoC based board with a NAND controller on it.


### Hardware binding

Hardware details:
arch/arm64/boot/dts/broadcom/bcmbca/bcm4908.dtsi

Relevant part:
nand-controller@1800 {
    compatible = "brcm,nand-bcm63138", "brcm,brcmnand-v7.1", "brcm,brcmnand";
    reg = <0x1800 0x600>, <0x2000 0x10>;
    reg-names = "nand", "nand-int-base";
}:

Above binding is based on the documentation:
Documentation/devicetree/bindings/mtd/brcm,brcmnand.yaml


### Linux drivers

Linux has separated drivers for few Broadcom's NAND controller bindings:

1. drivers/mtd/nand/raw/brcmnand/bcm63138_nand.c for:
brcm,nand-bcm63138

2. drivers/mtd/nand/raw/brcmnand/brcmnand.c for:
brcm,brcmnand-v2.1
brcm,brcmnand-v2.2
brcm,brcmnand-v4.0
brcm,brcmnand-v5.0
brcm,brcmnand-v6.0
brcm,brcmnand-v6.1
brcm,brcmnand-v6.2
brcm,brcmnand-v7.0
brcm,brcmnand-v7.1
brcm,brcmnand-v7.2
brcm,brcmnand-v7.3

3. drivers/mtd/nand/raw/brcmnand/brcmstb_nand.c for:
brcm,brcmnand


### Problem

As first Linux probes my hardware using the "brcm,nand-bcm63138"
compatibility string driver bcm63138_nand.c. That's good.

It that fails however (.probe() returns an error) then Linux core starts
probing using drivers for less specific bindings.

Why does it fail?


In my case probing with the "brcm,brcmnand" string driver brcmstb_nand.c
results in ignoring SoC specific bits and causes a hardware lockup. Hw
isn't initialized properly and writel_relaxed(0x00000009, base + 0x04)
just make it hang.

Well, the missing piece here is that brcmnand.c is a library driver, therefore it needs an entry point, the next one that matches is brcmstb_nand.c.


That obviously isn't an acceptable behavior for me. So I'm wondering
what's going on wrong here.

Should Linux avoid probing with less-specific compatible strings?
Or should I not claim hw to be "brcm,brcmnand" compatible if it REQUIRES
SoC-specific handling?

An extra note: that fallback probing happens even with .probe()
returning -EPROBE_DEFER. This actually smells fishy for me on the Linux
core part.
I'm not an expect but I think core should wait for actual error without
trying less-specific compatible strings & drivers.

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

--
Florian