Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

From: Doug Anderson
Date: Tue May 03 2022 - 12:13:33 EST


Hi,

On Tue, May 3, 2022 at 8:54 AM Krzysztof Kozłowski
<k.kozlowski.k@xxxxxxxxx> wrote:
>
> On Tue, 19 Apr 2022 at 18:55, Doug Anderson <dianders@xxxxxxxxxxxx> wrote:
>
> > > Except shuffling the compatibles in bindings, you are changing the
> > > meaning of final "google,lazor" compatible. The bootloader works as
> > > expected - from most specific (rev5-sku6) to most generic compatible
> > > (google,lazor) but why do you need to advertise the latest rev as
> > > "google,lazor"? Why the bootloader on latest rev (e.g. rev7) cannot bind
> > > to rev7 compatible?
> >
> > The problem really comes along when a board strapped as -rev8 comes
> > along that is a board spin (and thus a new revision) but "should" be
> > invisible to software. Since it should be invisible to software we
> > want it to boot without any software changes. As per my previous mail,
> > sometimes HW guys make these changes without first consulting software
> > (since it's invisible to SW!) and we want to make sure that they're
> > still going to strap as "-rev8".
>
> If you want to boot it without any SW changes, do not change the SW.
> Do not change the DTB. If you admit that you want to change DTB, so
> the SW, sure, change it and accept the outcome - you have a new
> compatible. This new compatible can be or might be not compatible with
> rev7. Up to you.
>
> >
> > So what happens with this -rev8 board? The bootloader will check and
> > it won't see any device tree that advertises "google,lazor-rev8",
> > right?
>
> Your bootloader looks for a specific rev8, which is not compatible
> with rev7 (or is it? I lost the point of your example)

Actually the whole point is that _we don't know_ if -rev7 and -rev8
are compatible.

Think of it this way. You've got component A on your board and you
power it up with 1.8 V. We run out of component A and we decide to
replace it with component B. The vendor promises that component B is a
drop-in replacement for component A. You boot up a few devices with
component B and everything looks good. You build a whole lot of
products.

Sometime down the line you start getting failure reports. It turns out
that products that have component B are sporadically failing in the
field. After talking to the vendor, they suggest that we need to power
component B with 1.85 V instead of 1.80 V. Luckily we can adjust the
voltage with the PMIC, but component A's vendor doesn't want you to
bump the voltage up to 1.85V.

Even though we originally thought that the two boards were 100%
compatible, it later turns out that they're not.

So as a general principle, if we make big changes to a product we
increment the board revision strappings even if we think it's
invisible to software. This can help us get out of sticky situations
in the future.


> and you ship
> it with a DTB which has rev7, but not rev8. You control both pieces -
> bootloader and DTB. You cannot put incompatible pieces of firmware
> (one behaving entirely different than other) and expect proper output.
> This is why you also have bindings.

...and by "you" in "*you* control both pieces" you mean some
collection of people spread across several companies and several
countries and who don't always communicate well with each other. If
they believe that a change should be invisible to software, folks
building the hardware in China don't always send me a heads up in
California, but I still want them to bump the revision number just in
case they messed up and we do need a software change down the road.


> > If _all_ lazor revisions all include the "google,lazor"
> > compatible then the bootloader won't have any way to know which to
> > pick. The bootloader _doesn't_ have the smarts to know that "-rev7" is
> > closest to "-rev8".
>
> rev7 the next in the compatible list, isn't it? So bootloader picks up
> the fallback...

No. The bootloader works like this (just looking at the revision
strappings and ignoring the SKU strappings):

1. Read board strappings and get and ID (like "8")

2. Look for "google,lazor-rev8".

3. If it's not there, look for "google,lazor"

4. If it's not there then that's bad.

...so "-rev7" is _not_ in the compatible list for "-rev8".


> > It'll just randomly pick one of the "google,lazor"
> > boards. :( This is why we only advertise "google,lazor" for the newest
> > device tree.
> >
> > Yes, I agree it's not beautiful but it's what we ended up with. I
> > don't think we want to compromise on the ability to boot new revisions
> > without software changes because that will just incentivize people to
> > not increment the board revision. The only other option would be to
> > make the bootloader smart enough to pick the "next revision down" but
> > so far they haven't been willing to do that.
>
> Just choose the fallback and follow Devicetree spec...

It does choose the fallback and follow the devicetree spec, but the
bootloader doesn't have rules to consider "-rev7" as a fallback for
"-rev8".


> > I guess the question, though, is what action should be taken. I guess
> > options are:
> >
> > 1. Say that the above requirement that new "invisible" HW revs can
> > boot w/ no software changes is not a worthy requirement. Personally, I
> > wouldn't accept this option.
> >
> > 2. Ignore. Don't try to document top level compatible for these devices.
> >
> > 3. Document the compatible and accept that it's going to shuffle around a lot.
> >
> > 4. Try again to get the bootloader to match earlier revisions as fallbacks.
> >
> >
> > > > Now we can certainly argue back and forth above the above scheme and
> > > > how it's terrible and/or great, but it definitely works pretty well
> > > > and it's what we've been doing for a while now. Before that we used to
> > > > proactively add a whole bunch of "future" revisions "just in case".
> > > > That was definitely worse and had the same problem that we'd have to
> > > > shuffle compatibles. See, for instance `rk3288-veyron-jerry.dts`.
> > > >
> > > > One thing we _definitely_ don't want to do is to give HW _any_
> > > > incentive to make board spins _without_ changing the revision. HW
> > > > sometimes makes spins without first involving software and if it
> > > > doesn't boot because they updated the board ID then someone in China
> > > > will just put the old ID in and ship it off. That's bad.
> > > >
> > > > --
> > > >
> > > > But I guess this doesn't answer your question: how can userspace
> > > > identify what board this is running? I don't have an answer to that,
> > > > but I guess I'd say that the top-level "compatible" isn't really it.
> > >
> > > It can, the same as bootloader, by looking at the most specific
> > > compatible (rev7).
> > >
> > > > If nothing else, I think just from the definition it's not guaranteed
> > > > to be right, is it? From the spec: "Specifies a list of platform
> > > > architectures with which this platform is compatible." The key thing
> > > > is "a list". If this can be a list of things then how can you use it
> > > > to uniquely identify what one board you're on?
> > >
> > > The most specific compatible identifies or, like recently Rob confirmed
> > > in case of Renesas, the list of compatibles:
> > > https://lore.kernel.org/linux-devicetree/Yk2%2F0Jf151gLuCGz@xxxxxxxxxxxxxxxxxx/
> >
> > I'm confused. If the device tree contains the compatibles:
> >
> > "google,lazor-rev4", "google,lazor-rev3", "google,lazor", "qualcomm,sc7180"
> >
> > You want to know what board you're on and you look at the compatible,
> > right? You'll decide that you're on a "google,lazor-rev4" which is the
> > most specific compatible. ...but you could have booted a
> > "google,lazor-rev3". How do you know?
>
> Applying the wrong DTB on the wrong device will always give you the
> wrong answer. You can try too boot google,lazor-rev3 on x86 PC and it
> does not make it a google,lazor-rev3...

I don't understand what you're saying here. If a device tree has the compatible:

"google,lazor-rev4", "google,lazor-rev3", "google,lazor", "qualcomm,sc7180"

You wouldn't expect to boot it on an x86 PC, but you would expect to
boot it on either a "google,lazor-rev4" _or_ a "google,lazor-rev3".
Correct? Now, after we've booted software wants to look at the
compatible of the device tree that was booted. The most specific entry
in that device tree is "google,lazor-rev4". ...but we could have
booted it on a "google,lazor-rev3". How can you know?

-Doug