Re: [PATCH 2/3] drm/etnaviv: Don't ignore errors on getting clocks

From: Russell King - ARM Linux admin
Date: Wed May 20 2020 - 11:00:48 EST


On Wed, May 20, 2020 at 04:04:39PM +0200, Lucas Stach wrote:
> Am Mittwoch, den 20.05.2020, 15:38 +0200 schrieb Lubomir Rintel:
> > Yes. This means that in fact "core" is the only required clock for all
> > implementations of vivante,gc and the common binding needs to be updated
> > to reflect that. I'll follow with a patch that does that, unless there
> > are strong objections.
> >
> > If there are implementations that require different clock inputs, then they
> > need to use additional compatible string for the particular flavor and the
> > binding should have conditionals for them. Something like this:
> >
> > if:
> > properties:
> > compatible:
> > contains:
> > const: fsl,imx6sx-gpu
> > then:
> > properties:
> > clocks:
> > minItems: 4
>
> The DT binding of a device should describe the hardware of the device,
> not the specific integration into a SoC. Now it's a bit hard to make
> any definite statements about the Vivante GC GPU module itself, as most
> of the information we have is from reverse engineering. It's pretty
> clear though that the GPU module has at least 2 clock inputs: axi and
> core, as there is a feature bit that tells us if it's okay to gate the
> axi clock independently from core.
>
> I'm not 100% sure about the older cores as found in Dove, but all the
> more recent cores allow to clock the shader partition independently of
> the core partition, so that's another clock input.
>
> Now when it comes to a SoC integration, it's totally fine to have all
> those GPU module clock inputs fed from the same clock source and behind
> a shared gate maybe. But that doesn't change the clock inputs from the
> device perspective, it's still 3 independent clock inputs, which then
> just point to the same clock source in the DT.
>
> imx6sx.dtsi is even a precedent of such a setup: all module clock
> inputs are fed by a common clock and share a single gate.

It's very good to stand on a pedistal, and try to dictate what you
want, but what you want is not always possible, and that is the case
here. You have made your idea of how you see the hardware quite
plain.

Getting back to reality, again, what we know is that there is one
clock for the GC600 on Dove, since are register settings to control
that clock.

What we also know is that there is an AHB clock, but how that relates
to the rest of the system, we have no documentation. It is that is
used for the AHB bus, which incidentally includes the GC600 for access
to its register set. I don't remember whether AHB peripherals require
the clock or not, that is something that would need to be checked.
If they do, then it seems that clock is missing from the binding.

Then there's a description of the AXI interface to the GC600 which
mentions no clock (see 10.7.2 below). It has a clock, but what that
clock is, and how it relates to the rest of the system is not clearly
specified.

So, again, what you're basically asking for is for someone to guess
and throw some ficticious model into DT.

No, if it's not known, then we should not be guessing and throwing
random garbage into a SoC DT description - once it's in the DT
description, it has to be supported going forward, and if it's later
found to be incorrect, then we have a problem.

So, I don't see how Dove can meet your requirements, and I think you
have to compromise on this, and just accept that Dove only has one
known clock.

If you want to continue being pedantic about this, then in order to
support that pedanty, we need to add an AHB clock to all Vivante GC
descriptions as well since the AHB interface requires it - there it
is called HCLK in the AHB specs. Here is the extract from the Dove
documentation for the AHB interface:

10.7.1 AHB Interface
The main features of the AHB interface are:
n 256k addressable register space
n 32-bit accesses only (no bursts)
n 32-bit data bus
n Error response for illegal accesses
n Asynchronous interface to the Graphics Core
n Interrupt support
The interface uses a separate clock that is slower
than the AXI Bus clock, thus allowing a more
relaxed logic design. Because the Core clock is
different from the interface clock, design within
the GPU Core ensures that incoming and outgoing
data are handled properly when they cross clock
boundaries.
The GPU block occupies 256 KB (64k 32-bit words) of
the system address space. An AHB ERROR response is
returned if an illegal access is detected. Only
32-bit reads and writes are permitted.

10.7.2 AXI Interface
The main features of the AXI interface are:
n 64-bit independent read and write data buses
n Multiple Burst length (8 bytes, 16 bytes,
32 bytes, or 64 bytes)
n High-performance out-of-order / 16 multiple
outstanding accesses
n Supports out-of-order return data for
different clients
n Asynchronous interface to the graphics core
The AXI Interface included in the graphics GPU is
used to retrieve data from the memory attached to
the AXI Interconnect.
The possible graphics clients using the AXI to fetch
data are:
n DMA (to fetch commands, states, index and
vertex data)
n Rasterizer
n Texture engine (for reading of texture data)
n Pixel engine (for reading and writing of Z and
color buffer)

Now, as far as clocking goes:

4.3 Clocking
4.3.1 Clock Domains
The 88AP510 has one PLL for the CPU and DDR Memory
Controller, and a Core PLL that generates the AXI,
Mbus, and internal clocks.
The CPU and DDR Memory Controller operate
synchronously at a ratio of N:1 (N=1..10), 5:2 or 7:2.
The GPU, VMetaâ, and Display controller operate in
the AXI bus clock domain, which is asynchronous to
the CPU clock domain. The synchronization between
the AXI and CPU clock domains is implemented by the
Memory Controller Bridge (MCB). The Mbus is clocked
at Tclk, which is asynchronous to the AXI clock
domain. Synchronization between the Mbus and the AXI
clock domains is accomplished by the Upstream bridge
and the Downstream bridge.
4.3.2 Core PLL Clock Tree
The GPU, VMetaâ, and Display Controller derive the
operating clock from the Core PLL through the
dedicated PLL clock dividers, and are asynchronous
to the AXI interconnect bus clock. The clock
tree is depicted in Figure 5.

Figure 4: AXI Units Clock Tree

<diagram of Core-PLL feeding the display controller,
VMeta and GPU clocks, through dividers, which are
not the AXI clock but the functional clocks for these
peripherals. No mention of how the AXI clock is
derived.>

Figure 5 is something entirely different, so I suspect they mean figure
4 there.

So, as I say, you're basically asking for someone to make something up
to suit your requirements and throw that into DT.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up