Re: [PATCH v3 3/3] clocksource: timer-riscv: Set CLOCK_EVT_FEAT_C3STOP based on DT

From: Conor Dooley
Date: Sat Nov 26 2022 - 09:51:32 EST


Hey all,

On Fri, Nov 25, 2022 at 11:44:01PM +0000, Conor Dooley wrote:
> On Fri, Nov 25, 2022 at 01:13:04PM +0000, Conor Dooley wrote:
> > On Fri, Nov 25, 2022 at 04:51:05PM +0530, Anup Patel wrote:
> > > We should set CLOCK_EVT_FEAT_C3STOP for a clock_event_device only
> > > when riscv,timer-cant-wake-up DT property is present in the RISC-V
> > > timer DT node.
> > >
> > > This way CLOCK_EVT_FEAT_C3STOP feature is set for clock_event_device
> > > based on RISC-V platform capabilities rather than having it set for
> > > all RISC-V platforms.
> >
> > I need to go do some testing on what setting the C3STOP flag does on
> > platforms other than PolarFire SoC. I'm not sure that we should be
> > enabling this flag *at all* until we know that it does not break on
> > other platforms too.
>
> I tried my fu540 & fu740 - both of those seem to exhibit broken timer
> behaviour with C3STOP set. Ethernet doesn't work upstream on the
> VisionFive, so I didn't go through the hassle of testing that - but I
> would imagine it is the same as the fu740. Whenever I get a VisionFive 2
> I'll give that a try too.
>
> I did try the D1 (thanks for fielding my dumb questions Samuel) but I
> was not able to get the thing to boot if I disabled the sunxi timer :/
> Ethernet would not come up in U-Boot, clearly I did something not
> right..
>
> Obviously we need to fix things & get it backported etc, so taking a
> pragmatic approach: I think that it is better to merge this stuff even
> though it there's a pretty good chance I think that it'll break the
> SBI timer on a D1, since it is not intended that it will be used.
>
> It does make me worried about some of the other platforms though, like
> that Bouffalolabs SoC that Jisheng sent in a DT for. It's also using
> thead stuff so I wonder if it needs C3STOP too. I've added Jisheng to
> CC :)
>
> > > Signed-off-by: Anup Patel <apatel@xxxxxxxxxxxxxxxx>
> > > ---
> > > drivers/clocksource/timer-riscv.c | 10 ++++++++++
> > > 1 file changed, 10 insertions(+)
> > >
> > > diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c
> > > index a0d66fabf073..0c8bdd168a45 100644
> > > --- a/drivers/clocksource/timer-riscv.c
> > > +++ b/drivers/clocksource/timer-riscv.c
> > > @@ -28,6 +28,7 @@
> > > #include <asm/timex.h>
> > >
> > > static DEFINE_STATIC_KEY_FALSE(riscv_sstc_available);
> > > +static bool riscv_timer_cant_wake_cpu;
> > >
> > > static int riscv_clock_next_event(unsigned long delta,
> > > struct clock_event_device *ce)
> > > @@ -85,6 +86,8 @@ static int riscv_timer_starting_cpu(unsigned int cpu)
> > >
> > > ce->cpumask = cpumask_of(cpu);
> > > ce->irq = riscv_clock_event_irq;
> > > + if (riscv_timer_cant_wake_cpu)
> > > + ce->features |= CLOCK_EVT_FEAT_C3STOP;
> > > clockevents_config_and_register(ce, riscv_timebase, 100, 0x7fffffff);
> > >
> > > enable_percpu_irq(riscv_clock_event_irq,
> > > @@ -139,6 +142,13 @@ static int __init riscv_timer_init_dt(struct device_node *n)
> > > if (cpuid != smp_processor_id())
> > > return 0;
> > >
> > > + child = of_find_compatible_node(NULL, NULL, "riscv,timer");
> > > + if (child) {
> > > + riscv_timer_cant_wake_cpu = of_property_read_bool(child,
> > > + "riscv,timer-cant-wake-cpu");
> > > + of_node_put(child);
> > > + }
> > > +
> > > domain = NULL;
> > > child = of_get_compatible_child(n, "riscv,cpu-intc");
> > > if (!child) {
>
> Anyway, the mechanics of the change here look good to me. The re-use of
> child is understandable but a little odd though, since riscv,timer /is
> not/ actually a child. That's relatively minor thing to change though.
>
> I'm still not happy about turning on C3STOP when we have not figured out
> why it's breaking timer behaviour, but I think that's the lessor of two
> evils. Somewhat reluctantly:
> Reviewed-by: Conor Dooley <conor.dooley@xxxxxxxxxxxxx>
>
> I'll try to spend some time looking into why it's broken.

Right, so some good news! After Samuel provided me with an openSBI setup
to actually test that timer & C3STOP is currently breaking the timers on
the D1 too! IOW the same timer durations are rounded up to the next
jiffy. He then suggested the fix for it too, see below the scissors :)

I think the revert in patch 1 is still needed (to preserve suspend
functionality for existing platforms) but the commit message needs to be
changed.

Perhaps, it should become:
> From: Conor Dooley <conor.dooley@xxxxxxxxxxxxx>
>
> This reverts commit 232ccac1bd9b5bfe73895f527c08623e7fa0752d.
>
> On the subject of suspend, the RISC-V SBI spec states:
> > Request the SBI implementation to put the calling hart in a platform
> > specific suspend (or low power) state specified by the suspend_type
> > parameter. The hart will automatically come out of suspended state and
> > resume normal execution when it receives an interrupt or platform
> > specific hardware event.
>
> This does not cover whether any given events actually reach the hart or
> not, just what the hart will do if it receives an event. On PolarFire
> SoC, and potentially other SiFive based implementations, events from the
> RISC-V timer do reach a hart during suspend. This is not the case for
> the implementation on the Allwinner D1 - there timer events are not
> received during suspend.
>
> To prevent a device from entering an unrecoverable sleep state, the
> C3STOP feature was enabled unconditionally for the RISC-V timer driver.
> Unfortunately, this will have disabled sleep states used by existing
> platforms.
>
> Fortunately, the D1 has a second timer, which is "currently used in
> preference to the RISC-V/SBI timer driver" so a revert here does not
> hurt operation of D1 in its current form.
>
> Ultimately, a DeviceTree property (or node) will be added to encode the
> behaviour of the timers, but until then revert the addition of
> CLOCK_EVT_FEAT_C3STOP.
>
> Link: https://github.com/riscv-non-isa/riscv-sbi-doc/issues/98/
> Link: https://lore.kernel.org/linux-riscv/bf6d3b1f-f703-4a25-833e-972a44a04114@xxxxxxxxxxxx/
> Fixes: 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend")
> CC: Samuel Holland <samuel@xxxxxxxxxxxx>
> CC: Anup Patel <anup@xxxxxxxxxxxxxx>
> CC: Palmer Dabbelt <palmer@xxxxxxxxxxx>
> Reviewed-by: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
> Acked-by: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
> Acked-by: Samuel Holland <samuel@xxxxxxxxxxxx>
> Signed-off-by: Conor Dooley <conor.dooley@xxxxxxxxxxxxx>
> Signed-off-by: Anup Patel <apatel@xxxxxxxxxxxxxxxx>

Anyways, I think the new order of the patchset would have the below as
patch 1 & the current series on top of that. With those changes, I am
happy with the series & thanks for your (plural) help in figuring all of
this out!

Thanks,
Conor.

-- >8 --