Re: [PATCH v4 0/3] Optimize tlbflush path

From: Atish Patra
Date: Sat Sep 28 2019 - 00:23:41 EST


On Fri, 2019-08-30 at 19:50 -0700, Paul Walmsley wrote:
> Hi Atish,
>
> On Thu, 22 Aug 2019, Atish Patra wrote:
>
> > This series adds few optimizations to reduce the trap cost in the
> > tlb
> > flush path. We should only make SBI calls to remote tlb flush only
> > if
> > absolutely required.
>
> The patches look great. My understanding is that these optimization
> patches may actually be a partial workaround for the TLB flushing bug
> that
> we've been looking at for the last month or so, which can corrupt
> memory
> or crash the system.
>
> If that's the case, let's first root-cause the underlying
> bug. Otherwise
> we'll just be papering over the actual issue, which probably could
> still
> occur even with this series, correct? Since it contains no explicit
> fixes?
>
>
I have verified the glibc locale install issue both in Qemu and
Unleashed. I don't see any issue with OpenSBI master + Linux v5.3
kernel.

As per our investigation, it looks like a hardware errata with
Unleashed board as the memory corruption issue only happens in case of
tlb range flush. In RISC-V, sfence.vma can only be issued at page
boundary. If the range is larger than that, OpenSBI has to issue
multiple sfence.vma calls back to back leading to possible memory
corruption.

Currently, OpenSBI has a platform feature i.e. "tlb_range_flush_limit"
that allows to configure tlb flush threshold per platform. Any tlb
flush range request greater than this threshold, is converted to a full
flush. Currently, it is set to the default value 4K for every
platform[1]. Glibc locale install memory corruption only happens if
this threshold is changed to a higher value i.e. 1G. This doesn't
change anything in OpenSBI code path except the fact that it will issue
many sfence.vma instructions back to back instead of one.

If the hardware team at SiFive can look into this as well, it would be
great.

To conclude, we think this issue need to be investigated by hardware
team and the kernel patch can be merged to get the performance benefit.

[1]
https://github.com/riscv/opensbi/blob/master/include/sbi/sbi_platform.h#L40



> - Paul
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@xxxxxxxxxxxxxxxxxxx
> http://lists.infradead.org/mailman/listinfo/linux-riscv

--
Regards,
Atish