Re: [PATCH v3 0/2] riscv: errata: thead: use riscv_nonstd_cache_ops for CMO
From: Jisheng Zhang
Date:  Thu Oct 12 2023 - 10:53:01 EST
On Thu, Oct 12, 2023 at 03:36:28PM +0100, Conor Dooley wrote:
> On Thu, Oct 12, 2023 at 10:21:08PM +0800, Jisheng Zhang wrote:
> > On Thu, Oct 12, 2023 at 10:14:54PM +0800, Jisheng Zhang wrote:
> > > Previously, we use alternative mechanism to dynamically patch
> > > the CMO operations for THEAD C906/C910 during boot for performance
> > > reason. But as pointed out by Arnd, "there is already a significant
> > > cost in accessing the invalidated cache lines afterwards, which is
> > > likely going to be much higher than the cost of an indirect branch".
> > > And indeed, there's no performance difference with GMAC and EMMC per
> > > my test on Sipeed Lichee Pi 4A board.
> > > 
> > > Use riscv_nonstd_cache_ops for THEAD C906/C910 CMO to simplify
> > > the alternative code, and to acchieve Arnd's goal -- "I think
> > > moving the THEAD ops at the same level as all nonstandard operations
> > > makes sense, but I'd still leave CMO as an explicit fast path that
> > > avoids the indirect branch. This seems like the right thing to do both
> > > for readability and for platforms on which the indirect branch has a
> > > noticeable overhead."
> > > 
> > > To make bisect easy, I use two patches here: patch1 does the conversion
> > > which just mimics current CMO behavior via. riscv_nonstd_cache_ops, I
> > > assume no functionalities changes. patch2 uses T-HEAD PA based CMO
> > > instructions so that we don't need to covert PA to VA.
> > > 
> > > Hi Guo,
> > > 
> > > I didn't use wback_inv for wback as you suggested during v1 reviewing,
> > > this can be left as future optimizations.
> > > 
> > > Thanks
> > > 
> > > since v2:
> > >   - collect Reviewed-by tag
> > 
> > Oh, I missed the tag collection, but I know maintainers are using b4 which can
> > collect and apply tags automatically ;). let me know if want a new
> > version.
> 
> It doesn't collect tags (AFAIU) from earlier revisions though.
oops I didn't know this before, just sent out v4 with real tag collection to
make the merging progress smooth.