Re: [PATCH] arm64: asid: Optimize cache_flush for SMT

From: Guo Ren
Date: Mon Jun 24 2019 - 08:26:17 EST


On Mon, Jun 24, 2019 at 7:40 PM Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> I'm very confused by this patch. The title says arm64, yet the code is
> under arch/csky/, and the code in question refers to HARTs, which IIUC
> is RISC-V terminology.
This patch is used to answer Catalin's question:
> While the algorithm may seem fairly generic, the semantics have a few
> corner cases specific to each architecture. See [1] for a description of
> the semantics we need on arm64 (CnP is a feature where the hardware
> threads of the same core can share the TLB; the original algorithm
> violated the requirements when this feature was enabled).
Here is my reply for Catalin:
C-SKY SMP is only one hart per core, but here is a patch [1] with my
thought on SMT duplicate tlb flush:
[1] https://lore.kernel.org/linux-csky/1561305869-18872-1-git-send-email-guoren@xxxxxxxxxx/T/#u

Our talk is on this thread:
20190624102209.ngwtosgr5fvp3ler@willie-the-truck/T/#m92396a2f238c9eece660cdc0f275e787531d4ec1">https://lore.kernel.org/linux-arm-kernel/20190624102209.ngwtosgr5fvp3ler@willie-the-truck/T/#m92396a2f238c9eece660cdc0f275e787531d4ec1

>
> On Mon, Jun 24, 2019 at 12:04:29AM +0800, guoren@xxxxxxxxxx wrote:
> > From: Guo Ren <ren_guo@xxxxxxxxx>
> >
> > The hardware threads of one core could share the same TLB for SMT+SMP
> > system. Assume hardware threads number sequence like this:
> >
> > | 0 1 2 3 | 4 5 6 7 | 8 9 a b | c d e f |
> > core1 core2 core3 core4
>
> Given this is the Linux logical CPU ID rather than a physical CPU ID,
> this assumption is not valid. For example, CPUs may be renumbered across
> kexec.
>
> Even if this were a physical CPU ID, this doesn't hold on arm64 (e.g.
> due to big.LITTLE).
That's ok for csky, C-SKY smp logical CPU ID is the same with physical one.

>
> > Current algorithm seems is correct for SMT+SMP, but it'll give some
> > duplicate local_tlb_flush. Because one hardware threads local_tlb_flush
> > will also flush other hardware threads' TLB entry in one core TLB.
>
> Does any architecture specification mandate that behaviour?
>
> That isn't true for arm64, I have no idea whether RISC-V mandates that,
> and as below it seems this is irrelevant on C-SKY.
Harts in one core share the same tlb and I think one hart flushing tlb will also
affect on other harts in the same core. So we just need one tlb flush for one
core.

>
> > So we can use bitmap to reduce local_tlb_flush for SMT.
> >
> > C-SKY cores don't support SMT and the patch is no benefit for C-SKY.
>
> As above, this patch is very confusing -- if this doesn't benefit C-SKY,
> why modify the C-SKY code?
Ditto, it's for Catalin's question and this patch compiled for csky.

Best Regards
Guo Ren