Re: [PATCH] Fix: xtensa: add missing sync_core

From: Mathieu Desnoyers
Date: Tue Sep 12 2017 - 09:05:01 EST


----- On Sep 12, 2017, at 12:37 AM, Max Filippov jcmvbkbc@xxxxxxxxx wrote:

> Hi Mathieu,
>
> On Tue, Aug 29, 2017 at 11:55 AM, Mathieu Desnoyers
> <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
>> ----- On Aug 28, 2017, at 1:12 PM, Max Filippov jcmvbkbc@xxxxxxxxx wrote:
>>> On Mon, Aug 28, 2017 at 12:36 AM, Mathieu Desnoyers
>>> <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
>>>> The membarrier system call now requires all architectures to implement
>>>> sync_core(). On Xtensa, it is provided by the EXTW instruction.
>>>>
>>>> [ Completely untested! Can someone on the xtensa side confirm whether
>>>> EXTW is the right way to serialize core execution and try it out ? ]
>>>
>>> Thanks for the patch. I'm currently travelling, I'll give it a try next week
>>> once I'm back at work.
>>
>> I think we may need to flush the icache to make it consistent with the dcache
>> too on xtensa, in addition to the EXTW. The goal here is to allow JIT engines
>> to reclaim and re-use memory after they discard dynamically generated code.
>> This is similar to what we'd need to do on arm32, where they have inconsistent
>> d/i-caches.
>
> my understanding is that to support JIT engines on xtensa we need to do
> icache/dcache synchronization, this procedure is covered in the ISYNC
> instruction description in the ISA book, it involves MEMW and ISYNC,
> but not EXTW. EXTW is meant to work as a CPU barrier that orders all
> externally visible CPU signals, which seems unnecessary.

Good point! Yes, combining MEMW and ISYNC seems to be what we need here.

My upcoming proposal for core-serializing membarrier command issues
smp_mb() (memw) and sync_core(). I would then think that just using the
"isync" instruction is a good candidate for the sync_core() implementation,
given that it is similar to the effect of the core serializing "cpuid"
instruction on Intel.

>
> Interestingly, currently we don't have MEMW between dcache flush and
> icache invalidation, so I need to add it to be consistent with the documented
> procedure. Then I believe that sync_core implementation should invoke
> flush_dcache_all followed by MEMW followed by invalidate_icache_all.
> Does that sound right?

In a different leg of the thread targeting arm64, we discussed that we
should probably only make that membarrier command serialize the core,
without taking care of flushing the caches. Cache flushing instructions
can be executed from user-space on some architectures, or through another
arch-specific system call that targets the address range that needs to be
flushed.

For SMP variants of xtensa, that cache flushing system call should consider
whether icache invalidation needs to be performed on all CPUs when reclaiming
JIT code.

Thoughts ?

Thanks,

Mathieu


>
> --
> Thanks.
> -- Max

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com