Re: Re: [RFC][PATCH 2/5] mips/atomic: Fix loongson_llsc_mb() wreckage

From: huangpei
Date: Thu Apr 25 2019 - 01:11:05 EST


In my opinion. patch 2/3 is about Loongson's bug, and patch 4/5 is another theme.

Let me explain the bug more specific:

the bug ONLY matters in following situation:

#. more than one cpu (assume cpu A and B) doing ll/sc on same shared var V

#. speculative memory access from A cause A erroneously succeed sc operation, since the
erroneously successful sc operation violate the coherence protocol. (here coherence protocol means the rules that CPU follow to implement ll/sc right)

#. B succeed sc operation too, but this sc operation is right both logically and follow
the coherence protocol, and makes A's sc wrong logically since only ONE sc operation
can succeed.

If it is not LL/SC but other memory access from B on V, A's ll/sc can follow the atomic semantics even if A violate the coherence protocol in the same situation.

In one wordï the bug only affect local cpuâs ll/sc operation, and affect MP system.


PS:

If local_t is only ll/sc manipulated by current CPUï then no need fix it.






> -----ååéä-----
> åää: "Paul Burton" <paul.burton@xxxxxxxx>
> åéæé: 2019-04-25 05:18:04 (ææå)
> æää: "Peter Zijlstra" <peterz@xxxxxxxxxxxxx>
> æé: "stern@xxxxxxxxxxxxxxxxxxx" <stern@xxxxxxxxxxxxxxxxxxx>, "akiyks@xxxxxxxxx" <akiyks@xxxxxxxxx>, "andrea.parri@xxxxxxxxxxxxxxxxxxxx" <andrea.parri@xxxxxxxxxxxxxxxxxxxx>, "boqun.feng@xxxxxxxxx" <boqun.feng@xxxxxxxxx>, "dlustig@xxxxxxxxxx" <dlustig@xxxxxxxxxx>, "dhowells@xxxxxxxxxx" <dhowells@xxxxxxxxxx>, "j.alglave@xxxxxxxxx" <j.alglave@xxxxxxxxx>, "luc.maranget@xxxxxxxx" <luc.maranget@xxxxxxxx>, "npiggin@xxxxxxxxx" <npiggin@xxxxxxxxx>, "paulmck@xxxxxxxxxxxxx" <paulmck@xxxxxxxxxxxxx>, "will.deacon@xxxxxxx" <will.deacon@xxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, "torvalds@xxxxxxxxxxxxxxxxxxxx" <torvalds@xxxxxxxxxxxxxxxxxxxx>, "Huacai Chen" <chenhc@xxxxxxxxxx>, "Huang Pei" <huangpei@xxxxxxxxxxx>
> äé: Re: [RFC][PATCH 2/5] mips/atomic: Fix loongson_llsc_mb() wreckage
>
> Hi Peter,
>
> On Wed, Apr 24, 2019 at 02:36:58PM +0200, Peter Zijlstra wrote:
> > The comment describing the loongson_llsc_mb() reorder case doesn't
> > make any sense what so ever. Instruction re-ordering is not an SMP
> > artifact, but rather a CPU local phenomenon. This means that _every_
> > LL/SC loop needs this barrier right in front to avoid the CPU from
> > leaking a memop inside it.
>
> Does it?
>
> The Loongson bug being described here causes an sc to succeed
> erroneously if certain loads or stores are executed between the ll &
> associated sc, including speculatively. On a UP system there's no code
> running on other cores to race with us & cause our sc to fail - ie. sc
> should always succeed anyway, so if the bug hits & the sc succeeds
> what's the big deal? It would have succeeded anyway. At least that's my
> understanding based on discussions with Loongson engineers a while ago.
>
> Having said that, if you have a strong preference for adding the barrier
> in UP systems anyway then I don't really object. It's not like anyone's
> likely to want to run a UP kernel on the affected systems, nevermind
> care about a miniscule performance impact.
>
> One possibility your change could benefit would be if someone ran Linux
> on a subset of cores & some non-Linux code on other cores, in which case
> there could be something to cause the sc to fail. I've no idea if that's
> something these Loongson systems ever do though.
>
> > For the branch speculation case; if futex_atomic_cmpxchg_inatomic()
> > needs one at the bne branch target, then surely the normal
> > __cmpxch_asmg() implementation does too. We cannot rely on the
>
> s/cmpxch_asmg/cmpxchg_asm/
>
> > barriers from cmpxchg() because cmpxchg_local() is implemented with
> > the same macro, and branch prediction and speculation are, too, CPU
> > local.
>
> Similar story - cmpxchg_local() ought not have have CPUs racing for
> access to the memory in question. Having said that I don't know the
> details of when Loongson clears LLBit (ie. causes an sc to fail), so if
> it does that on based upon access to memory at a larger granularity than
> the 32b or 64b value being operated on then that could be a problem so
> I'm pretty happy with adding these barriers.
>
> Thanks,
> Paul


åäåææåäåæçäçæçèåéèääå2åæ 100095çè: +86 (10) 62546668äç: +86 (10) 62600826www.loongson.cnæéäååéäåæéèäçæææéååçåäçåäæïäéäåéçäéååäååçääæççãçæääåäääääååäçïåæääéäåéæé ååæéãååææåïæéäååéääçäæãåææéææéäïèæçåçèæéäéçåääååéæéäã

This email and its attachments contain confidential information from Loongson
Technology Corporation Limited, which is intended only for the person or entity
whose address is listed above. Any use of the information contained herein in
any way (including, but not limited to, total or partial disclosure,
reproduction or dissemination) by persons other than the intended recipient(s)
is prohibited. If you receive this email in error, please notify the sender by
phone or email immediately and delete it.