On Wed, Jun 04, 2025 at 01:32:08PM +0800, Miaoqing Pan wrote:
On 6/4/2025 10:34 AM, Miaoqing Pan wrote:
On 6/3/2025 7:51 PM, Johan Hovold wrote:
On Tue, Jun 03, 2025 at 06:52:37PM +0800, Baochen Qiang wrote:
On 6/2/2025 4:03 PM, Johan Hovold wrote:
No, the barrier is needed between reading the head pointer and
accessing
descriptor fields, that's what matters.
You can still end up with reading stale descriptor data even when
ath11k_hal_srng_dst_get_next_entry() returns non-NULL due to
speculation
(that's what happens on the X13s).
The fact is that a dma_rmb() does not even prevent speculation, no
matter where it is
placed, right?
It prevents the speculated load from being used.
If so the whole point of dma_rmb() is to prevent from compiler
reordering
or CPU reordering, but is it really possible?
The sequence is
1# reading HP
srng->u.dst_ring.cached_hp = READ_ONCE(*srng-
u.dst_ring.hp_addr);
2# validate HP
if (srng->u.dst_ring.tp == srng->u.dst_ring.cached_hp)
return NULL;
3# get desc
desc = srng->ring_base_vaddr + srng->u.dst_ring.tp;
4# accessing desc
ath11k_hal_desc_reo_parse_err(... desc, ...)
Clearly each step depends on the results of previous steps. In this
case the compiler/CPU
is expected to be smart enough to not do any reordering, isn't it?
Steps 3 and 4 can be done speculatively before the load in step 1 is
complete as long as the result is discarded if it turns out not to be
needed.
If the condition in step 2 is true and step 3 speculatively loads
descriptor from TP before step 1, could this cause issues?
Sorry for typo, if the condition in step 2 is false and step 3
speculatively loads descriptor from TP before step 1, could this cause
issues?
Almost correct; the descriptor can be loaded (from TP) before the head
pointer is loaded and thus before the condition in step 2 has been
evaluated. And if the condition in step 2 later turns out to be false,
step 4 may use stale data from before the head pointer was updated.