Empty/Drop descriptor is intentionally issued by the hardware during backpressure scenario
On 6/5/2025 6:00 PM, Johan Hovold wrote:
On Thu, Jun 05, 2025 at 04:41:32PM +0800, Baochen Qiang wrote:Praneesh, could you comment here since you made that change?
On 6/4/2025 10:45 PM, Johan Hovold wrote:Based on a quick look it seemed like this could possibly fall in the
Add the missing memory barrier to make sure that destination ringwhy? I would expect drunk cookies are valid in case of HAL_MON_DEST_INFO0_EMPTY_DESC,
descriptors are read after the head pointers to avoid using stale data
on weakly ordered architectures like aarch64.
The barrier is added to the ath12k_hal_srng_access_begin() helper for
symmetry with follow-on fixes for source ring buffer corruption which
will add barriers to ath12k_hal_srng_access_end().
Note that this may fix the empty descriptor issue recently worked around
by commit 51ad34a47e9f ("wifi: ath12k: Add drop descriptor handling for
monitor ring").
rather than anything caused by reordering.
same category as some of the other workarounds I've spotted while
looking into these ordering issues (e.g. f9fff67d2d7c ("wifi: ath11k:
Fix SKB corruption in REO destination ring")).
If you say this one is clearly unrelated, I'll drop the comment.
I guess the unconditional barrier also has impact on performance? If so I am not sureThe conditional avoids a memory barrier in case the ring is empty, so@@ -343,9 +343,6 @@ static int ath12k_ce_completed_recv_next(struct ath12k_ce_pipe *pipe,This consumes additional CPU cycles in hot path, which is a concern to me.
goto err;
}
- /* Make sure descriptor is read after the head pointer. */
- dma_rmb();
-
*nbytes = ath12k_hal_ce_dst_status_get_length(desc);
*skb = pipe->dest_ring->skb[sw_index];
diff --git a/drivers/net/wireless/ath/ath12k/hal.c b/drivers/net/wireless/ath/ath12k/hal.c
index 91d5126ca149..9eea13ed5565 100644
--- a/drivers/net/wireless/ath/ath12k/hal.c
+++ b/drivers/net/wireless/ath/ath12k/hal.c
@@ -2126,13 +2126,24 @@ void *ath12k_hal_srng_src_get_next_reaped(struct ath12k_base *ab,
void ath12k_hal_srng_access_begin(struct ath12k_base *ab, struct hal_srng *srng)
{
+ u32 hp;
+
lockdep_assert_held(&srng->lock);
- if (srng->ring_dir == HAL_SRNG_DIR_SRC)
+ if (srng->ring_dir == HAL_SRNG_DIR_SRC) {
srng->u.src_ring.cached_tp =
*(volatile u32 *)srng->u.src_ring.tp_addr;
- else
- srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
+ } else {
+ hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
+
+ if (hp != srng->u.dst_ring.cached_hp) {
Based on that, I prefer the v1 implementation.
for all callers but ath12k_ce_completed_recv_next() it's an improvement
over v1 in that sense.
I could make the barrier unconditional, which will only add one barrier
to ath12k_ce_completed_recv_next() in case the ring is empty compared to
v1. Perhaps that's a good compromise if you worry about the extra
comparison?
which one is better then ...
Let's just keep it as is and see what others think.
I very much want to avoid having both explicit barriers in the caller
and barriers in the hal end() helper. I think it should be either or.
Johan+ srng->u.dst_ring.cached_hp = hp;
+ /* Make sure descriptor is read after the head
+ * pointer.
+ */
+ dma_rmb();
+ }
+ }