Re: [PATCH v2 1/4] wifi: ath12k: fix dest ring-buffer corruption
From: Baochen Qiang
Date: Mon Jun 16 2025 - 07:00:01 EST
On 6/16/2025 5:29 PM, Praneesh P wrote:
>
> On 6/5/2025 4:19 PM, Baochen Qiang wrote:
>>
>> On 6/5/2025 6:00 PM, Johan Hovold wrote:
>>> On Thu, Jun 05, 2025 at 04:41:32PM +0800, Baochen Qiang wrote:
>>>> On 6/4/2025 10:45 PM, Johan Hovold wrote:
>>>>> Add the missing memory barrier to make sure that destination ring
>>>>> descriptors are read after the head pointers to avoid using stale data
>>>>> on weakly ordered architectures like aarch64.
>>>>>
>>>>> The barrier is added to the ath12k_hal_srng_access_begin() helper for
>>>>> symmetry with follow-on fixes for source ring buffer corruption which
>>>>> will add barriers to ath12k_hal_srng_access_end().
>>>>>
>>>>> Note that this may fix the empty descriptor issue recently worked around
>>>>> by commit 51ad34a47e9f ("wifi: ath12k: Add drop descriptor handling for
>>>>> monitor ring").
>>>> why? I would expect drunk cookies are valid in case of HAL_MON_DEST_INFO0_EMPTY_DESC,
>>>> rather than anything caused by reordering.
>>> Based on a quick look it seemed like this could possibly fall in the
>>> same category as some of the other workarounds I've spotted while
>>> looking into these ordering issues (e.g. f9fff67d2d7c ("wifi: ath11k:
>>> Fix SKB corruption in REO destination ring")).
>>>
>>> If you say this one is clearly unrelated, I'll drop the comment.
>> Praneesh, could you comment here since you made that change?
> Empty/Drop descriptor is intentionally issued by the hardware during backpressure scenario
> and is unrelated to the issue discussed in this patch series.
Thanks Praneesh.
Johan, according to that, please drop the comment.
>>>>> @@ -343,9 +343,6 @@ static int ath12k_ce_completed_recv_next(struct ath12k_ce_pipe
>>>>> *pipe,
>>>>> goto err;
>>>>> }
>>>>> - /* Make sure descriptor is read after the head pointer. */
>>>>> - dma_rmb();
>>>>> -
>>>>> *nbytes = ath12k_hal_ce_dst_status_get_length(desc);
>>>>> *skb = pipe->dest_ring->skb[sw_index];
>>>>> diff --git a/drivers/net/wireless/ath/ath12k/hal.c b/drivers/net/wireless/ath/ath12k/
>>>>> hal.c
>>>>> index 91d5126ca149..9eea13ed5565 100644
>>>>> --- a/drivers/net/wireless/ath/ath12k/hal.c
>>>>> +++ b/drivers/net/wireless/ath/ath12k/hal.c
>>>>> @@ -2126,13 +2126,24 @@ void *ath12k_hal_srng_src_get_next_reaped(struct ath12k_base
>>>>> *ab,
>>>>> void ath12k_hal_srng_access_begin(struct ath12k_base *ab, struct hal_srng *srng)
>>>>> {
>>>>> + u32 hp;
>>>>> +
>>>>> lockdep_assert_held(&srng->lock);
>>>>> - if (srng->ring_dir == HAL_SRNG_DIR_SRC)
>>>>> + if (srng->ring_dir == HAL_SRNG_DIR_SRC) {
>>>>> srng->u.src_ring.cached_tp =
>>>>> *(volatile u32 *)srng->u.src_ring.tp_addr;
>>>>> - else
>>>>> - srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
>>>>> + } else {
>>>>> + hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
>>>>> +
>>>>> + if (hp != srng->u.dst_ring.cached_hp) {
>>>> This consumes additional CPU cycles in hot path, which is a concern to me.
>>>>
>>>> Based on that, I prefer the v1 implementation.
>>> The conditional avoids a memory barrier in case the ring is empty, so
>>> for all callers but ath12k_ce_completed_recv_next() it's an improvement
>>> over v1 in that sense.
>>>
>>> I could make the barrier unconditional, which will only add one barrier
>>> to ath12k_ce_completed_recv_next() in case the ring is empty compared to
>>> v1. Perhaps that's a good compromise if you worry about the extra
>>> comparison?
>> I guess the unconditional barrier also has impact on performance? If so I am not sure
>> which one is better then ...
>>
>> Let's just keep it as is and see what others think.
>>
>>> I very much want to avoid having both explicit barriers in the caller
>>> and barriers in the hal end() helper. I think it should be either or.
>>>
>>>>> + srng->u.dst_ring.cached_hp = hp;
>>>>> + /* Make sure descriptor is read after the head
>>>>> + * pointer.
>>>>> + */
>>>>> + dma_rmb();
>>>>> + }
>>>>> + }
>>> Johan
>>