Re: [PATCH net v3] net: sched: fix packet stuck problem for lockless qdisc

From: Yunsheng Lin
Date: Tue Apr 13 2021 - 03:57:38 EST


On 2021/4/13 15:12, Hillf Danton wrote:
> On Tue, 13 Apr 2021 11:34:27 Yunsheng Lin wrote:
>> On 2021/4/13 11:26, Hillf Danton wrote:
>>> On Tue, 13 Apr 2021 10:56:42 Yunsheng Lin wrote:
>>>> On 2021/4/13 10:21, Hillf Danton wrote:
>>>>> On Mon, 12 Apr 2021 20:00:43 Yunsheng Lin wrote:
>>>>>>
>>>>>> Yes, the below patch seems to fix the data race described in
>>>>>> the commit log.
>>>>>> Then what is the difference between my patch and your patch below:)
>>>>>
>>>>> Hehe, this is one of the tough questions over a bounch of weeks.
>>>>>
>>>>> If a seqcount can detect the race between skb enqueue and dequeue then we
>>>>> cant see any excuse for not rolling back to the point without NOLOCK.
>>>>
>>>> I am not sure I understood what you meant above.
>>>>
>>>> As my understanding, the below patch is essentially the same as
>>>> your previous patch, the only difference I see is it uses qdisc->pad
>>>> instead of __QDISC_STATE_NEED_RESCHEDULE.
>>>>
>>>> So instead of proposing another patch, it would be better if you
>>>> comment on my patch, and make improvement upon that.
>>>>
>>> Happy to do that after you show how it helps revert NOLOCK.
>>
>> Actually I am not going to revert NOLOCK, but add optimization
>> to it if the patch fixes the packet stuck problem.
>>
> Fix is not optimization, right?

For this patch, it is a fix.
In case you missed it, I do have a couple of idea to optimize the
lockless qdisc:

1. RFC patch to add lockless qdisc bypass optimization:

https://patchwork.kernel.org/project/netdevbpf/patch/1616404156-11772-1-git-send-email-linyunsheng@xxxxxxxxxx/

2. implement lockless enqueuing for lockless qdisc using the idea
from Jason and Toke. And it has a noticable proformance increase with
1-4 threads running using the below prototype based on ptr_ring.

static inline int __ptr_ring_multi_produce(struct ptr_ring *r, void *ptr)
{

int producer, next_producer;


do {
producer = READ_ONCE(r->producer);
if (unlikely(!r->size) || r->queue[producer])
return -ENOSPC;
next_producer = producer + 1;
if (unlikely(next_producer >= r->size))
next_producer = 0;
} while(cmpxchg_relaxed(&r->producer, producer, next_producer) != producer);

/* Make sure the pointer we are storing points to a valid data. */
/* Pairs with the dependency ordering in __ptr_ring_consume. */
smp_wmb();

WRITE_ONCE(r->queue[producer], ptr);
return 0;
}

3. Maybe it is possible to remove the netif_tx_lock for lockless qdisc
too, because dev_hard_start_xmit is also in the protection of
qdisc_run_begin()/qdisc_run_end()(if there is only one qdisc using
a netdev queue, which is true for pfifo_fast, I believe).

4. Remove the qdisc->running seqcount operation for lockless qdisc, which
is mainly used to do heuristic locking on q->busylock for locked qdisc.

>
>> Is there any reason why you want to revert it?
>>
> I think you know Jiri's plan and it would be nice to wait a couple of
> months for it to complete.

I am not sure I am aware of Jiri's plan.
Is there any link referring to the plan?

>
> .
>