Re: [PATCH net] net: hdlc_x25: Use qdisc to queue outgoing LAPB frames

From: Jakub Kicinski
Date: Sat Jan 30 2021 - 14:17:29 EST


On Sat, 30 Jan 2021 06:29:20 -0800 Xie He wrote:
> On Fri, Jan 29, 2021 at 5:36 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
> > I'm still struggling to wrap my head around this.
> >
> > Did you test your code with lockdep enabled? Which Qdisc are you using?
> > You're queuing the frames back to the interface they came from - won't
> > that cause locking issues?
>
> Hmm... Thanks for bringing this to my attention. I indeed find issues
> when the "noqueue" qdisc is used.
>
> When using a qdisc other than "noqueue", when sending an skb:
> "__dev_queue_xmit" will call "__dev_xmit_skb";
> "__dev_xmit_skb" will call "qdisc_run_begin" to mark the beginning of
> a qdisc run, and if the qdisc is already running, "qdisc_run_begin"
> will fail, then "__dev_xmit_skb" will just enqueue this skb without
> starting qdisc. There is no problem.
>
> When using "noqueue" as the qdisc, when sending an skb:
> "__dev_queue_xmit" will try to send this skb directly. Before it does
> that, it will first check "txq->xmit_lock_owner" and will find that
> the current cpu already owns the xmit lock, it will then print a
> warning message "Dead loop on virtual device ..." and drop the skb.
>
> A solution can be queuing the outgoing L2 frames in this driver first,
> and then using a tasklet to send them to the qdisc TX queue.
>
> Thanks! I'll make changes to fix this.

Sounds like too much afford for a sub-optimal workaround.
The qdisc semantics are borken in the proposed scheme (double
counting packets) - both in term of statistics and if user decides
to add a policer, filter etc.

Another worry is that something may just inject a packet with
skb->protocol == ETH_P_HDLC but unexpected structure (IDK if
that's a real concern).

It may be better to teach LAPB to stop / start the internal queue.
The lower level drivers just needs to call LAPB instead of making
the start/wake calls directly to the stack, and LAPB can call the
stack. Would that not work?