RE: [PATCH RFC] net/sched: adjust device watchdog timer to detect stopped queue at right time

From: Praveen Kannoju
Date: Mon May 06 2024 - 10:03:16 EST


Thank you, Jakub.
Your comments have been addressed in the v2 patch. We've tested it internally and the patch works as expected. Please review and let us know If any additional changes are needed.

-
Praveen.

> -----Original Message-----
> From: Jakub Kicinski <kuba@xxxxxxxxxx>
> Sent: 04 May 2024 01:00 AM
> To: Praveen Kannoju <praveen.kannoju@xxxxxxxxxx>
> Cc: jhs@xxxxxxxxxxxx; xiyou.wangcong@xxxxxxxxx; jiri@xxxxxxxxxxx; davem@xxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@xxxxxxxxxx>; Rama Nichanamatlu
> <rama.nichanamatlu@xxxxxxxxxx>; Manjunath Patil <manjunath.b.patil@oraclecom>
> Subject: Re: [PATCH RFC] net/sched: adjust device watchdog timer to detect stopped queue at right time
>
> On Fri, 3 May 2024 14:28:13 +0000 Praveen Kannoju wrote:
> > > > txq = netdev_get_tx_queue(dev, i);
> > > > trans_start = READ_ONCE(txq->trans_start);
> > > > - if (netif_xmit_stopped(txq) &&
> > > > - time_after(jiffies, (trans_start +
> > > > - dev->watchdog_timeo))) {
> > > > - timedout_ms = jiffies_to_msecs(jiffies - trans_start);
> > > > - atomic_long_inc(&txq->trans_timeout);
> > > > - break;
> > > > + if (netif_xmit_stopped(txq)) {
> > >
> > > please use continue instead of adding another indentation level
> >
> > We need to take decision on whether to break out of loop or modify
> > "oldest_start" only when Queue is stopped. Hence one more level of
> > indentation is needed. Can you please elaborate on using "continue" in existing condition instead of adding a new indentation level.
>
> If the queue is not stopped, continue. Split the condition into multiple ifs.
>
> > > > + dev->watchdog_timeo))) {
> > > > + timedout_ms = jiffies_to_msecs(current_jiffies -
> > > > + trans_start);
> > > > + atomic_long_inc(&txq->trans_timeout);
> > > > + break;
> > > > + }
> > > > + next_check = trans_start + dev->watchdog_timeo -
> > > > + current_jiffies;
> > >
> > > this will give us "next_check" for last queue. Let's instead find the oldest trans_start in the loop. Do:
> > >
> > > unsigned long oldest_start = jiffies;
> > >
> > > then in the loop:
> > >
> > > oldest_start = min(...)
>
> BTW, the min() I suggested here needs to be a if (time_after(...)), we can't use bare min() to compare jiffies, because they may wrap.