Re: [net PATCH v2] octeontx2-af: Unlock contexts in the queue context cache in case of fault detection

From: Simon Horman
Date: Fri Feb 24 2023 - 04:07:41 EST


On Fri, Feb 24, 2023 at 08:39:20AM +0000, Sai Krishna Gajula wrote:
> Hi Simon,
>
> > -----Original Message-----
> > From: Simon Horman <simon.horman@xxxxxxxxxxxx>
> > Sent: Thursday, February 23, 2023 6:47 PM
> > To: Sai Krishna Gajula <saikrishnag@xxxxxxxxxxx>
> > Cc: davem@xxxxxxxxxxxxx; edumazet@xxxxxxxxxx; kuba@xxxxxxxxxx;
> > pabeni@xxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > Sunil Kovvuri Goutham <sgoutham@xxxxxxxxxxx>; Suman Ghosh
> > <sumang@xxxxxxxxxxx>
> > Subject: Re: [net PATCH v2] octeontx2-af: Unlock contexts in the queue
> > context cache in case of fault detection
> >
> >
> > ----------------------------------------------------------------------
> > On Thu, Feb 23, 2023 at 04:31:25PM +0530, Sai Krishna wrote:
> > > From: Suman Ghosh <sumang@xxxxxxxxxxx>
> > >
> > > NDC caches contexts of frequently used queue's (Rx and Tx queues)
> > > contexts. Due to a HW errata when NDC detects fault/poision while
> > > accessing contexts it could go into an illegal state where a cache
> > > line could get locked forever. To makesure all cache lines in NDC are
> > > available for optimum performance upon fault/lockerror/posion errors
> > > scan through all cache lines in NDC and clear the lock bit.
> > >
> > > Fixes: 4a3581cd5995 ("octeontx2-af: NPA AQ instruction enqueue
> > > support")
> > > Signed-off-by: Suman Ghosh <sumang@xxxxxxxxxxx>
> > > Signed-off-by: Sunil Kovvuri Goutham <sgoutham@xxxxxxxxxxx>
> > > Signed-off-by: Sai Krishna <saikrishnag@xxxxxxxxxxx>
> >
> > ...
> >
> > > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
> > > b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
> > > index 389663a13d1d..6508f25b2b37 100644
> > > --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
> > > +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
> > > @@ -884,6 +884,12 @@ int rvu_cpt_lf_teardown(struct rvu *rvu, u16
> > > pcifunc, int blkaddr, int lf, int rvu_cpt_ctx_flush(struct rvu *rvu,
> > > u16 pcifunc); int rvu_cpt_init(struct rvu *rvu);
> > >
> > > +/* NDC APIs */
> > > +#define NDC_MAX_BANK(rvu, blk_addr) (rvu_read64(rvu, \
> > > + blk_addr, NDC_AF_CONST) & 0xFF)
> > > +#define NDC_MAX_LINE_PER_BANK(rvu, blk_addr) ((rvu_read64(rvu, \
> > > + blk_addr, NDC_AF_CONST) &
> > 0xFFFF0000) >> 16)
> >
> > Perhaps not appropriate to include as part of a fix, as NDC_MAX_BANK is
> > being moved from elsewhere, but I wonder if this might be more cleanly
> > implemented using FIELD_GET().
>
> We will modify and send a separate patch for all the possible macros that can be replaced by FIELD_GET().

Thanks, much appreciated.

> > ...
> >
> > > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
> > > b/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
> > > index 1729b22580ce..bc6ca5ccc1ff 100644
> > > --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
> > > +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
> > > @@ -694,6 +694,7 @@
> > > #define NDC_AF_INTR_ENA_W1S (0x00068)
> > > #define NDC_AF_INTR_ENA_W1C (0x00070)
> > > #define NDC_AF_ACTIVE_PC (0x00078)
> > > +#define NDC_AF_CAMS_RD_INTERVAL (0x00080)
> > > #define NDC_AF_BP_TEST_ENABLE (0x001F8)
> > > #define NDC_AF_BP_TEST(a) (0x00200 | (a) << 3)
> > > #define NDC_AF_BLK_RST (0x002F0)
> > > @@ -709,6 +710,8 @@
> > > (0x00F00 | (a) << 5 | (b) << 4)
> > > #define NDC_AF_BANKX_HIT_PC(a) (0x01000 | (a) << 3)
> > > #define NDC_AF_BANKX_MISS_PC(a) (0x01100 | (a) << 3)
> > > +#define NDC_AF_BANKX_LINEX_METADATA(a, b) \
> > > + (0x10000 | (a) << 3 | (b) << 3)
> >
> > It looks a little odd that both a and b are shifted by 3 bits.
> > If it's intended then perhaps it would be clearer to write this as:
> >
> > #define NDC_AF_BANKX_LINEX_METADATA(a, b) \
> > (0x10000 | ((a) | (b)) << 3)
>
> will send v3 patch.

Likewise, thanks.