Re: [PATCH 3/4] mtd: nand: mxc_nand: support software ECC

From: Miquel Raynal
Date: Tue May 07 2024 - 03:45:55 EST


Hi Sascha,

s.hauer@xxxxxxxxxxxxxx wrote on Tue, 7 May 2024 09:12:30 +0200:

> On Mon, May 06, 2024 at 05:51:06PM +0200, Miquel Raynal wrote:
> > Hi Miquel,
> >
> > miquel.raynal@xxxxxxxxxxx wrote on Mon, 6 May 2024 16:05:08 +0200:
> >
> > > Hi Sascha,
> > >
> > > s.hauer@xxxxxxxxxxxxxx wrote on Wed, 17 Apr 2024 09:13:30 +0200:
> > >
> > > > To support software ECC we still need the driver provided read_oob,
> > > > read_page_raw and write_page_raw ops, so set them unconditionally
> > > > no matter which engine_type we use. The OOB layout on the other hand
> > > > represents the layout the i.MX ECC hardware uses, so set this only
> > > > when NAND_ECC_ENGINE_TYPE_ON_HOST is in use.
> > > >
> > > > With these changes the driver can be used with software BCH ECC which
> > > > is useful for NAND chips that require a stronger ECC than the i.MX
> > > > hardware supports.
> > > >
> > > > Signed-off-by: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx>
> > > > ---
> > > > drivers/mtd/nand/raw/mxc_nand.c | 9 +++++----
> > > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/mtd/nand/raw/mxc_nand.c b/drivers/mtd/nand/raw/mxc_nand.c
> > > > index fc70c65dea268..f44c130dca18d 100644
> > > > --- a/drivers/mtd/nand/raw/mxc_nand.c
> > > > +++ b/drivers/mtd/nand/raw/mxc_nand.c
> > > > @@ -1394,15 +1394,16 @@ static int mxcnd_attach_chip(struct nand_chip *chip)
> > > > chip->ecc.bytes = host->devtype_data->eccbytes;
> > > > host->eccsize = host->devtype_data->eccsize;
> > > > chip->ecc.size = 512;
> > > > - mtd_set_ooblayout(mtd, host->devtype_data->ooblayout);
> > > > +
> > > > + chip->ecc.read_oob = mxc_nand_read_oob;
> > > > + chip->ecc.read_page_raw = mxc_nand_read_page_raw;
> > > > + chip->ecc.write_page_raw = mxc_nand_write_page_raw;
> >
> > A second thought on this. Maybe you should consider keeping these for
> > on-host operations only.
> >
> > The read/write_page_raw operations are supposed to detangle the data
> > organization to show a proper [all data][all oob] organization to the
> > user.
>
> Let me take one step back. The organisation in the raw NAND is like this
> when using hardware ECC:
>
> [512b data0][16b oob0][512b data1][16b oob1][512b data2][16b oob2][512b data3][16b oob3]
>
> For a standard 2k+64b NAND. The read/write_page_raw operations detangle
> this and present the data to the user like this:
>
> [2048b data][64b OOB]
>
> Is this the correct behaviour or should that be changed?

I believe so, yes.

> (Side note: The GPMI NAND driver behaves differently here. It has the
> same interleaved organisation on the chip and also presents the same
> interleaved organisation to the user when using read_page_raw)

I'd say the GPMI driver is wrong?

> With my current approach for software ECC the same layout is used on the
> NAND chip. It would interleave the data with the OOB on the NAND chip
> and, since using the same read/write_page_raw operations, also presents
> [2048b data][64b OOB] to the user.

No need, I believe the only reason for interleaving is that your
hardware ECC engine works like that (writes the ECC bytes slightly
after each chunk of data). So if you don't use on-host hardware ECC,
you don't need to deal with this data layout.

> This works fine currently, but means that NAND_CMD_RNDOUT can't be used.
> Using NAND_CMD_RNDOUT to position the cursor at offset 512b for example
> doesn't give you the second subpage, but instead oob0. Positioning the
> cursor at offset 2048 doesn't give you the start of OOB, but some
> position in the middle of data3.
>
> Ok, NAND_CMD_RNDOUT can't be used for hardware ECC and there's no way
> around it. For software ECC we could change the organisation in the chip
> to be [2048b data][64b oob]. With that NAND_CMD_RNDOUT then could be
> used with software ECC.
>
> You say that NAND_CMD_RNDOUT is a basic command that is supported by all
> controllers, and yes, it is also supported with the mxc_nand controller.
> You just can't control how many bytes are transferred between the NAND
> chip and the controller. When using NAND_CMD_RNDOUT to read a few bytes
> at a certain page offset we'll end up reading 512 bytes discarding most
> of it. For the next ECC block we would move the cursor forward using
> another NAND_CMD_RNDOUT command, again read 512 bytes and discard most
> it (altough the desired data would have been in the first read already).

I'm not sure the controller limitations are so bad in this case. The
core helpers (using the same example) will ask for:
- 512b at offset 0
- 512b at offset 512...
- and finally 64b at offset 2048.
In practice it does not look like a huge drawback? I don't understand
in which case so much data would be read and then discarded?

> So I think NAND_CMD_RNDOUT should really be avoided for this controller,
> eventhough we might be able to support it.

I also mentioned the monolithic accessors which try to avoid these
random column changes. You probably want to check them out, they might
just avoid the need for NAND_CMD_RNDOUT by forcing full page accesses
directly. The reason why they were introduced is not exactly our
current use case, but it feels like they might be handy.

658beb663960 ("mtd: rawnand: Expose monolithic read/write_page_raw() helpers")
0e7f4b64ea46 ("mtd: rawnand: Allow controllers to overload soft ECC hooks")

Thanks,
Miquèl