Re: [PATCH 4/4] mtd: nand: mxc_nand: disable subpage reads

From: Miquel Raynal
Date: Mon May 06 2024 - 12:41:49 EST


Hi Sascha,

s.hauer@xxxxxxxxxxxxxx wrote on Mon, 22 Apr 2024 12:53:38 +0200:

> On Fri, Apr 19, 2024 at 11:46:57AM +0200, Miquel Raynal wrote:
> > Hi Sascha,
> >
> > s.hauer@xxxxxxxxxxxxxx wrote on Thu, 18 Apr 2024 13:43:15 +0200:
> >
> > > On Thu, Apr 18, 2024 at 11:32:44AM +0200, Miquel Raynal wrote:
> > > > Hi Sascha,
> > > >
> > > > s.hauer@xxxxxxxxxxxxxx wrote on Thu, 18 Apr 2024 08:48:08 +0200:
> > > >
> > > > > On Wed, Apr 17, 2024 at 09:13:31AM +0200, Sascha Hauer wrote:
> > > > > > The NAND core enabled subpage reads when a largepage NAND is used with
> > > > > > SOFT_ECC. The i.MX NAND controller doesn't support subpage reads, so
> > > > > > clear the flag again.
> > > > > >
> > > > > > Signed-off-by: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx>
> > > > > > ---
> > > > > > drivers/mtd/nand/raw/mxc_nand.c | 2 ++
> > > > > > 1 file changed, 2 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/mtd/nand/raw/mxc_nand.c b/drivers/mtd/nand/raw/mxc_nand.c
> > > > > > index f44c130dca18d..19b46210bd194 100644
> > > > > > --- a/drivers/mtd/nand/raw/mxc_nand.c
> > > > > > +++ b/drivers/mtd/nand/raw/mxc_nand.c
> > > > > > @@ -1667,6 +1667,8 @@ static int mxcnd_probe(struct platform_device *pdev)
> > > > > > if (err)
> > > > > > goto escan;
> > > > > >
> > > > > > + this->options &= ~NAND_SUBPAGE_READ;
> > > > > > +
> > > > >
> > > > > Nah, it doesn't work like this. It turns out the BBT is read using
> > > > > subpage reads before we can disable them here.
> > > > >
> > > > > This is the code in nand_scan_tail() we stumble upon:
> > > > >
> > > > > /* Large page NAND with SOFT_ECC should support subpage reads */
> > > > > switch (ecc->engine_type) {
> > > > > case NAND_ECC_ENGINE_TYPE_SOFT:
> > > > > if (chip->page_shift > 9)
> > > > > chip->options |= NAND_SUBPAGE_READ;
> > > > > break;
> > > > >
> > > > > default:
> > > > > break;
> > > > > }
> > > > >
> > > > > So the code assumes subpage reads are ok when SOFT_ECC is in use, which
> > > > > in my case is not true. I guess some drivers depend on the
> > > > > NAND_SUBPAGE_READ bit magically be set, so simply removing this code is
> > > > > likely not an option. Any ideas what to do?
> > > >
> > > > Can you elaborate why subpage reads are not an option in your
> > > > situation? While subpage writes depend on chip capabilities, reads
> > > > however should always work: it's just the controller selecting the
> > > > column where to start and then reading less data than it could from the
> > > > NAND cache. It's a very basic NAND controller feature, and I remember
> > > > this was working on eg. an i.MX27.
> > >
> > > On the i.MX27 reading a full 2k page means triggering one read operation
> > > per 512 bytes in the NAND controller, so it would be possible to read
> > > subpages by triggering only one read operation instead of four in a row.
> > >
> > > The newer SoCs like i.MX25 always read a full page with a single read
> > > operation. We could likely read subpages by temporarily configuring the
> > > controller for a 512b page size NAND.
> > >
> > > I just realized the real problem comes with reading the OOB data. With
> > > software BCH the NAND layer hardcodes the read_subpage hook to
> > > nand_read_subpage() which uses nand_change_read_column_op() to read the
> > > OOB data. This uses NAND_CMD_RNDOUT and I have now idea if/how this can
> > > be implemented in the i.MX NAND driver. Right now the controller indeed
> > > reads some data and then the SRAM buffer really contains part of the
> > > desired OOB data, but also part of the user data.
> >
> > NAND_CMD_RNDOUT is impossible to avoid,
>
> Apparently it has been possible until now. NAND_CMD_RNDOUT has never
> been used with this driver and it also doesn't work like expected.
>
> One problem is that the read_page_raw() and write_page_raw() are not
> implemented like supposed by the NAND layer. The i.MX NAND controller
> uses a syndrome type ECC layout, meaning that the user data and OOB data
> is interleaved, so the raw r/w functions should normally pass/expect the
> page data in interleaved format. Unfortunately the raw functions are not
> implemented like that, instead they detangle the data themselves. This
> also means that setting the cursor using NAND_CMD_RNDOUT will not put
> the cursor at a meaningful place, as the raw functions are not really
> exect/return the raw page data.
>
> This could be fixed, but the raw operations are also exposed to
> userspace, so fixing these would mean that we might break some userspace
> applications.

As answered to patch 3/4 I believe you need other raw page helpers for
the software ECC path, just because the existing functions are tight to
the on-host ECC logic and do what they are expected to do (I believe).

Creating another set of raw page helpers should be straightforward to
do if that's really needed (mainly for performance purposes, but we're
not yet there). Using the core helpers should work, the only thing is
supporting properly the NAND_CMD_RNDOUT path, which should be possible
at a rather low cost, it really is a very very basic command, I know no
controller without this feature, even old ones.

> The other point is that with using software BCH ecc the NAND layer
> requests me to read 7 bytes at offset 0x824. This can't be really
> implemented in the i.MX NAND driver. It only allows us to read a full
> 512 byte subpage, so whenever the NAND layer requests me to read a few
> bytes the controller will always transfer 512 bytes from which I then
> ignore most of it (and possibly trigger another 512 bytes transfer when
> reading the ECC for the next subpage).

If you manage to get the NAND_CMD_RNDOUT op working I believe you'll be
tempted to use memcpy32_fromio() with just a slightly rounded up length.

> I think these issues can all be handled somehow, but this comes at a
> rather high price, both in effort and the risk of breaking userspace.
> It would be far easier to tell the NAND layer not to do subpage reads.

I understand it may feel like that but that is likely not the right
approach. I just think about another possibility: using monolithic
reads if the controller is too constrained this way you might end up
avoiding the RNDOUT command (might require a bit of tweaking in the
core, I no longer remember exactly).

Good luck, I really appreciate this effort.
Miquèl