RE: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND Flash Controller

From: Naga Sureshkumar Relli
Date: Mon Dec 17 2018 - 08:21:16 EST


Hi Miquel,

> -----Original Message-----
> From: Miquel Raynal [mailto:miquel.raynal@xxxxxxxxxxx]
> Sent: Wednesday, December 12, 2018 6:48 PM
> To: Naga Sureshkumar Relli <nagasure@xxxxxxxxxx>
> Cc: Boris Brezillon <boris.brezillon@xxxxxxxxxxx>; robh@xxxxxxxxxx; richard@xxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; marek.vasut@xxxxxxxxx; linux-mtd@xxxxxxxxxxxxxxxxxxx;
> nagasuresh12@xxxxxxxxx; Michal Simek <michals@xxxxxxxxxx>;
> computersforpeace@xxxxxxxxx; dwmw2@xxxxxxxxxxxxx
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> Hi Naga,
>
> Naga Sureshkumar Relli <nagasure@xxxxxxxxxx> wrote on Wed, 12 Dec 2018
> 13:07:42 +0000:
>
> > Hi Miquel,
> >
> > > -----Original Message-----
> > > From: Miquel Raynal [mailto:miquel.raynal@xxxxxxxxxxx]
> > > Sent: Wednesday, December 12, 2018 2:40 PM
> > > To: Naga Sureshkumar Relli <nagasure@xxxxxxxxxx>
> > > Cc: Boris Brezillon <boris.brezillon@xxxxxxxxxxx>; robh@xxxxxxxxxx;
> > > richard@xxxxxx; linux- kernel@xxxxxxxxxxxxxxx;
> > > marek.vasut@xxxxxxxxx; linux-mtd@xxxxxxxxxxxxxxxxxxx;
> > > nagasuresh12@xxxxxxxxx; Michal Simek <michals@xxxxxxxxxx>;
> > > computersforpeace@xxxxxxxxx; dwmw2@xxxxxxxxxxxxx
> > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support
> > > for Arasan NAND Flash Controller
> > >
> > > Hi Naga,
> > >
> > > Naga Sureshkumar Relli <nagasure@xxxxxxxxxx> wrote on Wed, 12 Dec
> > > 2018
> > > 09:04:16 +0000:
> > >
> > > > Hi Miquel,
> > > >
> > > > > -----Original Message-----
> > > > > From: Miquel Raynal [mailto:miquel.raynal@xxxxxxxxxxx]
> > > > > Sent: Wednesday, December 12, 2018 1:42 PM
> > > > > To: Naga Sureshkumar Relli <nagasure@xxxxxxxxxx>
> > > > > Cc: Boris Brezillon <boris.brezillon@xxxxxxxxxxx>;
> > > > > robh@xxxxxxxxxx; richard@xxxxxx; linux- kernel@xxxxxxxxxxxxxxx;
> > > > > marek.vasut@xxxxxxxxx; linux-mtd@xxxxxxxxxxxxxxxxxxx;
> > > > > nagasuresh12@xxxxxxxxx; Michal Simek <michals@xxxxxxxxxx>;
> > > > > computersforpeace@xxxxxxxxx; dwmw2@xxxxxxxxxxxxx
> > > > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add
> > > > > support for Arasan NAND Flash Controller
> > > > >
> > > > > Hi Naga,
> > > > >
> > > > > Naga Sureshkumar Relli <nagasure@xxxxxxxxxx> wrote on Wed, 12
> > > > > Dec
> > > > > 2018
> > > > > 05:27:03 +0000:
> > > > >
> > > > > > Hi Boris & Miquel,
> > > > > >
> > > > > > An update to my comments on thread https://lkml.org/lkml/2018/11/15/656.
> > > > > > In this I said, will take a default error count value as 16
> > > > > > and during page read, will check the error count Register
> > > > > > value with this and if it is equal to or greater than the
> > > > > > default count(16) then I am checking for
> > > Erased pages.
> > > > > > But bit[7:0] in ECC_Error_Count_Register(0x38) will update for each error
> occurred.
> > > > > > Link:
> > > > > > https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-
> > > > > > ultr
> > > > > > ascale-
> > > > > registers.html and check for NAND module, ECC_Error_Count_Register.
> > > > > >
> > > > > > I mean previously I dependent on Total error count value
> > > > > > (bit[16:8]), but we can simply check for bit[7:0] To see the error occurred or not.
> > > > > > I tried with this approach and I don't see any issues with that.
> > > > > > I ran ubifs with this and I am able to see the bit[7:0] count
> > > > > > is updated for erased page read and then will Use
> > > > > > nand_chech_erased_ecc_chunk() to see the
> > > > > bitflips.
> > > > > >
> > > > > > If it is ok, I will update the driver and will send new patch,
> > > > > > but do you have any other
> > > > > comments on v12?
> > > > >
> > > > > Is 'nandbiterrs -i' running correctly now?
> > > > Yes, but with some changes in driver.
> > > > I have added the log and changes done in https://lkml.org/lkml/2018/11/23/705.
> > >
> > > No, I don't see a working nandbiterrs there, sorry.
> > The log that I have attached is from mtd_nandbiterrs test So as per
> > ARASAN controller ECC mechanism, it will correct upto 24-bit. After that the test will fail.
>
> There is a distinction between:
> 1/ The driver fails to correct more than 24-bit and advertise the
> caller that the page read is somehow corrupted.
> 2/ The driver fails to correct more than 24-bit but does not complain
> about it. In our case, the caller (the test tool) will compare the
> page written and read: if it do not match it means the driver is
> broken because the driver reported a successful operation despite
> the fact that it returned a corrupted page.
>
> You are in the second case, we expect the driver to behave like in 1/.
I tried with mtd-utils(nandbiterr test) and here is the output of that
root@xilinx-zc1751-dc2-2018_1:~# nandbiterrs -i /dev/mtd0
incremental biterrors test
Successfully corrected 0 bit errors per subpage
Inserted biterror @ 1/7
Successfully corrected 1 bit errors per subpage
Inserted biterror @ 3/7
Successfully corrected 2 bit errors per subpage
Insertedbiterror @ 5/7
Successfully corrected 3 bit errors per subpage
Inserted biterror @ 7/7
Successfully corrected 4 bit errors per subpage
Inserted biterror @ 8/7
Successfully corrected 5 bit errors per subpage
Inserted biterror @ 10/7
Successfully corrected 6 bit errors per subpage
Inserted biterror @ 12/7
Successfully corrected 7 bit errors per subpage
Inserted biterror @ 14/7
Successfully corrected 8 bit errors per subpage
Inserted biterror @ 17/7
Successfully corrected 9 bit errors per subpage
Inserted biterror @ 19/7
Successfully corrected 10 bit errors per subpage
Inserted biterror @ 21/7
Successfully corrected 11 bit errors per subpage
Inserted biterror @ 23/7
Successfully corrected 12 bit errors per subpage
Inserted biterror @ 24/7
Successfully corrected 13 bit errors per subpage
Inserted biterror @ 26/7
Successfully corrected 14 bit errors per subpage
Inserted biterror @ 28/7
Successfully corrected 15 bit errors per subpage
Inserted biterror @ 30/7
Successfully corrected 16 bit errors per subpage
Inserted biterror @ 32/7
Successfully corrected 17 bit errors per subpage
Inserted biterror @ 34/7
Successfully corrected 18 bit errors per subpage
Inserted biterror @ 36/7
Successfully corrected 19 bit errors per subpage
Inserted biterror @ 38/7
Successfully corrected 20 bit errors per subpage
Inserted biterror @ 41/7
Successfully corrected 21 bit errors per subpage
Inserted biterror @ 43/7
Successfully corrected 22 bit errors per subpage
Inserted biterror @ 45/7
Successfully corrected 23 bit errors per subpage
Inserted biterror @ 47/7
Successfully corrected 24 bit errors per subpage
Inserted biterror @ 48/7
Successfully corrected 25 bit errors per subpage
Inserted biterror @ 50/7
ECC failure, invalid data despite read success
root@xilinx-zc1751-dc2-2018_1:~#

But even in this case also, driver is saying ECC failure but read success.
That means controller is able to detect errors on read page up to 24 bit only.
After that there is no way to say to the upper layers that the page is bad because of the limitation in the controller.
Could you please suggest any alternative to report the errors in that case?

Thanks,
Naga Sureshkumar Relli

>
> >
> > I am running mtd-utils nandbiterr test now. Will let you know once I completed that.
>
> Yes please, prefer using the userspace tools.
>
>
> Thanks,
> MiquÃl