RE: [PATCH net-next 3/3] net: stmmac: Introducing support for Page Pool

From: Jose Abreu
Date: Mon Jul 22 2019 - 10:06:20 EST


From: Jon Hunter <jonathanh@xxxxxxxxxx>
Date: Jul/22/2019, 13:05:38 (UTC+00:00)

>
> On 22/07/2019 12:39, Jose Abreu wrote:
> > From: Lars Persson <lists@xxxxxxx>
> > Date: Jul/22/2019, 12:11:50 (UTC+00:00)
> >
> >> On Mon, Jul 22, 2019 at 12:18 PM Ilias Apalodimas
> >> <ilias.apalodimas@xxxxxxxxxx> wrote:
> >>>
> >>> On Thu, Jul 18, 2019 at 07:48:04AM +0000, Jose Abreu wrote:
> >>>> From: Jon Hunter <jonathanh@xxxxxxxxxx>
> >>>> Date: Jul/17/2019, 19:58:53 (UTC+00:00)
> >>>>
> >>>>> Let me know if you have any thoughts.
> >>>>
> >>>> Can you try attached patch ?
> >>>>
> >>>
> >>> The log says someone calls panic() right?
> >>> Can we trye and figure were that happens during the stmmac init phase?
> >>>
> >>
> >> The reason for the panic is hidden in this one line of the kernel logs:
> >> Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> >>
> >> The init process is killed by SIGSEGV (signal 11 = 0xb).
> >>
> >> I would suggest you look for data corruption bugs in the RX path. If
> >> the code is fetched from the NFS mount then a corrupt RX buffer can
> >> trigger a crash in userspace.
> >>
> >> /Lars
> >
> >
> > Jon, I'm not familiar with ARM. Are the buffer addresses being allocated
> > in a coherent region ? Can you try attached patch which adds full memory
> > barrier before the sync ?
>
> TBH I am not sure about the buffer addresses either. The attached patch
> did not help. Same problem persists.

OK. I'm just guessing now at this stage but can you disable SMP ?

We have to narrow down if this is coherency issue but you said that
booting without NFS and then mounting manually the share works ... So,
can you share logs with same debug prints in this condition in order to
compare ?

---
Thanks,
Jose Miguel Abreu