Re: [GIT PULL] parisc huge page support for v4.4

From: Mikulas Patocka
Date: Sat Dec 26 2015 - 07:09:32 EST




On Tue, 24 Nov 2015, Mikulas Patocka wrote:

>
>
> On Tue, 24 Nov 2015, Helge Deller wrote:
>
> > * Mikulas Patocka <mpatocka@xxxxxxxxxx>:
> > >
> > >
> > > On Tue, 24 Nov 2015, Helge Deller wrote:
> > >
> > > > > Hi
> > > > >
> > > > > Since the kernel 4.4-rc2 I'm getting frequent boot failures on PA-RISC.
> > > > > When I revert this patchset, the crashes are gone.
> > > >
> > > > > [ 3.296666] CPU(s): 4 out of 4 PA8900 (Shortfin) at 1000.000000 MHz online
> > > >
> > > > Hi Mikulas,
> > > >
> > > > Yes, I've seen this as well.
> > > > It affects only the PA8900 CPUs, while all PA8500-PA8700 machines seem to work fine.
> > > > I do have a temporary 3-line patch to avoid the crashes which I'll push to my tree shortly.
> > > > I'm still investigating why it only affects the PA8900 CPUs, but I assume
> > > > it's related to the cache aliasing of those CPUs.
> > > > I'll keep you updated.
> > > >
> > > > Helge
> > >
> > > The PA-RISC specification doesn't allow aliasing on non-equaivalent
> > > addresses. Can the kernel map a piece of kernel data to other virtual
> > > address? If yes, we can't use big pages to map kernel data.
> >
> > Can you please try the two patches below?
> > The first one disables mapping kernel text/data on huge pages on
> > PA8800/PA8900 CPUs. Patch works for me on my Mako PA8800.
> >
> > Independend of my huge page patch the second patch disables the tlb
> > flush optimization we added earlier. It seems calling flush_tlb_all()
> > doesn't reliably flushes tlbs on all CPUs so it's better to fall back to
> > the loop implementation.
> >
> > Helge
>
> The kernel with these patches works fine so far.
>
> Mikulas

BTW. I looked at this in arch/parisc/mm/hugetlbpage.c:set_huge_pte_at
"*ptep = entry;" and it seems like a bad bug. PA-RISC doesn't have atomic
instructions to modify page table entries, so it takes spinlock in the TLB
handler and modifies the page table entry non-atomically. If you modify
the page table entry without the spinlock, you may race with TLB handler
on another CPU and your modification may be lost.

The comment says something about double locking on pa_tlb_lock, but
pa_tlb_lock isn't held when that function is called.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/