Re: frequent lockups in 3.18rc4

From: Frederic Weisbecker
Date: Wed Nov 19 2014 - 18:08:03 EST

Next message: Tobias Klausmann: "Re: 3.18-rc regression: drm/nouveau: use shared fences for readable objects"
Previous message: Andy Lutomirski: "Re: frequent lockups in 3.18rc4"
In reply to: Andy Lutomirski: "Re: frequent lockups in 3.18rc4"
Next in thread: Thomas Gleixner: "Re: frequent lockups in 3.18rc4"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Nov 19, 2014 at 02:59:01PM -0800, Andy Lutomirski wrote:
> On Wed, Nov 19, 2014 at 2:56 PM, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> > On Wed, Nov 19, 2014 at 10:56:26PM +0100, Thomas Gleixner wrote:
> >> On Wed, 19 Nov 2014, Frederic Weisbecker wrote:
> >> > I got a report lately involving context tracking. Not sure if it's
> >> > the same here but the issue was that context tracking uses per cpu data
> >> > and per cpu allocation use vmalloc and vmalloc'ed area can fault due to
> >> > lazy paging.
> >>
> >> This is complete nonsense. pcpu allocations are populated right
> >> away. Otherwise no single line of kernel code which uses dynamically
> >> allocated per cpu storage would be safe.
> >
> > Note this isn't faulting because part of the allocation is swapped. No
> > it's all reserved in the physical memory, but it's a lazy allocation.
> > Part of it isn't yet addressed in the P[UGM?]D. That's what vmalloc_fault() is for.
> >
> > So it's a non-blocking/sleeping fault which is why it's probably fine
> > most of the time except on code that isn't fault-safe. And I suspect that
> > most people assume that kernel data won't fault so probably some other
> > places have similar issues.
> >
> > That's a long standing issue. We even had to convert the perf callchain
> > allocation to ad-hoc kmalloc() based per cpu allocation to get over vmalloc
> > faults. At that time, NMIs couldn't handle faults and many callchains were
> > populated in NMIs. We had serious crashes because of per cpu memory faults.
>
> Is there seriously more than 512GB of per-cpu virtual space or
> whatever's needed to exceed a single pgd on x86_64?

No idea, I'm clueless about -mm details.

>
> And there are definitely placed that access per-cpu data in contexts
> in which a non-IST fault is not allowed. Maybe not dynamic per-cpu
> data, though.

It probably happens to be fine because the code that accesses first the
related data is fault-safe. Or maybe not and some state is silently messed
up somewhere.

This doesn't leave a comfortable feeling.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Tobias Klausmann: "Re: 3.18-rc regression: drm/nouveau: use shared fences for readable objects"
Previous message: Andy Lutomirski: "Re: frequent lockups in 3.18rc4"
In reply to: Andy Lutomirski: "Re: frequent lockups in 3.18rc4"
Next in thread: Thomas Gleixner: "Re: frequent lockups in 3.18rc4"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]