Re: increased vmap_area_lock contentions on "n_tty: Move buffersinto n_tty_data"

From: Peter Hurley
Date: Thu Sep 26 2013 - 18:21:46 EST


On 09/26/2013 05:58 PM, Andrew Morton wrote:
On Thu, 26 Sep 2013 17:42:52 -0400 Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote:

On 09/26/2013 02:05 PM, Andrew Morton wrote:
On Thu, 26 Sep 2013 13:35:32 -0400 Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote:

The issue with a single large kmalloc is that it may fail where
3 separate, page-or-less kmallocs would not have.

Or vmalloc fails first, because of internal fragmentation of the vmap
arena. This problem plus vmalloc's slowness are the reasons why
vmalloc should be avoided.

Ok, no vmalloc.

A tremendous number of places in the kernel perform higher-order
allocations nowadays. The page allocator works damn hard to service
them and I expect that switching to kmalloc here will be OK.

I've had order-4 allocation failures before on 10Gb.

Yep. But this allocation will be order=2, yes? And
PAGE_ALLOC_COSTLY_ORDER=3. So if that thing is working correctly,
order=2 will do a lot better than order=4.

PAGE_ALLOC_COSTLY_ORDER was a subtlety I wasn't aware of; thanks
for the info.

In fact, the
nouveau driver switched to vmalloc for that very reason (commit
d005f51eb93d71cd40ebd11dd377453fa8c8a42a, drm/nouveau: use vmalloc
for pgt allocation).

Sigh. I'm not aware of any reports of anyone hitting arena
fragmentation problems yet, so it remains a theoretical thing. But the
more we use vmalloc, the more likely it becomes. And because the usage
sites are so disparate, fixing it will be pretty horrid.

For this reason (plus vmalloc is slow), I do think it's better to do
the old

foo = kmalloc(__GFP_NOWARN);
if (!foo)
foo = vmalloc();

thing. It's ugly, but will greatly reduce the amount of vmallocing
which happens.

Someone had a patch a while back which wraps this operation (and the
corresponding free) into library functions. I said yuk and it wasn't
merged. Perhaps that was a mistake.

I would suggest either
1. documenting the bulk of our conversation in either/both
mm/vmalloc.c:vmalloc() and include/linux/slab.h
or
2. require that new vmalloc() users get your ack.

Regards,
Peter Hurley
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/