Re: [RFC v3 0/5] Transparent on-demand struct page initializationembedded in the buddy allocator

From: Ingo Molnar
Date: Wed Aug 14 2013 - 07:27:50 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Tue, Aug 13, 2013 at 4:10 PM, Nathan Zimmer <nzimmer@xxxxxxx> wrote:
> >
> > The only mm structure we are adding to is a new flag in page->flags.
> > That didn't seem too much.
>
> I don't agree.
>
> I see only downsides, and no upsides. Doing the same thing *without* the
> downsides seems straightforward, so I simply see no reason for any extra
> flags or tests at runtime.

The code as presented clearly looks more involved and neither simple nor
zero-cost - I was hoping for a much more simple approach.

I see three solutions:

- Speed up the synchronous memory init code: live migrate to the node
being set up via set_cpus_allowed(), to make sure the init is always
fast and local.

Pros: if it solves the problem then mem init is still synchronous,
deterministic and essentially equivalent to what we do today - so
relatively simple and well-tested, with no 'large machine' special
path.

Cons: it might not be enough and we might not have scheduling
enabled on the affected nodes yet.

- Speed up the synchronous memory init code by paralellizing the key,
most expensive initialization portion of setting up the page head
arrays to per node, via SMP function-calls.

Pros: by far the fastest synchronous option. (It will also test the
power budget and the mains fuses right during bootup.)

Cons: more complex and depends on SMP cross-calls being available at
mem init time. Not necessarily hotplug friendly.

- Avoid the problem by punting to async mem init.

Pros: it gets us to a minimal working system quickly and leaves the
memory code relatively untouched.

Disadvantages: makes memory state asynchronous and non-deterministic.
Stats either fluctuate shortly after bootup or have to be faked.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/