Re: [RFC 0/2] Delay initializing of large sections of memory

From: Nathan Zimmer
Date: Fri Jun 21 2013 - 13:18:21 EST


On 06/21/2013 12:03 PM, H. Peter Anvin wrote:
On 06/21/2013 09:51 AM, Greg KH wrote:
On Fri, Jun 21, 2013 at 11:25:32AM -0500, Nathan Zimmer wrote:
This rfc patch set delays initializing large sections of memory until we have
started cpus. This has the effect of reducing startup times on large memory
systems. On 16TB it can take over an hour to boot and most of that time
is spent initializing memory.

We avoid that bottleneck by delaying initialization until after we have
started multiple cpus and can initialize in a multithreaded manner.
This allows us to actually reduce boot time rather then just moving around
the point of initialization.

Mike and I have worked on this set for a while, with him doing the most of the
heavy lifting, and are eager for some feedback.
Why make this a config option at all, why not just always do this if the
memory size is larger than some specific number (like 8TB?)

Otherwise the distros will always enable this option, and having it be a
configuration choice doesn't make any sense.

Since you made it a compile time option, it would be good to know how
much code it adds, but otherwise I agree with Greg here... this really
shouldn't need to be an option. It *especially* shouldn't need to be a
hand-set runtime option (which looks quite complex, to boot.)
The patchset as a whole is just over 400 lines so it doesn't add alot.
If I were to pull the .config option it would probably remove 30 lines.

The command line option is too complex but some of the data I haven't found a way
to get at runtime yet.



I suspect the cutoff for this should be a lot lower than 8 TB even, more
like 128 GB or so. The only concern is to not set the cutoff so low
that we can end up running out of memory or with suboptimal NUMA
placement just because of this.
Even at lower amounts of ram there is an positive impact.I it knocks time off
boot even at as small as a 1TB of ram.

Also, in case it is not bloody obvious: whatever memory the kernel image
was loaded into MUST be considered "online", even if it is loaded way high.

-hpa






Ok
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/