Re: [RFC v2 0/5] Transparent on-demand struct page initializationembedded in the buddy allocator

From: Ingo Molnar
Date: Mon Aug 05 2013 - 05:58:21 EST



* Nathan Zimmer <nzimmer@xxxxxxx> wrote:

> We are still restricting ourselves ourselves to 2MiB initialization to
> keep the patch set a little smaller and more clear.
>
> We are still struggling with the expand(). Nearly always the first
> reference to a struct page which is in the middle of the 2MiB region.
> We were unable to find a good solution. Also, given the strong warning
> at the head of expand(), we did not feel experienced enough to refactor
> it to make things always reference the 2MiB page first. The only other
> fastpath impact left is the expansion in prep_new_page.

I suppose it's about this chunk:

@@ -860,6 +917,7 @@ static inline void expand(struct zone *zone, struct page *page,
area--;
high--;
size >>= 1;
+ ensure_page_is_initialized(page);
VM_BUG_ON(bad_range(zone, &page[size]));

where ensure_page_is_initialized() does, in essence:

+ while (aligned_start_pfn < aligned_end_pfn) {
+ if (pfn_valid(aligned_start_pfn)) {
+ page = pfn_to_page(aligned_start_pfn);
+
+ if (PageUninitialized2m(page))
+ expand_page_initialization(page);
+ }
+
+ aligned_start_pfn += PTRS_PER_PMD;
+ }

where aligned_start_pfn is 2MB rounded down.

which looks like an expensive loop to execute for a single page: there are
512 pages in a 2MB range, so on average this iterates 256 times, for every
single page of allocation. Right?

I might be missing something, but why not just represent the
initialization state in 2MB chunks: it is either fully uninitialized, or
fully initialized. If any page in the 'middle' gets allocated, all page
heads have to get initialized.

That should make the fast path test fairly cheap, basically just
PageUninitialized2m(page) has to be tested - and that will fail in the
post-initialization fastpath.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/