Re: [RFC REBASED 5/5] powerpc/mm/slice: use the dynamic high slice size to limit bitmap operations

From: Aneesh Kumar K.V
Date: Wed Feb 28 2018 - 02:00:04 EST




On 02/28/2018 12:23 PM, Nicholas Piggin wrote:
On Tue, 27 Feb 2018 18:11:07 +0530
"Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx> wrote:

Nicholas Piggin <npiggin@xxxxxxxxx> writes:

On Tue, 27 Feb 2018 14:31:07 +0530
"Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx> wrote:
Christophe Leroy <christophe.leroy@xxxxxx> writes:
The number of high slices a process might use now depends on its
address space size, and what allocation address it has requested.

This patch uses that limit throughout call chains where possible,
rather than use the fixed SLICE_NUM_HIGH for bitmap operations.
This saves some cost for processes that don't use very large address
spaces.

I haven't really looked at the final code. One of the issue we had was
with the below scenario.

mmap(addr, len) where addr < 128TB and addr+len > 128TB We want to make
sure we build the mask such that we don't find the addr available.

We should run it through the mmap regression tests. I *think* we moved
all of that logic from the slice code to get_ummapped_area before going
in to slices. I may have missed something though, it would be good to
have more eyes on it.

mmap(-1,...) failed with the test. Something like below fix it

@@ -756,7 +770,7 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize)
mm->context.low_slices_psize = lpsizes;
hpsizes = mm->context.high_slices_psize;
- high_slices = GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit);
+ high_slices = SLICE_NUM_HIGH;
for (i = 0; i < high_slices; i++) {
mask_index = i & 0x1;
index = i >> 1;

I guess for everything in the mm_context_t, we should compute it till
SLICE_NUM_HIGH. The reason for failure was, even though we recompute the
slice mask cached in mm_context on slb_addr_limit, it was still derived
from the high_slices_psizes which was computed with lower value.

Okay thanks for catching that Aneesh. I guess that's a slow path so it
should be okay. Christophe if you're taking care of the series can you
fold it in? Otherwise I'll do that after yours gets merged.


should we also compute the mm_context_t.slice_mask using SLICE_NUM_HIGH and skip the recalc_slice_mask_cache when we change the addr limit?

-aneesh