Re: [PATCH v2 3/5] powerpc/mm: Allow more than 16 low slices

From: Christophe LEROY
Date: Fri Jan 19 2018 - 03:59:44 EST




Le 19/01/2018 Ã 09:30, Aneesh Kumar K.V a ÃcritÂ:
Christophe Leroy <christophe.leroy@xxxxxx> writes:

While the implementation of the "slices" address space allows
a significant amount of high slices, it limits the number of
low slices to 16 due to the use of a single u64 low_slices_psize
element in struct mm_context_t

On the 8xx, the minimum slice size is the size of the area
covered by a single PMD entry, ie 4M in 4K pages mode and 64M in
16K pages mode. This means we could have resp. up to 1024 and 64
slices.

In order to override this limitation, this patch switches the
handling of low_slices to BITMAPs as done already for high_slices.

Does it have a performance impact. When we switched high_slices
that was one of the question asked. Now with a topdown search we should
mostly be using the high_slices. But it will good to get numbers for
ppc64 for this change.

It should have almost no performance impact at all, because all bitmap functions used a simplified way when the number of bits is small and constant:

- ret->low_slices = 0;
+ slice_bitmap_zero(ret->low_slices, SLICE_NUM_LOW);


static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
{
if (small_const_nbits(nbits))
*dst = 0UL;
else {
unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
memset(dst, 0, len);
}
}



- dst->low_slices |= src->low_slices;
+ slice_bitmap_or(dst->low_slices, dst->low_slices, src->low_slices,
+ SLICE_NUM_LOW);


static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
const unsigned long *src2, unsigned int nbits)
{
if (small_const_nbits(nbits))
*dst = *src1 | *src2;
else
__bitmap_or(dst, src1, src2, nbits);
}





Signed-off-by: Christophe Leroy <christophe.leroy@xxxxxx>
---
v2: Usign slice_bitmap_xxx() macros instead of bitmap_xxx() functions.

arch/powerpc/include/asm/book3s/64/mmu.h | 2 +-
arch/powerpc/include/asm/mmu-8xx.h | 2 +-
arch/powerpc/include/asm/paca.h | 2 +-
arch/powerpc/kernel/paca.c | 3 +-
arch/powerpc/mm/hash_utils_64.c | 13 ++--
arch/powerpc/mm/slb_low.S | 8 ++-
arch/powerpc/mm/slice.c | 104 +++++++++++++++++--------------
7 files changed, 74 insertions(+), 60 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index c9448e19847a..27e7e9732ea1 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -91,7 +91,7 @@ typedef struct {
struct npu_context *npu_context;
#ifdef CONFIG_PPC_MM_SLICES
- u64 low_slices_psize; /* SLB page size encodings */
+ unsigned char low_slices_psize[8]; /* SLB page size encodings */

Can that 8 be a #define?

Sure



unsigned char high_slices_psize[SLICE_ARRAY_SIZE];
unsigned long slb_addr_limit;
#else

-aneesh


Christophe