Re: [PATCH v2] bitmap: speedup in bitmap_find_free_region whenorder is 0

From: Andrew Morton
Date: Mon Apr 08 2013 - 23:10:09 EST


On Tue, 9 Apr 2013 11:44:46 +0900 Chanho Min <chanho.min@xxxxxxx> wrote:

> If bitmap_find_free_region() is called with order=0, We can reduce
> for-loops to find 1 free bit. First, It scans bitmap array by the
> increment of long type, then find 1 free bit within 1 long type value.
>
> In 32 bits system and 1024 bits size, in the worst case, We need 1024
> for-loops to find 1 free bit. But, If This is applied, it takes
> 64 for-loops. Instead, if free bit is in the first index of the bitmaps,
> It will be needed additional 1 for-loop. But from second index, It
> will speed up significantly.
>
> Changes compared to v1:
> - Modified unnecessarily complicated code.
> - Fixed the buggy code if `bits' is not an multiple of BITS_PER_LONG.
>
> ...
>
> --- a/lib/bitmap.c
> +++ b/lib/bitmap.c
> @@ -1099,6 +1099,39 @@ done:
> }
>
> /**
> + * bitmap_find_free_one - find a mem region
> + * @bitmap: array of unsigned longs corresponding to the bitmap
> + * @bits: number of bits in the bitmap
> + *
> + * Find one of free (zero) bits in a @bitmap of @bits bits and
> + * allocate them (set them to one).
> + *
> + * Return the bit offset in bitmap of the allocated region,
> + * or -errno on failure.
> + */
> +static int __bitmap_find_free_one(unsigned long *bitmap, int bits)
> +{
> + int pos, end = BITS_PER_LONG, i;
> + int nlongs_reg = BITS_TO_LONGS(bits);

Still wrong, I think - BITS_TO_LONG() rounds up.

> + int last_bits = bits % BITS_PER_LONG;
> +
> + for (i = 0 ; i < nlongs_reg ; i++) {

No space before the semicolon, please. checkpatch should warn about
this but it seems to be broken.

> + if (bitmap[i] != ~0UL) {
> + if (i == (nlongs_reg - 1) && last_bits)
> + end = last_bits;
> + for (pos = 0 ; pos < end ; pos++) {
> + if (!__reg_op(&bitmap[i], pos, 0,
> + REG_OP_ISFREE))
> + continue;
> + __reg_op(&bitmap[i], pos, 0, REG_OP_ALLOC);
> + return pos;
> + }
> + }
> + }
> + return -ENOMEM;
> +}
> +
> +/**
> * bitmap_find_free_region - find a contiguous aligned mem region
> * @bitmap: array of unsigned longs corresponding to the bitmap
> * @bits: number of bits in the bitmap
> @@ -1116,6 +1149,9 @@ int bitmap_find_free_region(unsigned long *bitmap, int bits, int order)
> {
> int pos, end; /* scans bitmap by regions of size order */
>
> + if (order == 0)
> + return __bitmap_find_free_one(bitmap, bits);
> +
> for (pos = 0 ; (end = pos + (1 << order)) <= bits; pos = end) {
> if (!__reg_op(bitmap, pos, order, REG_OP_ISFREE))
> continue;

It seems excessively complicated to me. Why not change
bitmap_find_free_region() to skip the leading all-ones words and when
it finds a not-all-ones word, adjust `pos' then fall into the existing
bit-at-a-time search?

In fact we could use the 64-bits-at-a-time search for allocations other
than order-zero:

--- a/lib/bitmap.c~a
+++ a/lib/bitmap.c
@@ -1117,6 +1117,12 @@ int bitmap_find_free_region(unsigned lon
int pos, end; /* scans bitmap by regions of size order */

for (pos = 0 ; (end = pos + (1 << order)) <= bits; pos = end) {
+ if (pos & (BITS_PER_LONG - 1) == 0) {
+ if (bitmap[pos / BITS_PER_LONG] == ~0UL) {
+ pos += BITS_PER_LONG;
+ continue;
+ }
+ }
if (!__reg_op(bitmap, pos, order, REG_OP_ISFREE))
continue;
__reg_op(bitmap, pos, order, REG_OP_ALLOC);

(that's presumably slow and buggy, but you get the idea ;))

Another obvious inefficiency in bitmap_find_free_region() is that when
it inspects a region at `pos' for 1<<order zero bits and fails to find
them, it resumes the search at pos+1. Dumb - it should resume
searching at the next-one-bit, rounded up to the next 1<<order.

Obviously nobody tried very hard here - is any poor soul using this
code for large bitmaps? I guess "yes", as 1024-CPU machines exist.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/