Re: [PATCH] perf bench: Add benchmark of find_next_bit

From: Andi Kleen
Date: Fri Jul 24 2020 - 10:45:05 EST


On Fri, Jul 24, 2020 at 12:19:59AM -0700, Ian Rogers wrote:
> for_each_set_bit, or similar functions like for_each_cpu, may be hot
> within the kernel. If many bits were set then one could imagine on
> Intel a "bt" instruction with every bit may be faster than the function
> call and word length find_next_bit logic. Add a benchmark to measure
> this.

> This benchmark on AMD rome and Intel skylakex shows "bt" is not a good
> option except for very small bitmaps.

Small bitmaps is a common case in the kernel (e.g. cpu bitmaps)

But the current code isn't that great for small bitmaps. It always looks horrific
when I look at PT traces or brstackinsn, especially since it was optimized
purely for code size at some point.

Probably would be better to have different implementations for
different sizes.

-Andi