Re: [PATCH v4 1/2] lib/test_bitops: Add benchmark test for fns()

From: Yury Norov
Date: Sun May 05 2024 - 16:40:06 EST


On Mon, May 06, 2024 at 02:48:14AM +0800, Kuan-Wei Chiu wrote:
> On Sun, May 05, 2024 at 01:11:53PM +0000, David Laight wrote:
> > From: Yury Norov
> > > Sent: 01 May 2024 17:30
> > >
> > > On Wed, May 01, 2024 at 09:20:46PM +0800, Kuan-Wei Chiu wrote:
> > > > Introduce a benchmark test for the fns(). It measures the total time
> > > > taken by fns() to process 1,000,000 test data generated using
> > > > get_random_bytes() for each n in the range [0, BITS_PER_LONG).
> > > >
> > > > example:
> > > > test_bitops: fns: 5876762553 ns, 64000000 iterations
> > >
> > > So... 5 seconds for a test sounds too much. I see the following patch
> > > improves it dramatically, but in general let's stay in a range of
> > > milliseconds. On other machines it may run much slower and trigger
> > > a stall watchdog.
> > >
> > > > Signed-off-by: Kuan-Wei Chiu <visitorckw@xxxxxxxxx>
> > >
> > > Suggested-by: Yury Norov <yury.norov@xxxxxxxxx>
> > >
> > > > ---
> > > >
> > > > Changes in v4:
> > > > - Correct get_random_long() -> get_random_bytes() in the commit
> > > > message.
> > > >
> > > > lib/test_bitops.c | 22 ++++++++++++++++++++++
> > > > 1 file changed, 22 insertions(+)
> > > >
> > > > diff --git a/lib/test_bitops.c b/lib/test_bitops.c
> > > > index 3b7bcbee84db..ed939f124417 100644
> > > > --- a/lib/test_bitops.c
> > > > +++ b/lib/test_bitops.c
> > > > @@ -50,6 +50,26 @@ static unsigned long order_comb_long[][2] = {
> > > > };
> > > > #endif
> > > >
> > > > +static unsigned long buf[1000000];
> > >
> > > Can you make it __init, or allocate with kmalloc_array(), so that 64M
> > > of memory will not last forever in the kernel?
> > >
> > > > +static int __init test_fns(void)
> > > > +{
> > > > + unsigned int i, n;
> > > > + ktime_t time;
> > > > +
> > > > + get_random_bytes(buf, sizeof(buf));
> > > > + time = ktime_get();
> > > > +
> > > > + for (n = 0; n < BITS_PER_LONG; n++)
> > > > + for (i = 0; i < 1000000; i++)
> > > > + fns(buf[i], n);
> > >
> > > What concerns me here is that fns() is a in fact a const function, and
> > > the whole loop may be eliminated. Can you make sure it's not your case
> > > because 450x performance boost sounds a bit too much to me.
> > >
> > > You can declare a "static volatile __used __init" variable to assign
> > > the result of fns(), and ensure that the code is not eliminated
> >
> > Yep, without 'c' this compiler to 'return 0'.
> >
> > static inline unsigned long fns(unsigned long word, unsigned int n)
> > {
> > while (word && n--)
> > word &= word - 1;
> > return word ? __builtin_ffs(word) : 8 * sizeof (long);
> > }
> >
> > unsigned long buf[1000000];
> >
> > volatile int c;
> >
> > int test_fns(void)
> > {
> > unsigned int i, n;
> >
> > for (n = 0; n < 8*sizeof (long); n++)
> > for (i = 0; i < 1000000; i++)
> > c = fns(buf[i], n);
> > return 0;
> > }
> >
> > You are also hitting the random number generator.
> > It would be better to use a predictable sequence.
> > Then you could, for instance, add up all the fns() results
> > and check you get the expected value.
> >
> > With a really trivial 'RNG' (like step a CRC one bit) you
> > could do it inside the loop and not nee a buffer at all.
> >
> Hi David,
>
> I do think that conducting correctness testing here is a good idea.
> However, we are about to change the return value of fns() from return
> BITS_PER_LONG to return >= BITS_PER_LONG [1][2] when the nth bit is not
> found. Therefore, using a fixed input series here and checking the sum
> of return values may not accurately test it. Do you know if there are
> any other more suitable testing methods?

Hi David,

Let's do one thing in a time. test_fns() is about performance testing,
and that's it. Kuan-Wei wrote it specifically for that.

He also chose to use random numbers to feed the fns(), and that's a
reasonable choice. I see nothing wrong in his approach with the array,
as well as what you proposed. But the test is done, and it works. If
you think it's worth testing the function with your approach, feel
free to submit your test, I'll take it just as well, and we'll have
even better coverage.

Regarding summing up all the fns() returns to check correctness - I
think it's a wrong idea. At first because current behavior when we
return BITS_PER_LONG is not a contract, just implementation detail.
And even now, on 64- and 32-bit arches fns() will sum up to different
numbers.

Second, this summing up doesn't sound reliable. Imagine a case when
the function fails one time returning a less-by-one number, and in the
following run - a greater-by-one. So the sum will be correct, and the
test will not catch the error.

We're running a test_find_nth_bit(), and to me it's enough for testing
fns(). But if you'd like, please send a more specific test.

Thanks,
Yury