Re: [PATCH 3/5] lib/bitmap: add test for bitmap_{from,to}_arr64

From: Yury Norov
Date: Mon Feb 27 2023 - 15:23:45 EST


On Mon, Feb 27, 2023 at 12:12:01PM -0800, Guenter Roeck wrote:
> On 2/27/23 11:24, Yury Norov wrote:
> > On Mon, Feb 27, 2023 at 06:59:12AM -0800, Guenter Roeck wrote:
> > > On 2/27/23 06:46, Alexander Lobakin wrote:
> > > > From: Yury Norov <yury.norov@xxxxxxxxx>
> > > > Date: Sat, 25 Feb 2023 16:06:45 -0800
> > > >
> > > > > On Sat, Feb 25, 2023 at 04:05:02PM -0800, Yury Norov wrote:
> > > > > > On Sat, Feb 25, 2023 at 10:47:02AM -0800, Guenter Roeck wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
> > > > > > > > Test newly added bitmap_{from,to}_arr64() functions similarly to
> > > > > > > > already existing bitmap_{from,to}_arr32() tests.
> > > > > > > >
> > > > > > > > Signed-off-by: Yury Norov <yury.norov@xxxxxxxxx>
> > > > > > >
> > > > > > > Ever since this test is in the tree, several of my boot tests show
> > > > > > > lots of messages such as
> > > > > > >
> > > > > > > test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> > > >
> > > > Hmmm, the whole 4 bytes weren't touched.
> > > >
> > > > > > > test_bitmap: bitmap_to_arr64(nbits == 2): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000003)
> > > > > > > test_bitmap: bitmap_to_arr64(nbits == 3): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000007)
> > > >
> > > > This is where it gets worse...
> > > >
> > > > > > > ...
> > > > > > > test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
> > > > > > > test_bitmap: bitmap_to_arr64(nbits == 928): tail is not safely cleared: 0xa5a5a5a580000000 (must be 0x00000000ffffffff)
> > > >
> > > > I don't see the pattern how the actual result gets generated. But the
> > > > problem is in the bitmap code rather than in the subtest -- "must be"s
> > > > are fully correct.
> > > >
> > > > Given that the 0xa5s are present in the upper 32 bits, it is Big Endian
> > > > I guess? Maybe even 32-bit Big Endian? Otherwise I'd start concerning
> > > > how comes it doesn't reproduce on x86_64s :D
> > > >
> > >
> > > It does reproduce on 32-bit x86 builds, and as far as I can see
> > > it is only seen with 32-bit little endian systems.
> >
> > Hi Guenter, Alexander,
> >
> > I think that the reason for the failures like this:
> >
> > > test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> >
> > is that bitmap_to_arr64 is overly optimized for 32-bit LE architectures.
> >
> > Regarding this:
> >
> > > test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
> >
> > I am not sure what happens, but because this again happens on 32-bit
> > LE only, I hope the following fix would help too.
> >
> > Can you please check if the patch works for you? I don't have a 32-bit LE
> > machine in hand, and all my 32-bit VMs (arm and i386) refuse to load the
> > latest kernels for some weird reason, so it's only build-tested.
> >
> > I'll give it a full-run when restore my 32-bit setups.
> >
> > Thanks,
> > Yury
> >
> > > From 2881714db497aed103e310865da075e7b0ce7e1a Mon Sep 17 00:00:00 2001
> > From: Yury Norov <yury.norov@xxxxxxxxx>
> > Date: Mon, 27 Feb 2023 09:21:59 -0800
> > Subject: [PATCH] lib/bitmap: drop optimization of bitmap_{from,to}_arr64
> >
> > bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE
> > architectures when it's wired to bitmap_copy_clear_tail().
> >
> > bitmap_copy_clear_tail() takes care of unused bits in the bitmap up to
> > the next word boundary. But on 32-bit machines when copying bits from
> > bitmap to array of 64-bit words, it's expected that the unused part of
> > a recipient array must be cleared up to 64-bit boundary, so the last 4
> > bytes may stay untouched.
> >
> > While the copying part of the optimization works correct, that clear-tail
> > trick makes corresponding tests reasonably fail when nbits % 64 <= 32:
> >
> > test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> >
> > Fix it by removing bitmap_{from,to}_arr64() optimization for 32-bit LE
> > arches.
> >
> > Reported-by: Guenter Roeck <linux@xxxxxxxxxxxx>
> > Fixes: 0a97953fd2210 ("lib: add bitmap_{from,to}_arr64")
> > Signed-off-by: Yury Norov <yury.norov@xxxxxxxxx>
>
> Tested with 32-bit i386 image. With this patch on top of
> v6.2-12765-g982818426a0f, the log messages are gone. Without this patch,
> they are still seen.
>
> Tested-by: Guenter Roeck <linux@xxxxxxxxxxxx>

Thanks!

Then, I'll submit it properly together with a fix for fail_counter.

Thanks,
Yury