Re: [linus:master] [btrfs] 5f58d783fd: xfstests.btrfs.172.fail

From: Anand Jain
Date: Wed Mar 22 2023 - 19:49:53 EST


On 17/03/2023 12:14, Anand Jain wrote:


On 3/17/23 09:17, kernel test rboot wrote:

Greeting,

FYI, we noticed xfstests.btrfs.172.fail due to commit (built with gcc-11):

commit: 5f58d783fd7823b2c2d5954d1126e702f94bfc4c ("btrfs: free device in btrfs_close_devices for a single device filesystem")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: xfstests
version: xfstests-i386-5a5e419-1_20220926
with following parameters:

    disk: 6HDD
    fs: btrfs
    test: btrfs-logwrites

test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git


on test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Haswell) with 8G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


we did this test on i386 and we found the issue doesn't always happen,
on both this commit and v6.3-rc2, it failed in about half of tests.
however, always clean on parent commit 519b7e13b5ae8dd3.

519b7e13b5ae8dd3 5f58d783fd7823b2c2d5954d112                    v6.3-rc2
---------------- --------------------------- ---------------------------
        fail:runs  %reproduction    fail:runs  %reproduction    fail:runs
            |             |             |             |             |
            :14          23%           7:13          33% 10:19    xfstests.btrfs.172.fail
          14:14         -27%           6:13         -17% 9:19    xfstests.btrfs.172.pass



If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@xxxxxxxxx

2023-03-08 04:47:50 export TEST_DIR=/fs/sdb1
2023-03-08 04:47:50 export TEST_DEV=/dev/sdb1
2023-03-08 04:47:50 export FSTYP=btrfs
2023-03-08 04:47:50 export SCRATCH_MNT=/fs/scratch
2023-03-08 04:47:50 mkdir /fs/scratch -p
2023-03-08 04:47:50 export SCRATCH_DEV_POOL="/dev/sdb2 /dev/sdb3 /dev/sdb4 /dev/sdb5 /dev/sdb6"
2023-03-08 04:47:50 export LOGWRITES_DEV=/dev/sdb2
2023-03-08 04:47:50 export SCRATCH_DEV=/dev/sdb6
2023-03-08 04:47:50 unset SCRATCH_DEV_POOL
2023-03-08 04:47:50 sed "s:^:btrfs/:" //lkp/benchmarks/xfstests/tests/btrfs-logwrites
2023-03-08 04:47:50 ./check btrfs/291 btrfs/206 btrfs/196 btrfs/192 btrfs/190 btrfs/172
FSTYP         -- btrfs
PLATFORM      -- Linux/i686 lkp-hsw-d01 6.1.0-rc8-00284-g5f58d783fd78 #1 SMP Wed Mar  8 11:05:36 CST 2023
MKFS_OPTIONS  -- /dev/sdb6
MOUNT_OPTIONS -- /dev/sdb6 /fs/scratch

btrfs/172       [failed, exit status 1]- output mismatch (see /lkp/benchmarks/xfstests/results//btrfs/172.out.bad)
     --- tests/btrfs/172.out    2022-09-26 09:38:15.000000000 +0000
     +++ /lkp/benchmarks/xfstests/results//btrfs/172.out.bad 2023-03-08 04:47:54.134026209 +0000
     @@ -1,3 +1,5 @@
      QA output created by 172
      wrote 5242880/5242880 bytes at offset 0
      XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

     +_check_btrfs_filesystem: filesystem on /dev/sdb6 is inconsistent


Thank you for reporting it. I am currently looking into it. It appears
that the device is being freed after it's closed, which could indicate
that we overlooked something at close, just a hypothesis at this point.

-Anand




     +(see /lkp/benchmarks/xfstests/results//btrfs/172.full for details)


Hmm, I'm unable to reproduce the issue. However, it's possible that the
O_EXCL open for the 'btrfs check' command failed because 'systemd-udev'
was scanning the btrfs device at the same time. I noticed from the dmesg
that the 'systemd-udevd' thread was running [1] at some point. I'm not
entirely sure if it raced with the 'btrfs check' command to successfully
acquire the O_EXCL lock. If you could send me the 'testcase.full' and
'testcase.out.bad' logs from the system, I could verify the issue.

[1]
[ 337.769932][ T6408] BTRFS: device fsid 8dcaa8fb-b317-4e13-9e04-e9b63fe91948 devid 1 transid 6 /dev/sdb6 scanned by systemd-udevd (6408)


Could you please also attach the full log and out.bad file for
the test case in the report as shown below?

/lkp/benchmarks/xfstests/results/btrfs/<test-case-number>.full
and
/lkp/benchmarks/xfstests/results/btrfs/<test-case-number>.out.bad


Thanks, Anand

     ...
     (Run 'diff -u /lkp/benchmarks/xfstests/tests/btrfs/172.out /lkp/benchmarks/xfstests/results//btrfs/172.out.bad'  to see the entire diff)
btrfs/190        15s
btrfs/192        101s
btrfs/196        163s
btrfs/206        4s
btrfs/291       [not run] kernel btrfs isn't configured with verity support
Ran: btrfs/172 btrfs/190 btrfs/192 btrfs/196 btrfs/206 btrfs/291
Not run: btrfs/291
Failures: btrfs/172
Failed 1 of 6 tests




To reproduce:

         git clone https://github.com/intel/lkp-tests.git
         cd lkp-tests
         sudo bin/lkp install job.yaml           # job file is attached in this email
         bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
         sudo bin/lkp run generated-yaml-file

         # if come across any failure that blocks the test,
         # please remove ~/.lkp and /lkp dir to run from a clean state.