Re: [BUG] seltests/iommu: runaway ./iommufd consuming 99% CPU after a failed assert()

From: Mirsad Todorovac
Date: Sat Mar 23 2024 - 16:13:42 EST




On 3/19/24 14:58, Jason Gunthorpe wrote:
On Tue, Mar 12, 2024 at 07:35:40AM +0100, Mirsad Todorovac wrote:
Hi,

(This is verified on the second test box.)

In the most recent 6.8.0 release of torvalds tree kernel with selftest configs on,
process ./iommufd appears to consume 99% of a CPU core for quote a while in an
endless loop:

There is a "bug" in the ksefltest framework where if you call a
kselftest assertion from the setup/teardown it infinite loops

The fix I know is to replace kselftest assertions with normal assert()

But I don't see an obvious thing here saying you are hitting that..

Jason

Hi,

I'm not that deep into kselftest for that intervention.

Yet, with the v6.8-11743-ga4145ce1e7bc build, the problem with ./iommufd did not stuck.
Instead I got these 10 failed tests:

# # RUN iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # enforce_dirty: Test terminated by assertion
# # FAIL iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty
# not ok 156 iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty
# # RUN iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # set_dirty_tracking: Test terminated by assertion
# # FAIL iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking
# not ok 157 iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking
# # RUN iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # device_dirty_capability: Test terminated by assertion
# # FAIL iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability
# not ok 158 iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability
# # RUN iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # get_dirty_bitmap: Test terminated by assertion
# # FAIL iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap
# not ok 159 iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap
# # RUN iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # get_dirty_bitmap_no_clear: Test terminated by assertion
# # FAIL iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear
# not ok 160 iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear
.
.
.
# # RUN iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # enforce_dirty: Test terminated by assertion
# # FAIL iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty
# not ok 166 iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty
# # RUN iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # set_dirty_tracking: Test terminated by assertion
# # FAIL iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking
# not ok 167 iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking
# # RUN iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # device_dirty_capability: Test terminated by assertion
# # FAIL iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability
# not ok 168 iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability
# # RUN iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # get_dirty_bitmap: Test terminated by assertion
# # FAIL iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap
# not ok 169 iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap
# # RUN iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # get_dirty_bitmap_no_clear: Test terminated by assertion
# # FAIL iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear
# not ok 170 iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear
.
.
.
# # FAILED: 170 / 180 tests passed.
# # Totals: pass:170 fail:10 xfail:0 xpass:0 skip:0 error:0
not ok 1 selftests: iommu: iommufd # exit=1

It seems like the same assertion failed in all 10 failed tests?

However, I am not smart enough to figure out why ...

Apparently, from the source, mmap() fails to allocate pages on the desired address:

1746 assert((uintptr_t)self->buffer % HUGEPAGE_SIZE == 0);
1747 vrc = mmap(self->buffer, variant->buffer_size, PROT_READ | PROT_WRITE,
1748 mmap_flags, -1, 0);
→ 1749 assert(vrc == self->buffer);
1750

But I am not that deep into the source to figure our what was intended and what went
wrong :-/

Best regards,
Mirsad Todorovac