Re: Strange issues with epoll since 5.0

From: Eric Wong
Date: Sat Apr 27 2019 - 20:49:06 EST


Deepa Dinamani <deepa.kernel@xxxxxxxxx> wrote:
> I tried to replicate the failure on qemu.
> I do not see the failure with N=32.

> Does it work for N < 32?

Depends on number of cores you have; I have 4 cores, 8 threads
with HT; so I needed to have a lot of load on the machine to get
it to fail (it takes about 1 minute).

cmogstored is intended to run on machines that were already
saturated in CPU/memory from other processes, but not HDD I/O
bandwidth.

> Does any other signal work?

SIGCONT does, via:

perl -i -p -e 's/SIGURG/SIGCONT/g' `git ls-files`

> Are there any other architectures that fail?

I don't have other arches (well, 32-bit x86, but I've never
really tried cmogstored on that, even).

> Could you help me figure out how to run just the one test that is failing?

Just running one test won't trigger since it needs a busy
machine; but:

make test/mgmt_auto_adjust.log
(and "rm make test/mgmt_auto_adjust.log" if you want to rerun)

Thanks for looking into this. Fwiw, cmogstored uses epoll in
strange and uncommon ways which has led to kernel bugfixes
in the past.