Re: [PATCH v5] erofs: add per-cpu threads for decompression as an option

From: Gao Xiang
Date: Thu Feb 23 2023 - 14:08:24 EST




On 2023/2/24 02:52, Gao Xiang wrote:
Hi Eric,

On 2023/2/24 02:29, Eric Biggers wrote:
Hi,

On Wed, Feb 08, 2023 at 05:33:22PM +0800, Gao Xiang wrote:
From: Sandeep Dhavale <dhavale@xxxxxxxxxx>

Using per-cpu thread pool we can reduce the scheduling latency compared
to workqueue implementation. With this patch scheduling latency and
variation is reduced as per-cpu threads are high priority kthread_workers.

The results were evaluated on arm64 Android devices running 5.10 kernel.

I see that this patch was upstreamed.  Meanwhile, commit c25da5b7baf1d
("dm verity: stop using WQ_UNBOUND for verify_wq") was also upstreamed.

Why is this more complex solution better than simply removing WQ_UNBOUND?

I do think it's a specific issue on specific arm64 hardwares (assuming
qualcomm, I don't know) since WQ_UNBOUND decompression once worked well
on the hardwares I once used (I meant Hisilicon, and most x86_64 CPUs,
I tested at that time) compared with per-cpu workqueue.

Also RT threads are also matchable with softirq approach.  In addition,
many configurations work without dm-verity.

Also for dm-verity use cases, EROFS will reuse the dm-verity context
directly rather than kick off a new context. Yet I'm not sure there
are still users using EROFS without dm-verity as I said above.

Anyway, the original scheduling issue sounds strange for me (with my
own landing experiences) in the beginning, and I have no way to
confirm the cases. Just hopefully it could be resolved from the
developer inputs and finally benefit to end users.

I've already did my own stress test with this new configuration as
well without explicit regression.


I don't have more time to dig into it for now but it's important to
resolve this problem on some arm64 hardwares first.  Also it's an
optional stuff, if the root cause of workqueue issue can be resolved,
we could consider drop it then.

Thsnka,
Gao Xiang


- Eric