Re: [PATCH] eventpoll: Fix priority inversion problem
From: Nam Cao
Date: Mon May 19 2025 - 05:52:43 EST
On Mon, May 19, 2025 at 11:25:51AM +0200, Florian Bezdeka wrote:
> Hi all,
>
> sorry for top-posting, but I think it makes sense in this case as I'm
> trying to merge different workstreams, likely working on the same
> problem showing up in different colors.
>
> Main goal is to make everybody aware of the other stream / patch
> series.
>
> We have colleagues from Bytedance working on non-RT performance
> optimizations related to CONFIG_CFS_BANDWIDTH at [1].
>
> This series came to light while searching for a solution for a RT
> lockup, reported at [2].
>
> We heavily tested [1] during the last month on RT and can report
> success now. In our tests we saw read-lock holder preemption only
> within the epoll interface. It might be that [1] fixes more potential
> issues in this regard.
>
> Today [3] (= the patch I'm replying to, see below) got posted.
> Linutronix reworking the epoll infrastructure.
>
> I would love to learn how/if the combination, basically [1] and [3] fit
> together.
[1] fixes stall problem involving rw semaphore which epoll uses, but it
doesn't fix the possible priority inversion with epoll
[3] fixes priority inversion problem with epoll by stop using rw semaphore,
but it doesn't do anything about rw semaphore
So I propose we keep both.
Best regards,
Nam
> My understanding right now is, that [1] fixes a CFS issue, throttling
> while holding a lock is not ideal on !RT - but might cause a critical
> lockup on RT - while [3] is addressing a similar (RT) problem in epoll.
>
> Best regards,
> Florian
>
> [1] https://lore.kernel.org/all/20250409120746.635476-1-ziqianlu@xxxxxxxxxxxxx/
> [2] https://lore.kernel.org/linux-rt-users/xhsmhttqvnall.mognet@xxxxxxxxxxxxxxxxxxx/
> [3] https://lore.kernel.org/linux-rt-users/20250519074016.3337326-1-namcao@xxxxxxxxxxxxx/T/#u