On Tue, 24 Jun 2025 13:02:31 +0800
Lance Yang <lance.yang@xxxxxxxxx> wrote:
On 2025/6/24 11:53, Masami Hiramatsu (Google) wrote:
On Tue, 24 Jun 2025 09:44:55 +0800
Lance Yang <lance.yang@xxxxxxxxx> wrote:
On 2025/6/24 08:26, Masami Hiramatsu (Google) wrote:
On Thu, 12 Jun 2025 12:19:25 +0800
Lance Yang <ioworker0@xxxxxxxxx> wrote:
From: Lance Yang <lance.yang@xxxxxxxxx>
When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a
reader-owned rwsem can lead to false positives in blocker tracking.
To mitigate this, let’s try to clear the owner field on unlock, as a NULL
owner is better than a stale one for diagnostics.
Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and
remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS.
Thanks for the feedback! I see your point about the dependency ;)
Personlly, I'd perfer to keep them separate. The reasoning is that
they addreess two distinct things, and I think splitting them makes
this series clearer and easier to review ;)
Patch #1 focuses on "ownership tracking": Its only job is to make
the existing owner-related helpers (rwsem_owner(), is_rwsem_reader_owned())
globally available when blocker tracking is enabled.
Patch #2, on the other hand, is about "reader-owner cleanup": It
introduces a functional change to the unlock path, trying to clear
the owner field for reader-owned rwsems.
But without clearing the owner, the owner information can be
broken, right? Since CONFIG_DEBUG_RWSEMS is working as it is,
You're right, the owner info would be broken without the cleanup logic
in patch #2. But ...
I think those cannot be decoupled. For example, comparing the
result of both DETECT_HUNG_TASK_BLOCKER and DEBUG_RWSEMS are
enabled and only DETECT_HUNG_TASK_BLOCKER is enabled, the
result is different.
The actual blocker tracking for rwsems is only turned on in patch #3.
So, there's no case where the feature is active without the cleanup
logic already being in place.
Does this reasoning make sense to you?
Sorry, no. I think "reader-owner cleanup" is a part of "ownership
tracking" as DEBUG_RWSEMS does (and that keeps consistency of
the ownership tracking behavior same as DEBUG_RWSEM).
I thought this step-by-step approach was a bit cleaner, since there are
currently only two users for these owner helpers (DEBUG_RWSEMS and
DETECT_HUNG_TASK_BLOCKER).
I think the step-by-step approach fits better if the feature is evolving
(a working feature is already there.) I don't like the intermediate
state which does not work correctly, because if we have a unit test(
like kUnit) it should fail. If you can say "this finds the rwsem
owner as same as what the CONFIG_DEBUG_RWSEM is doing", it is simpler
to explain what you are doing, and easy to understand.