Re: [PATCH v2 2/2] sched/tracing: Add TASK_RTLOCK_WAIT to TASK_REPORT

From: Valentin Schneider
Date: Wed Jan 19 2022 - 13:39:04 EST


On 18/01/22 12:10, Eric W. Biederman wrote:
> Valentin Schneider <valentin.schneider@xxxxxxx> writes:
>>
>> Alternatively, TASK_RTLOCK_WAIT could be masqueraded as
>> TASK_(UN)INTERRUPTIBLE when reported to userspace - it is actually somewhat
>> similar, unlike TASK_IDLE vs TASK_UNINTERRUPTIBLE for instance. The
>> handling in get_task_state() will be fugly, but it might be preferable over
>> exposing a detail userspace might not need to be made aware of?
>
> Right.
>
> Frequently I have seen people do a cost/benefit analysis.
>
> If the benefit is enough, and tracking down the userspace programs that
> need to be verified to work with the change is inexpensive enough the
> change is made. Always keeping in mind that if something was missed and
> the change causes a regression the change will need to be reverted.
>
> If there is little benefit or the cost to track down userspace is great
> enough the work is put in to hide the change from userspace. Just
> because it is too much trouble to expose it to userspace.
>
> I honestly don't have any kind of sense about how hard it is to verify
> that a userspace regression won't result from a change like this. I
> just know that the question needs to be asked.
>

I see it as: does it actually make sense to expose a new state? All the
information this is conveying is: "this task took a lock that is
substituted by a sleepable lock under PREEMPT_RT". Now that you brought
this up, I don't really see much value in this vs just conveying that the
task is sleeping on a lock, i.e. just report the same as if it had gone
through rt_mutex_lock(), aka:

---
diff --git a/include/linux/sched.h b/include/linux/sched.h
index d00837d12b9d..ac7b3eef4a61 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1626,6 +1626,14 @@ static inline unsigned int __task_state_index(unsigned int tsk_state,
if (tsk_state == TASK_IDLE)
state = TASK_REPORT_IDLE;

+ /*
+ * We're lying here, but rather than expose a completely new task state
+ * to userspace, we can make this appear as if the task had gone through
+ * a regular rt_mutex_lock() call.
+ */
+ if (tsk_state == TASK_RTLOCK_WAIT)
+ state = TASK_UNINTERRUPTIBLE;
+
return fls(state);
}