Issue in PI boosting code in __sched_setscheduler

From: Ronny Meeus
Date: Tue Mar 17 2015 - 16:11:16 EST


I'm using a patched kernel I get from Monta-Vista, it is based on the
3.10 kernel with some RT patches.
We ported an application from pSOS RTOS to Linux using the
Xenomai-Mercury (=library to map pSOS task to POSIX threads).

One of the patches applied to our kernel is:
"[PATCH RT 3/4] sched: Consider pi boosting in setscheduler" (see
https://lkml.org/lkml/2012/12/22/77).
I see that the code is today also present in the mainline kernel (for
example in 3.19).

We have several threads running in the real-time priority domain.
ThreadA: running at prio -33.
ThreadB: running at prio -35.

ThreadA obtains a PI protected mutex and continues to execute code in
the critical section.
ThreadB tries to obtain the same mutex and this makes the kernel boost
the priority of ThreadA to -35.

While holding the lock, ThreadA changes its priority to -99 to
implement a critical section (Xenomai internals). After a short
period, the latter critical section is left and the call to lower the
priority to its original one (-33) is issued to the kernel.

I would expect that at this moment the priority is lowered to -35
since this is the priority of the thread waiting on the mutex (TheadB)
but instead the priority is not changed and stays at -99. (Because of
the patch mentioned before. The relevant part of the code is also
copied below).
Since the critical section takes rather long, we start to miss
important events processed by higher priority threads.

If I disable the code introduced by the patch, the events are not missed.

My question about this behavior: According to me it is not correct to
keep the thread at the higher priority and "assume" that the critical
section will not take long anymore.
In my opinion the only correct solution is to lower the priority of
the calling thread to the highest prio of "the new-priority (-33)" and
"the priority of the tasks waiting on the mutex (-35)".

Thanks.

Best regards,
Ronny


3408 static int __sched_setscheduler(struct task_struct *p,
3409 const struct sched_attr *attr,
3410 bool user)

3596 /*
3597 * Special case for priority boosted tasks.
3598 *
3599 * If the new priority is lower or equal (user space view)
3600 * than the current (boosted) priority, we just store the new
3601 * normal parameters and do not touch the scheduler class and
3602 * the runqueue. This will be done when the task deboost
3603 * itself.
3604 */
3605 if (rt_mutex_check_prio(p, newprio)) {
3606 __setscheduler_params(p, attr);
3607 task_rq_unlock(rq, p, &flags);
3608 return 0;
3609 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/