Re: [PATCH] locking/mutex: Mark racy reads of owner->on_cpu

From: Waiman Long
Date: Thu Dec 02 2021 - 10:47:18 EST


On 12/2/21 06:53, Marco Elver wrote:
On Thu, 2 Dec 2021 at 11:13, Marco Elver <elver@xxxxxxxxxx> wrote:
One of the more frequent data races reported by KCSAN is the racy read
in mutex_spin_on_owner(), which is usually reported as "race of unknown
origin" without showing the writer. This is due to the racing write
occurring in kernel/sched. Locally enabling KCSAN in kernel/sched shows:

| write (marked) to 0xffff97f205079934 of 4 bytes by task 316 on cpu 6:
| finish_task kernel/sched/core.c:4632 [inline]
| finish_task_switch kernel/sched/core.c:4848
| context_switch kernel/sched/core.c:4975 [inline]
| __schedule kernel/sched/core.c:6253
| schedule kernel/sched/core.c:6326
| schedule_preempt_disabled kernel/sched/core.c:6385
| __mutex_lock_common kernel/locking/mutex.c:680
| __mutex_lock kernel/locking/mutex.c:740 [inline]
| __mutex_lock_slowpath kernel/locking/mutex.c:1028
| mutex_lock kernel/locking/mutex.c:283
| tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
| ...
|
| read to 0xffff97f205079934 of 4 bytes by task 322 on cpu 3:
| mutex_spin_on_owner kernel/locking/mutex.c:370
| mutex_optimistic_spin kernel/locking/mutex.c:480
| __mutex_lock_common kernel/locking/mutex.c:610
| __mutex_lock kernel/locking/mutex.c:740 [inline]
| __mutex_lock_slowpath kernel/locking/mutex.c:1028
| mutex_lock kernel/locking/mutex.c:283
| tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
| ...
|
| value changed: 0x00000001 -> 0x00000000

This race is clearly intentional, and the potential for miscompilation
is slim due to surrounding barrier() and cpu_relax(), and the value
being used as a boolean.

Nevertheless, marking this reader would more clearly denote intent and
make it obvious that concurrency is expected. Use READ_ONCE() to avoid
having to reason about compiler optimizations now and in future.

Similarly, mark the read to owner->on_cpu in mutex_can_spin_on_owner(),
which immediately precedes the loop executing mutex_spin_on_owner().

Signed-off-by: Marco Elver <elver@xxxxxxxxxx>
[...]

Kefeng kindly pointed out that there is an alternative, which would
refactor owner_on_cpu() from rwsem that would address both mutex and
rwsem:
https://lore.kernel.org/all/b641f1ea-6def-0fe4-d273-03c35c4aa7d6@xxxxxxxxxx/

Preferences?

I would like to see owner_on_cpu() extracted out from kernel/locking/rwsem.c into include/linux/sched.h right after vcpu_is_preempted(), for instance, and with READ_ONCE() added. Then it can be used in mutex.c as well. This problem is common to both mutex and rwsem.

Cheers,
Longman

Thanks,
-- Marco