RE: [PATCH RT] nvdimm: make lane acquirement RT aware

From: Liu, Yongxin
Date: Sun Mar 10 2019 - 20:45:52 EST



> -----Original Message-----
> From: linux-kernel-owner@xxxxxxxxxxxxxxx [mailto:linux-kernel-
> owner@xxxxxxxxxxxxxxx] On Behalf Of Sebastian Andrzej Siewior
> Sent: Friday, March 8, 2019 17:42
> To: Liu, Yongxin
> Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-rt-users@xxxxxxxxxxxxxxx;
> tglx@xxxxxxxxxxxxx; rostedt@xxxxxxxxxxx; dan.j.williams@xxxxxxxxx;
> pagupta@xxxxxxxxxx; Gortmaker, Paul; linux-nvdimm@xxxxxxxxxxxx
> Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware
>
> On 2019-03-08 00:07:41 [+0000], Liu, Yongxin wrote:
> > The lane is critical resource which needs to be protected. One CPU can
> use only one
> > lane. If CPU number is greater than the number of total lane, the lane
> can be shared
> > among CPUs.
> >
> > In non-RT kernel, get_cpu() disable preemption by calling
> preempt_disable() first.
> > Only one thread on the same CPU can get the lane.
> >
> > In RT kernel, if we only use raw_smp_processor_id(), this doesn't
> protect the lane.
> > Thus two threads on the same CPU can get the same lane at the same time.
> >
> > In this patch, two-level lock can avoid race condition for the lane.
>
> but you still have the ndl_lock->lock which protects the resource. So in
> the unlikely (but possible event) that you switch CPUs after obtaining
> the CPU number you block on the lock. No harm is done, right?

The resource "lane" can be acquired recursively, so "ndl_lock->lock" is a conditional lock.

ndl_count->count is per CPU.
ndl_lock->lock is per lane.

Here is an example:
Thread A on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> get "ndl_lock->lock"
--> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" due to "ndl_count->count++".

Thread B on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" ("ndl_count->count"
was changed by Thread A)

If we use raw_smp_processor_id(), no matter which CPU the thread was migrated to,
if there is another thread running on the old CPU, there will be race condition
due to per CPU variable "ndl_count->count".


Thanks,
Yongxin

>
> Sebastian