Re: [RFC PATCH v3 1/9] CPU hotplug: Provide APIs to prevent CPUoffline from atomic context

From: Oleg Nesterov
Date: Mon Dec 10 2012 - 13:16:57 EST


On 12/10, Srivatsa S. Bhat wrote:
>
> On 12/10/2012 02:27 AM, Oleg Nesterov wrote:
> > On 12/07, Srivatsa S. Bhat wrote:
> >>
> >> 4. No deadlock possibilities
> >>
> >> Per-cpu locking is not the way to go if we want to have relaxed rules
> >> for lock-ordering. Because, we can end up in circular-locking dependencies
> >> as explained in https://lkml.org/lkml/2012/12/6/290
> >
> > OK, but this assumes that, contrary to what Steven said, read-write-read
> > deadlock is not possible when it comes to rwlock_t.
>
> What I meant is, with a single (global) rwlock, you can't deadlock like that.

Ah. I greatly misunderstood Steven's email,

http://marc.info/?l=linux-pm&m=135482212307876

Somehow I didn't notice he described the deadlock with _two_ rwlock's, I
wrongly thought that his point is that read_lock() is not recursive (like
down_read).

> Let me know if my assumptions are incorrect!

No, sorry, I misunderstood Steven.


> > However. If this is true, then compared to preempt_disable/stop_machine
> > livelock is possible. Probably this is fine, we have the same problem with
> > get_online_cpus(). But if we can accept this fact I feel we can simmplify
> > this somehow... Can't prove, only feel ;)
>
> Not sure I follow..

I meant that write_lock_irqsave(&hotplug_rwlock) in take_cpu_down()
can spin "forever".

Suppose that reader_acked() == T on every CPU, so that
get_online_cpus_atomic() always takes read_lock(&hotplug_rwlock).

It is possible that this lock will be never released by readers,

CPU_0 CPU_1

get_online_cpus_atomic()
get_online_cpus_atomic()
put_online_cpus_atomic()

get_online_cpus_atomic()
put_online_cpus_atomic()

get_online_cpus_atomic()
put_online_cpus_atomic()

and so on.


> Reader-side:
> -> read_lock() your per-cpu rwlock and proceed.
>
> Writer-side:
> -> for_each_online_cpu(cpu)
> write_lock(per-cpu rwlock of 'cpu');

Yes, yes, this is clear.

> Also, like Tejun said, one of the important measures for per-cpu rwlocks
> should be that, if a user replaces global rwlocks with percpu rwlocks (for
> performance reasons), he shouldn't suddenly end up in numerous deadlock
> possibilities which never existed before. The replacement should continue to
> remain safe, and perhaps improve the performance.

Sure, I agree.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/