Re: smp_call_function_single lockups

From: Frederic Weisbecker
Date: Wed Apr 01 2015 - 10:22:48 EST


On Wed, Feb 11, 2015 at 12:42:10PM -0800, Linus Torvalds wrote:
> [ Added Frederic to the cc, since he's touched this file/area most ]
>
> On Wed, Feb 11, 2015 at 11:59 AM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > So the caller has a really hard time guaranteeing that CSD_LOCK isn't
> > set. And if the call is done in interrupt context, for all we know it
> > is interrupting the code that is going to clear CSD_LOCK, so CSD_LOCK
> > will never be cleared at all, and csd_lock() will wait forever.
> >
> > So I actually think that for the async case, we really *should* unlock
> > before doing the callback (which is what Thomas' old patch did).
> >
> > And we migth well be better off doing something like
> >
> > WARN_ON_ONCE(csd->flags & CSD_LOCK);
> >
> > in smp_call_function_single_async(), because that really is a hard requirement.
> >
> > And it strikes me that hrtick_csd is one of these cases that do this
> > with interrupts disabled, and use the callback for serialization. So I
> > really wonder if this is part of the problem..
> >
> > Thomas? Am I missing something?
>
> Ok, this is a more involved patch than I'd like, but making the
> *caller* do all the CSD maintenance actually cleans things up.
>
> And this is still completely untested, and may be entirely buggy. What
> do you guys think?
>
> Linus

> kernel/smp.c | 78 ++++++++++++++++++++++++++++++++++++------------------------
> 1 file changed, 47 insertions(+), 31 deletions(-)
>
> diff --git a/kernel/smp.c b/kernel/smp.c
> index f38a1e692259..2aaac2c47683 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -19,7 +19,7 @@
>
> enum {
> CSD_FLAG_LOCK = 0x01,
> - CSD_FLAG_WAIT = 0x02,
> + CSD_FLAG_SYNCHRONOUS = 0x02,
> };
>
> struct call_function_data {
> @@ -107,7 +107,7 @@ void __init call_function_init(void)
> */
> static void csd_lock_wait(struct call_single_data *csd)
> {
> - while (csd->flags & CSD_FLAG_LOCK)
> + while (smp_load_acquire(&csd->flags) & CSD_FLAG_LOCK)
> cpu_relax();
> }
>
> @@ -121,19 +121,17 @@ static void csd_lock(struct call_single_data *csd)
> * to ->flags with any subsequent assignments to other
> * fields of the specified call_single_data structure:
> */
> - smp_mb();
> + smp_wmb();
> }
>
> static void csd_unlock(struct call_single_data *csd)
> {
> - WARN_ON((csd->flags & CSD_FLAG_WAIT) && !(csd->flags & CSD_FLAG_LOCK));
> + WARN_ON(!(csd->flags & CSD_FLAG_LOCK));
>
> /*
> * ensure we're all done before releasing data:
> */
> - smp_mb();
> -
> - csd->flags &= ~CSD_FLAG_LOCK;
> + smp_store_release(&csd->flags, 0);
> }
>
> static DEFINE_PER_CPU_SHARED_ALIGNED(struct call_single_data, csd_data);
> @@ -144,13 +142,16 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(struct call_single_data, csd_data);
> * ->func, ->info, and ->flags set.
> */
> static int generic_exec_single(int cpu, struct call_single_data *csd,
> - smp_call_func_t func, void *info, int wait)
> + smp_call_func_t func, void *info)
> {
> - struct call_single_data csd_stack = { .flags = 0 };
> - unsigned long flags;
> -
> -
> if (cpu == smp_processor_id()) {
> + unsigned long flags;
> +
> + /*
> + * We can unlock early even for the synchronous on-stack case,
> + * since we're doing this from the same CPU..
> + */
> + csd_unlock(csd);
> local_irq_save(flags);
> func(info);
> local_irq_restore(flags);
> @@ -161,21 +162,9 @@ static int generic_exec_single(int cpu, struct call_single_data *csd,
> if ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu))
> return -ENXIO;
>
> -
> - if (!csd) {
> - csd = &csd_stack;
> - if (!wait)
> - csd = this_cpu_ptr(&csd_data);
> - }
> -
> - csd_lock(csd);
> -
> csd->func = func;
> csd->info = info;
>
> - if (wait)
> - csd->flags |= CSD_FLAG_WAIT;
> -
> /*
> * The list addition should be visible before sending the IPI
> * handler locks the list to pull the entry off it because of
> @@ -190,9 +179,6 @@ static int generic_exec_single(int cpu, struct call_single_data *csd,
> if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu)))
> arch_send_call_function_single_ipi(cpu);
>
> - if (wait)
> - csd_lock_wait(csd);
> -
> return 0;
> }
>
> @@ -250,8 +236,17 @@ static void flush_smp_call_function_queue(bool warn_cpu_offline)
> }
>
> llist_for_each_entry_safe(csd, csd_next, entry, llist) {
> - csd->func(csd->info);
> - csd_unlock(csd);
> + smp_call_func_t func = csd->func;
> + void *info = csd->info;
> +
> + /* Do we wait until *after* callback? */
> + if (csd->flags & CSD_FLAG_SYNCHRONOUS) {
> + func(info);
> + csd_unlock(csd);
> + } else {
> + csd_unlock(csd);
> + func(info);

Just to clarify things, the expected kind of lockup it is expected to fix is the case
where the IPI is resent from the IPI itself, right?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/