Re: [RFC][PATCH] irq_work

From: Peter Zijlstra
Date: Fri Jun 25 2010 - 04:17:09 EST


On Fri, 2010-06-25 at 10:12 +0800, Huang Ying wrote:
>
> It is better to add "void *data" field in this struct to allow same
> function can be used for multiple struct irq_work.

No, simply do:

struct my_foo {
struct irq_work work;
/* my extra data */
}

void my_func(struct irq_work *work)
{
struct my_foo *foo = container_of(work, struct my_foo, work);

/* tada! */
}


> And I think IRQ is the implementation detail here, so irq_work is
> probably not a good name. nmi_return_notifier or nmi_callback is better?

Well, its ran in hard-irq context, so its an irq work. There's nothing
that says it can only be used from NMI context.

> > +void irq_work_run(void)
> > +{
> > + struct irq_work *list;
> > +
> > + list = xchg(&__get_cpu_var(irq_work_list), CALLBACK_TAIL);
> > + while (list != CALLBACK_TAIL) {
> > + struct irq_work *entry = list;
> > +
> > + list = list->next;
> > + entry->func(entry);
> > +
> > + entry->next = NULL;
>
> entry->next = NULL should be put before entry->func(entry), so that we
> will not lose a notification from NMI. And maybe check irq_work_list for
> several times to make sure nothing is lost.

But then _sync() will return before its done executing.

I think clearing after the function is done executing is the only sane
semantics (and yes, I should fix the current perf code).

You can always miss an NMI since it can always happen before the
callback gets done, and allowing another enqueue before the callback is
complete is asking for trouble.

> > + /*
> > + * matches the mb in cmpxchg() in irq_work_queue()
> > + */
> > + smp_wmb();
> > + }
> > +}
>
> I don't know why we need smp_wmb() here and smp_rmb() in
> irq_work_pending(). The smp_<x>mb() in original perf_pending_xxx code is
> not necessary too. Because smp_<x>mb is invoked in wake_up_process() and
> __wait_event() already.

The smp_wmb() wants to be before ->next = NULL; so that all writes are
completed before we release the entry. To that same effect _sync() and
_queue need the (r)mb.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/