Re: [PATCH v5 3/4] kvm: Create kvm_clear_irq()

From: Gleb Natapov
Date: Wed Jul 18 2012 - 06:53:15 EST


On Wed, Jul 18, 2012 at 01:51:05PM +0300, Michael S. Tsirkin wrote:
> On Wed, Jul 18, 2012 at 01:36:08PM +0300, Gleb Natapov wrote:
> > On Wed, Jul 18, 2012 at 01:33:35PM +0300, Michael S. Tsirkin wrote:
> > > On Wed, Jul 18, 2012 at 01:27:39PM +0300, Gleb Natapov wrote:
> > > > On Wed, Jul 18, 2012 at 01:20:29PM +0300, Michael S. Tsirkin wrote:
> > > > > On Wed, Jul 18, 2012 at 09:27:42AM +0300, Gleb Natapov wrote:
> > > > > > On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin wrote:
> > > > > > > > _Seems_ racy, or _is_ racy? Please identify the race.
> > > > > > >
> > > > > > > Look at this:
> > > > > > >
> > > > > > > static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > > > > int irq_source_id, int level)
> > > > > > > {
> > > > > > > /* Logical OR for level trig interrupt */
> > > > > > > if (level)
> > > > > > > set_bit(irq_source_id, irq_state);
> > > > > > > else
> > > > > > > clear_bit(irq_source_id, irq_state);
> > > > > > >
> > > > > > > return !!(*irq_state);
> > > > > > > }
> > > > > > >
> > > > > > >
> > > > > > > Now:
> > > > > > > If other CPU changes some other bit after the atomic change,
> > > > > > > it looks like !!(*irq_state) might return a stale value.
> > > > > > >
> > > > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1.
> > > > > > > If CPU 0 sees a stale value now it will return 0 here
> > > > > > > and interrupt will get cleared.
> > > > > > >
> > > > > > This will hardly happen on x86 especially since bit is set with
> > > > > > serialized instruction.
> > > > >
> > > > > Probably. But it does make me a bit uneasy. Why don't we pass
> > > > > irq_source_id to kvm_pic_set_irq/kvm_ioapic_set_irq, and move
> > > > > kvm_irq_line_state to under pic_lock/ioapic_lock? We can then use
> > > > > __set_bit/__clear_bit in kvm_irq_line_state, making the ordering simpler
> > > > > and saving an atomic op in the process.
> > > > >
> > > > With my patch I do not see why we can't change them to unlocked variant
> > > > without moving them anywhere. The only requirement is to not use RMW
> > > > sequence to set/clear bits. The ordering of setting does not matter. The
> > > > ordering of reading is.
> > >
> > > You want to use __set_bit/__clear_bit on the same word
> > > from multiple CPUs, without locking?
> > > Why won't this lose information?
> > Because it is not RMW. If it is then yes, you can't do that.
>
> You are saying __set_bit does not do RMW on x86? Interesting.
I think it doesn't.

> It's probably not a good idea to rely on this I think.
>
The code is no in arch/x86 so probably no. Although it is used only on
x86 (and ia64 which has broken kvm anyway).

> > >
> > > In any case, it seems simpler and safer to do accesses under lock
> > > than rely on specific use.
> > >
> > > > --
> > > > Gleb.
> >
> > --
> > Gleb.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/