Re: dealing with barriers (was Re: [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation)

From: Nick Piggin
Date: Thu Nov 01 2007 - 08:39:05 EST


On Thursday 01 November 2007 20:51, Stefan Richter wrote:
> Nick Piggin wrote:
> > On Thursday 01 November 2007 12:49, Stefan Richter wrote:
> >> fw_device.node_id and fw_device.generation are accessed without mutexes.
> >> We have to ensure that all readers will get to see node_id updates
> >> before generation updates.
> >
> > Hi, a few points:
>
> Thanks, this is appreciated.
>
> > - can change it to use spinlocks instead? This would be most
> > preferable.
>
> It could be done, but I don't find it preferable for this purpose.
> Although the struct members in question are basically part of fw-core's
> API (kernel-internal API), there won't be a lot of places which access
> these members.
>
> (Besides, lockless algorithms are essential to FireWire driver code
> everywhere where CPU and controller interact. So IMO these lockless
> accesses don't constitute a whole new level of complexity in these
> drivers.)

Fair enough.


> > - if not, you need comments.
>
> So far I tended to disagree every time when checkpatch.pl bugged me
> about barriers without comments. But maybe that was because I had a
> wrong idea of what kind of comment should go there. There would
> certainly be no value in a comment which effectively says "trust me, I
> know we need a barrier here". However, thinking about it again, it
> occurs to me that I should add the following comments:
>
> 1.) At respective struct type definitions: A comment on how struct
> members are to be accessed and why.

That's something that can be really helpful I think, yes.


> 2.) At the occurrences of rmb() and wmb(): A comment on which
> accesses the particular barrier divides. This is in order to get
> the barriers properly updated (i.e. moved or removed) when
> surrounding code is reordered, APIs reworked, or whatever.

Yes, this is the main thing to comment at the actual site of the barirer.
Although it isn't always 100% clear with locks either (because you might
do some non-critical operatoins inside a lock section), it just tends to
be clearer what it is being locked, and what other paths are being locked
against.

So describing what accesses are being ordered is good. Also a brief
description of how the lockless read-side works, if it's not obvious
(or you've already described that in your #1).


> Of course this division of the Why and the What only applies to accesses
> like those in the patch --- i.e. API-like data types which aren't
> abstracted by accessor functions --- while there may be other
> requirements on comments for other usage types of barriers.
>
> Or am I still wrong in my judgment of how barriers should be documented?

I think what you've said sounds good. I think that even memory barriers
within some subsystem can use commenting, even if only a line or two.


> > - you also seem to be missing rmb()s now. I see a couple in the
> > firewire directory, but nothing that seems to be ordering loads
> > of these particular fields.
>
> That's right. Of course the wmb()s don't make sense if the reader side
> isn't fixed up accordingly. I did this for all places which I spotted
> yesterday in the patches
> "firewire: fw-core: react on bus resets while the config ROM is being
> fetched", http://lkml.org/lkml/2007/10/31/464 (Hmm, this has an unclear
> changelog.)
> "firewire: fw-sbp2: enforce read order of device generation and node
> ID", http://lkml.org/lkml/2007/10/31/465

Ah, right, didn't see them. Thanks, that looks more useful now ;)


> I didn't fold all these patches into one because these two other patches
> include also other changes to the respective read sides. I should have
> pointed this out in this first patch.

Sure. I might personally still separate out the memory ordering fix
and put all the rmb and wmb in a single patch, but that's no big
deal.


> Also, it could be that I have overlooked one more reader last night; I
> will reexamine it.
>
> > - use smp_*mb() if you are just ordering regular cacheable RAM
> > accesses.
>
> Hmm, right. Looks like the smp_ variants are indeed the right thing to
> do here.

OK.


> > Also, diffstat is a bit wrong... maybe you posted the wrong version?
>
> Oops, this happened when I worked on what became the "fw-core: react on
> bus resets..." patch. So the diffstat is wrong but...

Ah, that's no problem. I manage to screw up patches in much worse
ways than this, don't worry ;)

Thanks,
Nick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/