Re: [PATCH 1/3] relay: Fix 4 off-by-one errors occuring whenwriting to a CPU buffer.

From: Tom Zanussi
Date: Mon Jun 16 2008 - 01:22:43 EST



On Sat, 2008-06-14 at 17:52 +0300, Eduard - Gabriel Munteanu wrote:
> On Fri, 13 Jun 2008 23:40:37 -0500
> Tom Zanussi <tzanussi@xxxxxxxxx> wrote:
>
> > I'm wondering if the all-zeroes at the end of the buffer might be
> > another case of the all-zeroes you were seeing due to cross-cpu
> > reading you decribed in the other patch. In any case, I'm pretty
> > sure this patch isn't doing what you think it is, and don't see how
> > it could have fixed the problem (see below). There may still be a
> > bug somewhere, but it would be good to be able to reproduce it. Does
> > it happen even when running on a single cpu?
>
> Hi,
>
> I noticed this problem after adding those spinlocks. As far as I can
> tell, having (offset == subbuf_size + 1) at any given moment allows the
> read() handler to see inconsistent offsets:
> 1. writer sets offset = subbuf_size + 1
> 2. writer releases spinlock
> 3. read() acquires spinlock and reads the wrong offset
> 4. read() releases spinlock
> 5. next writer corrects the offset at the next write
>
> > This case, offset being 1 larger than the subbuf size, is how we note
> > a full sub-buffer, so changing this will break full-subbuffer cases.
>
> No, it won't. Maximum length messages result in the following condition:
> start + offset == subbuf_size
> This happens because a buffer of length subbuf_size actually ranges
> from zero to (subbuf_size - 1) in regard to how it is addressed. Then,
> subbuf_size + 1 isn't just outside the bounds, but one more byte off.
> "Visual" example:
> subbuf_size = 4
> |[ ][ ][ ][ ]|[ ]
> 0 1 2 3 subbuf_size
>
> So, a full subbufer means offset equals subbuf_size, that is, the next
> empty slot is just outside the subbuffer.
>

Yes, I understand that - what I meant was that the subbuf_size + 1
condition happens only in the buffer-full case (i.e. no reader or
lagging reader), but not during the normal filling of a subbuffer, which
you describe correctly.

So apparently what you're seeing is zeroes being read when there's a
buffer-full condition? If so, we need to figure out exactly why that's
happening to see whether your fix is really what's needed; I haven't
seen problems in the buffer-full case before and I think your fix would
break it even if it fixed your read problem. So it would be good to be
able to reproduce it first.

Tom


>
> Eduard

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/