Re: Process Hang in __read_seqcount_begin

From: Eric Dumazet
Date: Fri Oct 26 2012 - 12:48:07 EST


On Fri, 2012-10-26 at 09:15 -0700, Peter LaDow wrote:
> On Tue, Oct 23, 2012 at 9:32 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> > Could you try following patch ?
>
> So, I applied your patch. And so far, it seems to have fixed the
> issue. I've had my systems running for 48 hours, and no lockup in
> iptables. Usually, I could get a lockup to occur within 12 to 24
> hours, and running this long tells me this patch may have things
> fixed.
>
> Now, I have a couple of questions.
>
> First, I'm not sure how this actually fixes things. It does seem
> there was a race before. But it isn't clear to me how this patch
> eliminates the race. I fear that this patch only reduces the window
> in which a race could occur, but doesn't eliminate it completely.
>

...

> Second, perhaps use seqlock_t instead of a seqcount_t for xt_recseq?
> This eliminates the problem. But given that it has been using
> seqcount_t for a long time, and is still using it, and nobody else has
> had this issue appear, makes me wonder if this problem isn't something
> unique to the RT patches. Perhaps only use seqlock_t in built for
> PREEMPT_RT_FULL?
>
> Third, recent RT patches (such as 3.6.2-rt4) have added
> preempt_disable_rt() and preempt_enable_rt() calls inside of
> read_seqcount_begin() and write_seqcount_end() respectively. The call
> to local_bh_disable/local_bh_enable doesn't do anything to
> disable/enable preemption (in 3.6.2-rt4), but it does in 3.03.36-rt58.
> But in 3.0.36-rt58 it only did so if not in an atomic context. And
> it doesn't appear to be an atomic context since local_bh_disable
> increments preempt_count by SOFTIRQ_OFFSET. And it appeared that
> building with SMP enabled, even though it did call
> preempt_disable_rt(), indirectly, through local_bh_disable(), but the
> lockup still occurred.
>
> Finally, after more testing, if this patch proves a solution to the
> problem, we could apply it locally. But what kind of testing would be
> required as part of a submission back to the general kernel/RT folk?
> The patch is easy enough to generate, but if it can't be proven that
> this patch actually fixes anything, I fail to see how it would be
> useful.
>
> Thanks,
> Pete LaDow


Upstream kernel is fine, there is no race, as long as :

local_bh_disable() disables BH and preemption.

Apparently RT changes this, so RT needs to change the code.

cmpxchg() has strong guarantees (and is also slower than upstream code),
and seems a reasonable way to fix RT/iptables



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/