Re: Process Hang in __read_seqcount_begin

From: Eric Dumazet
Date: Fri Oct 26 2012 - 17:05:52 EST


On Fri, 2012-10-26 at 11:51 -0700, Peter LaDow wrote:
> (I've added netfilter and linux-rt-users to try to pull in more help).
>
> On Fri, Oct 26, 2012 at 9:48 AM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> > Upstream kernel is fine, there is no race, as long as :
> >
> > local_bh_disable() disables BH and preemption.
>
> Looking at the unpatched code in net/ipv4/netfilter/ip_tables.c, it
> doesn't appear that any of the code checks the return value for
> xt_write_receq_begin to determine if it is safe to write. And neither
> does the newly patched code. How did the mainline code prevent
> corruption of the tables it is updating?
>

Do you know what is per cpu data in linux kernel ?

> Why isn't there something like
>
> while ( (addend = xt_write_recseq_begin()) == 0 );
>
> To make sure that only one person has write access to the tables?
> Better yet, why not use a seqlock_t instead?
>

Because its not needed. Really I dont know why you want that.

Once you are sure a thread cannot be interrupted by a softirq, and
cannot migrate to another cpu, access to percpu data doesnt need other
synchronization at all.

Following sequence is safe :

addend = (__this_cpu_read(xt_recseq.sequence) + 1) & 1;
/*
* This is kind of a write_seqcount_begin(), but addend is 0 or 1
* We dont check addend value to avoid a test and conditional jump,
* since addend is most likely 1
*/
__this_cpu_add(xt_recseq.sequence, addend);

Because any other thread will use a different percpu data.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/