Re: A regression in recent 3.2 kernel: bdi_dirty_limit() divideerror

From: Peter Zijlstra
Date: Sun Jan 08 2012 - 05:19:30 EST


On Sun, 2012-01-08 at 10:33 +0800, Wu Fengguang wrote:
> On Sat, Jan 07, 2012 at 05:35:25PM +0100, Peter Zijlstra wrote:
> > On Sat, 2012-01-07 at 22:56 +0800, Wu Fengguang wrote:
> > > Subject:
> > > Date: Sat Jan 07 22:50:45 CST 2012
> > >
> > > The uninitilized shift may lead to denominator=0 in
> > > prop_fraction_percpu() and divide error in bdi_dirty_limit().
> >
> > I'm not seeing how, only proc_change_shift() can change ->index, and it
> > does that after it writes ->pg[index]->shift.
>
> Then I lose the clue why bdi_dirty_limit() will divide error at all.

You and me both, the weird thing is, this code hasn't been changes like
forever and I can't recall any such weirdness.

In fact, prop_fraction_percpu() sets the denominator to period_2 +
(global_count & counter_mask).

The only way to make that 0 is to overflow the unsigned long.. did the
crash happen on 32bit -- I never saw the initial report?

But even then, we limit PROP_MAX_SHIFT to 3*BITS_PER_LONG/4, I don't
think that could ever overflow.

> prop_change_shift() does
>
> change ->pg[index]->shift
> smp_wmb()
> change ->index
>
> Will the read side prop_fraction_percpu() need some read memory barrier?

It actually has one, see prop_get_global()...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/