Re: [ANNOUNCE] Reiserfs/kill-bkl tree v2

From: Chris Mason
Date: Mon Aug 03 2009 - 09:28:22 EST


On Sun, Aug 02, 2009 at 10:04:40PM -0700, Roland Dreier wrote:
>
> > Well, dont waste too much time on it (beyond the due diligence
> > level) - Andi forgot that the right way to stress-test patches is to
> > get through the review process and then through the integration
> > trees which have far more test exposure than any single contributor
> > can test.
> >
> > Patch submitters cannot possibly test every crazy possibility that
> > is out there - nor should they: it just doesnt scale. What we expect
> > people to do is to write clean patches, to test the bits on their
> > own boxes and submit them to lkml and address specific review
> > feedback.
>
> I respectfully disagree in this case. For patches that touch, say,
> something hardware dependent where the patch submitter doesn't have all
> the variations on the hardware, yes, I agree, scale the testing by
> running the code on many machines. But for the code in question, where
> some very fundamental and complex changes are being made to filesystem
> locking, I don't think that testing really scales -- after all, if there
> is some race then it's quite likely that testers will just see some rare
> filesystem corruption, which could easily waste weeks of debugging
> before the BKL/reiserfs patches were even implicated.

Definitely, the cost of the rare bug is much higher. The good news is
that reiserfs tends to pile its races into a few spots. Most of them
can be found with a 12 hour run of the namesys stress.sh program and a
lot of memory pressure. I'd compile with preemption on and you'll have
a good test on any SMP machine.

http://oss.oracle.com/~mason/stress.sh

stress.sh just copies a source directory into the test filesystem, then
reads it back and deletes it in a loop. I'd run with 50 procs and
enough memory pressure for the box to lightly swap (booting w/mem= is a
fine way to make memory pressure). This way you make sure to hammer on
the metadata writeback paths, which is where all of the difficult races
come in.

Testing with an fsx-linux process running at the same time will make
sure all of the mmap/truncate paths are working correctly as well.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/