pre-2.3.9-5_SMP-page-LRU_F [Re: fs corruption with pre-2.3.9-5 +

Andrea Arcangeli (andrea@suse.de)
Wed, 30 Jun 1999 17:45:13 +0200 (CEST)


On Wed, 30 Jun 1999, Andrea Arcangeli wrote:

>With only this little patch applyed to pre-2.3.9-5 I corrupted completly
>my test-machine only by running a memory hog + bonnie at the same time.

(forget to tell the machine that screwed up is UP with an SMP kernel)

Now I have the test-machine running again with pre-2.3.9-5 + my LRU patch
(without the little patch that ll_rw_block from try_to_free_buffers()) and
it's working _fine_ for 73 minutes (more than one hour) so the corruption
problem is been hided _completly_ removing the little patch in the last
email. (note the LRU-patch includes the ll_rw_block and genhd.c part of
the patch of the previous email). So the problem was definitely triggered
by the previous patch I posted in the first email of the thread.

Since the page-LRU patch seems to be really fine (on my UP hardware, the
only place where I can afford to trash the FS without major developement
stall) I'll upload it and I would appreciate if somebody who can risk to
corrupt the fs of an _SMP_ machine may give it a try under very high VM
load (swapping out tons of data in loop for example).

The patch itself should be 100% safe (and finally it seems it doesn't
trigger pre-2.3.9-5 bugs anymore) and it should give you an high boost
under high VM load (when the memory _not_ used as cache is relevant). I
would like very much if somebody may also do some benchmark w/ and w/o the
patch applyed (also a kerenl compile would be interesting). Somebody asked
me why I don't do benchmarks myself. The reason is that I stress test the
kernel all the time to trigger bugs. A kernel compile is not evil enough
to trigger bugs in the VM or generate fs corruption. (I run always with
20/30 mbyte of swap in loop)

I would like also if Linus and Ingo could take a look at the code. The
features of the patch are:

o rewrote shrink_mmap from scratch. Now is SMP-threaded and
more than one shrink_mmap will be allowed to run at the same time.
It's threaded in respect of the page cache and of the big kernel lock.
All the locking uses the per-page PG_locked bit and over the
new pagemap_lru_lock spinlock.
o Avoid running in O(nr_phys_pages) in shrink_mmap but run close to
O(1) even if the cache is near zero. This make an _huge_
difference for the swap and allows us to not be fooled swapping
out too eary due not well distributed cache over the memmap.
o shm memory swapout asynchronously using the swap cache to resolve
the read-write locking. (needed since 2.3.8 or 7 I don't remeber)
o clustering in the get_swap_cache code: try to find a empty cluster
before getting a fragmented one.
o persistence of the data in the swap space: the same page is going
to be swapped out always in the place on swap even after a swapin
fault.
o fix in the higherbit/lowerbit logic in swap_free.
o do_wp_page will do the COW SMP threaded using the locking of the
page to synchronize with swap_out().
o all the swap_cache code is been carefully put under the big kernel
lock since to take over the swap cache in do_wp_page we must
make sure that nobody is finding the swap_cache_page while
we are taking over it.
o ll_rw_block is multidevice capable (but really such bit could
be also removed since I don't ll_rw_block from try_to_free_pages
anymore to avoid triggering the fs corruption).
o debugging BH_Shared bitflag. I like it very much, we'll be allowed
to remove it then. It verify that a shared buffer belongs to a
page still in the page cache.
o fix for free_page_and_swap_cache to avoid sleeping in vmtruncate.
o race fix in inode.c if mark_buffer_dirty(bh, 1) can sleep.

The interesting case to benchmark is when the cache in the system is very
small but I expect an interesting speedup also from a plain kernel
compile (with make -j40 or more).

The patch can be downloaded from:

ftp://e-mind.com/pub/andrea/kernel-patches/pre-2.3.9-5_SMP-page-LRU_F.gz

It will cameup very soon also in the far faster mirrors:

ftp://ftp.suse.com/pub/people/andrea/kernel-patches/pre-2.3.9-5_SMP-page-LRU_F.gz
ftp://ftp.linux.it/pub/People/andrea/kernel-patches/pre-2.3.9-5_SMP-page-LRU_F.gz
ftp://master.softaplic.com.br/pub/andrea/kernel-patches/pre-2.3.9-5_SMP-page-LRU_F.gz

If you try it and you get a problem make sure to send a bug report! :)

Thanks.

Andrea

PS. It should be obvious that the patch is against pre-patch-2.3.9-5.gz
downloadable from ftp.*.kernel.org/pub/linux/kernel/testing/
(btw, pre-patch-2.3.9-5.gz itself is against clean 2.3.8).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/