Re: 2.2.8_andrea1.bz2

Chuck Lever (cel@monkey.org)
Thu, 13 May 1999 00:23:11 -0400 (EDT)


On Wed, 12 May 1999, Andrea Arcangeli wrote:
> Now I use the swap cache for doing async swapout of shm memory and this in
> turn mean that I am been allowed to kill the swap-lockmap.

it might be better for the page/swap cache to use a separate hash function
for swap caching -- that way you can use s(i+o) for page caching, and
something like s(offset) for swap caching. i've never been able to get
this to work, though. there is probably some interaction between the swap
cache and the page cache that i still don't understand.

> There's my VM-lru code + my buffer.c redesign.

the only problem with LRU is that every time you touch a page or buffer,
you have to do list magic, which is more expensive than just setting a
"referenced" bit. perhaps if touch_buffer() just set the page's reference
bit, that would help almost as much as a true LRU design, but without the
overhead of maintaining a list?

> BTW, I seen the buffer.c changes of 2.2.8. You have killed (not fixed ;)
> flushtime. At least you could have removed also flushtime from the struct
> buffer_head to avoid wasting time in useless initializations ;).

especially the logic in getblk() right after the get_hash_table() call.
eliminating that takes out a conditional branch that is taken only rarely,
but is executed every time get_hash_table() finds a matching buffer
(which is about 80% of the time).

> So I rejected all the buffer.c changes included in 2.2.8 (I bet 2.2.8 will
> be faster then 2.2.7 simply because update can't harm performances anymore
> because sync_old_buffers don't know about flushtime anymore ;) and I think
> my approch will be still slightly faster than 2.2.8 (if somebody would do
> a benchmark between 2.2.8 and 2.2.8_andrea1.bz2 while writing a lot of
> data to disk in the same place many times (like the gdbm case) I would be
> glad).

i ran some benchmarks this afternoon that compare 2.2.7 and 2.2.8 under a
moderate to heavy multitasking load. 2.2.8 turns out to perform
significantly worse than 2.2.7, at least for this benchmark. i ran these
on a dual 200Mhz PPro with 128M of RAM and two SCSI-2 fireballs. offered
load varied from moderate CPU bound -- 16 concurrent scripts -- to heavy
CPU and I/O bound -- 64 concurrent scripts.

kernel 16 scripts 32 scripts 48 scripts 64 scripts

2.2.7 1861.7 1777.9 1730.6 1675.6
s=4.10 s=5.98 s=3.91 s=11.48
elapsed: 49 min 19 sec

2.2.8 1791.6 1739.0 1683.5 1624.2
s=4.32 s=10.86 s=5.01 s=18.99
elapsed: 51 min 31 sec

here are the top kernel CPU users for 2.2.7

16679 do_wp_page 30.2156
10983 d_lookup 52.8029
10077 filemap_nopage 12.5962
7556 proc_root_lookup 24.8553
7264 do_anonymous_page 56.7500
5003 file_read_actor 62.5375
4820 __get_free_pages 10.2119
4220 schedule 4.4328
3490 system_call 62.3214

and it looks almost the same for 2.2.8, except schedule() is two places
higher in the list:

17451 do_wp_page 31.6141
10903 filemap_nopage 13.6288
8779 d_lookup 42.2067
7773 do_anonymous_page 60.7266
7221 proc_root_lookup 24.0700
5809 schedule 6.0259
4991 __get_free_pages 10.5742
4987 file_read_actor 62.3375
3690 system_call 65.8929

- Chuck Lever

--
corporate:	<chuckl@netscape.com>
personal:	<chucklever@netscape.net> or <cel@monkey.org>

The Linux Scalability project: http://www.citi.umich.edu/projects/linux-scalability/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/