Re: 2.2.8_andrea1.bz2

Andrea Arcangeli (andrea@e-mind.com)
Thu, 13 May 1999 15:27:04 +0200 (CEST)


On Thu, 13 May 1999, Chuck Lever wrote:

>it might be better for the page/swap cache to use a separate hash function
>for swap caching -- that way you can use s(i+o) for page caching, and

Really in some cases we may have the offset not PAGE_SIZE aligned also in
the page cache (see the ZMAIGIC issue ;). So I think it would be more
generic to do s(i+o+offset) (sct version) or something similar to account
a difference between the low order bits of the offset.

>> There's my VM-lru code + my buffer.c redesign.
>
>the only problem with LRU is that every time you touch a page or buffer,
>you have to do list magic, which is more expensive than just setting a
>"referenced" bit. perhaps if touch_buffer() just set the page's reference
>bit, that would help almost as much as a true LRU design, but without the
>overhead of maintaining a list?

Yes, agreed and I just fixed this some time ago. Now I never move pages to
the top of the lru from getblk or find_page, but I only set the reference
bit as the stock kernel does (nothing more). Then I shrink_mmap time I
clear the reference bit and I put the page to the top of the list (in
shrink_mmap I would have such cost even if there wouldn't be the reference
bit set). This removed the bottleneck completly according to numbers
people sent to me. The nice thing is that usually the pages at the end of
the lru are just the old ones (reference bit not set), and with the lru I
avoid completly to waste tons of time browsing the pagemap by hand. And
this is an huge improvement (at least when the memory gets used and there
are more used pages than pages that sleep in the cache only for caching
purposes).

>> BTW, I seen the buffer.c changes of 2.2.8. You have killed (not fixed ;)
>> flushtime. At least you could have removed also flushtime from the struct
>> buffer_head to avoid wasting time in useless initializations ;).
>
>especially the logic in getblk() right after the get_hash_table() call.
>eliminating that takes out a conditional branch that is taken only rarely,
>but is executed every time get_hash_table() finds a matching buffer
>(which is about 80% of the time).

I didn't understood exactly what you mean here. There are no changes in
2.2.8 realted to getblk(). But reading your comment above I had a new idea
(maybe is what you was suggesting above), basically is make no sense to
call get_hash_table and add one asm call plus and additional branch to the
code in getblk (we could use find_buffer directly there). To show you my
idea here is a completly untested (but quite obviously right ;) patch
against 2.2.8.

--- linux-2.2.8/fs/buffer.c Wed May 12 13:34:28 1999
+++ buffer.c Thu May 13 15:12:06 1999
@@ -697,8 +697,9 @@
int isize;

repeat:
- bh = get_hash_table(dev, block, size);
+ bh = find_buffer(dev, block, size);
if (bh) {
+ bh->b_count++;
if (!buffer_dirty(bh)) {
bh->b_flushtime = 0;
}

>kernel 16 scripts 32 scripts 48 scripts 64 scripts
>
>2.2.7 1861.7 1777.9 1730.6 1675.6
> s=4.10 s=5.98 s=3.91 s=11.48
> elapsed: 49 min 19 sec
>
>2.2.8 1791.6 1739.0 1683.5 1624.2
> s=4.32 s=10.86 s=5.01 s=18.99
> elapsed: 51 min 31 sec

BTW, do you ever tried to benchmark one of my latest patches
ftp://e-mind.com/pub/andrea/kernel/2.2.8_andrea1.bz2 ? I would be courious
to see if it make relevant differences or not.

>and it looks almost the same for 2.2.8, except schedule() is two places
>higher in the list:

Interesting (but in part is a feature). But could you apply this patch:

--- linux-2.2.8/kernel/sched.c Wed May 12 13:36:53 1999
+++ sched.c Thu May 13 15:22:44 1999
@@ -272,7 +272,7 @@
int cpu, best_cpu, weight, best_weight, i;
unsigned long flags;

- best_weight = 0; /* prevents negative weight */
+ best_weight = 3; /* prevents too easy rescheduling */

spin_lock_irqsave(&runqueue_lock, flags);

@@ -323,7 +323,7 @@
struct task_struct *tsk;

tsk = cpu_curr(this_cpu);
- if (preemption_goodness(tsk, p, this_cpu) > 0)
+ if (preemption_goodness(tsk, p, this_cpu) > 3)
tsk->need_resched = 1;
#endif
}

Allowing reschedules only when there is a significan difference of
priority between the current task and the wakenup task should not harm
iteractiveness very much but should slightly decrease the scheduling rate
according to me.

But I don't like the stock kernel reschedule_idle() (for example this
check below make no sense at all to me:

[..]
if ((p->processor == cpu) && related(cpu_curr(cpu), p))
return;
[..]

so I suggest you to try out also my reschedule_idle patch I posted to the
list (with subject `[RFC] pre-2.2.8-4 schedule issues') to see if it make
differences (you can change the best_weigth low limit to 3 there too).

Thanks ;-).

Andrea Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/