Re: More cluuse, was: Re: Wierd IDE/Triton behavior

Gerard Roudier (groudier@club-internet.fr)
Sat, 11 Jan 1997 15:38:38 +0000 (GMT)


Hi Leonard!

These reports are very interesting. I will try an explanation or just a
speculation.

The buffer cache tries to increase its pool at GFP_BUFFER priority.
For this priority, __get_free_pages does not try to get pages from the
unified page cache.

>From __get_free_pages():

if (priority != GFP_BUFFER && try_to_free_page(priority, dma, 1))

If try_to_free_page() is never called and the pages are referenced, then
the buffer cache will never get memory and will have to deal with its
current pool. But the page cache will at the same time try to pick it
up memory ...

On the other hand, the circular algorithm of shrink_mmap may, in some not
enough random activity, never free pages the buffer cache can get.

My feeling is that, the more we have main memory in a system, the more this
strange behaviour may happen.

However, I dont think that normal system activity may be affected too
much by this weird cache balancing behaviour.

Gerard.

On Sat, 11 Jan 1997, Leonard N. Zubkoff wrote:

> From: tenthumbs@cybernex.net
> Date: Sat, 11 Jan 1997 05:53:02 GMT
>
> Further testing reveals that this is probably not a disk problem but
> a memory management one.
>
> Immediately after booting the 2.0.27 kernel, I can reproduce the slow
> results every time. If I run the system for a while doing something
> like compiling a kernel or running X or generally churning free memory,
> the problem disappears! From then on, the test runs as fast as with the
> other kernels. Interesting.
>
> The kernel size also seems important. I built a special kernel (with
> profiling support) to test this phenomenon, but it does not exhibit the
> behavior. Other 2.0.27 kernels I have (with different configurations)
> don't show this slowdown.
>
> Luckily, I found out that you can profile a kernel even it if wasn't
> originally compiled with profiling support. (A definite feature.) One
> very obvious difference between a slow run and a fast one is that
> shrink_specific_buffers is called 6970 times in a slow run but *never*
> in a fast one.
>
> I know nothing about Linux memory management, but it appears that there
> are certain initial conditions that can cause big performance hits.
>
> Any experts have any ideas?
>
> I ran into a similar problem yesterday. A normal reboot hit the mount limit
> and ran e2fsck on two of my partitions. The first (4GB) completed after 30
> minutes and the second (8MB, 2 way stripe) after 60! After the system booted,
> I manually ran e2fsck again and the slowest only took a little over 3 minutes.
>
> Note that e2fsck uses the raw device and hence 1KB blocks. Perhaps this is
> related. I did manage to capture one report from the magic ScrollLock keys in
> the kernel messages buffer. Perhaps this will help someone locate the problem.
>
> Leonard
>
> Mem-info:
> Free pages: 6560kB
> ( 0*4kB 0*8kB 0*16kB 27*32kB 9*64kB 40*128kB = 6560kB)
> Swap cache: add 0/0, delete 71228/0, find 0/0
> Free swap: 1176732kB
> 65536 pages of RAM
> 1653 free pages
> 1264 reserved pages
> 342 pages shared
> Buffer memory: 230680kB
> Buffer heads: 230640
> Buffer blocks: 230611
> Buffer[0] mem: 227577 buffers, 2 used (last=2), 0 locked, 0 protected, 0 dirty 0 shrd
> Buffer[4] mem: 2895 buffers, 0 used (last=0), 0 locked, 0 protected, 2895 dirty 0 shrd
> Size [LAV] Free Clean Unshar Lck Lck1 Dirty Shared
> 512 [ 0]: 0 0 0 0 0 0 0
> 1024 [ 641]: 118 227575 0 0 0 2895 0
> 2048 [ 0]: 0 0 0 0 0 0 0
> 4096 [ 0]: 21 2 0 0 0 0 0
> 8192 [ 0]: 0 0 0 0 0 0 0
> Networking buffers in use : 0
> Network buffers locked by drivers : 0
> Total network buffer allocations : 1
> Total failed network buffer allocs : 0
> Total free while locked events : 0
> IP fragment buffer size : 0