Re: knfsd and ext2? Huh?

From: Alexei I. Adamovich (lexa@adam.botik.ru)
Date: Mon Jun 12 2000 - 09:58:06 EST


"Alexei I. Adamovich" <lexa@adam.botik.ru> (AA) wrote on Mon, 5 Jun
2000 17:52:04 +0400 (MSD):
AA> Already working, spent some time during weekend trying to invent
AA> testing metodology for being able to confirm the fact that we have
AA> knfsd issue fixed--when it will be fixed. But found some other knfsd
AA> problem: it seems like knfsd doesn't work properly with ext2 partitons
AA> also. In short:
AA> Stress.sh had some error reported before being completely stalled when
AA> working on nfs-mounted knfsd-served ext2 partition (the same picture
AA> as with ReiserFS, but EXT2 resisted twice or four times longer than
AA> ReiserFS). I'll send longer report if somebody is interested.

Hans Reiser <hans@reiser.to> (HR), in his turn, wrote on Fri, 09 Jun
2000 22:40:20 -0700:
HR> Chip Salzenberg wrote:
HR> >
HR> > According to Hans Reiser:
HR> > > knfsd crashes for both ext2 and reiserfs
HR> >
HR> > I've not heard any such reports about knfsd crashing with ext2, at
HR> > least not the knfsd in 2.4 and the trond/brown/dhiggen patch set for
HR> > 2.2. And I've been paying attention. What are you talking about?
HR> > --
HR> > Chip Salzenberg - a.k.a. - <chip@valinux.com>
HR> > "I wanted to play hopscotch with the impenetrable mystery of existence,
HR> > but he stepped in a wormhole and had to go in early." // MST3K
HR> ...
HR> All I did was mindlessly repeat the words of Alan Cox.
HR> knfsd is supposed to be buggy. Actually, that is not entirely true, lexa also
HR> told me he can make it crash for ext2.

And Alan Cox <alan@lxorguk.ukuu.org.uk> replied on Sat, 10 Jun 2000
12:34:43 +0100 (BST):
AX> > According to Hans Reiser:
AX> > > [Alan Cox &] lexa also told me he can make it crash for ext2.
AX> >
AX> > Has either of them been using the current Trond/Neil/Dave patch set?
AX>
AX> Right now Im using the 2.2.16 set - since the prune_dcache stuff went in
AX> I've not seen any hangs

1. I'm using pure 2.4.0-test1:
> lexa@adam:~ > uname -a
> Linux adam 2.4.0-test1 #1 SMP Sat Jun 3 08:41:46 MSD 2000 i686 unknown

2. GCC is 2.95.2:
> lexa@adam:~ > gcc -v
> Reading specs from /usr/lib/gcc-lib/i486-suse-linux/2.95.2/specs
> gcc version 2.95.2 19991024 (release)

3. Box is 2-Celeron SMP installation:
> lexa@adam:~ > less /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 6
> model name : Celeron (Mendocino)
> stepping : 5
> cpu MHz : 501.146554
> cache size : 128 KB
> fdiv_bug : no
> hlt_bug : no
> sep_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 2
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr
> bogomips : 999.42
>
> processor : 1
> ... (same bunch of lines here)

4. RAM is 128 Mb:
> lexa@adam:~ > free
> total used free shared buffers cached
> Mem: 126648 103072 23576 0 1792 44064
> -/+ buffers/cache: 57216 69432
> Swap: 136512 768 135744

5. Disk is 25Gb IDE, tested ext2 partition (/dev/hda6) is 5Gb logical on
8Gb extended:
> adam:~ # fdisk -l
>
> Disk /dev/hda: 255 heads, 63 sectors, 3111 cylinders
> Units = cylinders of 16065 * 512 bytes
>
> Device Boot Start End Blocks Id System
> /dev/hda1 1 2 16033+ 83 Linux
> /dev/hda2 * 3 1024 8209215 c Win95 FAT32 (LBA)
> /dev/hda3 2070 3110 8361832+ 83 Linux
> /dev/hda4 1025 2069 8393962+ 5 Extended
> /dev/hda5 1025 1041 136521 82 Linux swap
> /dev/hda6 1042 1694 5245191 83 Linux

6. Recipe:
a) run mke2fs on /dev/hda6;
b) mount -t ext2 /dev/hda6 /mnt
c) mkdir /mnt/pub
d) cd /mnt/pub; cp -a /usr/bin /usr/X11R6 .
e) kexportfs -o rw,no_root_squash 127.0.0.1:/mnt/pub
f) mount -t nfs 127.0.0.1:/mnt/pub/ /testfs
g) from /root/benchmarks/stress run stress script on /testfs with 20
processes running:
> adam:~/benchmarks/stress # ./Stress.sh -n 20 -c Documentation /testfs
> Number of concurrent processes: 20
> Content directory: Documentation (size: 4256 KB)
> Warning: This will DELETE anything in /testfs/stress. Type yes to confirm.
> yes
> Created /testfs/stress directory.
> Computing MD5 sums over content directory: done.
> Starting stress test processes: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
> Process IDs: 982 983 985 987 989 992 994 995 996 1000 1001 1002 1004 1005 1010 1012 1013 1014 1016 1020
> Press ^C to kill all processes
> D1 W1 R1 D1 W1 R1 D1 W1 R1 D1 W1 R1 D1 W1 R1 D2 W2 D1 W1 R2 D2 R1 W2
> ...

7. RESULTS may vary from error reports:
> R9 R6 W8 D17 W17 D11 D18 W18 D2 D19 R5 W19 R13 R14 R1 D20 W20 D7 W11
> 151c151
> < 0e265ce983ccd5575685fd9dee8429ed ./kernel-docs.txt
> ---
> > fa40746447a701403c93ff71db9d7e8f ./kernel-docs.txt
> D12 W2 D9 R15 W7 D6 W12 R3 R16 R4 R10 W9 R8 D13 D5 240c240
> < bf00bdcc002d24ade27aea441d32f679 ./pm.txt
> ---
> > 6ebec3253ddea00c804c70806b86b03e ./pm.txt
> D14 W6 D1 R17 R18 W5 W13 R19 W14 W1 D15 R20 R11 D3 R2 D16 D4 D10 D8
> W15 R7 D17 R12 W3 W16 D18 W4 W10 W8 R9 R6 W17 D19 W18

to all stress processes being stalled and--further--up to
ALL THE SYSTEM COMPLETE FREEZE.

I also found some nfs messages in /var/log/messages:
> Jun 10 14:23:32 adam kernel: nfs: task 43039 can't get a request slot
> Jun 10 14:23:32 adam kernel: nfs: task 43028 can't get a request slot
> Jun 10 14:23:32 adam kernel: nfs: task 43032 can't get a request slot
> Jun 10 14:23:32 adam kernel: nfs: task 43030 can't get a request slot
> Jun 10 14:23:32 adam kernel: nfs: task 43052 can't get a request slot
> Jun 10 14:23:32 adam kernel: nfs: task 43034 can't get a request slot
> Jun 10 14:23:32 adam kernel: nfs: task 43036 can't get a request slot
> Jun 10 14:23:32 adam kernel: nfs: task 43048 can't get a request slot
> Jun 10 14:23:32 adam kernel: nfs: task 43043 can't get a request slot
> Jun 10 14:23:32 adam kernel: nfs: task 43040 can't get a request slot
> Jun 10 14:29:29 adam kernel: nfs: task 43054 can't get a request slot
> Jun 10 14:29:29 adam kernel: nfs: task 43055 can't get a request slot
> Jun 10 14:29:29 adam kernel: nfs: task 43058 can't get a request slot
> Jun 10 14:29:29 adam kernel: nfs: task 43056 can't get a request slot
> Jun 10 14:29:29 adam kernel: nfs: task 43060 can't get a request slot
> Jun 10 14:29:29 adam kernel: nfs: task 43057 can't get a request slot
> Jun 10 14:29:29 adam kernel: nfs: task 43059 can't get a request slot
> ...

I'm not shure if prune_dcache stuff is in 2.4.0-test1, but I
definitely saw some Neil's stuff in (BTW: I very like it, it opens
some way to filesystem-specific nfsd support).

Thanks,

   Alexei I.Adamovich
----------------------------------------------------------------
Res. Centre for Multiprocessor Systems, | e-mail:
PSI RAS, Pereslavl-Zalessky 152140 Russia | lexa@adam.botik.ru
----------------------------------------------------------------

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Jun 15 2000 - 21:00:25 EST