Re: Kernel 2.6.14.2 - Hard link count is wrong

From: Clemens Koller
Date: Mon Nov 21 2005 - 06:17:01 EST


Hello, Massimiliano, Hi Daniel!

Massimiliano Hofer wrote:
I'm back. First of all I beg your pardon for blaming 2.6.14.2, I forgot I rebooted with 2.6.13.4.

I'm indeed working on 2.6.14.2 and I get the problem with findutils 4.2.20 and
up. I've checked my system's reiserfs partition... no problems. Otherwise,
the machine is pretty simple:
/$ mount
/dev/hda1 on / type reiserfs (rw)
devpts on /dev/pts type devpts (rw)

/$ find --version
GNU find version 4.2.20
Features enabled: D_TYPE O_NOFOLLOW(enabled)

/$ find |grep smb.conf
./etc/samba/smb.conf.default
./etc/samba/smb.conf
./usr/man/man5/smb.conf.5.gz
find: WARNING: Hard link count is wrong for .: this may be a bug in your filesystem driver. Automatically turning on find's -noleaf option. Earlier results may have failed to include directories that should have been searched.

And after an update to stock:

/$ find --version
GNU find version 4.2.26
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION

It's still the same:

/$ find |grep smb.conf
./etc/samba/smb.conf.default
./etc/samba/smb.conf
./usr/man/man5/smb.conf.5.gz
find: WARNING: Hard link count is wrong for .: this may be a bug in your filesystem driver. Automatically turning on find's -noleaf option. Earlier results may have failed to include directories that should have been searched.

The Patch, Daniel mentioned didn't make it into the latest findutils-4.2.26, but if
I re-apply it manually (attachment), it tells me:

/$ find --version
GNU find version 4.2.26-ck
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION

/$ find |grep smb.conf
./etc/samba/smb.conf.default
./etc/samba/smb.conf
./usr/man/man5/smb.conf.5.gz
find: WARNING: Hard link count (44) is wrong for ./proc: this may be a bug in your filesystem driver. Automatically turning on fid's -noleaf option. Earlier results may have failed to include directories that should have been searched.

So, I guess that something in /proc is messed up a little.
Anymore checks I can do?

Anyway I couldn't reproduce the bug in 2.6.14.2, but it is easily reproducible with 2.6.13.4.
Compiling pcmcia in the kernel (but leaving yenta and 8139too as a module) I never get errors on /proc/bus, but I get the same results on /proc/bus/pci.
It seems a broader hotplug problem. Of corse it could be already solved in 2.6.14, as I can't reproduce it.

Well, I don't think so. I can easily reproduce that on my machine here:

$ uname -a
Linux zefix 2.6.14.2 #2 Tue Nov 15 19:08:57 CET 2005 i686 unknown unknown GNU/Linux

$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 5
model name : Pentium II (Deschutes)
stepping : 2
cpu MHz : 300.715
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr
bogomips : 602.39

$ cat /proc/pci
PCI devices found:
Bus 0, device 0, function 0:
Class 0600: PCI device 8086:7180 (rev 3).
Master Capable. Latency=64.
Prefetchable 32 bit memory at 0xe4000000 [0xe7ffffff].
Bus 0, device 1, function 0:
Class 0604: PCI device 8086:7181 (rev 3).
Master Capable. Latency=64. Min Gnt=8.
Bus 0, device 4, function 0:
Class 0601: PCI device 8086:7110 (rev 2).
Bus 0, device 4, function 1:
Class 0101: PCI device 8086:7111 (rev 1).
Master Capable. Latency=32.
I/O at 0xb800 [0xb80f].
Bus 0, device 4, function 2:
Class 0c03: PCI device 8086:7112 (rev 1).
IRQ 5.
Master Capable. Latency=32.
I/O at 0xb400 [0xb41f].
Bus 0, device 4, function 3:
Class 0680: PCI device 8086:7113 (rev 2).
IRQ 9.
Bus 0, device 10, function 0:
Class 0200: PCI device 10ec:8139 (rev 16).
IRQ 10.
Master Capable. Latency=33. Min Gnt=32.Max Lat=64.
I/O at 0xb000 [0xb0ff].
Non-prefetchable 32 bit memory at 0xe1800000 [0xe18000ff].
Bus 0, device 13, function 0:
Class 0401: PCI device 1274:5880 (rev 2).
IRQ 5.
Master Capable. Latency=32. Min Gnt=12.Max Lat=128.
I/O at 0xa800 [0xa83f].
Bus 1, device 0, function 0:
Class 0300: PCI device 1002:475a (rev 58).
IRQ 11.
Master Capable. Latency=64. Min Gnt=8.
Prefetchable 32 bit memory at 0xe3000000 [0xe3ffffff].
I/O at 0xd800 [0xd8ff].
Non-prefetchable 32 bit memory at 0xe2000000 [0xe2000fff].

Hmm, okay, I see, the PCIID's are missing. :-(
That's basically a Intel PIIX4 chipset, RTL8139, and a ATI Rage II card.
Can we narrow down the problem a little? Hmm, do you need more info?

Thanks so far,
best greets,
--
Clemens Koller
_______________________________
R&D Imaging Devices
Anagramm GmbH
Rupert-Mayer-Str. 45/1
81379 Muenchen
Germany

http://www.anagramm.de
Phone: +49-89-741518-50
Fax: +49-89-741518-19
--- findutils-4.2.26/find/find.c~ 2005-11-11 08:41:37.000000000 +0100
+++ findutils-4.2.26/find/find.c 2005-11-21 11:47:35.000000000 +0100
@@ -1947,8 +1947,8 @@ process_dir (char *pathname, char *name,
* doesn't really handle hard links with Unix semantics.
* In the latter case, -noleaf should be used routinely.
*/
- error(0, 0, _("WARNING: Hard link count is wrong for %s: this may be a bug in your filesystem driver. Automatically turning on find's -noleaf option. Earlier results may have failed to include directories that should have been searched."),
- parent);
+ error(0, 0, _("WARNING: Hard link count (%d) is wrong for %s: this may be a bug in your filesystem driver. Automatically turning on find's -noleaf option. Earlier results may have failed to include directories that should have been searched."),
+ statp->st_nlink, pathname);
state.exit_status = 1; /* We know the result is wrong, now */
options.no_leaf_check = true; /* Don't make same
mistake again */