Re: NFS Data CORRUPTION Between Linux and SunOS 5.5.1

Bill Hawes (whawes@transmeta.com)
Thu, 13 Aug 1998 18:48:06 -0700


Ben McCann wrote:

> We use Linux 2.1.x for software development where Linux workstations
> NFS mount filesystems on a Sun UltraSparc server. The Ultra runs
> SunOs 5.5.1.
>
> We ran 2.1.84 with no problems. We recently upgraded our build
> environment to 2.1.102. (We've been using 2.1.102 in application
> testing for a couple of months so we decided it was stable enough
> to use for compiling and linking too).
>
> Immediately after upgrading, we noticed that our executable files
> were corrupted during the link phase of a build. Remember that
> the objects and the executable are all stored on the UltraSparc
> server. If we link under Linux 2.1.84 then there is no corruption
> and if it is 2.1.102 then there IS corruption.
>
> ====> I've repeated this with 2.1.115 so the bug is still alive
> ====> in the latest edition of the kernel.
>
> This is a very puzzling bug. We do NOT see corruption when we link
> directly to the local hard drive and we don't see corruption when
> we NFS mount another 2.1.102 Linux box and link on its file system.
>
> The only corruption occurs when running 'ld' under 2.1.102 (or
> 2.1.115) and writing the executable to a SunOS 5.5.1 NFS server.
> (BTW, we using GNU ld version 2.8.1 (with BFD linux-2.8.1.0.1)).

Hi Ben,

A couple of experiments you could try to help track down the problem ...

If certain files are consistently corrupted, is there a pattern to the
offsets and data that are different?

Try turning on NFS and RPC level debugging (echo 65535
>/proc/sys/sunrpc/{nfs,rpc}_debug) and link a file Linux-toLinux, then
Linux-to-Sun. Then diff the logs and look for patterns that might pinpoint
the problem.

Are you using TCP or UDP? If TCP, try UDP.

Regards,
Bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html