Re: NFS Data CORRUPTION Between Linux and SunOS 5.5.1

Ben McCann (bmccann@indusriver.com)
Fri, 14 Aug 1998 08:35:36 -0400


Sorry, I forgot to include a description of the corruption itself.
I have build 'good' and 'bad' versions of the file and compared
them. The corruption always follows the same pattern and multiple
corruptions have been seen in the file:

1. The corruption always begins on a 4096 byte aligned offset
in the file (i.e. on a page boundary).

2. 1, 2, or 3 bytes of ZERO are written at the beginning of the page
and the rest of the page is SHIFTED by that amount. (When we first
saw this we thought a SCSI controller was failing on the Sun
server but we've not had any problems with data written via
NFS to this Sun from a bunch of WinNT boxes we have here. And,
as I said earlier, 2.1.84 works fine).

3. The location of the smashed page or pages is random. The first
is usually 4 or 5 megabytes into the file (which is 11M long) but
occasionally it is only 56K into the file.

4. The number of corrupted blocks in a 11M file is small, like
5 or 10.

Hope this provides a clue. I couldn't fathom why the data was
SHIFTED because that implies the page was COPIED someplace.
How many places in the NFS logic COPY entire pages? Perhaps that
is a place to look.

Now, a few questions:

1. How do I vary the NFS block size? (Larry asked that I try that).

2. How can I tell if I am using UDP versus TCP? I've done NOTHING
to explicitly configure NFS. We just use RedHat 5.0 out of the box
with the 2.1.X kernels.

3. Given I can determine UDP vs. TCP, how do I change it to the
other? Can I assume SunOS 5.5 supports both?

I'll run the NFS debug log experiment today and send you both the
diff's.

Last, we have CONFIG_NFS_FS and CONFIG_NFSD setup as kernel modules
and we have the RPM's 'nfs-server-2.2beta29-2' and
'nfs-server-clients-2.2beta29-2' installed.

-Ben McCann

-- 
Ben McCann                              Indus River Networks
                                        31 Nagog Park
                                        Acton, MA, 01720
email: bmccann@indusriver.com           web: www.indusriver.com 
phone: (978) 266-8140                   fax: (978) 266-8111

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html