Re: [OTP] SMP board recommendations?

From: David D.W. Downey (pgpkeys@hislinuxbox.com)
Date: Fri Feb 16 2001 - 00:03:27 EST


Thank you all for your response.

Andre (ASL), thanks for the assist. Laurie and Janine took care of me.
Asus CUV4X-D mobo with 1GB of buffered ECC RAM. I'm in the process of
transfering all the hardware to the new board. I'll let you know if this
new board solves the APIC errors and the random lockups under heavy I/O
problems.

I do have one more problem that I just can NOT track down.

2.4.1-ac10 kernel on the old Abit VP6 mobo. I'm getting curious errors
from the 2.4.1, 2.4.1-ac10, and 2.4.2-pre[#] kernels.

I've been using

dd if=/dev/zero of=/tmp/testdd.img bs=1024k count=1500

for testing of I/O on the various boards I have here. Now, the funny part
is that I get "file size limit exceeded" at around 1.0GB. I was getting
this under the 2.4.2-pre# kernels so i switched to straight 2.4.1 and got
the same problem. I switched to the 2.4.1-ac# line and the problem
disappeared. Guess what? It's baaacckk!

So, I did a strace of the dd command and got the following from it

execve("/bin/dd", ["dd", "if=/dev/zero", "of=/tmp/testing.img", "bs=1024k", "count=1500"], [/* 22 vars */]) = 0
brk(0) = 0x804e7b8
open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=7852, ...}) = 0
old_mmap(NULL, 7852, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015000
close(3) = 0
open("/lib/libc.so.6", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0755, st_size=1183326, ...}) = 0
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200\215"..., 4096) = 4096
old_mmap(NULL, 947548, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40017000
mprotect(0x400f7000, 30044, PROT_NONE) = 0
old_mmap(0x400f7000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xdf000) = 0x400f7000
old_mmap(0x400fb000, 13660, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x400fb000
close(3) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x400ff000
mprotect(0x40017000, 917504, PROT_READ|PROT_WRITE) = 0
mprotect(0x40017000, 917504, PROT_READ|PROT_EXEC) = 0
munmap(0x40015000, 7852) = 0
personality(PER_LINUX) = 0
getpid() = 195
brk(0) = 0x804e7b8
brk(0x804e7f0) = 0x804e7f0
brk(0x804f000) = 0x804f000
open("/dev/zero", O_RDONLY|O_LARGEFILE) = 3
open("/tmp/testing.img", O_RDWR|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 4
rt_sigaction(SIGINT, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGINT, {0x804ada8, [], 0x4000000}, NULL, 8) = 0
rt_sigaction(SIGQUIT, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGQUIT, {0x804ada8, [], 0x4000000}, NULL, 8) = 0
rt_sigaction(SIGPIPE, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGPIPE, {0x804ada8, [], 0x4000000}, NULL, 8) = 0
rt_sigaction(SIGUSR1, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGUSR1, {0x804ae70, [], 0x4000000}, NULL, 8) = 0
old_mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40100000
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576

********* BIG ASS SNIP **********

read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = -1 EFBIG (File too large)
--- SIGXFSZ (File size limit exceeded) ---
+++ killed by SIGXFSZ +++

Now, notice the beginning file creation call. It starts out with
O_LARGEFILE but ends with EFBIG. Since I'm not totally familiar with the
kernel code I could be wrong on my next statement and if I am, please tell
me, but it looks like it changes the file creation call from LARGEFILE to
EFBIG (or is this just the error call itself?)

Now, the kernel is supposed to be able to handle creating a 4TB file(?),
so 1.0GB should be nothing to it. NOTHING changed betwen it working and
not working. No hardware changes, no software additions, no recompiles of
existing applications/daemons.. nothing.

So, my question is now one of "What gives?" Any clues on how I can check
to see what's going wrong? Is my gut feeling that it's changing the file
type wrong? (IIUC, there are different open() calls for different size
files? No, I have nothing to base this one, just something I flashed on
and thought might explain the problem.)

I'm learning here guys, so please be gentle. You folks are the only ones I
have with the experience to tell me when I'm just fscked in the head and
when I'm bang on.

-- 
David D.W. Downey - RHCE
Consulting Engineer
Ensim Corporation - Sunnyvale, CA

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Feb 23 2001 - 21:00:12 EST