Re: [OOPS] 2.6.9-rc4, dual Opteron, NUMA, 8GB

From: Brad Fitzpatrick
Date: Wed Oct 13 2004 - 15:35:35 EST


On Wed, 13 Oct 2004, Randy.Dunlap wrote:

> Brad Fitzpatrick wrote:
> > On Wed, 13 Oct 2004, Jeff Garzik wrote:
> >
> >
> >>Brad Fitzpatrick wrote:
> >>
> >>>I'm reporting an oops. Details follow.
> >>>
> >>>I have two of these machines. I will happily be anybody's guinea pig
> >>>to debug this. (more details, access to machine, try patches, kernels...)
> >>>Machines aren't in production.
> >>>
> >>>- Brad
> >>>
> >>>
> >>>Kernel: 2.6.9-rc4 vanilla (.config below)
> >>>
> >>>Hardware: IBM eServer 325, Dual Opteron 8GB ram (more info below)
> >>>
> >>>Pre-crash and crash:
> >>>
> >>>a1:~# mke2fs /dev/mapper/raid10-data
> >>>mke2fs 1.35 (28-Feb-2004)
> >>>Filesystem label=
> >>>OS type: Linux
> >>>Block size=4096 (log=2)
> >>>Fragment size=4096 (log=2)
> >>>25608192 inodes, 51200000 blocks
> >>>2560000 blocks (5.00%) reserved for the super user
> >>>First data block=0
> >>>1563 block groups
> >>>32768 blocks per group, 32768 fragments per group
> >>>16384 inodes per group
> >>>Superblock backups stored on blocks:
> >>> 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
> >>> 4096000, 7962624, 11239424, 20480000, 23887872
> >>>
> >>>Writing inode tables: 1091/1563
> >>>Message from syslogd@localhost at Wed Oct 13 11:46:01 2004 ...
> >>>localhost kernel: Oops: 0000 [1] SMP
> >>>
> >>>Message from syslogd@localhost at Wed Oct 13 11:46:01 2004 ...
> >>>localhost kernel: CR2: 0000000000001770
> >>
> >>
> >>What's your block device configuration? What block devices are sitting
> >>on top of what other block devices?
> >
> >
> > /dev/mapper/raid10-data is a LV taking 200GB of a 280GB VG ("raid10") with
> > a single PV in it: /dev/sdb1 -- ips driver, IBM ServeRAID 6M card,
> > representing a RAID 10 atop 8 SCSI disks.
> >
> > I just made a new kernel without NUMA and made a filesystem on /dev/sdb1
> > directly instead of using LVM and it worked fine, if not a little slowly.
> >
> > Now that I know it /can/ work, I'll try and narrow down whose fault it is:
> > NUMA or LVM.
>
> Very similar to
> http://marc.theaimsgroup.com/?l=linux-kernel&m=109328505204081&w=2
> and its follow-up:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=109330259511819&w=2
>
> but no solutions there.

Well, good to know I'm not alone? :-)

I was just about to mail and report that disabling NUMA does help:

NUMA + mke2fs on LVM: OOPS (mailed earlier)
no NUMA + mke2fs on LVM: okay
NUMA + mke2fs on sdb1: OOPS (below)
no NUMA + mke2fs on sdb1: okay

no NUMA + mount e2fs on LVM: okay
no NUMA + mount e2fs on sb1: okay
NUMA + mount e2fs on LVM: okay
NUMA + mount e2fs on sb1: untested, assume okay


OOPs when doing mke2fs on /dev/sdb1, with NUMA enabled:

Oct 13 13:24:37 localhost kernel: Unable to handle kernel paging request at 0000000000001770 RIP:
Oct 13 13:24:37 localhost kernel: <ffffffff8015efe4>{kmem_getpages+132}
Oct 13 13:24:37 localhost kernel: PML4 1f8fe6067 PGD 1f8fef067 PMD 0
Oct 13 13:24:37 localhost kernel: Oops: 0000 [1] SMP
Oct 13 13:24:37 localhost kernel: CPU 0
Oct 13 13:24:37 localhost kernel: Modules linked in: af_packet tsdev mousedev joydev usbhid ohci_hcd hw_random amd74xx evdev tg3 dm_mod ide_generic ide_cd ide_core cdrom rtc ext3 jbd mbcache sd_mod ips mptscsih mptbase scsi_mod unix
Oct 13 13:24:37 localhost kernel: Pid: 3145, comm: mke2fs Not tainted 2.6.9-rc4
Oct 13 13:24:37 localhost kernel: RIP: 0010:[kmem_getpages+132/432] <ffffffff8015efe4>{kmem_getpages+132}
Oct 13 13:24:37 localhost kernel: RSP: 0018:00000101f81b7aa8 EFLAGS: 00010213
Oct 13 13:24:37 localhost kernel: RAX: ffffffff7fffffff RBX: 00000101fffc9680 RCX: 0000000000000000
Oct 13 13:24:37 localhost kernel: RDX: 0000010000011700 RSI: 00000100000119c0 RDI: 0000010000012500
Oct 13 13:24:37 localhost kernel: RBP: 00000101fffc9680 R08: 000001016bc01000 R09: 00000101fffc96e8
Oct 13 13:24:37 localhost kernel: R10: 0000000000000000 R11: 00000000fffffffa R12: 00000101fffc9680
Oct 13 13:24:37 localhost kernel: R13: 0000000000000000 R14: 00000101fffc9728 R15: 0000000000000001
Oct 13 13:24:37 localhost kernel: FS: 0000002a95ddb4a0(0000) GS:ffffffff803df300(0000) knlGS:0000000000000000
Oct 13 13:24:37 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct 13 13:24:37 localhost kernel: CR2: 0000000000001770 CR3: 0000000000101000 CR4: 00000000000006e0
Oct 13 13:24:37 localhost kernel: Process mke2fs (pid: 3145, threadinfo 00000101f81b6000, task 00000101fe9b4030)
Oct 13 13:24:37 localhost kernel: Stack: 000001016c3fa000 0000000000000000 0000000000000050 ffffffff8015ff6e
Oct 13 13:24:37 localhost kernel: 0000005000000010 000000000000003c 00000100fbf6b000 00000101fffc9680
Oct 13 13:24:37 localhost kernel: 00000101fffc96c8 00000101fffc9728
Oct 13 13:24:37 localhost kernel: Call Trace:<ffffffff8015ff6e>{cache_grow+190} <ffffffff801601c6>{cache_alloc_refill+422}
Oct 13 13:24:37 localhost kernel: <ffffffff801604b6>{kmem_cache_alloc+54} <ffffffff8017d761>{alloc_buffer_head+17}
Oct 13 13:24:37 localhost kernel: <ffffffff8017aeba>{create_buffers+42} <ffffffff8017b884>{create_empty_buffers+20}
Oct 13 13:24:37 localhost kernel: <ffffffff8017bcdf>{__block_prepare_write+175} <ffffffff80180190>{blkdev_get_block+0}
Oct 13 13:24:37 localhost kernel: <ffffffff8017c78a>{block_prepare_write+26} <ffffffff80158dc4>{generic_file_buffered_write+404}
Oct 13 13:24:37 localhost kernel: <ffffffff80193fae>{inode_update_time+158} <ffffffff801594dd>{generic_file_aio_write_nolock+765}
Oct 13 13:24:37 localhost kernel: <ffffffff801595b5>{generic_file_write_nolock+165} <ffffffff80134ef3>{__wake_up+67}
Oct 13 13:24:37 localhost kernel: <ffffffff802a46fe>{thread_return+41} <ffffffff80136890>{autoremove_wake_function+0}
Oct 13 13:24:37 localhost kernel: <ffffffff8018128a>{blkdev_file_write+26} <ffffffff801789e4>{vfs_write+228}
Oct 13 13:24:37 localhost kernel: <ffffffff80178b13>{sys_write+83} <ffffffff8011195a>{system_call+126}
Oct 13 13:24:37 localhost kernel:
Oct 13 13:24:37 localhost kernel:
Oct 13 13:24:37 localhost kernel: Code: 48 8b 91 70 17 00 00 76 07 b8 00 00 00 80 eb 0a 48 b8 00 00
Oct 13 13:24:37 localhost kernel: RIP <ffffffff8015efe4>{kmem_getpages+132} RSP <00000101f81b7aa8>
Oct 13 13:24:37 localhost kernel: CR2: 0000000000001770



Randy, if you're interested and you're actually at OSDL Beaverton, I'm
just across the street from you. I could carry this 1U server and 3U
drive cabinet over to you! :)

Who's responsible for the K8_NUMA stuff? I'd love to work with them to
narrow this down.

- Brad


>
> --
> ~Randy
> MOTD: Always include version info.
> (Again. Sometimes I think ln -s /usr/src/linux/.config .signature)
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/