Re: Software raid in 2.2.16 broken?

From: Camm Maguire (camm@enhanced.com)
Date: Thu Jul 27 2000 - 11:47:42 EST


There is an errata patch on Alan Cox's website that fixes this. Its a
one liner.

Take care,

"nate@firetrail.com" <nate@firetrail.com> writes:

> I sent this email to debian-user & debian-devel@lists.debian.org had 1
> other user that had this same problem. It is quite severe.
>
> below is the original email about the kernel puking when running e2fsck on
> a software raid(level 1) device:
>
>
> -- begin original email --
> To: Debian user mailinglist <debian-user@lists.debian.org>
> Subject: e2fsck+raid+linux 2.2.16 = crash?
> From: "nate@firetrail.com" <nate@firetrail.com>
> Date: Fri, 14 Jul 2000 15:40:34 -0700 (PDT)
> cc: debian-devel@lists.debian.org
>
>
>
> Today...
>
> One of my systems crashed due to SCSI disk errors.
>
> when it came back up it crashed again. so i had to go to the site (2 hour
> drive!@) and took a look at it.
>
> it seems when the system tries to run e2fsck on the raid set the system
> crashes with the message:
>
> Jul 14 13:36:30 galactica kernel: VFS: grow_buffers: size = 16384
> Jul 14 13:36:31 galactica last message repeated 665 times
> Jul 14 13:36:31 galactica kernel: 384
> Jul 14 13:36:31 galactica kernel: VFS: grow_buffers: size = 16384
> Jul 14 13:36:31 galactica last message repeated 814 times
> Jul 14 13:36:31 galactica kernel: 384
> Jul 14 13:36:31 galactica kernel: VFS: grow_buffers: size = 16384
> Jul 14 13:36:31 galactica last message repeated 468 times
>
> it goes on and on and on..(the people on site left it in that state for a
> good hour and a half until i got there)
>
> This did not happen with 2.2.15. This is also the first time i have run
> e2fsck on the raid drive since upgrading to 2.2.16. As a result i have
> been forced to mount the raid array unclean just so i could get the system
> up and running so the rest of the staff at the site could go home(its
> co-located). I didn't think to reboot to 2.2.15 and run e2fsck on it
> before i left i was in a real rush.
>
> System configuration:
>
> Asus P2L97-DS (this board does not have scsi despite the "DS" model code)
> Dual P2-233
> 128MB PC100 SDRAM x 3 (384MB total)
> Matrox G100 AGP Video
> Adaptec AHA 2940UW PCI
> 3Com 3C905C w/drivers from www.3com.com
> Generic ATAPI 32x CDROM
> Root device: IBM DDRS-34560D 4.3GB
> Raid system: IBM-DNES-30917OW 9.1GB (x2) Software raid mode 1
> Additional Storage: QUANTUM VIKING 4.5WSE 4.5GB (<- the cause of the
> initial crash)
>
>
> Linux Distribution: Debian GNU/Linux 2.1r4
> Installed: Mid-march
> Installed with kernel: 2.2.14+ow1 (www.openwall.com/linux)
> Currently running: 2.2.16+ow1
>
> Raid version: 0.36.4
>
> this happened again and again, and i finally traced it down to this by
> disabling automounting of /dev/md0 in /etc/fstab and the system booted
> fine. ckraid was run multiple times with no crash(it ran everytime the
> system crashed due to other reasons). Once i tried to run e2fsck on
> /dev/md0 the errors(above) flooded the screen until i ctrl-alt-del.
>
> I'm not a kernel hacker but i was curious if this is a known issue? i do
> read kernel traffic every week but i haven't seen anything specifically
> realted to raid mentioned (that i can remember). I think what i will do is
> download the data off the raid set, and reformat it so it can be
> "clean" again. or reboot to 2.2.15 and run e2fsck on it..
>
> i can try to provide more info if needed, let me know, i just dont want to
> have to drive up there again if i can avoid it:)
>
> help!
>
> nate
>
> :::
> http://www.aphroland.org/
> http://www.linuxpowered.net/
> aphro@aphroland.org
> 3:24pm up 1:30, 1 user, load average: 0.00, 0.01, 0.00
>
> -- end original email --
>
> one user on debian-user pointed me to a patch:
>
> http://www.linux.org.uk/VERSION/relnotes.2216.html
>
> but it appears this code is already in 2.2.16. This does not happen in
> 2.2.15.
>
> any ideas? btw, im not on the list so if you could please cc: me when
> responding.
>
> thanks!
>
> nate
>
>
> :::
> http://www.aphroland.org/
> http://www.linuxpowered.net/
> aphro@aphroland.org
> 3:36pm up 23:04, 2 users, load average: 0.04, 0.01, 0.00
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/
>
>

-- 
Camm Maguire			     			camm@enhanced.com
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jul 31 2000 - 21:00:24 EST