Re: Possible dcache BUG

From: Gene Heskett
Date: Mon Aug 09 2004 - 23:16:00 EST


On Sunday 08 August 2004 14:39, Andrew Morton wrote:
>Gene Heskett <gene.heskett@xxxxxxxxxxx> wrote:
>> On Thursday 05 August 2004 23:24, Linus Torvalds wrote:
>>
>> [...]
>>
>> >I'll commit the obvious one-liner fix, since it might explain
>> > _some_ problems people have seen.
>> >
>> > Linus
>>
>> I had to reboot late last night, out of memory and things (like
>> mozilla (1.7.2) were dying, but nothing in the logs.
>
>Please wait for it to happen again, then send the contents of
>/proc/meminfo, /proc/slabinfo and then do
>
> su
> dmesg -c
> echo m > /proc/sysrq-trigger
> dmesg > foo
>
>and send foo as well.

I just had to reboot again. Top was showing about 50 megs free, and
there was about 60 megs in the swap. Top wasn't showing anything
else of interest that I noted.

I've been gone all day, a long day at that, 12 hours. We had another
blowup in the hi voltage at the tv transmitter, and we'll be sometime
tomorrow getting things back to normal there. Its 40 years old, and
quite far up the far end of the "bathtub curve".

I left about 10:15 and came back in about 22:30. A friend had been
trying to reach me over an alsa problem, and I'd opened a shell and
was showing him how the new 2.6 modprobe.conf worked. When we were
done, I hit a q to quit less, and (surprise) the whole shell went
away, and I could not start another shell, each attempt being
reported as an error 5 on the kickstart panel at the bottom of the
screen after the new window opened and reclosed in about 100
milliseconds per attempt. I quit the top program to free that shell,
and thinking maybe I was being attacked, entered a 'w' at the prompt,
and that shell went away too, with the same error. That left me with
the tail on the log, which at that point still wasn't showing me
anything but a samba restart I do once daily else it dies from a
profound lack of interest anyway.

I right clicked on the screen and selected quit X.

It quit, but then a trap error was reported.

I typed "reboot" and the machine reported no more processes at this
run level and was then DOA, requireing a tap on the reset button to
bring it back to life. On the subsequent e2fsck's, /dev/hda8 had
this error:

i_dir_acl for inode 654880 (/lib/local/ar_YE is 42752 but s/b zero.

And then dropped me to a shell to run e2fsck without any options.
Which I did. Eventually it asked me if I wanted to clear that inode,
so I answered 'y' and it finished without any other errors, but when
I did the ctl-d to reboot, it still wanted to do an e2fsck on
everything, which passed. So now I'm rebooted, but without anything
of meaning (there is nothing in the logs) to report. Any evidence of
the debacle is now gone.

Also, during the reboot, I'm blind from "ok, booting the kernel" until
the line in something that sets the default font is executed, setting
it to "lat0_sun16" at which time I have readable info on the screen
again. I don't recall seeing that particular font mentioned in a
make xconfig, so I've no idea how to make it use it from square one
so I can read the dmesg as it goes by the first time. I have
iso-8859-1 compiled in, along with codepage 437 for US useage, with
everything else as modular.

How can I fix this "blind" time?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.24% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/