Re: swsusp is at it... again

From: fchabaud@free.fr
Date: Thu Mar 07 2002 - 04:54:22 EST


('BINARY' encoding is not supported, stored as-is)

Le 5 Mar, Pavel Machek a écrit :
> Hi!
>
>> > After about 20 resume cycles (compiled kernel with swsusp making
>> > machine suspend/resume) I got that nasty FS corruption, again.
>> >
>> > So...
>> >
>> > 1) Maybe your ext3 patches are not at fault.
>>
>> I suspect all this come from suspension failure and immediate resume. I
>> have reenabled your panic ! I believe that if a task isn't stopped
>
> Okay, I think I can try that. [Do you think you can send me your diffs
> to ext3?]

I attach my diff with your patch that I applied on 2.4.18.

>
>> I also made a modification in stopping task to stop normal task and then
>> kernel threads (I had to add a new PF_KERNTHREAD flag). Perhaps the bug
>> has to do with the *order* of stopping processes (I think of that
>> because kernel messages are written to log files: what happens if
>> kjournald thread is stopped and a task still writes ?)
>
> Nothing that bad should happen... kjournald is only _delayed_ right?
> And it could be delayed by scheduling as well.

Actually, I'm not sure that so simple. I have passed hours trying to
figure out exactly what's happening but I'm not confident with that
assumption. All transactions in journal have an expiration time based on
jiffies and I'm not sure jiffies are correctly resumed, are they ? Maybe
this expiration of transaction is handled in a way that is not
inocuitous in our context.

>
>> > 2) Be carefull using swsusp patch. Real carefull.
>> >
>> > 3) Don't trust fsck. At this kind of corruption, e2fsck 1.19 will
>> > report "clean" but will not repair it, putting your fs into
>> > self-destruct mode. Bad bad. Its fixed on new versions. Always run
>> > fsck twice, second time with -f.
>>
>> tune2fs -e panic
>> is also a good precaution at least for ext3 filesystems because all my
>> root inode crashes were preceded by ext3-error messages and these
>> messages were sometimes several hours before effective crash.
>
> Yes.

Unfortunately not sufficient at least in my last crash. The good point
is that this time, I had to reinstall, so the filesystem should now be
clean: no trace of ancient crashes should remain ;-)

Last but not least I won't be able to work on swsusp for a while due to
other priorities :-(

--
Florent Chabaud
http://www.ssi.gouv.fr | http://fchabaud.free.fr


- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 07 2002 - 21:01:01 EST