Re: [Patch] jbd commit code deadloop when installing Linux

From: Zou Nan hai
Date: Wed Jun 28 2006 - 04:35:30 EST


On Wed, 2006-06-28 at 16:04, Andrew Morton wrote:
> On 28 Jun 2006 14:02:57 +0800
> Zou Nan hai <nanhai.zou@xxxxxxxxx> wrote:
>
> > > > However I think cond_resched_lock and cond_resched_softirq also need fix
> > > > to make the semantic consistent.
> > > >
> > > > Please check the following patch.
> > > >
> > >
> > > Ah. I think the return value from these functions should mean "something
> > > disruptive happened", if you like.
> > >
> > > See, the callers of cond_resched_lock() aren't interested in whether
> > > cond_resched_lock() actually called schedule(). They want to know whether
> > > cond_resched_lock() dropped the lock. Because if the lock was dropped, the
> > > caller needs to take some special action, regardless of whether schedule()
> > > was finally called.
> > >
> > > So I think the patch I queued is OK, agree?
> >
> > I am afraid the code like cond_resched_lock check in
> > fs/jbd/checkpoint.c log_do_checkpoint may fall into endless retry in
> > some condition, will it?
>
> Oh crap, yes. If need_resched() and system_state==SYSTEM_BOOTING then
> cond_resched_lock() will drop the lock but won't schedule. So it'll return
> true but won't clear need_resched() and the caller will lock up.
>
> So if cond_resched_foo() ends up dropping the lock it _must_ call
> schedule() to clear need_resched().
>
> So, how about this (it needs some code comments!)
>
>

The patch works for the install test env.
However I still have some concern on cond_resched_lock(), on an UP
kernel it will return 1 if schedule happen, but actually it does not
drop any lock, that semantic seems to be different to SMP kernel.

Though the only code I found that checks the return value of
cond_resched_lock is in checkpoint.c...

Zou Nan hai

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/