Re: [PATCH] Fix commit of ordered data buffers

From: Jan Kara
Date: Wed Sep 14 2005 - 07:05:43 EST


> Jan Kara <jack@xxxxxxx> wrote:
> >
> > When a buffer is locked it does not mean that write-out is in progress. We
> > have to check if the buffer is dirty and if it is we have to submit it
> > for write-out. We unconditionally move the buffer to t_locked_list so
> > that we don't mistake unprocessed buffer and buffer not yet given to
> > ll_rw_block(). This subtly changes the meaning of buffer states in
> > t_locked_list - unlock buffer (for list users different from
> > journal_commit_transaction()) does not mean it has been written. But
> > only journal_unmap_buffer() cares and it now checks if the buffer
> > is not dirty.
>
> Seems complex. It means that t_locked_list takes on an additional (and
> undocumented!) meaning.
Sorry, if we agree on some final form I'll add the appropriate
comment to the header file.

> Also, I don't think it works. See ll_rw_block()'s handling of
> already-locked buffers..
We send it to disk with SWRITE - hence ll_rw_block() wait for the buffer
lock for us. Or do you have something else in mind?

> An alternative is to just lock the buffer in journal_commit_transaction(),
> if it was locked-and-dirty. And remove the call to ll_rw_block() and
> submit the locked buffers by hand.
Yes, this has the advantage that we can move the buffer to t_locked_list
in the right time and so we don't change the semantics of t_locked_list.
OTOH the locking will be a bit more complicated (we'd need to acquire and
drop j_list_lock almost for every bh while currently we do it only once
per batch) - the code would have to be like:

spin_lock(&journal->j_list_lock);
while (commit_transaction->t_sync_datalist) {
jh = commit_transaction->t_sync_datalist;
bh = jh2bh(jh);
journal_grab_journal_head(bh);
if (buffer_dirty(bh)) {
get_bh(bh);
spin_unlock(&journal->j_list_lock);
lock_buffer(bh);
if (buffer_dirty(bh))
/* submit the buffer */
jbd_lock_bh_state(bh);
spin_lock(&journal->j_list_lock);
/* Check that somebody did not move the jh elsewhere */
}
else {
if (!inverted_lock(journal, bh))
goto write_out_data;
}
__journal_temp_unlink_buffer(jh);
__journal_file_buffer(jh, commit_transaction, BJ_Locked);
journal_put_journal_head(bh);
jbd_unlock_bh_state(bh);
}

If you prefer something like this I can code it up...

> That would mean that if someone had redirtied a buffer which was on
> t_sync_datalist *while* it was under writeout, we'd end up waiting on that
> writeout to complete before submitting more I/O. But I suspect that's
> pretty rare.
>
> One thing which concerns me with your approach is livelocks: if some process
> sits in a tight loop writing to the same part of the same file, will it
> cause kjournald to get stuck?
No, because as soon as we find the buffer in t_sync_datalist we move
it to t_locked_list and submit it for IO - this case is one reason why I
introduced that new meaning to t_locked_list.

> The problem we have here is "was the buffer dirtied before this commit
> started, or after?". In the former case we are obliged to write it. In
> the later case we are not, and in trying to do this we risk livelocking.

Honza
--
Jan Kara <jack@xxxxxxx>
SuSE CR Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/