Re: ext3 performance bottleneck as the number of spindles gets large

From: Stephen C. Tweedie (sct@redhat.com)
Date: Thu Jun 20 2002 - 04:54:49 EST


Hi,

On Wed, Jun 19, 2002 at 05:54:46PM -0700, Andrew Morton wrote:

> The vague plan there is to replace lock_kernel with lock_journal
> where appropriate. But ext3 scalability work of this nature
> will be targetted at the 2.5 kernel, most probably.

I think we can do better than that, with care. lock_journal could
easily become a read/write lock to protect the transaction state
machine, as there's really only one place --- the commit thread ---
where we end up changing the state of a transaction itself (eg. from
running to committing). For short-lived buffer transformations, we
already have the datalist spinlock.

There are a few intermediate types of operation, such as the
do_get_write_access. That's a buffer operation, but it relies on us
being able to allocate memory for the old version of the buffer if we
happen to be committing the bh to disk already. All of those cases
are already prepared to accept BKL being dropped during the memory
allocation, so there's no problem with doing the same for a short-term
buffer spinlock; and if the journal_lock is only taken shared in such
places, then there's no urgent need to drop that over the malloc.

Even the commit thread can probably avoid taking the journal lock in
many cases --- it would need it exclusively while changing a
transaction's global state, but while it's just manipulating blocks on
the committing transaction it can probably get away with much less
locking.

Cheers,
 Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jun 23 2002 - 22:00:21 EST