Re: minor JBD tunings

From: Andrew Morton (akpm@osdl.org)
Date: Fri Jul 04 2003 - 12:36:09 EST


Alex Tomas <bzzz@tmi.comex.ru> wrote:
>
> I'm trying to improve JBD performance.

Not a bad thing to be doing ;)

I made quite a few changes, mainly removing some unused function args.

This code in log_do_checkpoint() confuses me:

                transaction = journal->j_checkpoint_transactions;
                while (transaction->t_tid <= start_tid) {
                        transaction = transaction->t_cpnext;
                        if (transaction == journal->j_checkpoint_transactions)
                                goto finish;
                }

what are you actually searching for here? What is the intent?

If seems to be saying "I want to find the oldest transaction to checkpoint,
but I refuse to checkpoint the most recently checkpointed transaction",
yes?

If so, then it could be simplified to:

        transaction = journal->j_checkpoint_transactions->t_cpprev;

could it not?

You'll never actually wrap around to the most-recently-checkpointed
transaction in here because that would mean that there is only a single
checkpoint transaction. And when there's only a single checkpointed
transaction this function will never be called, because there is sufficient
log space.

Unless someone diddles with the journal parameters to make the max
transaction size larger than journal_size/4. So to be robust in the
presence of such a change, we shouldn't make any explicit provision to
avoid checkpointing the trasnaction at journal->j_checkpoint_transactions.

So I think the above line is sufficient.

Some other possible improvements to the checkpointing code would be:

- Get some async writeout underway before we actually run out of journal
  space.

- Don't wait on each transaction individually: schedule enough IO to
  reclaim sufficient log space and then wait on all those transactions.

errr, what's going on here? __flush_buffer() unconditionally does

        *freed = 1;

so log_do_checkpoint() will _always_ break out of the loop, without having
waited on any I/O. log_do_checkpoint() returns to caller, caller calls
log_do_checkpoint() again, we walk over the same transaction again, not
scheduling any new I/O. Look like it'll all turn into a busy loop?

 fs/jbd/checkpoint.c | 62 ++++++++++++++++++++++++++++++++++++---------------
 fs/jbd/journal.c | 4 +--
 fs/jbd/transaction.c | 2 -
 include/linux/jbd.h | 4 +--
 4 files changed, 49 insertions(+), 23 deletions(-)

diff -puN fs/jbd/checkpoint.c~jbd-commit-tricks fs/jbd/checkpoint.c
--- 25/fs/jbd/checkpoint.c~jbd-commit-tricks 2003-07-04 10:08:22.000000000 -0700
+++ 25-akpm/fs/jbd/checkpoint.c 2003-07-04 10:08:22.000000000 -0700
@@ -72,14 +72,19 @@ static int __try_to_free_cp_buf(struct j
 /*
  * __log_wait_for_space: wait until there is space in the journal.
  *
- * Called under j-state_lock *only*. It will be unlocked if we have to wait
+ * Called under j_state_lock *only*. It will be unlocked if we have to wait
  * for a checkpoint to free up some space in the log.
  */
-
-void __log_wait_for_space(journal_t *journal, int nblocks)
+void __log_wait_for_space(journal_t *journal)
 {
+ int nblocks;
+
         assert_spin_locked(&journal->j_state_lock);
 
+ nblocks = journal->j_max_transaction_buffers;
+ if (journal->j_committing_transaction)
+ nblocks += journal->j_committing_transaction->
+ t_outstanding_credits;
         while (__log_space_left(journal) < nblocks) {
                 if (journal->j_flags & JFS_ABORT)
                         return;
@@ -91,9 +96,13 @@ void __log_wait_for_space(journal_t *jou
                  * were waiting for the checkpoint lock
                  */
                 spin_lock(&journal->j_state_lock);
+ nblocks = journal->j_max_transaction_buffers;
+ if (journal->j_committing_transaction)
+ nblocks += journal->j_committing_transaction->
+ t_outstanding_credits;
                 if (__log_space_left(journal) < nblocks) {
                         spin_unlock(&journal->j_state_lock);
- log_do_checkpoint(journal, nblocks);
+ log_do_checkpoint(journal);
                         spin_lock(&journal->j_state_lock);
                 }
                 up(&journal->j_checkpoint_sem);
@@ -222,14 +231,17 @@ __flush_batch(journal_t *journal, struct
  *
  * Called with j_list_lock held.
  * Called under jbd_lock_bh_state(jh2bh(jh)), and drops it
+ * Sets *freed true if the transaction was dropped.
  */
 static int __flush_buffer(journal_t *journal, struct journal_head *jh,
                         struct buffer_head **bhs, int *batch_count,
- int *drop_count)
+ int *drop_count, int *freed)
 {
         struct buffer_head *bh = jh2bh(jh);
         int ret = 0;
 
+ *freed = 1;
+
         if (buffer_dirty(bh) && !buffer_locked(bh) && jh->b_jlist == BJ_None) {
                 J_ASSERT_JH(jh, jh->b_transaction == NULL);
 
@@ -262,6 +274,8 @@ static int __flush_buffer(journal_t *jou
                 if (__try_to_free_cp_buf(jh)) {
                         (*drop_count)++;
                         ret = last_buffer;
+ if (last_buffer)
+ *freed = 1;
                 }
         }
         return ret;
@@ -280,12 +294,12 @@ static int __flush_buffer(journal_t *jou
  * The journal should be locked before calling this function.
  */
 
-/* @@@ `nblocks' is unused. Should it be used? */
-int log_do_checkpoint(journal_t *journal, int nblocks)
+int log_do_checkpoint(journal_t *journal)
 {
         int result;
         int batch_count = 0;
         struct buffer_head *bhs[NR_BATCH];
+ int start_tid;
 
         jbd_debug(1, "Start checkpoint\n");
 
@@ -308,14 +322,23 @@ int log_do_checkpoint(journal_t *journal
          * degenerates into a busy loop at unmount time.
          */
         spin_lock(&journal->j_list_lock);
+ if (journal->j_checkpoint_transactions)
+ start_tid = journal->j_checkpoint_transactions->t_tid - 1;
         while (journal->j_checkpoint_transactions) {
                 transaction_t *transaction;
                 struct journal_head *jh, *last_jh, *next_jh;
                 int drop_count = 0;
                 int cleanup_ret, retry = 0;
                 tid_t this_tid;
+ int freed = 0;
 
- transaction = journal->j_checkpoint_transactions->t_cpnext;
+ transaction = journal->j_checkpoint_transactions;
+ while (transaction->t_tid <= start_tid) {
+ transaction = transaction->t_cpnext;
+ if (transaction == journal->j_checkpoint_transactions)
+ goto finish;
+ }
+ start_tid = transaction->t_tid;
                 this_tid = transaction->t_tid;
                 jh = transaction->t_checkpoint_list;
                 last_jh = jh->b_cpprev;
@@ -333,14 +356,22 @@ int log_do_checkpoint(journal_t *journal
                                 break;
                         }
                         retry = __flush_buffer(journal, jh, bhs, &batch_count,
- &drop_count);
+ &drop_count, &freed);
                 } while (jh != last_jh && !retry);
- if (batch_count) {
+
+ if (batch_count)
                         __flush_batch(journal, bhs, &batch_count);
+ /*
+ * transaction was freed, so we return to check whether
+ * sufficient log space was made available.
+ */
+ if (freed)
+ break;
+
+ if (retry) {
+ start_tid--;
                         continue;
                 }
- if (retry)
- continue;
 
                 /*
                  * If someone emptied the checkpoint list while we slept, we're
@@ -349,12 +380,6 @@ int log_do_checkpoint(journal_t *journal
                 if (!journal->j_checkpoint_transactions)
                         break;
                 /*
- * If someone cleaned up this transaction while we slept, we're
- * done
- */
- if (journal->j_checkpoint_transactions->t_cpnext != transaction)
- continue;
- /*
                  * Maybe it's a new transaction, but it fell at the same
                  * address
                  */
@@ -368,6 +393,7 @@ int log_do_checkpoint(journal_t *journal
                 cleanup_ret = __cleanup_transaction(journal, transaction);
                 J_ASSERT(drop_count != 0 || cleanup_ret != 0);
         }
+finish:
         spin_unlock(&journal->j_list_lock);
         result = cleanup_journal_tail(journal);
         if (result < 0)
diff -puN fs/jbd/transaction.c~jbd-commit-tricks fs/jbd/transaction.c
--- 25/fs/jbd/transaction.c~jbd-commit-tricks 2003-07-04 10:08:22.000000000 -0700
+++ 25-akpm/fs/jbd/transaction.c 2003-07-04 10:08:22.000000000 -0700
@@ -214,7 +214,7 @@ repeat_locked:
         if (__log_space_left(journal) < needed) {
                 jbd_debug(2, "Handle %p waiting for checkpoint...\n", handle);
                 spin_unlock(&transaction->t_handle_lock);
- __log_wait_for_space(journal, needed);
+ __log_wait_for_space(journal);
                 goto repeat_locked;
         }
 
diff -puN fs/jbd/journal.c~jbd-commit-tricks fs/jbd/journal.c
--- 25/fs/jbd/journal.c~jbd-commit-tricks 2003-07-04 10:08:22.000000000 -0700
+++ 25-akpm/fs/jbd/journal.c 2003-07-04 10:08:22.000000000 -0700
@@ -1076,7 +1076,7 @@ void journal_destroy(journal_t *journal)
         spin_lock(&journal->j_list_lock);
         while (journal->j_checkpoint_transactions != NULL) {
                 spin_unlock(&journal->j_list_lock);
- log_do_checkpoint(journal, 1);
+ log_do_checkpoint(journal);
                 spin_lock(&journal->j_list_lock);
         }
 
@@ -1284,7 +1284,7 @@ int journal_flush(journal_t *journal)
         spin_lock(&journal->j_list_lock);
         while (!err && journal->j_checkpoint_transactions != NULL) {
                 spin_unlock(&journal->j_list_lock);
- err = log_do_checkpoint(journal, journal->j_maxlen);
+ err = log_do_checkpoint(journal);
                 spin_lock(&journal->j_list_lock);
         }
         spin_unlock(&journal->j_list_lock);
diff -puN include/linux/jbd.h~jbd-commit-tricks include/linux/jbd.h
--- 25/include/linux/jbd.h~jbd-commit-tricks 2003-07-04 10:08:22.000000000 -0700
+++ 25-akpm/include/linux/jbd.h 2003-07-04 10:08:22.000000000 -0700
@@ -992,9 +992,9 @@ int log_start_commit(journal_t *journal,
 int __log_start_commit(journal_t *journal, tid_t tid);
 int journal_start_commit(journal_t *journal, tid_t *tid);
 int log_wait_commit(journal_t *journal, tid_t tid);
-int log_do_checkpoint(journal_t *journal, int nblocks);
+int log_do_checkpoint(journal_t *journal);
 
-void __log_wait_for_space(journal_t *journal, int nblocks);
+void __log_wait_for_space(journal_t *journal);
 extern void __journal_drop_transaction(journal_t *, transaction_t *);
 extern int cleanup_journal_tail(journal_t *);
 

_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jul 07 2003 - 22:00:23 EST