Re: spin lock contention: lru_list lock

From: David S. Miller (davem@redhat.com)
Date: Sat Feb 19 2000 - 11:42:32 EST


   Date: Sat, 19 Feb 2000 17:17:01 +0100
   From: Manfred Spraul <manfreds@colorfullife.com>

   Yes, the problem is that gtcd polls /dev/cdrom, and that every close()
   call calls __invalidate_buffers(). __invalidate_buffers() must walk the
   complete buffer chain, and as I wrote the buffer chain contained ~
   100000 entries.

   Either we ignore the problem, or we could add a per-kdev_t linked list.

It's very a difficult issue, and I fully investigated this when
I initially threaded the buffer cache layer. The summary of the
difficulty can be discovered by just examining what
try_to_free_buffers needs to do.

It is clearly the case that buffers can be accessed and found
via several tables. The hash tables in block/kdev keyed searches, the
lru lists for dirtying and buffer flushing/unlocking, and the free
lists when creating new buffers. Finally, the unused list is accessed
for page I/O tagging buffer heads.

try_to_free_buffers is just given an arbitrary page, we are not told
explicitly whether these buffers are real true buffer heads, or just
I/O tags for an entry in the page cache. (We can find out, this is
why I am describing all of this detail).

So the code there is super conservative and just locks out all the
entry points to possibly finding a buffer anywhere, in any table.
Without question, this scheme is much easier to verify than trying
to fine grain it.

This function must return with a status which represents the
disposition of the buffers after trying to free them. It would
be dangerous to do the locking on each buffer individually. For
example, if we were to scan and lock based upon where each individual
buffer is, try to free that one, and redo the locking for the next
buffer we try to free up, older buffers we dealt with can reappear.

In my initial threading attempts, I wished to experiment with the
effects of per-hashchain locks in the buffer cache hash table. This
has potential good scalability properties, however the issues
surrounding correct operation of try_to_free_buffers(), as described
above, prevented me from doing this.

I do not mean to imply that relaxing the locks here is impossible.

The problem which would need to be solved to allow this would be to
somehow "grab" or "freeze" all buffer heads attached to a page, even
if they are active. It can be done, but it would be tricky to get
all of the semantics correct everywhere.

Once this would be accomplished, another attempt could be made to
use per-hashchain and per-lru-list locking to hopefully reduce
contention.

I know this would not fix properly the case you mention, but it does
demonstrate why adjusting the locking strategy would be a bit
difficult.

   I have another problem with flush_dirty_buffers():

   flush_dirty_buffers() quits as soon as it finds the first buffer where
   b_flushtime has not yet expired.

   I can't find the code that enforced this chronological ordering, esp.
   since the time until a buffer must be written to disk if different for
   metadata blocks and normal blocks.
   [just grep for b_flushtime, there is no sort algorithm]

If age_super and ag_buffer tunables were identical, and refile_buffer
would always add to the tail of the lru list, it might be correct.

However, of course, neither of these two things are true.
So I believe your analysis is correct :-)

I wonder how much it really matters...

Later,
David S. Miller
davem@redhat.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Feb 23 2000 - 21:00:23 EST