Re: [PATCH 22/27] locking/lockdep: Reuse list entries that are no longer in use

From: Bart Van Assche
Date: Mon Dec 03 2018 - 11:40:53 EST


On Sat, 2018-12-01 at 21:24 +-0100, Peter Zijlstra wrote:
+AD4 On Thu, Nov 29, 2018 at 08:48:50AM -0800, Bart Van Assche wrote:
+AD4 +AD4 On Thu, 2018-11-29 at 13:01 +-0100, Peter Zijlstra wrote:
+AD4 +AD4 +AD4 On Thu, Nov 29, 2018 at 11:49:02AM +-0100, Peter Zijlstra wrote:
+AD4 +AD4 +AD4 +AD4 On Wed, Nov 28, 2018 at 03:43:20PM -0800, Bart Van Assche wrote:
+AD4 +AD4 +AD4 +AD4 +AD4 /+ACo
+AD4 +AD4 +AD4 +AD4 +AD4 +ACo Remove all dependencies this lock is
+AD4 +AD4 +AD4 +AD4 +AD4 +ACo involved in:
+AD4 +AD4 +AD4 +AD4 +AD4 +ACo-/
+AD4 +AD4 +AD4 +AD4 +AD4 +- list+AF8-for+AF8-each+AF8-entry+AF8-safe(entry, tmp, +ACY-all+AF8-list+AF8-entries, alloc+AF8-entry) +AHs
+AD4 +AD4 +AD4 +AD4 +AD4 if (entry-+AD4-class +ACEAPQ class +ACYAJg entry-+AD4-links+AF8-to +ACEAPQ class)
+AD4 +AD4 +AD4 +AD4 +AD4 continue+ADs
+AD4 +AD4 +AD4 +AD4 +AD4 links+AF8-to +AD0 entry-+AD4-links+AF8-to+ADs
+AD4 +AD4 +AD4 +AD4 +AD4 WARN+AF8-ON+AF8-ONCE(entry-+AD4-class +AD0APQ links+AF8-to)+ADs
+AD4 +AD4 +AD4 +AD4 +AD4 list+AF8-del+AF8-rcu(+ACY-entry-+AD4-lock+AF8-order+AF8-entry)+ADs
+AD4 +AD4 +AD4 +AD4 +AD4 +- list+AF8-move(+ACY-entry-+AD4-alloc+AF8-entry, +ACY-free+AF8-list+AF8-entries)+ADs
+AD4 +AD4 +AD4 +AD4 +AD4 entry-+AD4-class +AD0 NULL+ADs
+AD4 +AD4 +AD4 +AD4 +AD4 entry-+AD4-links+AF8-to +AD0 NULL+ADs
+AD4 +AD4 +AD4 +AD4 +AD4 check+AF8-free+AF8-class(zapped+AF8-classes, class)+ADs
+AD4 +AD4 +AD4 +AD4
+AD4 +AD4 +AD4 +AD4 Hurm.. I'm confused here.
+AD4 +AD4 +AD4 +AD4
+AD4 +AD4 +AD4 +AD4 The reason you cannot re-use lock+AF8-order+AF8-entry for the free list is
+AD4 +AD4 +AD4 +AD4 because list+AF8-del+AF8-rcu(), right? But if so, then what ensures the
+AD4 +AD4 +AD4 +AD4 list+AF8-entry is not re-used before it's grace-period?
+AD4 +AD4 +AD4
+AD4 +AD4 +AD4 Also+ADs if you have to grow lock+AF8-list by 16 bytes just to be able to free
+AD4 +AD4 +AD4 it, a bitmap allocator is much cheaper, space wise.
+AD4 +AD4 +AD4
+AD4 +AD4 +AD4 Some people seem to really care about the static image size, and
+AD4 +AD4 +AD4 lockdep's .data section does matter to them.
+AD4 +AD4
+AD4 +AD4 How about addressing this by moving removed list entries to a +ACI-zapped+AF8-entries+ACI
+AD4 +AD4 list and only moving list entries from the zapped+AF8-entries list to the
+AD4 +AD4 free+AF8-list+AF8-entries list after an RCU grace period? I'm not sure that it is
+AD4 +AD4 possible to implement that approach without introducing a new list+AF8-head in
+AD4 +AD4 struct lock+AF8-list.
+AD4
+AD4 I think we can do this with a free bitmap and an array of 2 pending
+AD4 bitmaps and an index. Add newly freed entries to the pending bitmap
+AD4 indicated by the current index, when complete flip the index -- such
+AD4 that further new bits go to the other pending bitmap -- and call+AF8-rcu().
+AD4
+AD4 Then, on the call+AF8-rcu() callback, ie. after a GP has happened, OR our
+AD4 pending bitmap into the free bitmap, and when the other pending bitmap
+AD4 isn't empty, flip the index again and start it all again.
+AD4
+AD4 This ensures there is at least one full GP between setting a bit and it
+AD4 landing in the free mask.

Hi Peter,

How about the following alternative which requires only two bitmaps instead
of three:
- Maintain two bitmaps, one for the free entries and one for the entries
that are being freed.
- Protect all accesses to both bitmaps with the graph lock.
- zap+AF8-class() sets a bit in the +ACI-being freed+ACI bitmap for the entries that
should be freed after a GP.
- Instead of making free+AF8-zapped+AF8-classes() wait for a grace period by calling
synchronize+AF8-sched(), use call+AF8-rcu() and do the freeing work from inside the
RCU callback.
- From inside the RCU callback, set a bit in the +ACI-free+ACI bitmap for all entries
that have a bit set in the +ACI-being freed+ACI bitmap and clears the +ACI-being freed+ACI
bitmap.

Thanks,

Bart.