[PATCH v3 02/18] locking/lockdep: Add description and explanation in lockdep design doc

From: Yuyang Du
Date: Thu Mar 21 2019 - 03:57:50 EST


More words are added to lockdep design document regarding key concepts,
which helps people understand the design as well as read the reports.

Signed-off-by: Yuyang Du <duyuyang@xxxxxxxxx>
---
Documentation/locking/lockdep-design.txt | 80 ++++++++++++++++++++++++--------
1 file changed, 60 insertions(+), 20 deletions(-)

diff --git a/Documentation/locking/lockdep-design.txt b/Documentation/locking/lockdep-design.txt
index 49f58a0..1dcceaa 100644
--- a/Documentation/locking/lockdep-design.txt
+++ b/Documentation/locking/lockdep-design.txt
@@ -15,51 +15,91 @@ tens of thousands of) instantiations. For example a lock in the inode
struct is one class, while each inode has its own instantiation of that
lock class.

-The validator tracks the 'state' of lock-classes, and it tracks
-dependencies between different lock-classes. The validator maintains a
-rolling proof that the state and the dependencies are correct.
-
-Unlike an lock instantiation, the lock-class itself never goes away: when
-a lock-class is used for the first time after bootup it gets registered,
-and all subsequent uses of that lock-class will be attached to this
-lock-class.
+The validator tracks the 'usage state' of lock-classes, and it tracks the
+dependencies between different lock-classes. The dependency can be
+understood as lock order, where L1 -> L2 suggests L1 depends on L2, which
+can also be expressed as a forward dependency (L1 -> L2) or a backward
+dependency (L2 <- L1). From lockdep's perspective, the two locks (L1 and L2)
+are not necessarily related as opposed to in some modules an order must be
+followed. Here it just means that order ever happened. The validator
+maintains a continuing effort to prove that the lock usages and their
+dependencies are correct or the validator will shoot a splat if they are
+potentially incorrect.
+
+Unlike a lock instance, a lock-class itself never goes away: when a
+lock-class's instance is used for the first time after bootup the class gets
+registered, and all (subsequent) instances of that lock-class will be mapped
+to the lock-class.

State
-----

-The validator tracks lock-class usage history into 4 * nSTATEs + 1 separate
-state bits:
+The validator tracks lock-class usage history and divides the usage into
+(4 usages * n STATEs + 1) categories:

+Where the 4 usages can be:
- 'ever held in STATE context'
- 'ever held as readlock in STATE context'
- 'ever held with STATE enabled'
- 'ever held as readlock with STATE enabled'

-Where STATE can be either one of (kernel/locking/lockdep_states.h)
- - hardirq
- - softirq
+Where the n STATEs are coded in kernel/locking/lockdep_states.h and as of
+now they include:
+- hardirq
+- softirq

+Where the last 1 category is:
- 'ever used' [ == !unused ]

-When locking rules are violated, these state bits are presented in the
-locking error messages, inside curlies. A contrived example:
+When locking rules are violated, these usage bits are presented in the
+locking error messages, inside curlies, with a total of 2 * n STATEs bits.
+See a contrived example:

modprobe/2287 is trying to acquire lock:
- (&sio_locks[i].lock){-.-...}, at: [<c02867fd>] mutex_lock+0x21/0x24
+ (&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24

but task is already holding lock:
- (&sio_locks[i].lock){-.-...}, at: [<c02867fd>] mutex_lock+0x21/0x24
+ (&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24


-The bit position indicates STATE, STATE-read, for each of the states listed
-above, and the character displayed in each indicates:
+For a given lock, the bit positions from left to right indicate the usage
+of the lock and readlock (if exists), for each of the n STATEs listed
+above respectively, and the character displayed at each bit position
+indicates:

'.' acquired while irqs disabled and not in irq context
'-' acquired in irq context
'+' acquired with irqs enabled
'?' acquired in irq context with irqs enabled.

-Unused mutexes cannot be part of the cause of an error.
+The bits are illustrated with an example:
+
+ (&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24
+ ||||
+ ||| \-> softirq disabled and not in softirq context
+ || \--> acquired in softirq context
+ | \---> hardirq disabled and not in hardirq context
+ \----> acquired in hardirq context
+
+
+For a given STATE, whether the lock is ever acquired in that STATE context
+and whether that STATE is enabled yields four possible cases as shown in the
+table below. It is worth noting that the bit character is able to indicate
+which exact case is for the lock as of the reporting time.
+
+ -------------------------------------------
+ | | irq enabled | irq disabled |
+ -------------------------------------------
+ | ever in irq | ? | - |
+ -------------------------------------------
+ | never in irq | + | . |
+ -------------------------------------------
+
+The character '-' suggests irq is disabled because if otherwise, the
+charactor '?' would have been shown instead. Similar deduction can be
+applied for '+' too.
+
+Unused locks (e.g., mutexes) cannot be part of the cause of an error.


Single-lock state rules:
--
1.8.3.1