[PATCH tip/core/rcu 17/86] rcu: fix boost-tracing bug and update tracing documentation

From: Paul E. McKenney
Date: Sun May 01 2011 - 09:35:31 EST


From: Paul E. McKenney <paul.mckenney@xxxxxxxxxx>

Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
---
Documentation/RCU/trace.txt | 169 +++++++++++++++++++++++++++++++++++++------
kernel/rcutree_plugin.h | 2 +-
2 files changed, 149 insertions(+), 22 deletions(-)

diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index 5aefd5f..0346d3c 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -10,34 +10,37 @@ for rcutree and next for rcutiny.

CONFIG_TREE_RCU and CONFIG_TREE_PREEMPT_RCU debugfs Files and Formats

-These implementations of RCU provides five debugfs files under the
+These implementations of RCU provides seven debugfs files under the
top-level directory RCU: rcu/rcudata (which displays fields in struct
rcu_data), rcu/rcudata.csv (which is a .csv spreadsheet version of
rcu/rcudata), rcu/rcugp (which displays grace-period counters),
-rcu/rcuhier (which displays the struct rcu_node hierarchy), and
+rcu/rcuhier (which displays the struct rcu_node hierarchy),
rcu/rcu_pending (which displays counts of the reasons that the
-rcu_pending() function decided that there was core RCU work to do).
+rcu_pending() function decided that there was core RCU work to do),
+rcu/rcutorture (which displays rcutorture test progress), and, if the
+kernel is built with CONFIG_RCU_BOOST, rcu/rcuboost (which displays RCU
+boosting statistics).

The output of "cat rcu/rcudata" looks as follows:

rcu_sched:
- 0!c=423090 g=423091 pq=1 pqc=423090 qp=1 dt=86475/1/0 df=16319 of=163 ri=1519 ql=0 qs=.... b=10 ci=1460693 co=1648 ca=6448
- 1!c=423329 g=423330 pq=1 pqc=423329 qp=1 dt=90875/1/0 df=16231 of=157 ri=1249 ql=0 qs=.... b=10 ci=1459002 co=1614 ca=3310
- 2!c=423370 g=423371 pq=1 pqc=423370 qp=1 dt=69661/1/0 df=16125 of=163 ri=1469 ql=0 qs=.... b=10 ci=1610701 co=2015 ca=2378
- 3!c=422967 g=422968 pq=1 pqc=422967 qp=1 dt=70349/1/0 df=12528 of=163 ri=1450 ql=0 qs=.... b=10 ci=1427543 co=1430 ca=897
- 4!c=423196 g=423197 pq=1 pqc=423196 qp=0 dt=38935/1/0 df=10959 of=177 ri=1657 ql=0 qs=.... b=10 ci=1562249 co=1896 ca=533
- 5!c=422950 g=422951 pq=1 pqc=422950 qp=0 dt=25127/1/0 df=5895 of=167 ri=1549 ql=0 qs=.... b=10 ci=1777260 co=2137 ca=274
- 6!c=423396 g=423397 pq=1 pqc=423396 qp=1 dt=22639/1/0 df=4590 of=149 ri=1572 ql=0 qs=.... b=10 ci=1471186 co=1530 ca=243
- 7 c=460203 g=460203 pq=1 pqc=460202 qp=0 dt=937087/1/0 df=3298 of=149 ri=1584 ql=6 qs=N.W. b=10 ci=4026154 co=1948 ca=135
+ 0!c=423090 g=423091 pq=1 pqc=423090 qp=1 dt=86475/1/0 df=16319 of=163 ri=1519 ql=0 qs=.... kt=0/W b=10 ci=1460693 co=1648 ca=6448
+ 1!c=423329 g=423330 pq=1 pqc=423329 qp=1 dt=90875/1/0 df=16231 of=157 ri=1249 ql=0 qs=.... kt=0/W b=10 ci=1459002 co=1614 ca=3310
+ 2!c=423370 g=423371 pq=1 pqc=423370 qp=1 dt=69661/1/0 df=16125 of=163 ri=1469 ql=0 qs=.... kt=0/W b=10 ci=1610701 co=2015 ca=2378
+ 3!c=422967 g=422968 pq=1 pqc=422967 qp=1 dt=70349/1/0 df=12528 of=163 ri=1450 ql=0 qs=.... kt=0/W b=10 ci=1427543 co=1430 ca=897
+ 4!c=423196 g=423197 pq=1 pqc=423196 qp=0 dt=38935/1/0 df=10959 of=177 ri=1657 ql=0 qs=.... kt=0/W b=10 ci=1562249 co=1896 ca=533
+ 5!c=422950 g=422951 pq=1 pqc=422950 qp=0 dt=25127/1/0 df=5895 of=167 ri=1549 ql=0 qs=.... kt=0/W b=10 ci=1777260 co=2137 ca=274
+ 6!c=423396 g=423397 pq=1 pqc=423396 qp=1 dt=22639/1/0 df=4590 of=149 ri=1572 ql=0 qs=.... kt=0/W b=10 ci=1471186 co=1530 ca=243
+ 7 c=460203 g=460203 pq=1 pqc=460202 qp=0 dt=937087/1/0 df=3298 of=149 ri=1584 ql=6 qs=N.W. kt=0/W b=10 ci=4026154 co=1948 ca=135
rcu_bh:
- 0!c=18446744073709551494 g=18446744073709551494 pq=0 pqc=18446744073709551493 qp=1 dt=86475/1/0 df=11 of=0 ri=0 ql=0 qs=.... b=10 ci=112 co=0 ca=0
- 1!c=18446744073709551496 g=18446744073709551496 pq=1 pqc=18446744073709551495 qp=0 dt=90875/1/0 df=15 of=0 ri=0 ql=0 qs=.... b=10 ci=143 co=0 ca=0
- 2!c=18446744073709551496 g=18446744073709551496 pq=1 pqc=18446744073709551495 qp=0 dt=69661/1/0 df=21 of=0 ri=1 ql=0 qs=.... b=10 ci=88 co=0 ca=0
- 3!c=18446744073709551494 g=18446744073709551494 pq=1 pqc=18446744073709551493 qp=0 dt=70349/1/0 df=13 of=0 ri=0 ql=0 qs=.... b=10 ci=100 co=0 ca=0
- 4!c=18446744073709551494 g=18446744073709551494 pq=0 pqc=18446744073709551493 qp=1 dt=38935/1/0 df=17 of=0 ri=0 ql=0 qs=.... b=10 ci=36 co=0 ca=0
- 5!c=18446744073709551494 g=18446744073709551494 pq=0 pqc=18446744073709551493 qp=1 dt=25127/1/0 df=7 of=0 ri=0 ql=0 qs=.... b=10 ci=32 co=0 ca=0
- 6!c=18446744073709551496 g=18446744073709551496 pq=1 pqc=18446744073709551495 qp=0 dt=22639/1/0 df=9 of=0 ri=0 ql=0 qs=.... b=10 ci=44 co=0 ca=0
- 7 c=182 g=182 pq=1 pqc=181 qp=0 dt=937087/1/0 df=14 of=0 ri=1 ql=0 qs=.... b=10 ci=627 co=0 ca=0
+ 0!c=18446744073709551494 g=18446744073709551494 pq=0 pqc=18446744073709551493 qp=1 dt=86475/1/0 df=11 of=0 ri=0 ql=0 qs=.... kt=0/W b=10 ci=112 co=0 ca=0
+ 1!c=18446744073709551496 g=18446744073709551496 pq=1 pqc=18446744073709551495 qp=0 dt=90875/1/0 df=15 of=0 ri=0 ql=0 qs=.... kt=0/W b=10 ci=143 co=0 ca=0
+ 2!c=18446744073709551496 g=18446744073709551496 pq=1 pqc=18446744073709551495 qp=0 dt=69661/1/0 df=21 of=0 ri=1 ql=0 qs=.... kt=0/W b=10 ci=88 co=0 ca=0
+ 3!c=18446744073709551494 g=18446744073709551494 pq=1 pqc=18446744073709551493 qp=0 dt=70349/1/0 df=13 of=0 ri=0 ql=0 qs=.... kt=0/W b=10 ci=100 co=0 ca=0
+ 4!c=18446744073709551494 g=18446744073709551494 pq=0 pqc=18446744073709551493 qp=1 dt=38935/1/0 df=17 of=0 ri=0 ql=0 qs=.... kt=0/W b=10 ci=36 co=0 ca=0
+ 5!c=18446744073709551494 g=18446744073709551494 pq=0 pqc=18446744073709551493 qp=1 dt=25127/1/0 df=7 of=0 ri=0 ql=0 qs=.... kt=0/W b=10 ci=32 co=0 ca=0
+ 6!c=18446744073709551496 g=18446744073709551496 pq=1 pqc=18446744073709551495 qp=0 dt=22639/1/0 df=9 of=0 ri=0 ql=0 qs=.... kt=0/W b=10 ci=44 co=0 ca=0
+ 7 c=182 g=182 pq=1 pqc=181 qp=0 dt=937087/1/0 df=14 of=0 ri=1 ql=0 qs=.... kt=0/W b=10 ci=627 co=0 ca=0

The first section lists the rcu_data structures for rcu_sched, the second
for rcu_bh. Note that CONFIG_TREE_PREEMPT_RCU kernels will have an
@@ -146,6 +149,23 @@ o "qs" gives an indication of the state of the callback queue
If there are no callbacks in a given one of the above states,
the corresponding character is replaced by ".".

+o "kt" is the per-CPU kernel-thread state. The digit preceding
+ the slash is zero if there is no work pending and 1 otherwise.
+ The character after the slash is as follows:
+
+ "S" The kernel thread is stopped, in other words, all
+ CPUs corresponding to this rcu_node structure are
+ offline.
+
+ "R" The kernel thread is running.
+
+ "W" The kernel thread is waiting because there is no work
+ for it to do.
+
+ "Y" The kernel thread is yielding to avoid hogging CPU.
+
+ "?" Unknown value, indicates a bug.
+
o "b" is the batch limit for this CPU. If more than this number
of RCU callbacks is ready to invoke, then the remainder will
be deferred.
@@ -356,6 +376,113 @@ o "nn" is the number of times that this CPU needed nothing. Alert
is due to short-circuit evaluation in rcu_pending().


+The output of "cat rcu/rcutorture" looks as follows:
+
+rcutorture test sequence: 0 (test in progress)
+rcutorture update version number: 615
+
+The first line shows the number of rcutorture tests that have completed
+since boot. If a test is currently running, the "(test in progress)"
+string will appear as shown above. The second line shows the number of
+update cycles that the current test has started, or zero if there is
+no test in progress.
+
+
+The output of "cat rcu/rcuboost" looks as follows:
+
+0:5 tasks=.... kt=W ntb=0 neb=0 nnb=0 j=2f95 bt=300f
+ balk: nt=0 egt=989 bt=0 nb=0 ny=0 nos=16
+6:7 tasks=.... kt=W ntb=0 neb=0 nnb=0 j=2f95 bt=300f
+ balk: nt=0 egt=225 bt=0 nb=0 ny=0 nos=6
+
+This information is output only for rcu_preempt. Each two-line entry
+corresponds to a leaf rcu_node strcuture. The fields are as follows:
+
+o "n:m" is the CPU-number range for the corresponding two-line
+ entry. In the sample output above, the first entry covers
+ CPUs zero through five and the second entry covers CPUs 6
+ and 7.
+
+o "tasks=TNEB" gives the state of the various segments of the
+ rnp->blocked_tasks list:
+
+ "T" This indicates that there are some tasks that blocked
+ while running on one of the corresponding CPUs while
+ in an RCU read-side critical section.
+
+ "N" This indicates that some of the blocked tasks are preventing
+ the current normal (non-expedited) grace period from
+ completing.
+
+ "E" This indicates that some of the blocked tasks are preventing
+ the current expedited grace period from completing.
+
+ "B" This indicates that some of the blocked tasks are in
+ need of RCU priority boosting.
+
+ Each character is replaced with "." if the corresponding
+ condition does not hold.
+
+o "kt" is the state of the RCU priority-boosting kernel
+ thread associated with the corresponding rcu_node structure.
+ The state can be one of the following:
+
+ "S" The kernel thread is stopped, in other words, all
+ CPUs corresponding to this rcu_node structure are
+ offline.
+
+ "R" The kernel thread is running.
+
+ "W" The kernel thread is waiting because there is no work
+ for it to do.
+
+ "Y" The kernel thread is yielding to avoid hogging CPU.
+
+ "?" Unknown value, indicates a bug.
+
+o "ntb" is the number of tasks boosted.
+
+o "neb" is the number of tasks boosted in order to complete an
+ expedited grace period.
+
+o "nnb" is the number of tasks boosted in order to complete a
+ normal (non-expedited) grace period. When boosting a task
+ that was blocking both an expedited and a normal grace period,
+ it is counted against the expedited total above.
+
+o "j" is the low-order 16 bits of the jiffies counter in
+ hexadecimal.
+
+o "bt" is the low-order 16 bits of the value that the jiffies
+ counter will have when we next start boosting, assuming that
+ the current grace period does not end beforehand. This is
+ also in hexadecimal.
+
+o "balk: nt" counts the number of times we didn't boost (in
+ other words, we balked) even though it was time to boost because
+ there were no blocked tasks to boost. This situation occurs
+ when there is one blocked task on one rcu_node structure and
+ none on some other rcu_node structure.
+
+o "egt" counts the number of times we balked because although
+ there were blocked tasks, none of them were blocking the
+ current grace period, whether expedited or otherwise.
+
+o "bt" counts the number of times we balked because boosting
+ had already been initiated for the current grace period.
+
+o "nb" counts the number of times we balked because there
+ was at least one task blocking the current non-expedited grace
+ period that never had blocked. If it is already running, it
+ just won't help to boost its priority!
+
+o "ny" counts the number of times we balked because it was
+ not yet time to start boosting.
+
+o "nos" counts the number of times we balked for other
+ reasons, e.g., the grace period ended first.
+
+
CONFIG_TINY_RCU and CONFIG_TINY_PREEMPT_RCU debugfs Files and Formats

These implementations of RCU provides a single debugfs file under the
@@ -422,9 +549,9 @@ o "neb" is the number of expedited grace periods that have had
o "nnb" is the number of normal grace periods that have had
to resort to RCU priority boosting since boot.

-o "j" is the low-order 12 bits of the jiffies counter in hexadecimal.
+o "j" is the low-order 16 bits of the jiffies counter in hexadecimal.

-o "bt" is the low-order 12 bits of the value that the jiffies counter
+o "bt" is the low-order 16 bits of the value that the jiffies counter
will have at the next time that boosting is scheduled to begin.

o In the line beginning with "normal balk", the fields are as follows:
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 22a6a46..a21413d 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1086,7 +1086,7 @@ static void rcu_initiate_boost_trace(struct rcu_node *rnp)
else if (rnp->gp_tasks != NULL && rnp->qsmask != 0)
rnp->n_balk_notblocked++;
else if (rnp->gp_tasks != NULL &&
- ULONG_CMP_GE(jiffies, rnp->boost_time))
+ ULONG_CMP_LT(jiffies, rnp->boost_time))
rnp->n_balk_notyet++;
else
rnp->n_balk_nos++;
--
1.7.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/