[RFC v2 1/2] swait: add idle variants which don't contribute to load average

From: Luis R. Rodriguez
Date: Thu Jun 15 2017 - 14:49:03 EST


There are cases where folks are using an interruptible swait when
using kthreads. This is rather confusing given you'd expect
interruptible waits to be -- interruptible, but kthreads are not
interruptible ! The reason for such practice though is to avoid
having these kthreads contribute to the system load average.

When systems are idle some kthreads may spend a lot of time blocking if
using swait_event_timeout(). This would contribute to the system load
average. On systems without preemption this would mean the load average
of an idle system is bumped to 2 instead of 0. On systems with PREEMPT=y
this would mean the load average of an idle system is bumped to 3
instead of 0.

This adds proper API using TASK_IDLE to make such goals explicit and
avoid confusion.

Suggested-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
Signed-off-by: Luis R. Rodriguez <mcgrof@xxxxxxxxxx>
---
include/linux/swait.h | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/include/linux/swait.h b/include/linux/swait.h
index 2c700694d50a..105c70e23286 100644
--- a/include/linux/swait.h
+++ b/include/linux/swait.h
@@ -194,4 +194,29 @@ do { \
__ret; \
})

+#define __swait_event_idle(wq, condition) \
+ ___swait_event(wq, condition, TASK_IDLE, 0, schedule())
+
+#define swait_event_idle(wq, condition) \
+({ \
+ int __ret = 0; \
+ if (!(condition)) \
+ __ret = __swait_event_idle(wq, condition); \
+ __ret; \
+})
+
+#define __swait_event_idle_timeout(wq, condition, timeout) \
+ ___swait_event(wq, ___wait_cond_timeout(condition), \
+ TASK_IDLE, timeout, \
+ __ret = schedule_timeout(__ret))
+
+#define swait_event_idle_timeout(wq, condition, timeout) \
+({ \
+ long __ret = timeout; \
+ if (!___wait_cond_timeout(condition)) \
+ __ret = __swait_event_idle_timeout(wq, \
+ condition, timeout); \
+ __ret; \
+})
+
#endif /* _LINUX_SWAIT_H */
--
2.11.0