Re: [RFC PATCH 1/2] pipe: introduce busy wait for pipe

From: Subhra Mazumdar
Date: Tue Sep 04 2018 - 20:51:16 EST




On 08/31/2018 09:09 AM, Steven Sistare wrote:
On 8/30/2018 4:24 PM, subhra mazumdar wrote:
Introduce pipe_ll_usec field for pipes that indicates the amount of micro
seconds a thread should spin if pipe is empty or full before sleeping. This
is similar to network sockets. Workloads like hackbench in pipe mode
benefits significantly from this by avoiding the sleep and wakeup overhead.
Other similar usecases can benefit. pipe_wait_flag is used to signal any
thread busy waiting. pipe_busy_loop_timeout checks if spin time is over.

Signed-off-by: subhra mazumdar <subhra.mazumdar@xxxxxxxxxx>
---
include/linux/pipe_fs_i.h | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index e7497c9..fdfd2a2 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -1,6 +1,8 @@
#ifndef _LINUX_PIPE_FS_I_H
#define _LINUX_PIPE_FS_I_H
+#include <linux/sched/clock.h>
+
#define PIPE_DEF_BUFFERS 16
#define PIPE_BUF_FLAG_LRU 0x01 /* page is on the LRU */
@@ -54,6 +56,8 @@ struct pipe_inode_info {
unsigned int waiting_writers;
unsigned int r_counter;
unsigned int w_counter;
+ unsigned int pipe_ll_usec;
+ unsigned long pipe_wait_flag;
struct page *tmp_page;
struct fasync_struct *fasync_readers;
struct fasync_struct *fasync_writers;
@@ -157,6 +161,21 @@ static inline int pipe_buf_steal(struct pipe_inode_info *pipe,
return buf->ops->steal(pipe, buf);
}
+static inline unsigned long pipe_busy_loop_current_time(void)
+{
+ return (unsigned long)(local_clock() >> 10);
Why ">> 10" ? local_lock() has nanosec units, and you compare to the tunable
pipe_llc_sec which has microsec units. Should be ">> 3". Better yet, redefine
the tunable to have nanosec units. I suspect you will need very large values
of the tunable to show similar results.
It's 2^10. I don't think using nanosec units is necessary. It is unlikely
data will be read or written in nano seconds. sk_busy_loop_timeout for
sockets uses micro seconds too.

Also, since this type of optimization consumes CPU extra cycles that could
be used by other tasks, show the overall CPU utilization before and after
the optimization, such as by using "time hackbench ...".
OK.

Thanks,
Subhra

- Steve

+}
+
+static inline bool pipe_busy_loop_timeout(struct pipe_inode_info *pipe,
+ unsigned long start_time)
+{
+ unsigned long bp_usec = READ_ONCE(pipe->pipe_ll_usec);
+ unsigned long end_time = start_time + bp_usec;
+ unsigned long now = pipe_busy_loop_current_time();
+
+ return time_after(now, end_time);
+}
+
/* Differs from PIPE_BUF in that PIPE_SIZE is the length of the actual
memory allocation, whereas PIPE_BUF makes atomicity guarantees. */
#define PIPE_SIZE PAGE_SIZE