[PATCH v2 0/3] tty: Make __do_SAK() less greedy in regard to tasklist_lock

From: Kirill Tkhai
Date: Wed Jan 17 2018 - 07:39:37 EST


Hi,

this patchset makes __do_SAK() to take tasklist_lock for very small time
in comparison to that it does now. Though this function is executed
in process context and it takes tasklist_lock read locked with interrupts enabled,
another tasks may want to take it for writing with interrupt disabled
(e.g., forking tasks), and these tasks may evoke hard lockups.

I've observed several hard lockups caused by long execution of __do_SAK()
on the node with 200 big containers. 3.10 kernel is used there, and mainline
kernel does not have differences in comparation to that, because of __do_SAK()
function has not changed for a long time. So, mainline kernel has this problem too.

The patchset proposes two optimizations in __do__SAK(). The first one is
to skip threads, when they share previous thread's fd table [2/3].

The second optimization is to iterate task list under rcu_read_lock().
This allows to take tasklist_lock for a very small time just to check we
reached the end of the task list. See patch [3/3] for the details.

v2: All three patches changed. Now we don't care about races with
unshare_files() and do not take tasklist_lock on reaching task
list end. Link to v1: https://lkml.org/lkml/2018/1/11/486

Thanks,
Kirill

---

Kirill Tkhai (2):
Revert "do_SAK: Don't recursively take the tasklist_lock"
tty: Use RCU read lock to iterate tasks and threads in __do_SAK()

Oleg Nesterov (1):
tty: Avoid threads files iterations in __do_SAK()


drivers/tty/tty_io.c | 41 ++++++++++++++++++++++++++++-------------
1 file changed, 28 insertions(+), 13 deletions(-)

--
Signed-off-by: Kirill Tkhai <ktkhai@xxxxxxxxxxxxx>