Re: [RFC PATCH 0/4 v2] Define killable version for access_remote_vm() and use it in fs/proc

From: Yang Shi
Date: Wed Feb 28 2018 - 19:17:51 EST




On 2/26/18 5:47 PM, David Rientjes wrote:
On Mon, 26 Feb 2018, Yang Shi wrote:

Rather than killable, we have patches that introduce down_read_unfair()
variants for the files you've modified (cmdline and environ) as well as
others (maps, numa_maps, smaps).
You mean you have such functionality used by google internally?

Yup, see https://lwn.net/Articles/387720.

When another thread is holding down_read() and there are queued
down_write()'s, down_read_unfair() allows for grabbing the rwsem without
queueing for it. Additionally, when another thread is holding
down_write(), down_read_unfair() allows for queueing in front of other
threads trying to grab it for write as well.
It sounds the __unfair variant make the caller have chance to jump the gun to
grab the semaphore before other waiters, right? But when a process holds the
semaphore, i.e. mmap_sem, for a long time, it still has to sleep in
uninterruptible state, right?

Right, it's solving two separate things which I think may be able to be
merged together. Killable is solving an issue where the rwsem is blocking
for a long period of time in uninterruptible sleep, and unfair is solving
an issue where reading the procfs files gets stalled for a long period of
time. We haven't run into an issue (yet) where killable would have solved
it; we just have the unfair variants to grab the rwsem asap and then, if
killable, gracefully return.

Ingo would know more about whether a variant like that in upstream Linux
would be acceptable.

Would you be interested in unfair variants instead of only addressing
killable?
Yes, I'm although it still looks overkilling to me for reading /proc.

We make certain inferences on the readablility of procfs files for other
threads to determine how much its mm's mmap_sem is contended.

I see your points here for reading /proc for system monitor. However, I'm concerned that the _unfair APIs get the processes which read /proc priority elevation (not real priority change, just look like). It might be abused by some applications, for example:

A high priority process and a low priority process are waiting for the same rwsem, if the low priority process is trying to read /proc maliciously on purpose, it can get elevated to grap the rwsem before any other processes which are waiting for the same rwsem.

Yang