Re: [RFC PATCH] Add proc interface to set PF_MEMALLOC flags

From: Mike Christie
Date: Wed Sep 11 2019 - 11:23:24 EST


On 09/10/2019 05:12 PM, Tetsuo Handa wrote:
> On 2019/09/10 3:26, Mike Christie wrote:
>> Forgot to cc linux-mm.
>>
>> On 09/09/2019 11:28 AM, Mike Christie wrote:
>>> There are several storage drivers like dm-multipath, iscsi, and nbd that
>>> have userspace components that can run in the IO path. For example,
>>> iscsi and nbd's userspace deamons may need to recreate a socket and/or
>>> send IO on it, and dm-multipath's daemon multipathd may need to send IO
>>> to figure out the state of paths and re-set them up.
>>>
>>> In the kernel these drivers have access to GFP_NOIO/GFP_NOFS and the
>>> memalloc_*_save/restore functions to control the allocation behavior,
>>> but for userspace we would end up hitting a allocation that ended up
>>> writing data back to the same device we are trying to allocate for.
>>>
>>> This patch allows the userspace deamon to set the PF_MEMALLOC* flags
>>> through procfs. It currently only supports PF_MEMALLOC_NOIO, but
>>> depending on what other drivers and userspace file systems need, for
>>> the final version I can add the other flags for that file or do a file
>>> per flag or just do a memalloc_noio file.
>
> Interesting patch. But can't we instead globally mask __GFP_NOFS / __GFP_NOIO
> than playing games with per a thread masking (which suffers from inability to
> propagate current thread's mask to other threads indirectly involved)?

If I understood you, then that had been discussed in the past:

https://www.spinics.net/lists/linux-fsdevel/msg149035.html

We only need this for specific threads which implement part of a storage
driver in userspace.

>
>>> +static ssize_t memalloc_write(struct file *file, const char __user *buf,
>>> + size_t count, loff_t *ppos)
>>> +{
>>> + struct task_struct *task;
>>> + char buffer[5];
>>> + int rc = count;
>>> +
>>> + memset(buffer, 0, sizeof(buffer));
>>> + if (count != sizeof(buffer) - 1)
>>> + return -EINVAL;
>>> +
>>> + if (copy_from_user(buffer, buf, count))
>
> copy_from_user() / copy_to_user() might involve memory allocation
> via page fault which has to be done under the mask? Moreover, since
> just open()ing this file can involve memory allocation, do we forbid
> open("/proc/thread-self/memalloc") ?

I was having the daemons set the flag when they initialize.