Re: [PATCH 1/2 v2] fdmap(2)

From: Alexey Dobriyan
Date: Thu Sep 28 2017 - 06:55:56 EST


On 9/28/17, Michael Kerrisk (man-pages) <mtk.manpages@xxxxxxxxx> wrote:
> On 27 September 2017 at 17:03, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:

>>> The idea is to start process. In ideal world, only bynary system calls
>>> would exist and shells could emulate /proc/* same way bash implement
>>> /dev/tcp
>>
>> Then start the process by doing it for real and making it obviously
>> useful. We should not add a pair of vaguely useful, rather weak
>> syscalls just to start a process of modernizing /proc.

Before doing it for real it would be nice to have at least a nod
from people in charge that syscalls which return binary
information are OK. Otherwise some EIATF guy will just say
"NAK /proc is fine, it always was fine".

Or look from another angle: sched_setaffinity exists but there is
no /proc counterpart, shells must use taskset(1) and world didn't end.

> I concur.
>
> Alexey, you still have not wxplained who specifically needs this
> right now, and how, precisely, they plan to use the new system calls.
> It is all very arm-wavey so far.

It is not if you read even example program in the original patch.
Any program which queries information about file descriptors
will benefit both in CPU and memory usage.

void closefrom(int start)
{
int fd[1024];
int n;

while ((n = fdmap(0, fd, sizeof(fd)/sizeof(fd[0]), start)) > 0) {
unsigned int i;

for (i = 0; i < n; i++)
close(fd[i]);

start = fd[n - 1] + 1;
}
}

CRIU naturally to know everything about descriptors of target processes:
It does:

int predump_task_files(int pid)
{
struct dirent *de;
DIR *fd_dir;
int ret = -1;

pr_info("Pre-dump fds for %d)\n", pid);

fd_dir = opendir_proc(pid, "fd");
if (!fd_dir)
return -1;

while ((de = readdir(fd_dir))) {
if (dir_dots(de))
continue;

if (predump_one_fd(pid, atoi(de->d_name)))
goto out;
}

ret = 0;
out:
closedir(fd_dir);
return ret;
}

which is again inefficient.