Re: [PATCH 24/32] vfs: syscall: Add fsopen() to prepare for superblock creation [ver #9]

From: David Howells
Date: Wed Jul 11 2018 - 04:42:38 EST


Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> Yeah, Andy is right that we should *not* make "write()" have side effects.

Note that write() has side effects all over the place: procfs, sysfs, debugfs,
tracefs, ... Though for the most part they're single-shot jobs and not
cumulative (I'm not sure this is always true for debugfs - there's a lot of
weird stuff in there).

> > (b) Keep the current structure but use a new syscall instead of write().
> >
> > (c) Keep using write() but literally just buffer the data. Then have a new
> > syscall to commit it. In other words, replace âxâ with a syscall and call
> > all the fs_context_operations helpers in that context instead of from
> > write().
>
> But yeah, b-or-c sounds fine.

I would prefer to avoid the "let's buffer everything" but rather parse the
data as we go along. What I currently do is store the parsed data in the
context and only actually *apply* it when someone sends the 'x' command.

There are two reasons for this:

(1) mount()'s error handling is slight: it can only return an error code, but
creating and mounting something has so many different and interesting
ways of going wrong and I want to be able to give better error reporting.

This gets more interesting if it happens inside a container where you
can't see dmesg.

(2) Parsing the data means you only need to store the result of the parse and
can reject anything that's unknown or contradictory.

Buffering till the end means you have to buffer *everything* - and,
unless you limit your buffer, you risk running out of RAM.

Now, I can replace the 'x' command with an ioctl() so that just writing random
rubbish to the fd won't cause anything to actually happen.

fd = fsopen("ext4");
write(fd, "s /dev/sda1");
write(fd, "o user_xattr");
ioctl(fd, FSOPEN_IOC_CREATE_SB, 0);

or I could make a special syscall for it:

fscommit(fd, FSCOMMIT_CREATE);

or:

fscommit(fd, FSCOMMIT_RECONFIGURE);

and require that you have CAP_SYS_ADMIN to enact it.

David