Re: [PATCH 32/32] [RFC] fsinfo: Add a system call to allow querying of filesystem information [ver #8]

From: David Howells
Date: Mon Jun 04 2018 - 15:04:02 EST


Arnd Bergmann <arnd@xxxxxxxx> wrote:

> > I've split the capabilities out into their own thing. I've attached the
> > revised patch below.
>
> I'm still not completely clear on how variable-length structures are
> supposed to be handled by the fsinfo syscall. It seems like a possible
> source of bugs to return a structure from the kernel that has a different
> size in kernel and user space depending on the fsinfo_cap__nr value at
> compile time. How does one e.g. guarantee there is no out of bounds access
> when you run new user space on an older kernel that has a smaller structure?

There's a buffer size parameter:

int ret = fsinfo(int dfd,
const char *filename,
const struct fsinfo_params *params,
void *buffer,
size_t buf_size);

For a fixed-size buffer request (as opposed to a string), the fsinfo syscall
allocates an internal buffer sized for the size of the buffer that the
internal kernel code is expecting, and *not* what the user asked for:

/* Allocate an appropriately-sized buffer. We will truncate the
* contents when we write the contents back to userspace.
*/
size = fsinfo_buffer_sizes[params.request];
...
if (buf_size > 0) {
params.buf_size = size;
params.buffer = kzalloc(size, GFP_KERNEL);
if (!params.buffer)
return -ENOMEM;
}

so that the filesystems don't have to concern themselves with anything other
than the kernel's idea of the size.

The fsinfo() syscall truncates the reply buffer to the size the user requested
if the user requested a smaller amount. Take the fsinfo_supports struct for
example:

struct fsinfo_supports {
__u64 supported_stx_attributes;
__u32 supported_stx_mask;
__u32 supported_ioc_flags;
};

Now imagine that in future we want to add another field, say the mask of the
windows file attributes a filesystem supports. We can extend the struct like
so:

struct fsinfo_supports_v2 {
__u64 supported_stx_attributes;
__u32 supported_stx_mask;
__u32 supported_ioc_flags;
__u32 supported_win_file_atts;
__u32 __reserved[1];
};

Note that the start of the new struct *must* correspond in layout to the
original struct. An application that doesn't know about v2 would just ask for
v1:

struct fsinfo_supports foo;
fsinfo(.... &foo, sizeof(foo));

and would only ever get those bits - though it would be told that there is
more data available. An application that does know about v2 might do:

struct fsinfo_supports_v2 foo2;
fsinfo(.... &foo2, sizeof(foo2));

If all of v2 was available, all fields will be filled in and the return value
will == sizeof(foo2). If not all fields are available, the return value will
== sizeof(foo). If a v3 was added, the return value would == sizeof(v3), and
so on.

I can improve this such that if you asked for a fixed-length option and the
kernel doesn't have enough data to fill the user buffer provided, then it
clears the remainder of the buffer. That way at least any unsupported fields
will be initialised to 0.


For the capabilities bitmask, it's not really any different conceptually. If
you want to test capability bit 47, you need to ask for 6 bytes of data. If
the kernel doesn't support that many bits, it won't necessarily give you that
many bytes. If it has, say, 13 bytes-worth of caps available, it will only
give you the first 6 bytes-worth if that's all you ask for. You presumably
weren't interested or didn't know about any more than that.


As for strings, they're completely variable length anyway, so I don't think
there's a problem there.

> In any case, it would be nice to have a trivial way to query which of
> the four timestamp types are supported at all, and returning
> them separately would be one way of doing that.

fsinfo_cap_has_atime = 45, /* fs supports access time */
fsinfo_cap_has_btime = 46, /* fs supports birth/creation time */
fsinfo_cap_has_ctime = 47, /* fs supports change time */
fsinfo_cap_has_mtime = 48, /* fs supports modification time */

David