Re: [PATCH] eventfs/tracing: Add callback for release of an eventfs_inode

From: Google
Date: Wed May 01 2024 - 10:59:17 EST


On Tue, 30 Apr 2024 14:23:27 -0400
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> From: "Steven Rostedt (Google)" <rostedt@xxxxxxxxxxx>
>
> Synthetic events create and destroy tracefs files when they are created
> and removed. The tracing subsystem has its own file descriptor
> representing the state of the events attached to the tracefs files.
> There's a race between the eventfs files and this file descriptor of the
> tracing system where the following can cause an issue:
>
> With two scripts 'A' and 'B' doing:
>
> Script 'A':
> echo "hello int aaa" > /sys/kernel/tracing/synthetic_events
> while :
> do
> echo 0 > /sys/kernel/tracing/events/synthetic/hello/enable
> done
>
> Script 'B':
> echo > /sys/kernel/tracing/synthetic_events
>
> Script 'A' creates a synthetic event "hello" and then just writes zero
> into its enable file.
>
> Script 'B' removes all synthetic events (including the newly created
> "hello" event).
>
> What happens is that the opening of the "enable" file has:
>
> {
> struct trace_event_file *file = inode->i_private;
> int ret;
>
> ret = tracing_check_open_get_tr(file->tr);
> [..]
>
> But deleting the events frees the "file" descriptor, and a "use after
> free" happens with the dereference at "file->tr".
>
> The file descriptor does have a reference counter, but there needs to be a
> way to decrement it from the eventfs when the eventfs_inode is removed
> that represents this file descriptor.
>
> Add an optional "release" callback to the eventfs_entry array structure,
> that gets called when the eventfs file is about to be removed. This allows
> for the creating on the eventfs file to increment the tracing file
> descriptor ref counter. When the eventfs file is deleted, it can call the
> release function that will call the put function for the tracing file
> descriptor.
>
> This will protect the tracing file from being freed while a eventfs file
> that references it is being opened.
>

Looks good to me.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>

Thank you,

> Link: https://lore.kernel.org/linux-trace-kernel/20240426073410.17154-1-Tze-nan.Wu@xxxxxxxxxxxx/
>
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
> Reported-by: Tze-nan wu <Tze-nan.Wu@xxxxxxxxxxxx>
> Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
> ---
> fs/tracefs/event_inode.c | 7 +++++++
> include/linux/tracefs.h | 3 +++
> kernel/trace/trace_events.c | 12 ++++++++++++
> 3 files changed, 22 insertions(+)
>
> diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
> index 894c6ca1e500..dc97c19f9e0a 100644
> --- a/fs/tracefs/event_inode.c
> +++ b/fs/tracefs/event_inode.c
> @@ -84,10 +84,17 @@ enum {
> static void release_ei(struct kref *ref)
> {
> struct eventfs_inode *ei = container_of(ref, struct eventfs_inode, kref);
> + const struct eventfs_entry *entry;
> struct eventfs_root_inode *rei;
>
> WARN_ON_ONCE(!ei->is_freed);
>
> + for (int i = 0; i < ei->nr_entries; i++) {
> + entry = &ei->entries[i];
> + if (entry->release)
> + entry->release(entry->name, ei->data);
> + }
> +
> kfree(ei->entry_attrs);
> kfree_const(ei->name);
> if (ei->is_events) {
> diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
> index 7a5fe17b6bf9..d03f74658716 100644
> --- a/include/linux/tracefs.h
> +++ b/include/linux/tracefs.h
> @@ -62,6 +62,8 @@ struct eventfs_file;
> typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data,
> const struct file_operations **fops);
>
> +typedef void (*eventfs_release)(const char *name, void *data);
> +
> /**
> * struct eventfs_entry - dynamically created eventfs file call back handler
> * @name: Then name of the dynamic file in an eventfs directory
> @@ -72,6 +74,7 @@ typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data,
> struct eventfs_entry {
> const char *name;
> eventfs_callback callback;
> + eventfs_release release;
> };
>
> struct eventfs_inode;
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 52f75c36bbca..6ef29eba90ce 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -2552,6 +2552,14 @@ static int event_callback(const char *name, umode_t *mode, void **data,
> return 0;
> }
>
> +/* The file is incremented on creation and freeing the enable file decrements it */
> +static void event_release(const char *name, void *data)
> +{
> + struct trace_event_file *file = data;
> +
> + event_file_put(file);
> +}
> +
> static int
> event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file)
> {
> @@ -2566,6 +2574,7 @@ event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file)
> {
> .name = "enable",
> .callback = event_callback,
> + .release = event_release,
> },
> {
> .name = "filter",
> @@ -2634,6 +2643,9 @@ event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file)
> return ret;
> }
>
> + /* Gets decremented on freeing of the "enable" file */
> + event_file_get(file);
> +
> return 0;
> }
>
> --
> 2.43.0
>


--
Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>