Re: [PATCH v3] remoteproc: Fix NULL vs IS_ERR() checking in rproc_create_trace_file

From: Bjorn Andersson
Date: Tue Jan 18 2022 - 14:17:56 EST


On Tue 18 Jan 10:56 CST 2022, Mathieu Poirier wrote:

> On Mon, Jan 17, 2022 at 04:31:23PM -0600, Bjorn Andersson wrote:
> > On Mon 17 Jan 11:06 CST 2022, Mathieu Poirier wrote:
> >
> > > On Wed, Jan 05, 2022 at 01:10:22PM +0000, Miaoqian Lin wrote:
> > > > The debugfs_create_file() function doesn't return NULL.
> > > > It returns error pointers. Fix check in rproc_create_trace_file
> > > > and make it returns return error pointers.
> > >
> > > s/"returns return"/return
> > >
> > > > Fix check in rproc_handle_trace to propagate the error code.
> > > >
> > > > Signed-off-by: Miaoqian Lin <linmq006@xxxxxxxxx>
> > > > ---
> > > > Changes in v2:
> > > > - return PTR_ERR(tfile) in rproc_create_trace_file
> > > > - fix check in rproc_handle_trace()
> > > > Changes in v3:
> > > > - return tfile to fix incorrect return type in v2
> > > > ---
> > > > drivers/remoteproc/remoteproc_core.c | 6 ++++--
> > > > drivers/remoteproc/remoteproc_debugfs.c | 4 +---
> > > > 2 files changed, 5 insertions(+), 5 deletions(-)
> > > >
> > >
> > > I will fix the above, add a proper "Fixes" tag and apply this patch to
> > > rproc-next when v5.17-rc1 comes out next week.
> > >
> >
> > We're actually not supposed to check debugfs_create_*() for errors.
>
> I'm interested in knowing more about this - can you expand on the specifics or
> perharps provide a link?
>

I'm not able to find anything going into the reasoning behind it, but
you can find lots of examples where Greg says that we shouldn't do this:

$ git log --grep "no need to check return value of debugfs_create functions"

E.g.:
https://lore.kernel.org/r/20200818133701.462958-1-gregkh@xxxxxxxxxxxxxxxxxxx

> >
> > > Thanks,
> > > Mathieu
> > >
> > > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > > > index 775df165eb45..5608408f8eac 100644
> > > > --- a/drivers/remoteproc/remoteproc_core.c
> > > > +++ b/drivers/remoteproc/remoteproc_core.c
> > > > @@ -656,6 +656,7 @@ static int rproc_handle_trace(struct rproc *rproc, void *ptr,
> > > > struct rproc_debug_trace *trace;
> > > > struct device *dev = &rproc->dev;
> > > > char name[15];
> > > > + int ret;
> > > >
> > > > if (sizeof(*rsc) > avail) {
> > > > dev_err(dev, "trace rsc is truncated\n");
> > > > @@ -684,9 +685,10 @@ static int rproc_handle_trace(struct rproc *rproc, void *ptr,
> > > >
> > > > /* create the debugfs entry */
> > > > trace->tfile = rproc_create_trace_file(name, rproc, trace);
> > > > - if (!trace->tfile) {
> > > > + if (IS_ERR(trace->tfile)) {
> > > > + ret = PTR_ERR(trace->tfile);
> > > > kfree(trace);
> > > > - return -EINVAL;
> > > > + return ret;
> >
> >
> > And actually catching and propagating the error here means that we will
> > start failing rproc_boot() for firmware including a RSC_TRACE when
> > debugfs is disabled...
> >
> > So if we really want to save the heap space we should at least cleanly
> > ignore the error, by cleaning up and returning 0 here.
>
> Humm... To me the _intent_ of the upstream code has always been to propagate
> errors reported by rproc_create_trace_file(). The fact that is hasn't happen
> because of inappropriate error handling is something that should be corrected.
>

I share that view, in general. I suspect that the idea with debugfs is
that it's for debugging purposes and you don't want your remoteproc to
stop working just because there might be an issue debugging it.

> That being said disabling debugfs is a common practice for production systems
> and I agree that handling such a condition by returning 0 when
> rproc_create_trace_file() returns -ENODEV is the right thing to do.
>

Right, but even with debugfs enabled, do you want to prevent your
remoteproc from booting just because the debugfs, for some reason,
wasn't able to add the trace file?

For me the question is if we should clean up the "trace" object or not,
as this only relates to the debugfs file. Ignoring the error would imply
that we just keep this memory allocated - which I'm fine with for the
sake of avoiding the error handling.

> Thanks,
> Mathieu
>
> >
> > > > }
> > > >
> > > > list_add_tail(&trace->node, &rproc->traces);
> > > > diff --git a/drivers/remoteproc/remoteproc_debugfs.c b/drivers/remoteproc/remoteproc_debugfs.c
> > > > index b5a1e3b697d9..2ae59a365b7e 100644
> > > > --- a/drivers/remoteproc/remoteproc_debugfs.c
> > > > +++ b/drivers/remoteproc/remoteproc_debugfs.c
> > > > @@ -390,10 +390,8 @@ struct dentry *rproc_create_trace_file(const char *name, struct rproc *rproc,
> > > >
> > > > tfile = debugfs_create_file(name, 0400, rproc->dbg_dir, trace,
> > > > &trace_rproc_ops);
> > > > - if (!tfile) {
> > > > + if (IS_ERR(tfile))
> > > > dev_err(&rproc->dev, "failed to create debugfs trace entry\n");
> >
> > And I therefor think this function would be better reduced to:
> >
> > return debugfs_create_file(...);
> >

Taking another look at the implementation of debugfs_create_file() this
dev_err() should be removed, because there will already be a more useful
error printed by debugfs_create_file().

Regards,
Bjorn

> > Regards,
> > Bjorn
> >
> > > > - return NULL;
> > > > - }
> > > >
> > > > return tfile;
> > > > }
> > > > --
> > > > 2.17.1
> > > >