Re: [PATCH 0/2] tracing: Detect unsafe dereferencing of pointers from trace events

From: Steven Rostedt
Date: Wed Mar 03 2021 - 06:31:51 EST


On Wed, 3 Mar 2021 09:21:39 +0800
Peter Chen <peter.chen@xxxxxxxxxx> wrote:

> On 21-03-02 09:56:05, Steven Rostedt wrote:
> > On Tue, 2 Mar 2021 16:23:55 +0800
> > Peter Chen <peter.chen@xxxxxxxxxx> wrote:
> >
> > s it looks like it uses %pa which IIUC from the printk code, it
> > > > >> dereferences the pointer to find it's virtual address. The event has
> > > > >> this as the field:
> > > > >>
> > > > >> __field(struct cdns3_trb *, start_trb_addr)
> > > > >>
> > > > >> Assigns it with:
> > > > >>
> > > > >> __entry->start_trb_addr = req->trb;
> > > > >>
> > > > >> And prints that with %pa, which will dereference pointer at the time of
> > > > >> reading, where the address in question may no longer be around. That
> > > > >> looks to me as a potential bug.
> > >
> > > Steven, thanks for reporting. Do you mind sending patch to fix it?
> > > If you have no time to do it, I will do it later.
> > >
> >
> > I would have already fixed it, but I wasn't exactly sure how this is used.
> >
> > In Documentation/core-api/printk-formats.rst we have:
> >
> > Physical address types phys_addr_t
> > ----------------------------------
> >
> > ::
> >
> > %pa[p] 0x01234567 or 0x0123456789abcdef
> >
> > For printing a phys_addr_t type (and its derivatives, such as
> > resource_size_t) which can vary based on build options, regardless of the
> > width of the CPU data path.
> >
> > So it only looks like it is used to for the size of the pointer.
> >
> > I guess something like this might work:
> >
> > diff --git a/drivers/usb/cdns3/cdns3-trace.h b/drivers/usb/cdns3/cdns3-trace.h
> > index 8648c7a7a9dd..d3b8624fc427 100644
> > --- a/drivers/usb/cdns3/cdns3-trace.h
> > +++ b/drivers/usb/cdns3/cdns3-trace.h
> > @@ -214,7 +214,7 @@ DECLARE_EVENT_CLASS(cdns3_log_request,
> > __field(int, no_interrupt)
> > __field(int, start_trb)
> > __field(int, end_trb)
> > - __field(struct cdns3_trb *, start_trb_addr)
> > + __field(phys_addr_t, start_trb_addr)
> > __field(int, flags)
> > __field(unsigned int, stream_id)
> > ),
> > @@ -230,7 +230,7 @@ DECLARE_EVENT_CLASS(cdns3_log_request,
> > __entry->no_interrupt = req->request.no_interrupt;
> > __entry->start_trb = req->start_trb;
> > __entry->end_trb = req->end_trb;
> > - __entry->start_trb_addr = req->trb;
> > + __entry->start_trb_addr = *(const phys_addr_t *)req->trb;
> > __entry->flags = req->flags;
> > __entry->stream_id = req->request.stream_id;
> > ),
> > @@ -244,7 +244,7 @@ DECLARE_EVENT_CLASS(cdns3_log_request,
> > __entry->status,
> > __entry->start_trb,
> > __entry->end_trb,
> > - __entry->start_trb_addr,
> > + /* %pa dereferences */ &__entry->start_trb_addr,
> > __entry->flags,
> > __entry->stream_id
> > )
> >
> >
> > Can you please test it? I don't have the hardware, but I also want to make
> > sure I don't break anything.
> >
>
> Hi Steve,
>
> Regarding this issue, I have one question:
> - If the virtual address is got from dma_alloc_coherent, can't we print
> this address using %pa to get its physical address (the same with DMA address),
> or its DMA address using %pad? req->trb is the virtual address got from

I'm not sure. I just looked at the vsprintf code, which simply does:

> static noinline_for_stack
> char *address_val(char *buf, char *end, const void *addr,
> struct printf_spec spec, const char *fmt)
> {
> unsigned long long num;
> int size;
>
> if (check_pointer(&buf, end, addr, spec))
> return buf;
>
> switch (fmt[1]) {
> case 'd':
> num = *(const dma_addr_t *)addr;
> size = sizeof(dma_addr_t);

The above is what is called, which dereferences addr and places it into num.

> break;
> case 'p':
> default:
> num = *(const phys_addr_t *)addr;
> size = sizeof(phys_addr_t);
> break;
> }
>
> return special_hex_number(buf, end, num, size);

This just prints the hex number defined by num. There's no physical address
calculations done via vsprintf that I'm aware of.

> }



> dma_alloc_coherent. And what's the logic for this "unsafe dereference" warning?

The actions done in TP_fast_assign() are executed at the time of the trace
event (i.e. when trace_cdns3_ep_queue() is called), but the actions of
TP_printk() are executed when the user reads the trace file (minutes,
hours, days later!). Thus, when you have a TP_printk("... %pa ...", and a
__entry->start_trb_addr referenced, that start_trb_addr may be pointing to
something that is long gone, and when you use a kernel vsnprintf()
dereferencing pointer like %pa (or %pU, %pM, etc), it may read something
that no longer exists, and is obviously unsafe.

So I created this patch series (in this thread, you can use lore to see it)
that analyzes the trace events as they get register, parses the TP_printk()
for any dereferencing pointers, and makes sure that what it is dereferencing
lives on the ring buffer. Otherwise it is likely referencing something that
may no longer be around when it gets dereferenced.

I booted an allyesconfig with this logic and your event triggered the
warning, and it was the only event in the kernel to do so. Thus, I've been
pretty good at policing trace events from doing such things (I've stopped a
lot of them from getting into the kernel by manually reviewing every trace
event that is Cc'd to me).

I hope that answers your question.

-- Steve