Re: Linux 5.12-rc7

From: Guenter Roeck
Date: Mon Apr 12 2021 - 16:05:10 EST


On 4/12/21 10:38 AM, Eric Dumazet wrote:
[ ... ]

> Yes, I think this is the real issue here. This smells like some memory
> corruption.
>
> In my traces, packet is correctly received in AF_PACKET queue.
>
> I have checked the skb is well formed.
>
> But the user space seems to never call poll() and recvmsg() on this
> af_packet socket.
>

After sprinkling the kernel with debug messages:

424 00:01:33.674181 sendto(6, "E\0\1H\0\0\0\0@\21y\246\0\0\0\0\377\377\377\377\0D\0C\00148\346\1\1\6\0\246\336\333\v\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0RT\0\
424 00:01:33.693873 close(6) = 0
424 00:01:33.694652 fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
424 00:01:33.695213 clock_gettime64(CLOCK_MONOTONIC, 0x7be18a18) = -1 EFAULT (Bad address)
424 00:01:33.695889 write(2, "udhcpc: clock_gettime(MONOTONIC) failed\n", 40) = -1 EFAULT (Bad address)
424 00:01:33.697311 exit_group(1) = ?
424 00:01:33.698346 +++ exited with 1 +++

I only see that after adding debug messages in the kernel, so I guess there must be
a heisenbug somehere.

Anyway, indeed, I see (another kernel debug message):

__do_sys_clock_gettime: Returning -EFAULT on address 0x7bacc9a8

So udhcpc doesn't even try to read the reply because it crashes after sendto()
when trying to read the current time. Unless I am missing something, that means
that the problem happens somewhere on the send side.

To make things even more interesting, it looks like the failing system call
isn't always clock_gettime().

Guenter