Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

From: Rainer Weikusat
Date: Wed Sep 30 2015 - 09:27:29 EST


Mathias Krause <minipli@xxxxxxxxxxxxxx> writes:
> On 30 September 2015 at 12:56, Rainer Weikusat <rweikusat@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
>> Mathias Krause <minipli@xxxxxxxxxxxxxx> writes:
>>> On 29 September 2015 at 21:09, Jason Baron <jbaron@xxxxxxxxxx> wrote:
>>>> However, if we call connect on socket 's', to connect to a new socket 'o2', we
>>>> drop the reference on the original socket 'o'. Thus, we can now close socket
>>>> 'o' without unregistering from epoll. Then, when we either close the ep
>>>> or unregister 'o', we end up with this list corruption. Thus, this is not a
>>>> race per se, but can be triggered sequentially.
>>>
>>> Sounds profound, but the reproducers calls connect only once per
>>> socket. So there is no "connect to a new socket", no?
>>> But w/e, see below.
>>
>> In case you want some information on this: This is a kernel warning I
>> could trigger (more than once) on the single day I could so far spend
>> looking into this (3.2.54 kernel):
>>
>> Sep 15 19:37:19 doppelsaurus kernel: WARNING: at lib/list_debug.c:53 list_del+0x9/0x30()
>> Sep 15 19:37:19 doppelsaurus kernel: Hardware name: 500-330nam
>> Sep 15 19:37:19 doppelsaurus kernel: list_del corruption. prev->next should be ffff88022c38f078, but was dead000000100100
>> [snip]
>
> Is that with Jason's patch or a vanilla v3.2.54?

That's a kernel warning which occurred repeatedly (among other "link
pointer disorganization" warnings) when I tested the "program with
unknown behaviour" you wrote with the kernel I'm currently supporting a
while ago (as I already wrote in the original mail).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/