Re: Kernel crash after using new Intel NIC (igb)

From: Arun Sharma
Date: Thu May 26 2011 - 15:29:50 EST

Next message: Casey Schaufler: "Re: [PATCH v5 00/21] EVM"
Previous message: tip-bot for Arnaldo Carvalho de Melo: "[tip:perf/urgent] perf symbols: Handle /proc/sys/kernel/kptr_restrict"
In reply to: Ben Hutchings: "Re: Kernel crash after using new Intel NIC (igb)"
Next in thread: Eric Dumazet: "Re: Kernel crash after using new Intel NIC (igb)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 5/24/11 11:35 PM, Eric Dumazet wrote:

Another possibility is to do the list_empty() check twice. Once without
taking the lock and again with the spinlock held.

Why ?

Part of the problem is that I don't have a precise understanding of the race condition that's causing the list to become corrupted.

All I know is that doing it under the lock fixes it. If it's slowing things down, we do a check outside the lock (since it's cheap). But if we get the wrong answer, we verify it again under the lock.

list_del_init(&p->unused); (done under lock of course) is safe, you can
call it twice, no problem.

Doing it twice is not a problem. But doing it when we shouldn't be doing it could be the problem.

The list modification under unused_peers.lock looks generally safe. But the control flow (based on refcnt) done outside the lock might have races.

Eg: inet_putpeer() might find the refcnt go to zero, but before it adds it to the unused list, another thread may be doing inet_getpeer() and set refcnt to 1. In the end, we end up with a node that's potentially in use, but ends up on the unused list.

-Arun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Casey Schaufler: "Re: [PATCH v5 00/21] EVM"
Previous message: tip-bot for Arnaldo Carvalho de Melo: "[tip:perf/urgent] perf symbols: Handle /proc/sys/kernel/kptr_restrict"
In reply to: Ben Hutchings: "Re: Kernel crash after using new Intel NIC (igb)"
Next in thread: Eric Dumazet: "Re: Kernel crash after using new Intel NIC (igb)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]