Re: [RFC PATCH 02/12] On Tue, 23 Sep 2008, David Miller wrote:

From: Thomas Gleixner
Date: Sat Oct 04 2008 - 07:08:38 EST


On Sat, 4 Oct 2008, Jiri Kosina wrote:
> On Fri, 3 Oct 2008, Jesse Brandeburg wrote:
> > Our experience is different. We are also testing with the "protection
> > patch" reverted.
> > We see that the problem specifically comes and goes when
> > removing/adding the use of set_memory_ro/set_memory_rw to the driver.
>
> But if this patch (which is an obvious workaround, compared to the other
> patches which fix real bugs, right?) would be catching some malicious
> accessess to the mapped EEPROM, there should be stacktraces present in the
> kernel log, right?

Exactly. The access to a ro region results in a fault. I have nowhere
seen that trigger, but I can reproduce the trylock() WARN_ON, which
confirms that there is concurrent access to the NVRAM registers. The
backtrace pattern is similar to the one you have seen.

There are two possible bad results from that concurrent access:

1) Task A issues command A
Task B issues command B
Task A writes data for A
which end up in B

2) Task A acquires the software flag
......

Task B acquires the software flag

Task A releases the software flag

The firmware accesses NVRAM Task B accesses the NVRAM

Both are probably serious enough to result in random NVRAM corruption.
There is no doubt: The missing serialization is a real bug.

Your question why this just happens now, while the bug is there for
ever, is definitely a good one. My opinion on that is that we just
have been lucky or some minor modification somewhere else in the
e1000e code or even in the generic/architecture code removed an
accidental serializing effect.

I was not able to reproduce the trylock warning on Fedora 8, but
Fedora 10-Beta triggers it once in 50 boots. I'm not going to remove
the mutex to verify whether it actually would corrupt the NVRAM :)

In theory we should be able to reproduce the problem with older kernel
versions as well. Maybe not the corruption, but we might see the
mutex_trylock check trigger.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/