RE: [RFC PATCH 07/12] e1000e: debug contention on NVM SWFLAG

From: Thomas Gleixner
Date: Thu Oct 02 2008 - 14:02:58 EST


On Thu, 2 Oct 2008, Brandeburg, Jesse wrote:

> Olaf Kirch wrote:
> > Looks like the e1000 watchdog racing with some dhclient activity
> > (upping the interface).
>
> > I just noticed that the driver actually uses register pages. So it
> > looks like it's possible to have something like this without the
> > mutex:
> >
> > process A selects page A
> > process B selects page B
> > process A writes to register at offset A'
>
> I think that is possible, which is why the mutex patch would be good for
> the future. However we have not shown that to be happening as a root
> cause, but I don't rule it out.

Nevertheless I vote strongly for putting that check in _NOW_. It has
proven that there is concurrent access and that's definitely a bug by
all means.

> so, why now? Drivers since before the e1000/e1000e split had this same
> code, with no reports of problems. This code has been heavily tested,
> and one of the platforms easily reproducing this has been available for
> 3 years now (ich8), with code that is basically unchanged in the driver.

Well, timing of events changes slightly over time and we definitely
had some major changes in the last three years which influence timing
(high res timers, dynticks, NAPI ....)

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/