Re: Orinoco card fails on resume with 2.6.2 (race condition?)

From: Rob Landley
Date: Sun Mar 07 2004 - 21:33:50 EST


On Sunday 07 March 2004 18:20, David Gibson wrote:
> Sorry I've been slow to reply - still catching up from three weeks
> away.

I'm traveling around the country myself.

> On Thu, Mar 04, 2004 at 09:45:55AM +0000, Russell King wrote:
> > On Wed, Feb 25, 2004 at 02:36:15PM -0600, Rob Landley wrote:
> > > Attached are a dmesg log from my laptop resuming after a software
> > > suspend with a stock 2.6.2 kernel, and the config that 2.6.2 kernel was
> > > compiled with.
> >
> > I guess we need the orinoco people to comment on this; although I seem
> > to look after the 2.6 PCMCIA _core_, I certainly do not look after
> > PCMCIA driver issues.
> >
> > David - can you provide any insight?
>
> Um... not a lot, I'm afraid. A "Timeout waiting for command
> completion" usually indicates a low-level problem, the registers not
> responding at all, for example. On the other hand we do seem to be
> getting some interrupts and correctly processing linkstatus frames
> before it all falls over, so it doesn't look like a complete failure
> of the low-level stuff to re-activate the card on resume.

Well, thanks for the attempt.

Hmmm... I don't _think_ the software suspend logic is likely to snapshot any
strange timer state, but I dunno. On resume, the system goes into a swap
frenzy trying to get everything back in. (This thinkpad can only hold 192
megabytes of ram, and as far as I can tell Konqueror never actually frees any
memory it allocates, so I'm usually about halfway into swap.) So if there's
any possibility of the world's strangest race condition occurring on resume,
this thing could probably trigger it. My suspend script is just:

#!/bin/sh

sync
#echo -n disk > /sys/power/state
echo 4 > /proc/acpi/sleep
hwclock --hctosys
# rdate -s clock.psu.edu
dhclient eth0

> > > Usually after resume the network is back and happy, but this time it
> > > went "boing". Specifically:
>
> A race condition in the re-initialization logic is plausible - the
> locking around there is very hairy - and would jibe with this
> observation.

The odd part is that dhclient thought it had negotiated a connection before
the wireless card decided to curl up and die. Too bad the card couldn't
either start over initializing the sucker again, or at least print a stack
trace of where it to confused...

I'll let you know if it happens again. That's the... second or third time
I've seen it in the past four or five months. (It happens. Just not often
enough to determine a pattern.)

Rob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/