Re: page fault problems porting a network driver to 2.4.x

From: Andi Kleen (ak@suse.de)
Date: Tue Oct 24 2000 - 14:45:19 EST


On Tue, Oct 24, 2000 at 11:21:23AM -0700, Hen, Shmulik wrote:
> > The function looks something like:
> >
> > int iansHardStartXmit(struct sk_buff *skb, struct net_device *dev) {
> > int res;
> > struct net_device *base;
> >
> > spin_lock(&lock);

Normally the network code should synchronize the startxmit entry for you.
If it didn't the lock should probably be a spin_lock_irqsave.

> > base = get_base_driver_by_name(name);
> >
> > if(base != NULL) {
> > res = base->hard_start_xmit(skb, base);
> > }
> >
> > spin_unlock(&lock);
> > return res;
> > }
> >
> > We used kdb in order to track down the problem and found out the following
> > stack trace:
> >
> > EBP EIP function(args)
> > 0xc4cd1c54 0xd081e3e7 [e100]__kallsyms+0xb (0xc4b595a0,

My first guess for that would be that you didn't compile the kdb
kernel with frame pointers and it is some stack garbage.

> > Figuring the dev->hard_start_xmit pointer got trashed somehow, we added a
> > check to make sure the same pointer is always called, and indeed this was
> > the case. Looking at the assembly code with kdb, we could see that the
> > call to the base driver is done by a 'call *%eax' command. kdb reports
> > that eax=0xffffffff after the page fault (origeax).

origeax is always -1 for exceptions, it is used as a marker that it isn't
a system call. Only for system calls it is the real eax. You should
probably look at the real eax a bit below.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@vger.kernel.org



This archive was generated by hypermail 2b29 : Tue Oct 31 2000 - 21:00:33 EST