> From: Alan Cox <alan@cymru.net>
> Date: Tue, 16 Sep 1997 11:57:25 +0100 (BST)
>
> > would get quickly detected. Just be careful with
> > release_sock(). It might be worth to check the more obscure code
> > paths in 2.0.x that call release_sock() if they are really
> > compiled correctly. Do you have any strange crash reports left?
>
> sk->prot magically going to NULL during a tcp packet receive or
> outside of the receive during a recvmsg call
>
> Alan this sounds more like a "touch sk after it is free'd" bug rather
> than a gcc code gen problem.
>
> In any event, we should at some point figure out what caused the bad
> code in Andi's example. It could be bad constraints in an inline asm
> on Intel for all we know and gcc is not to blame.
The code looked like this:
label:
kfree_skb(skb, FREE_READ);
release_sock(sk);
return 0;
kfree_skb is this inline code:
extern __inline__ void kfree_skb(struct sk_buff *skb, int rw)
{
if (atomic_dec_and_test(&skb->users))
__kfree_skb(skb);
}
release_sock is this:
static inline void release_sock(struct sock *sk)
{
barrier();
if ((sk->sock_readers = sk->sock_readers-1) == 0)
__release_sock(sk);
}
Here is the faulty code generated (note that there is some more debugging
code mixed in that tests %ebx, but the crash happened without the debugging
code too, so you can ignore it here)
;; %ebx is never initialised to sk before, it is just used as
;; an temporary register to check the return code of an function
;; that mostly returns NULL - that's why it crashes.
.L1694:
;; kfree_skb
leal 112(%esi),%eax
#APP
decl 112(%esi); sete %al
#NO_APP
testb %al,%al
je .L1695
pushl %esi
call __kfree_skb
addl $4,%esp
.L1695:
#APP
movl %ebx,%eax ; debugging code - ignore.
#NO_APP
testl %eax,%eax
je .L1699 ; end of debugging code
#APP ; that's barrier() - it looks like the culprit
#NO_APP
;; release_sock():
;; %ebx is never loaded from the stack!
movl 44(%ebx),%eax ; <<<---- here is the crash.
decl %eax
movl %eax,44(%ebx)
jne .L1703
pushl %ebx
call __release_sock
addl $4,%esp
popl %ebx
popl %esi
ret
I'll check later if egcs fixes the problem and if not I'll send in
a bug report.
-Andi