Re: [Ocfs2-users] [OCFS2] Crash at o2net_shutdown_sc()

From: Tejun Heo
Date: Mon Mar 04 2013 - 13:56:57 EST


Hello,

On Sat, Mar 02, 2013 at 09:41:54AM +0100, richard -rw- weinberger wrote:
> On Fri, Mar 1, 2013 at 10:42 PM, Srinivas Eeda <srinivas.eeda@xxxxxxxxxx> wrote:
> > Yes that was the crash I was referring to which stopped me from testing my
> > other patch on mainline. I think the crashes started since some workqueue
> > patches introduced by commit 57b30ae77bf00d2318df711ef9a4d2a9be0a3a2a
> > Earlier kernels should be fine.
> >
> > Patch https://lkml.org/lkml/2012/10/18/592 tried to address one fix which
> > helped ramster that uses same ocfs2/o2net code. There still seems to be
> > another problem that crashes ocfs2.
>
> If commit 57b30ae (workqueue: reimplement cancel_delayed_work() using
> try_to_grab_pending())
> introduced that regression, it is time to CC Tejun.

Hmmm.....

> >> [ 1514.840690] BUG: unable to handle kernel NULL pointer dereference
> >> at 0000000000000028
> >> [ 1514.841627] IP: [<ffffffff816ce840>] kernel_sock_ioctl+0x50/0x50

I suppose it's because either sock->ops or sock->ops->ioctl is NULL?
Can someone teach me what could lead to such conditions here and how
it could be affected by cancel_delayed_work()?

> >> [ 1514.841627] Call Trace:
> >> [ 1514.841627] [<ffffffff81323776>] ? o2net_shutdown_sc+0x106/0x1e0

I suppose this is the work function? Wondering why '?' is there tho.

> >> [ 1514.841627] [<ffffffff810013fa>] ? __switch_to+0x2a/0x4a0
> >> [ 1514.841627] [<ffffffff818a5c22>] ? _raw_spin_unlock_irq+0x12/0x40
> >> [ 1514.841627] [<ffffffff81069d26>] ? finish_task_switch+0x56/0xc0
> >> [ 1514.841627] [<ffffffff81056eb3>] process_one_work+0x133/0x510
> >> [ 1514.841627] [<ffffffff81323670>] ?
> >> o2net_sc_connect_completed+0xf0/0xf0
> >> [ 1514.841627] [<ffffffff810585ed>] worker_thread+0x15d/0x450
> >> [ 1514.841627] [<ffffffff81058490>] ? busy_worker_rebind_fn+0x100/0x100
> >> [ 1514.841627] [<ffffffff8105e10b>] kthread+0xbb/0xc0
> >> [ 1514.841627] [<ffffffff818a0000>] ? e1000_regdump+0x262/0x3be
> >> [ 1514.841627] [<ffffffff8105e050>] ? kthread_create_on_node+0x130/0x130
> >> [ 1514.841627] [<ffffffff818accac>] ret_from_fork+0x7c/0xb0
> >> [ 1514.841627] [<ffffffff8105e050>] ? kthread_create_on_node+0x130/0x130

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/