rcu: hlist_bl_set_first_rcu hangs

From: Po-Yu Chuang
Date: Wed Jan 12 2011 - 02:17:46 EST


Dear Nick,

I am porting arm-based platform from 2.6.35 to current mainline.
The compiler is gcc-4.4.0.

While booting, the kernel (mainline) hangs after
"RPC: Registered tcp NFSv4.1 backchannel transport module "
with similar configurations to 2.6.35.
Then I found that there is an infinite loop in hlist_bl_set_first_rcu hangs().

The call sequence is:

_d_rehash @fs/dcache.c
__d_rehash @fs/dcache.c
hlist_bl_add_head_rcu @include/linux/rculist_bl.h
hlist_bl_set_first_rcu @include/linux/rculist_bl.h

Current implementation of hlist_bl_set_first_rcu:

static inline void hlist_bl_set_first_rcu(struct hlist_bl_head *h,
struct hlist_bl_node *n)
{
LIST_BL_BUG_ON((unsigned long)n & LIST_BL_LOCKMASK);
LIST_BL_BUG_ON(!((unsigned long)h->first & LIST_BL_LOCKMASK));
rcu_assign_pointer(h->first,
(struct hlist_bl_node *)((unsigned long)n | LIST_BL_LOCKMASK));
}

The infinite loop is caused by:
LIST_BL_BUG_ON(!((unsigned long)h->first & LIST_BL_LOCKMASK));
If I comment out this line, the kernel boots well.

The problem looks like that LIST_BL_LOCKMASK is 0 if CONFIG_SMP=n
and CONFIG_DEBUG_SPINLOCK=n.

#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
#define LIST_BL_LOCKMASK 1UL
#else
#define LIST_BL_LOCKMASK 0UL
#endif

Therefore !((unsigned long)h->first & LIST_BL_LOCKMASK) is always true.

I also have CONFIG_DEBUG_LIST=y, so boom! Kernel dies.

best regards,
Po-Yu Chuang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/