Re: bisected! (WAS Re: s390x: kernel BUG at fs/ext4/inode.c:1591!)

From: Dmitry Monakhov
Date: Tue Apr 02 2013 - 06:01:48 EST


On Tue, 2 Apr 2013 00:06:24 -0400 (EDT), CAI Qian <caiqian@xxxxxxxxxx> wrote:
> Bisect indicated this is the culprit,
>
> 0e401101db49959f5783f6ee9e676124b5a183ac
> ext4: fix memory leakage in mext_check_coverage
Strange...
It changes a bug in move_extent.c (e4defrag functionality)
ASAIU you just previously stopped your bisecting process here. Right?
Is this indeed a first bad commit?
>
> This following with Dmitry's debug patch applied,
>
> CAI Qian
>
> Ã 101.408610Â ES cache assertation failed for inode: 753 es_cached ex Ã56/5/744
> 81/20Â != found ex Ã56/5/3396400/0Â retval 0 flags 5
> Ã 209.858899Â ES cache assertation failed for inode: 384 es_cached ex Ã57/7/332
> 82/20Â != found ex Ã57/7/3396400/0Â retval 0 flags 5
> Ã 209.860656Â ES cache assertation failed for inode: 384 es_cached ex Ã25/1/332
> 50/20Â != found ex Ã25/1/0/0Â retval 0 flags 0
> Ã 209.893587Â ES cache assertation failed for inode: 384 es_cached ex Ã22/1/332
> 47/20Â != found ex Ã22/1/34838/1000Â retval 1 flags 0
> Ã 209.913482Â ES cache assertation failed for inode: 384 es_cached ex Ã27/1/329
> 40/20Â != found ex Ã27/1/0/0Â retval 0 flags 0
> Ã 209.919950Â ES cache assertation failed for inode: 384 es_cached ex Ã59/5/338
> 48/20Â != found ex Ã59/5/3396400/0Â retval 0 flags 5
> Ã 209.931856Â ES cache assertation failed for inode: 384 es_cached ex Ã7/1/3292
> 0/20Â != found ex Ã7/1/35879/20Â retval 1 flags 43
> Ã 209.969282Â ES cache assertation failed for inode: 384 es_cached ex Ã35/1/361
> 97/20Â != found ex Ã35/1/36197/1000Â retval 1 flags 0
> Ã 209.969290Â ES cache assertation failed for inode: 384 es_cached ex Ã48/1/362
> 10/20Â != found ex Ã48/1/0/0Â retval 0 flags 0
> Ã 209.980724Â ES cache assertation failed for inode: 384 es_cached ex Ã13/4/334
> 89/20Â != found ex Ã13/4/2161372/0Â retval 0 flags 5
> Ã 209.980744Â ES cache assertation failed for inode: 384 es_cached ex Ã61/3/335
> 37/20Â != found ex Ã61/3/3396400/0Â retval 0 flags 5
> Ã 209.983848Â ES cache assertation failed for inode: 384 es_cached ex Ã44/2/335
> 20/20Â != found ex Ã44/2/36216/20Â retval 2 flags 43
> Ã 210.020041Â ES cache assertation failed for inode: 384 es_cached ex Ã61/3/341
> 91/20Â != found ex Ã61/3/3396400/0Â retval 0 flags 5
> Ã 210.050100Â ES cache assertation failed for inode: 384 es_cached ex Ã22/11/34
> 565/20Â != found ex Ã22/11/3396400/0Â retval 0 flags 5
> Ã 210.053271Â ES cache assertation failed for inode: 384 es_cached ex Ã15/1/334
> 90/20Â != found ex Ã15/1/33579/1000Â retval 1 flags 1
It does not looks like bigendian issue, actually I cant find any logical
system in the log. The only thing I see is that es_cache is
horribly out of sync with extent_tree.
Please try this patch:
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index fe3337a..95d27cd 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -689,6 +689,7 @@ int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,
trace_ext4_es_lookup_extent_enter(inode, lblk);
es_debug("lookup extent in block %u\n", lblk);

+ return 0;
tree = &EXT4_I(inode)->i_es_tree;
read_lock(&EXT4_I(inode)->i_es_lock);


It just disable es_cache lookup feature should. This should helps to
determine whenever this is a es_cache issue or not.
> Ã 210.053275Â mpage_da_submit_io failed block=33490 != b_blocknr=33579
> Ã 210.053277Â ino:384 lbkl:15, b_state=0x00001023, b_size=4096
> Ã 210.053320Â ------------Ã cut here Â------------
> Ã 210.053323Â kernel BUG at fs/ext4/inode.c:1639!
> Ã 210.053402Â illegal operation: 0001 Ã#1Â SMP
> Ã 210.053405Â Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast
> ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ipt
> able_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defra
> g_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tab
> les iptable_filter ip_tables sg qeth_l2 vmur xfs libcrc32c dasd_fba_mod dasd_eck
> d_mod lcs dasd_mod qeth ctcm qdio ccwgroup fsm dm_mirror dm_region_hash dm_log d
> m_mod
> Ã 210.053434Â CPU: 0 Not tainted 3.8.0-rc3+ #16
> Ã 210.053436Â Process fsx (pid: 20565, task: 000000002c358000, ksp: 000000002c0
> 8f480)
> Ã 210.053439Â Krnl PSW : 0704f00180000000 00000000003033e8 (mpage_da_submit_io
> 0x3d4/0x408)
> Ã 210.053450Â R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 EA:
> 3
> Krnl GPRS: 0000000000000015 0000000000000001 0000000000000030 00000000031b4508
> Ã 210.053455Â 00000000003033e4 0000000000000000 0000000000001000 000
> 7ffff00001000
> Ã 210.053457Â 000000002c08fa98 000003d100a8c6c0 000000002c08fb68 000
> 000000000000f
> Ã 210.053460Â 00000000000082d2 000000002204d068 00000000003033e4 000
> 000002c08f970
> Ã 210.053473Â Krnl Code: 00000000003033d8: c02000215447 larl %r2,72dc
> 66
> 00000000003033de: c0e50016788f brasl %r14,5d24fc
> #00000000003033e4: a7f40001 brc 15,3033e6
> >00000000003033e8: a7f40001 brc 15,3033ea
> 00000000003033ec: a7f40001 brc 15,3033ee
> 00000000003033f0: 4120f0e8 la %r2,232(%r15)
> 00000000003033f4: a7180000 lhi %r1,0
> 00000000003033f8: 5010f0d8 st %r1,216(%r15)
> Ã 210.053497Â Call Trace:
> Ã 210.053498Â (Ã<00000000003033e4>Â mpage_da_submit_io+0x3d0/0x408)
> Ã 210.053501Â Ã<0000000000309a48>Â mpage_da_map_and_submit+0x150/0x41c
> Ã 210.053505Â Ã<000000000030a212>Â write_cache_pages_da+0x4fe/0x530
> Ã 210.053509Â Ã<000000000030a584>Â ext4_da_writepages+0x340/0x628
> Ã 210.053512Â Ã<00000000002024d2>Â __filemap_fdatawrite_range+0x6e/0x7c
> Ã 210.053518Â Ã<00000000002025fc>Â filemap_write_and_wait_range+0x54/0x8c
> Ã 210.053521Â Ã<00000000002fe0f8>Â ext4_sync_file+0x7c/0x3d8
> Ã 210.053524Â Ã<000000000023c932>Â SyS_msync+0x14e/0x1d8
> Ã 210.053528Â Ã<00000000005de66e>Â sysc_tracego+0x14/0x1a
> Ã 210.053533Â Ã<000003fffd0e1240>Â 0x3fffd0e1240
> Ã 210.053536Â Last Breaking-Event-Address:
> Ã 210.053537Â Ã<00000000003033e4>Â mpage_da_submit_io+0x3d0/0x408
> Ã 210.053540Â
> Ã 210.053542Â ---Ã end trace f387176e9fcb98d0 Â---
> Ã 210.053546Â ------------Ã cut here Â------------
> Ã 210.053548Â WARNING: at kernel/exit.c:713
> Ã 210.053550Â Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast
> ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ipt
> able_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defra
> g_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tab
> les iptable_filter ip_tables sg qeth_l2 vmur xfs libcrc32c dasd_fba_mod dasd_eck
> d_mod lcs dasd_mod qeth ctcm qdio ccwgroup fsm dm_mirror dm_region_hash dm_log d
> m_mod
> Ã 210.053571Â CPU: 0 Tainted: G D 3.8.0-rc3+ #16