linux-next regression: IO errors in with ext4 and xen-blkfront

From: Jeremy Fitzhardinge
Date: Wed Oct 20 2010 - 20:04:56 EST


Hi,

When doing some regression testing with Xen on linux-next, I'm finding
that my domains are failing to get through the boot sequence due to IO
errors:

Remounting root filesystem in read-write mode: EXT4-fs (dm-0): re-mounted. Opts: (null)
[ OK ]
Mounting local filesystems: EXT3-fs: barriers not enabled
kjournald starting. Commit interval 5 seconds
EXT3-fs (xvda1): using internal journal
EXT3-fs (xvda1): mounted filesystem with writeback data mode
SELinux: initialized (dev xvda1, type ext3), uses xattr
SELinux: initialized (dev xenfs, type xenfs), uses genfs_contexts
[ OK ]
Enabling local filesystem quotas: [ OK ]
Enabling /etc/fstab swaps: Adding 917500k swap on /dev/mapper/vg_f1364-lv_swap. Priority:-1 extents:1 across:917500k
[ OK ]
SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses genfs_contexts
Entering non-interactive startup
Starting monitoring for VG vg_f1364: 2 logical volume(s) in volume group "vg_f1364" monitored
[ OK ]
ip6tables: Applying firewall rules: [ OK ]
iptables: Applying firewall rules: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0:
Determining IP information for eth0... done.
[ OK ]
Starting auditd: [ OK ]
end_request: I/O error, dev xvda, sector 0
end_request: I/O error, dev xvda, sector 0
end_request: I/O error, dev xvda, sector 9675936
Aborting journal on device dm-0-8.
Starting portreserve: EXT4-fs error (device dm-0): ext4_journal_start_sb:259: Detected aborted journal
EXT4-fs (dm-0): Remounting filesystem read-only
[ OK ]
Starting system logger: EXT4-fs (dm-0): error count: 4
EXT4-fs (dm-0): initial error at 1286479997: ext4_journal_start_sb:251
EXT4-fs (dm-0): last error at 1287618175: ext4_journal_start_sb:259


I haven't tried to bisect this yet (which will be awkward because
linux-next had also introduced various Xen bootcrashing bugs), but I
wonder if you have any thoughts about what may be happening here. I
guess an obvious candidate is the barrier changes in the storage
subsystem, but I still get the same errors if I mount root with barrier=0.

Current linux-2.6 mainline is fine, so the problem is in some of the
patches targeted at the next merge window.

Thanks,
J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/