Re: spurious -ENOSPC on XFS

From: Mikulas Patocka
Date: Fri Jan 23 2009 - 15:14:46 EST

On Thu, 22 Jan 2009, Christoph Hellwig wrote:

> On Thu, Jan 22, 2009 at 03:59:13PM -0500, Christoph Hellwig wrote:
> > On Wed, Jan 21, 2009 at 10:24:22AM +1100, Dave Chinner wrote:
> > > Right, so you need to use internal xfs sync functions that don't
> > > have these problems. That is:
> > >
> > > error = xfs_sync_inodes(ip->i_mount, SYNC_DELWRI|SYNC_WAIT);
> > >
> > > will do a blocking flush of all the inodes without deadlocks occurring.
> > > Then you can remove the 500ms wait.
> >
> > I've given this a try with Eric's testcase from #724 in the oss bugzilla,
> > but it's not enough yet. I thinks that's because SYNC_WAIT is rather
> > meaningless for data writeout, and we need SYNC_IOWAIT instead. The
> > patch below gets the testcase working for me:
> Actually I still see failures happing sometimes. I guess tha'ts because
> our flush is still asynchronous due to the schedule_work..

If I placed
xfs_sync_inodes(ip->i_mount, SYNC_DELWRI);
xfs_sync_inodes(ip->i_mount, SYNC_DELWRI | SYNC_IOWAIT);
directly to xfs_flush_device, I got lock dependency warning (though not a
real deadlock).

So synchronous flushing definitely needs some thinking and lock
rearchitecting. If you sync it asynchronously, a deadlock possibility will
actually turn into spurious -ENOSPC (because the flushing thread will wait
untill the main thread sleeps for 500ms, drops the lock and returns


[ INFO: possible recursive locking detected ]
2.6.29-rc1-devel #2
mc/336 is trying to acquire lock:
(&(&ip->i_iolock)->mr_lock){----}, at: [<0000000010186b88>]
xfs_ilock+0x68/0xa0 [xfs]

but task is already holding lock:
(&(&ip->i_iolock)->mr_lock){----}, at: [<0000000010186bb0>]
xfs_ilock+0x90/0xa0 [xfs]

other info that might help us debug this:
2 locks held by mc/336:
#0: (&sb->s_type->i_mutex_key#8){--..}, at: [<00000000101b1560>]
xfs_write+0x340/0x6a0 [xfs]
#1: (&(&ip->i_iolock)->mr_lock){----}, at: [<0000000010186bb0>]
xfs_ilock+0x90/0xa0 [xfs]

stack backtrace:
Call Trace:
[000000000047e924] validate_chain+0x1184/0x1460
[000000000047ee88] __lock_acquire+0x288/0xae0
[000000000047f734] lock_acquire+0x54/0x80
[0000000000470b20] down_read_nested+0x40/0x60
[0000000010186b88] xfs_ilock+0x68/0xa0 [xfs]
[00000000101b4f78] xfs_sync_inodes_ag+0x218/0x320 [xfs]
[00000000101b5100] xfs_sync_inodes+0x80/0x100 [xfs]
[00000000101b51a0] xfs_flush_device+0x20/0x60 [xfs]
[000000001018e654] xfs_flush_space+0x94/0x100 [xfs]
[000000001018e95c] xfs_iomap_write_delay+0x13c/0x240 [xfs]
[000000001018f100] xfs_iomap+0x280/0x300 [xfs]
[00000000101a84d4] __xfs_get_blocks+0x74/0x260 [xfs]
[00000000101a8724] xfs_get_blocks+0x24/0x40 [xfs]
[00000000004e6800] __block_prepare_write+0x220/0x4e0
[00000000004e6b68] block_write_begin+0x48/0x100
[00000000101a785c] xfs_vm_write_begin+0x3c/0x60 [xfs]

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at