Re: xfstests failures with xfs, dax and v4.4-rc3

From: Dave Chinner
Date: Wed Dec 02 2015 - 15:45:23 EST


On Thu, Dec 03, 2015 at 07:29:10AM +1100, Dave Chinner wrote:
> On Wed, Dec 02, 2015 at 11:34:38AM -0700, Ross Zwisler wrote:
> > I'm hitting a few more test failures in my testing setup with v4.4-rc3, xfs
> > and DAX. My test setup is a pair of 4GiB PMEM partitions in a KVM virtual
> > machine. Here are the failures:
>
> Which are caused by commit 1ca1915 ("xfs: Don't use unwritten extents
> for DAX") because of this code for unwritten extent conversion in
> get_blocks:
>
> tp->t_flags |= XFS_TRANS_RESERVE;
>
> It's a minor problem compared to all the other issues DAX has right
> now, so I ignored it to get the bigger problem solved first.

Patch to fix the problem below.

-Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx

xfs: Don't use reserved blocks for data blocks with DAX

From: Dave Chinner <dchinner@xxxxxxxxxx>

Commit 1ca1915 ("xfs: Don't use unwritten extents for DAX") enabled
the DAX allocation call to dip into the reserve pool in case it was
converting unwritten extents rather than allocating blocks. This was
a direct copy of the unwritten extent conversion code, but had an
unintended side effect of allowing normal data block allocation to
use the reserve pool. Hence normal block allocation could deplete
the reserve pool and prevent unwritten extent conversion at ENOSPC,
hence violating fallocate guarantees on preallocated space.

Fix it by checking whether the incoming map from __xfs_get_blocks()
spans an unwritten extent and only use the reserve pool if the
allocation covers an unwritten extent.

Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
---
fs/xfs/xfs_iomap.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index f4f5b43..9ed146b 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -203,15 +203,20 @@ xfs_iomap_write_direct(
* this outside the transaction context, but if we commit and then crash
* we may not have zeroed the blocks and this will be exposed on
* recovery of the allocation. Hence we must zero before commit.
+ *
* Further, if we are mapping unwritten extents here, we need to zero
* and convert them to written so that we don't need an unwritten extent
* callback for DAX. This also means that we need to be able to dip into
- * the reserve block pool if there is no space left but we need to do
- * unwritten extent conversion.
+ * the reserve block pool for bmbt block allocation if there is no space
+ * left but we need to do unwritten extent conversion.
*/
+
if (IS_DAX(VFS_I(ip))) {
bmapi_flags = XFS_BMAPI_CONVERT | XFS_BMAPI_ZERO;
- tp->t_flags |= XFS_TRANS_RESERVE;
+ if (ISUNWRITTEN(imap)) {
+ tp->t_flags |= XFS_TRANS_RESERVE;
+ resblks = XFS_DIOSTRAT_SPACE_RES(mp, 0) << 1;
+ }
}
error = xfs_trans_reserve(tp, &M_RES(mp)->tr_write,
resblks, resrtextents);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/