Re: [PATCH] f2fs: fix to force keeping write barrier for strict fsync mode

From: Chao Yu
Date: Tue Feb 04 2020 - 20:39:54 EST


On 2020/1/24 6:18, Jaegeuk Kim wrote:
> On 01/20, Chao Yu wrote:
>> If barrier is enabled, for strict fsync mode, we should force to
>> use atomic write semantics to avoid data corruption due to no
>> barrier support in lower device.
>>
>> Signed-off-by: Chao Yu <yuchao0@xxxxxxxxxx>
>> ---
>> fs/f2fs/file.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>> index 86ddbb55d2b1..c9dd45f82fbd 100644
>> --- a/fs/f2fs/file.c
>> +++ b/fs/f2fs/file.c
>> @@ -241,6 +241,13 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
>> };
>> unsigned int seq_id = 0;
>>
>> + /*
>> + * for strict fsync mode, force to keep atomic write sematics to avoid
>> + * data corruption if lower device doesn't support write barrier.
>> + */
>> + if (!atomic && F2FS_OPTION(sbi).fsync_mode == FSYNC_MODE_STRICT)
>> + atomic = true;
>
> This allows to relax IO ordering and cache flush. I'm not sure that's what you
> want to do here for strict mode.

I intend to solve potential data corruption mentioned in below report:

https://www.mail-archive.com/linux-f2fs-devel@xxxxxxxxxxxxxxxxxxxxx/msg15126.html

It occurs in this scenario:

- write page #0; persist
- overwrite page #0
- fsync
- write data page #0 OPU into device's cache
- write inode page into device's cache
- issue flush

If SPO is triggered during flush command, inode page can be persisted before data
page #0, so that after recovery, inode page can be recovered with new physical block
address of data page #0, however there may contains dummy data in new physical block
address.

So what user see is after overwrite & fsync + SPO, old data in file was corrupted, if
any user do care about such case, we can enhance to avoid the corruption in strict mode
and suggest user to use fsync's strict mode.

Thoughts?

Thanks,

>
>> +
>> if (unlikely(f2fs_readonly(inode->i_sb) ||
>> is_sbi_flag_set(sbi, SBI_CP_DISABLED)))
>> return 0;
>> --
>> 2.18.0.rc1
> .
>