Re: [RFC 11/32] xfs: convert to struct inode_time

From: Dave Chinner
Date: Sat May 31 2014 - 01:55:18 EST



[ Please don't top post. ]

On Fri, May 30, 2014 at 06:22:55PM -0700, H. Peter Anvin wrote:
> On May 30, 2014 6:14:50 PM PDT, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> >On Fri, May 30, 2014 at 05:41:14PM -0700, H. Peter Anvin wrote:
> >> On 05/30/2014 05:37 PM, Dave Chinner wrote:
> >> >
> >> > IOWs, the filesystem has to be able to reject any attempt to
> >> > set a timestamp that is can't represent on disk otherwise Bad
> >> > Stuff will happen,
> >>
> >> Actually it is questionable if it is worse to reject a
> >> timestamp or
> >just
> >> let it wrap. Rejecting a valid timestamp is a bit like "You
> >> don't exist, go away."
> >
> >I think having the new systems calls being able to return EINVAL
> >if the value cannot be stored permanently on disk correctly is
> >the right thing to do. Having it silently mangled by the
> >filesystem and returning "everything is just fine, trust me" is
> >close to the worst solution I can think of. That's exactly what
> >leads to overflow bugs occurring....
> >
> >> > and filesystems have to be able to specify in their on disk
> >> > format what timestamp encoding is being used. The solution
> >will
> >> > be different for every filesystem that needs to support time
> >> > beyond 2038.
> >>
> >> Actually the cutoff can be really different for each
> >> filesystem, not necessarily 2038. However, I maintain the
> >> above still holds.
> >
> >Sure, but all filesystems are supposed to handle at least the
> >current unix epoch.
> >
> >> Consider a filesystem that kept timestamps in YYMMDDHHMMSS
> >> format.
> >What
> >> would you have expected such a filesystem to do on Jan 1, 2000?
> >
> >Strawman.
> >
> >We don't need to cater for fundamentally broken designs that
> >can't even handle the current unix epoch correctly. If such
> >filesystems exist, then they can simple say "original unix epoch
> >support only" and do whatever crap they are doing right now.
>
> No, not a strawman. Replace with Jan 26, 2038 and you have the
> same situation.

But that's not the problem I'm talking about. The problem isn't the
roll-over date of the epoch - the problem is that we're changing the
in-memory meaning of time without changing what the filesystems
store on disk or how they translate them.

To use your example, what I'm actually talking about is the kernel
switching to CCYYMMDDHHMMSS while the filesystem has YYMMDDHHMMSS on
disk. The filesystem doesn't know the timestamp is now a different
format, so it could mangle it writing it to disk, or it could mangle
existing timestamps in the YY.. format reading them from disk and
putting them into CC.. format structures. IOWs, it will
incorrectly translate YY format dates to CC format, or translate
something in the CC format as though it was in YY format. And it
wouldn't even know what was the correct format because there's
nothing telling it on disk whether the date is in CC or YY format.

Either way, you get mangled timestamps, the filesystem doesn't know
about it because it's just storing what the kernel gives it, the
kernel thinks they are fine because they are just opaque when read
back, but the user says "what the fuck did a reboot do to all these
timestamps?".

Hence your example of roll-over dates is a strawman - you've
constructed a problem that is irrelevant to the issue being pointed
out.

FWIW, we already have code in the superblock and VFS to avoid such
problems on filesystems with limited timestamp resolution (i.e
s_time_gran and current_fs_time()) so that what the VFS hands the
filesystem is exactly what the VFS expects to get back from disk
when comparing timestamps.

If we are changing the in-kernel timestamp to have a greater dynamic
range that anything we current support on disk, then we need support
for all filesystems for similar translation and constraint. The
filesystems need to be able to tell the kernel what they timestamp
range they support, and then the kernel needs to follow those
guidelines. And if the filesystem is mounted on a kernel that
doesn't support the current filesystem's timestamp format, then at
minimum that filesystem cannot do anything that writes a
timestamp....

Put simply: the filesystem defines the timestamp range that can be
used safely, not the userspace API. If the filesystem can't support
the date it is handed then that is an out-of-range error. Since
when have we accepted that it's OK to handle out-of-range data with
silent overflows or corruption of the data that we are attempting to
store? We're defining a new API to support a wider date range -
there is nothing that prevents us from saying ERANGE can be returned
to a timestamp that the file cannot store correctly....

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/