Re: [RFC] Ext3 nanosecond timestamps in big inodes

From: Andreas Dilger
Date: Sun Jan 16 2005 - 16:04:48 EST


On Jan 14, 2005 22:16 +0100, Andreas Gruenbacher wrote:
> this is a spin-off of an old patch by Alex Tomas <alex@xxxxxxxxxxxxx>:
> Alex originally had nanosecond timestamps in his original patch; here is
> a rejuvenated version. Please tell me what you think. Alex also added a
> create timestamp in his original patch. Do we actually need that?

I think the create timestamp can't hurt things and might be useful in some
cases (e.g. CIFS could use this for exporting, and Lustre will also). It
would also be useful for some disk "forensics" (term used loosely) unlike
the ctime. I haven't ever been very keen on nsecond atimes, but it can't
kill us (atime is already flushed at most 1/sec I think).

> Nanoseconds consume 30 bits in the 32-bit fields. The remaining two bits
> currently are zeroed out implicitly. We could later use them remaining two
> bits for years beyond 2038.

My thought has always been to use, say, 24 bits for "nseconds>>6" and 8 bits
for "seconds<<32" or something close to this (a few bits either way) to give
us more range on the high end. In the past we talked of usecond timestamps
(which would have only been 20 bits, leaving 12 bits for seconds msb),
but the kernel ended up using nseconds for the internal timestamps.

> @@ -2501,6 +2501,14 @@ void ext3_read_inode(struct inode * inod
> + if (ei->i_extra_isize >= 2*sizeof(__le16) + 3*sizeof(__le32)) {
> + inode->i_atime.tv_nsec =
> + le32_to_cpu(raw_inode->i_atime_nsec);
> + inode->i_ctime.tv_nsec =
> + le32_to_cpu(raw_inode->i_ctime_nsec);
> + inode->i_mtime.tv_nsec =
> + le32_to_cpu(raw_inode->i_mtime_nsec);
> + }

You may as well put a macro here to convert from the raw timestamp
(probably shouldn't just be called "nsec" if we are also encoding more
info there, maybe msb_and_ns?) so we can mask (and shift) as necessary.

> @@ -2638,8 +2646,17 @@ static int ext3_do_update_inode(handle_t
> - if (EXT3_INODE_SIZE(inode->i_sb) > EXT3_GOOD_OLD_INODE_SIZE)
> + if (ei->i_extra_isize) {

I don't think this is right. This means we would only ever store nsec
timestamps if there was already other data in the large inode. Since
writing that part of the inode is free (we will be writing a full sector
anyways), no reason not to save it.

> raw_inode->i_extra_isize = cpu_to_le16(ei->i_extra_isize);
> + if (ei->i_extra_isize > 2*sizeof(__le16) + 3*sizeof(__le32)) {
> + raw_inode->i_atime_nsec =
> + cpu_to_le32(inode->i_atime.tv_nsec);
> + raw_inode->i_ctime_nsec =
> + cpu_to_le32(inode->i_ctime.tv_nsec);
> + raw_inode->i_mtime_nsec =
> + cpu_to_le32(inode->i_mtime.tv_nsec);
> + }
> + }

Macro here also to convert from nsec to the ext3-internal "msb_and_ns" format.

> +static inline struct timespec ext3_current_time(struct inode *inode)
> +{
> + return (inode->i_sb->s_time_gran == 1) ?
> + CURRENT_TIME : CURRENT_TIME_SEC;
> +}

If "s_time_gran" (I haven't seen this before but it doesn't appear to
be a part of your patch) had some useful meaning we could use it to e.g.
shift the nsec part of the timestamps as Andy requested so as not to make
the timestamps change too often.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/

Attachment: pgp00000.pgp
Description: PGP signature