Re: [PATCH v2 1/5] fat: allocate persistent inode numbers

From: J. Bruce Fields
Date: Wed Sep 12 2012 - 10:32:36 EST

Next message: Stephane Eranian: "Re: [PATCH v2 1/3] hrtimer: add hrtimer_init_cpu()"
Previous message: Zhi Yong Wu: "Re: [RFC 00/11] VFS: hot data tracking"
In reply to: Namjae Jeon: "Re: [PATCH v2 1/5] fat: allocate persistent inode numbers"
Next in thread: OGAWA Hirofumi: "Re: [PATCH v2 1/5] fat: allocate persistent inode numbers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Sep 12, 2012 at 11:12:56PM +0900, Namjae Jeon wrote:
> 2012/9/12 OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>:
> > Namjae Jeon <linkinjeon@xxxxxxxxx> writes:
> >
> >>>> I think that it is unfixable because we can not know i_pos of inode
> >>>> changed by rename.
> >>>> And even though we know it, there is no rebuild inode routine in -mm.
> >>>> And It even can not fix in our patches.
> >>>
> >>>>> And are you tried https://lkml.org/lkml/2012/6/29/381 patches? It sounds
> >>>>> like to improve performance by enabling lookupcache.
> >>>> We checked this patches when facing estale issue in -mm.
> >>>> But It is no use, these patches just retry system call one more when
> >>>> estale error.
> >>>
> >>> What happens if client retried from lookup() after -ESTALE? (client NFS
> >>> doesn't have the name of entry anymore?)
> >> Need to rebuild inode routine because inode cache is already evicted on Server.
> >>>
> >>> I'm assuming the retry means - it restarts from building the NFS file
> >>> handle. I might be just wrong here though.
> >> As I remember, just retry in VFS of NFS client..I heard this patch is
> >> needed for
> >> a very specific set of circumstances where an entry goes stale once
> >> between the lookup and the actual operation(s).
> >> It is not related with current issues(inode cache eviction on server).
> >
> > Supposing, the server/client state is after cold boot, and client try to
> > rename at first without any cache on client/server.
> >
> > Even if this state, does the server return ESTALE? If it doesn't return
> > ESTALE, I can't understand why it is really unfixable.
> Hi OGAWA.
> Server will not return ESTALE in this case. because the client does
> not have any information for files yet.

It does if the client mounted before the server rebooted. NFS is
designed so that servers can reboot without causing clients to fail.
(Applications will just see a delay during the reboot.)

It probably isn't possible to this work in the case of fat.

But from fat's point of view there probably isn't much difference
between a filehandle lookup after a reboot and a filehandle lookup after
the inode's gone from cache.

I really don't see what you can do to help here. Won't anything that
allows looking up an uncached inode by filehandle also risk finding the
wrong file?

(If looking up the same filehandle ever results in finding a *different*
file from before, that's a bug. Probably a more dangerous bug than an
ESTALE--in the ESTALE case the failure is obvious whereas in the case
where you get the wrong file, you may silently corrupt data.)

--b.

> I mean NFS client does not have any old NFS FH(containing old inode
> number) for this.
>
> >
> > If it returns ESTALE, why does it return? I'm assuming the previous code
> > path is the cached FH path.
> The main point for observation is the file handle-which is used for
> all the NFS operation.
> So for all the NFS operation(read/write....) which makes use of the
> NFS file handle in between if there is a change in inode number
> It will result in ESTALE.
> Changing inode number on rename happened at NFS server by inode cache
> eviction with memory pressure.
>
> lookupcache is used at NFS client to reduce number of LOOKUP operations.
> But , we can still get ESTALE if inode number at NFS Server change
> after LOOKUP, although lookupcache is disable.
>
> LOOKUP return NFS FH->[inode number changed at NFS Server] ->
> But we still use old NFS FH returned from LOOKUP for any file
> operation(write,read,etc..)
> -> ESTALE will be returned.
>
> Thanks!
> > --
> > OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Stephane Eranian: "Re: [PATCH v2 1/3] hrtimer: add hrtimer_init_cpu()"
Previous message: Zhi Yong Wu: "Re: [RFC 00/11] VFS: hot data tracking"
In reply to: Namjae Jeon: "Re: [PATCH v2 1/5] fat: allocate persistent inode numbers"
Next in thread: OGAWA Hirofumi: "Re: [PATCH v2 1/5] fat: allocate persistent inode numbers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]