Re: [PATCH] namei: results of d_is_negative() should be checked after dentry revalidation

From: Al Viro
Date: Fri Oct 09 2015 - 21:37:48 EST


On Fri, Oct 09, 2015 at 05:19:02PM -0700, Linus Torvalds wrote:

> So in general, we should always (a) either verify all sequence points
> or (b) return -ENOCHLD to go into slow mode. The patch seems
>
> However, this thing was explicitly made to be this way by commit
> 766c4cbfacd8 ("namei: d_is_negative() should be checked before ->d_seq
> validation"), so while my gut feel is to consider this fix
> ObviouslyCorrect(tm), I will delay it a bit in the hope to get an ACK
> and comment from Al about the patch.
>
> Al?

Umm... I agree that the current version is wrong and it looks like this
patch is a complete fix. The only problem is the commit message -
what really happens is that 766c4cbfacd8 got the things subtly wrong.
We used to treat d_is_negative() after lookup_fast() as "fall with ENOENT".
That was wrong - checking ->d_flags outside of ->d_seq protection is
unreliable and failing with hard error on what should've fallen back to
non-RCU pathname resolution is a bug.

Unfortunately, we'd pulled the test too far up and ran afoul of another
kind of staleness. Dentry might have been absolutely stable from the
RCU point of view (and we might be on UP, etc.), but stale from the
remote fs point of view. If ->d_revalidate() returns "it's actually
stale", dentry gets thrown away and original code wouldn't even have looked
at its ->d_flags. What we need is to check ->d_flags where 766c4cbfacd8 does
(prior to ->d_seq validation) but only use the result in cases where we
do not discard this dentry outright.

With some explanation along the lines of the above added, consider the patch
ACKed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/