Re: [PATCH 17/46] fs: Use rename lock and RCU for multi-step operations

From: Yehuda Sadeh Weinraub
Date: Mon Feb 07 2011 - 16:04:55 EST


On Mon, Feb 7, 2011 at 10:52 AM, Jim Schutt <jaschut@xxxxxxxxxx> wrote:
>
> On Wed, 2011-01-26 at 22:18 -0700, Nick Piggin wrote:
>> On Wed, Jan 26, 2011 at 9:10 AM, Yehuda Sadeh Weinraub
>> <yehudasa@xxxxxxxxx> wrote:
>> > On Wed, Jan 19, 2011 at 2:32 PM, Nick Piggin <npiggin@xxxxxxxxx> wrote:
>> >> On Thu, Jan 20, 2011 at 9:27 AM, Yehuda Sadeh Weinraub
>> >> <yehudasa@xxxxxxxxx> wrote:
>> >>> On Tue, Jan 18, 2011 at 2:42 PM, Nick Piggin <npiggin@xxxxxxxxx> wrote:
>> >>>> On Wed, Jan 19, 2011 at 9:32 AM, Yehuda Sadeh Weinraub
>> >>>
>> >>>>> There's an issue with ceph as it references the
>> >>>>> dentry->d_parent(->d_inode) at dentry_release(), so setting
>> >>>>> dentry->d_parent to NULL here doesn't work with ceph. Though there is
>> >>>>> some workaround for it, we would like to be sure that this one is
>> >>>>> really required so that we don't exacerbate the ugliness. The
>> >>>>> workaround is to keep a pointer to the parent inode in the private
>> >>>>> dentry structure, which will be referenced only at the .release()
>> >>>>> callback. This is clearly not ideal.
>> >>>>
>> >>>> Hmm, I'll have to think about it. Probably we can check for
>> >>>> d_count == 0 rather than parent != NULL I think?
>> >>>>
>> >>>
>> >>> That'll solve ceph's problem, don't know about how'd affect other
>> >>> stuff. We'll need to know whether this is the solution, or whether
>> >>> we'd need to introduce some other band aid fix.
>> >>
>> >> No I think it will work fine. Basically we just need to know whether
>> >> we have been deleted, and if so then we restart rather than walking
>> >> back up the parent.
>> >>
>> >> I'll send a patch in a few days. For the meantime, it's a rathe
>> >> small window for ceph to worry about. So we'll have something
>> >> before -rc2 which should be OK.
>> >>
>> >
>> > I guess that it's a bit late for -rc2, should we assume that it'll be on -rc3?
>>
>> Yeah, I'm sorry I've been travelling and a bit disconnected.
>>
>> NFS folk are having a similar problem and looks like similar
>> proposed fix will do it.
>>
>> http://marc.info/?l=linux-fsdevel&m=129599823927039&w=2
>>
>> So I think it is the best way to go to restore behaviour back to what
>> filesystems already expect, to avoid more surprises in future.
>
> I think the following BUG indicates I'm hitting this problem?
> All I have to do to cause it is unlink a file.
>
> My ceph client kernel is 8dbdea8444 (master branch) from
>  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> + e41cdbb6c5 (master branch) + a3f5274e53 (unstable branch)
>  from git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git
>
> Are there any patches available for this I can test?
>
> Thanks -- Jim
>

It does look like this specific problem.
You can try cherry-pick commit 9c3db35 off the ceph git. It is just a
temporary workaround, and it wasn't tested too much. Hopefully Nick
will push his fix soon so that it wouldn't be needed.

Thanks,
Yehuda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/