Re: [ 16/46] NFSv4.1: Remove a bogus BUG_ON() in nfs4_layoutreturn_done

From: Boaz Harrosh
Date: Wed Sep 19 2012 - 05:50:34 EST


On 09/17/2012 04:05 PM, Myklebust, Trond wrote:

>> -----Original Message-----
>> From: Greg Kroah-Hartman [mailto:gregkh@xxxxxxxxxxxxxxxxxxx]
>> Sent: Sunday, September 16, 2012 12:37 PM
>> To: Ben Hutchings
>> Cc: Myklebust, Trond; linux-kernel@xxxxxxxxxxxxxxx;
>> stable@xxxxxxxxxxxxxxx; Boaz Harrosh; Tigran Mkrtchyan
>> Subject: Re: [ 16/46] NFSv4.1: Remove a bogus BUG_ON() in
>> nfs4_layoutreturn_done
>>
>> On Sun, Sep 16, 2012 at 05:33:03PM +0100, Ben Hutchings wrote:
>>> On Wed, 2012-09-12 at 16:39 -0700, Greg Kroah-Hartman wrote:
>>>> From: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
>>>>
>>>> 3.0-stable review patch. If anyone has any objections, please let me
>> know.
>>>>
>>>> ------------------
>>>>
>>>> From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
>>>>
>>>> commit 47fbf7976e0b7d9dcdd799e2a1baba19064d9631 upstream.
>>>>
>>>> Ever since commit 0a57cdac3f (NFSv4.1 send layoutreturn to fence
>>>> disconnected data server) we've been sending layoutreturn calls
>>>> while there is potentially still outstanding I/O to the data
>>>> servers. The reason we do this is to avoid races between replayed
>>>> writes to the MDS and the original writes to the DS.
>>>>
>>>> When this happens, the BUG_ON() in nfs4_layoutreturn_done can be
>>>> triggered because it assumes that we would never call layoutreturn
>>>> without knowing that all I/O to the DS is finished. The fix is to
>>>> remove the BUG_ON() now that the assumptions behind the test are
>>>> obsolete.
>>>>
>>>> Reported-by: Boaz Harrosh <bharrosh@xxxxxxxxxxx>
>>>> Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@xxxxxxx>
>>>> Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
>>>> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
>>> [...]
>>>
>>> The upstream commit has:
>>>
>>> Cc: stable@xxxxxxxxxxxxxxx [>=3.5]
>>>
>>> and so I ignored it for 3.2. Is it actually needed for the earlier
>>> stable series?
>>
>> Crud, I missed that somehow :(
>>
>> Trond, should I revert this in 3.0 and 3.4 stable kernels?
>
> Hi Greg,
>
> Applying it to those kernels should be unnecessary but harmless, so if you've already applied them then I'd say just keep them.
>
> Cheers
> Trond


Trond hi

I do hit this with objects layout also in 3.2.
I know that in files-layout it only hits post 3.5
But we've been using layout-return since 3.0

Thanks
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/