Re: [bisected] NFS corruption with 3.4

From: Dave Jones
Date: Tue Jun 05 2012 - 08:46:13 EST


On Tue, Jun 05, 2012 at 11:16:17AM +0200, Ondrej Zary wrote:
> Hello,
> I use NFS for deploying HDD images on new machines. My machine has 2nd network
> card just for this, running DHCPD, TFTPD and kernel NFS server. The target
> machine is set to boot from LAN and boots SystemRescueCD from my machine with
> an autorun script that launches Partimage and deploys the HDD image (400 to
> 900 MB compressed).
>
> It worked fine for years, until now. With kernel 3.4, everyting
> works only for the first time after boot (and not always). Next time (next
> machine), partimage aborts almost immediately as it's probably unable to
> decompress the image file. md5sum is different on my machine vs. on the
> target (through NFS). Also SystemRescueCD boot aborts with md5 error
> sometimes. Everything works fine after rebooting back to 3.3.
>
> Bisection found this:
>
> 0fc9d1040313047edf6a39fd4d7c7defdca97c62 is the first bad commit
> commit 0fc9d1040313047edf6a39fd4d7c7defdca97c62
> Author: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxx>
> Date: Wed Mar 28 14:42:54 2012 -0700
>
> radix-tree: use iterators in find_get_pages* functions
>
> Reverting this commit in 3.4 fixes the problem.

I meant to come back to this, because I saw this problem too.

is this patch a problem for the client, or the server ?
I'm assuming the server, because I saw at least a similar sounding
problem using an OSX client->Linux server.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/