Re: [PATCH 0/4] Was: deferring __fput()
From: Al Viro
Date: Sun Jul 01 2012 - 23:43:15 EST
On Sun, Jul 01, 2012 at 09:46:31PM -0400, Mimi Zohar wrote:
> On Sun, 2012-07-01 at 21:57 +0100, Al Viro wrote:
> > On Sun, Jul 01, 2012 at 03:50:02PM -0400, Mimi Zohar wrote:
> > > Replacing it with a call to __fput(), the system boots.
> >
> > "it" being just the part under that if (unlikely(...)))? Very interesting... If so, we
> > have some kernel thread ending up with delayed __fput() which somehow makes dracut (assuimg
> > you are using fedora initramfs to go with fedora config) unhappy. With your own patch,
> > doing async __fput() in a lot of cases when this one doesn't delay past the return to
> > userland managing to survive the boot... I wonder which files end up triggering that fun
> > and which kernel thread is responsible... Could you slap a printk() in there, showing
> > file->f_dentry->d_inode->i_mode (octal) and at least file->f_dentry->d_name.name?
> > Along with the current->comm[], all under that inner if (). And see which ones end up
> > going that way by the time execve() of /sbin/init fails.
>
> pid=1 uid=0 d_name=init comm=swapper/0 dev="rootfs" mode=100775
> pid=1 uid=0 d_name=bash comm=swapper/0 dev="rootfs" mode=100755
OK... Here's what I suspect is going on:
* populating initramfs writes binaries there. We open files (for write) from
the kernel thread (there's nothing other than kernel threads at that point), write to
them, then close(). Final fput() gets delayed.
* Then we proceed to execve(). Which means mapping the binary with MAP_DENYWRITE.
Which fails, since there's a struct file still opened for write on that sucker.
Your patch did not delay those fput() - they were done without ->mmap_sem held. So
it survived. Booting without initramfs always survives; booting with initramfs may
or may not survive, depending on the timings - if that scheduled work manages to
run by the time we do those execve(), we win. Note that async_synchronize_full()
done in init_post() might easily affect that, depending on config.
As a quick test, could you try slapping a delay somewhere around the beginning
of init_post() and see if it rescues the system?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/