Re: bisected boot regression post 2.6.25-rc3.. please revert

From: Ingo Molnar
Date: Mon Mar 03 2008 - 02:47:03 EST



* Arjan van de Ven <arjan@xxxxxxxxxxxxxxx> wrote:

> Hi Linus, Ingo, Hans,
>
> Please revert commit cded932b75ab0a5f9181ee3da34a0a488d1a14fd ( x86:
> fix pmd_bad and pud_bad to support huge pages ) since it prevents the
> kernel to finish booting on my (Penryn based) laptop. The boot stops
> right after freeing the init memory. Took a while to bisect (due to it
> touching page*.h, which forces a full recompile), but it definitely is
> caused by this commit...

hm, lets figure out why this patch breaks your box, ok? We obviously
have to revert it if we cannot figure it out, but lets at least try -
because the patch itself fixes a real regression and it's not obviously
wrong either. I think there might be some bug hiding somewhere that we
really want to fix instead of this revert.

Could you try to hack up a debug patch perhaps? Uninline the fucntion(s)
then add two versions (one is that breaks on your box and one is that
works on your box) of this same pmd_bad()/pud_bad() functions and do
something like this (pseudocode):

pmd_bad()
{
if (pmd_bad_working(x) != pmd_bad_broken(x))
panic_timeout++;

return pmd_bad_working(x);
}

i.e. we actually use the working function so your box should boot up
just fine - but we instrument things with the broken function as well
and detect the cases where the two values differ. It is an anomaly if
either function ever returns true instead of false.

if after bootup you have a non-zero panic_timeout then there is a
material difference somewhere. In that case try to stick a dump_stack()
into the above case as well - and if the box still boots, send us the
stackdumps. (if the box doesnt boot then perhaps the printout itself
hangs - in that case try a save_stack_trace() hack and print it out
later)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/