Re: 2.6.21-rc4-mm1

From: Andy Whitcroft
Date: Thu Mar 22 2007 - 13:08:40 EST


Con Kolivas wrote:
> On Thursday 22 March 2007 20:48, Andy Whitcroft wrote:
>> Andy Whitcroft wrote:
>>> Andy Whitcroft wrote:
>>>> Andrew Morton wrote:
>>>>> Temporarily at
>>>>>
>>>>> http://userweb.kernel.org/~akpm/2.6.21-rc4-mm1/
>>>>>
>>>>> Will appear later at
>>>>>
>>>>>
>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc
>>>>> 4/2.6.21-rc4-mm1/
>>>> [All of the below is from the pre hot-fix runs. The very few results
>>>> which are in for the hot-fix runs seem worse if anything. :( All
>>>> results should be out on TKO.]
>>>>
>>>>> - Restored the RSDL CPU scheduler (a new version thereof)
>>>> Unsure if the above is the culprit but there seems to be a smattering of
>>>> BUG's in kernbench from the schedular on several systems, and panics
>>>> which do not fully dump out.
>>>>
>>>> elm3b239 is about 2/4 kernbench being the test in progress when we
>>>> blammo in both failed tests, elm3b234 doesn't boot at all.
>>> Well I have one result through for backing RSDL out on elm3b239 and that
>>> does indeed seem to give us a successful boot and test. peterz has
>>> pointed me to an incremental patch from Con which I'll push through
>>> testing and see if that sorts it out.
>> Ok, tested the patch below on top of 2.6.21-rc4-mm1 and this seems to
>> fix the problem:
>>
>> http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc4-mm1-rsdl-0.32.p
>> atch
>>
>> Hard to tell from that patch whether it will be fixed in the changes
>> already committed to the next -mm.
>>
>> Its possible that it may be fixed by the following patch:
>>
>> sched-rsdl-improvements.patch
>>
>> Which has the following slipped in at the end of the changelog:
>>
>> A tiny change checking for MAX_PRIO in normal_prio()
>> may prevent oopses on bootup on large SMP due to
>> forking off the idle task.
>>
>> Con, are all the changes in the 0.32 patch above with akpm?
>
> Yes he's queued everything in that patch you tested for the next -mm. Thanks
> very much for testing it.

No worries. I've just got through the results on the other machine in
the mix. That machine seems to be fixed by backing out RSDL and not by
the fixup 0.32 patch ...

This second machine seems to had hard very soon after user space starts
executing but without a panic. I can't say that the symptoms are very
definitive, but I do have a good result from that machine without RSDL
and not with rsdl-0.32.

The machine is a dual-core x86_64 machine: Dual Core AMD Opteron(tm)
Processor 275.

I'll let you know if I find out anything else. Shout if you want any
information or have anything you want poked or tested.

-apw


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/