[BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7)

From: Harald Laabs
Date: Tue Jul 19 2011 - 16:46:07 EST


Hi,
reloading an apache httpd can crash the kernel since 2.6.35.
It seems that tasks are removed between creating the task-list and
calling wake_up_sem_queue_do in freeary. The pointers to the
task_struct elements end up in try_to_wake_up and sometimes contain
0x0 there.
The problem did not exist in 2.6.34. It does not show up on single
processor systems. Depending on the apache httpd settings it only
takes a few tries to kill the system on our 8-core servers. Dualcore
did not want to crash, maybe it really needs more than one real CPU.
Various gcc versions (4.1 to 4.6) were used.

If anyone wants to crash a system using an prefork apache httpd:
<IfModule mpm_prefork_module>
ServerLimit 512
StartServers 50
MinSpareServers 50
MaxSpareServers 100
MaxClients 200
MaxRequestsPerChild 500
</IfModule>
(Details do not seem to matter but some settings did not die fast.)

I'm not able to fix or understand this bug myself, its already in
bugzilla with the call trace:
https://bugzilla.kernel.org/show_bug.cgi?id=27142

Is there any more useful information I can provide? Anything to test?
Does anyone know of changes from 2.6.34 to 2.6.35 that might have
broken this? (The diff and the changelog do not enlighten me, too
much changed and I understand little of it.)

Thanks,
Harald
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/