* James Courtier-Dutton <James@xxxxxxxxxxxxxxxxxxxx> wrote:
This is just for info, now that we have a nice latency testing tool,
we might as well collect some useful traces that we can later work on.
Here is a trace showing a latency of 39034us. http://www.superbug.demon.co.uk/kernel/
This looks to me to be a bug somewhere. Either in the O7 patch, or in
the kernel. Surely, do_IRQ should happen quickly, and not take 39ms.
something's wrong indeed. Is this an x86 SMP system? If it's SMP then
please apply the patch below (ontop of -O7), it fixes an SMP false
positive bug in the latency timing code.
the process is looping somewhere. Here are the non-IRQ trace entries:
0.001ms (+0.000ms): __switch_to (schedule)
0.002ms (+0.000ms): finish_task_switch (schedule)
0.002ms (+0.000ms): __preempt_spin_lock (schedule)
... [lots of IRQs] ...
38.126ms (+0.362ms): preempt_schedule (schedule)
38.126ms (+0.000ms): sched_clock (schedule)
38.127ms (+0.000ms): find_next_bit (schedule)
38.127ms (+0.000ms): task_timeslice (schedule)
this shows that we are looping in __preempt_spin_lock() - most likely
via schedule()'s reqacquire_kernel_lock() code.
i.e. another CPU is holding the big kernel lock and this CPU is looping. _but_ this CPU is fully preemptible so the trace produces this false
positive.
Ingo