Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpuarea

From: Mike Travis
Date: Wed Jul 09 2008 - 10:37:35 EST


Eric W. Biederman wrote:
> Mike Travis <travis@xxxxxxx> writes:
>
>> Unfortunately it's back to the problem of faulting before x86_64_start_kernel()
>> and grub just immediately reboots. So I'm back at analyzing assembler and
>> config differences.
>
> Ok. That is a narrow window of code. So it shouldn't be too bad, nasty though.
>
> If you would like to trace through it I have attached my serial port
> debugging routines that I use in that part of the code.
>
> Eric
>
>

Very cool, thanks!!! I will start using this. (I have been using the trick
to replace printk with early_printk so messages come out immediately instead
of from the log buf.)

I've been able to make some more progress. I've gotten to a point where it
panics from stack overflow. I've verified this by bumping THREAD_ORDER and
it boots fine. Now tracking down stack usages. (I have found a couple of new
functions using set_cpus_allowed(..., CPU_MASK_ALL) instead of
set_cpus_allowed_ptr(... , CPU_MASK_ALL_PTR). But these are not in the calling
sequence so subsequently are not the cause.

One weird thing is early_idt_handler seems to have been called and that's one
thing our simulator does not mimic for standard Intel FSB systems - early pending
interrupts. (It's designed after all to mimic our h/w, and of course it's been
booting fine under that environment.)

Two patches are in the queue to reduce this stack usage:

Subject: [PATCH 1/1] sched: Reduce stack size in isolated_cpu_setup()
Subject: [PATCH 1/1] kthread: Reduce stack pressure in create_kthread and kthreadd

The other stack pigs are:

1640 sched_domain_node_span
1576 tick_notify
1576 setup_IO_APIC_irq
1576 move_task_off_dead_cpu
1560 arch_setup_ht_irq
1560 __assign_irq_vector
1544 tick_handle_oneshot_broadcast
1352 zc0301_ioctl_v4l2
1336 i2o_cfg_compat_ioctl
1192 sn9c102_ioctl_v4l2
1176 __build_sched_domains
1152 e1000_check_options
1144 __build_all_zonelists
1128 setup_IO_APIC
1096 sched_balance_self
1096 _cpu_down
1080 do_ida_request
1064 sched_rt_period_timer
1064 native_smp_call_function_mask
1048 setup_timer_IRQ0_pin
1048 setup_ioapic_dest
1048 set_ioapic_affinity_irq
1048 set_ht_irq_affinity
1048 pci_device_probe
1048 native_machine_crash_shutdown
1032 tick_do_periodic_broadcast
1032 sched_setaffinity
1032 native_flush_tlb_others
1032 local_cpus_show
1032 local_cpulist_show
1032 irq_select_affinity
1032 irq_complete_move
1032 irq_affinity_write_proc
1032 ioapic_retrigger_irq
1032 flush_tlb_mm
1032 flush_tlb_current_task
1032 fixup_irqs
1032 do_cciss_request
1032 create_irq
1024 uv_vector_allocation_domain
1024 uv_send_IPI_allbutself
1024 smp_call_function_single
1024 smp_call_function
1024 physflat_send_IPI_allbutself
1024 pci_bus_show_cpuaffinity
1024 move_masked_irq
1024 flush_tlb_page
1024 flat_send_IPI_allbutself
1000 security_load_policy

Only a few of these though I would think might get called early in
the boot, that might also be contributing to the stack overflow.

Oh yeah, I looked very closely at the differences in the assembler
for vmlinux when compiled with 4.2.0 (fails) and 4.2.4 (which boots
with the above mentioned THREAD_ORDER change) and except for some
weirdness around ident_complete it seems to be the same code. But
the per_cpu variables are in a completely different address order.
I wouldn't think that the -j10 for make could cause this but I can
verify that with -j1. But in any case, I'm sticking with 4.2.4 for
now.

Thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/