Re: Bulk CPU Hotplug (Was Re: [PATCH] Do not force shutdown/rebootto boot cpu.)

From: Robin Holt
Date: Thu Apr 11 2013 - 17:09:09 EST


On Thu, Apr 11, 2013 at 03:08:20PM -0500, Russ Anderson wrote:
> On Thu, Apr 11, 2013 at 08:15:27PM +0530, Srivatsa S. Bhat wrote:
> > On 04/11/2013 07:53 PM, Russ Anderson wrote:
> > > On Thu, Apr 11, 2013 at 06:15:18PM +0530, Srivatsa S. Bhat wrote:
> > >>
> > >> One more thing we have to note is that, there are 4 notifiers for taking a
> > >> CPU offline:
> > >>
> > >> CPU_DOWN_PREPARE
> > >> CPU_DYING
> > >> CPU_DEAD
> > >> CPU_POST_DEAD
> > >>
> > >> The first can be run in parallel as mentioned above. The second is run in
> > >> parallel in the stop_machine() phase as shown in Russ' patch. But the third
> > >> and fourth set of notifications all end up running only on CPU0, which will
> > >> again slow down things.
> > >
> > > In my testing the third and fourth set were a small part of the overall
> > > time. Less than 10%, with cpu notifiers 90+% of the time.
> >
> > *All* of them are cpu notifiers! All of them invoke __cpu_notify() internally.
> > So how did you differentiate between them and find out that the third and
> > fourth sets take less time?
>
> I reran a test on a 1024 cpu system, using my test patch to only call
> __stop_machine() once. Added printks to show the kernel timestamp
> at various points.
>
> When calling disable_nonboot_cpus() and enable_nonboot_cpus() just after
> booting the system:
> The loop calling __cpu_notify(CPU_DOWN_PREPARE) took 376.6 seconds.
> The loop calling cpu_notify_nofail(CPU_DEAD) took 8.1 seconds.
>
> My guess is that notifiers do more work in the CPU_DOWN_PREPARE case.
>
> I also added a loop calling a new notifier (CPU_TEST) which none of
> notifiers would recognize, to measure the time it took to spin through
> the call chain without the notifiers doing any work. It took
> 0.0067 seconds.
>
> On the actual reboot, as the system was shutting down:
> The loop calling __cpu_notify(CPU_DOWN_PREPARE) took 333.8 seconds.
> The loop calling cpu_notify_nofail(CPU_DEAD) took 2.7 seconds.

How about if you take the notifier_call_chain function copy it
to kernel/sys.c, and time each notifier_call() callout individually.

Robin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/