Re: [GIT pull] smp/hotplug updates for 4.13

From: Linus Torvalds
Date: Sun Jul 09 2017 - 14:15:09 EST


On Sun, Jul 9, 2017 at 2:07 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> + /* Wait for the CPU to reach CPUHP_AP_ONLINE_IDLE */
> wait_for_completion(&st->done);
> + BUG_ON(!cpu_online(cpu));

I realize this isn't actually a new BUG_ON(), just a moved one, but
can we *please* just agree to get rid of these machine-killing
idiocies?

The fact that it would be a bug if this triggers doesn't excuse it one
whit. BUG_ON() is simply bad.

Either it's a "this cannot happen", in which case it should be
deleted, or it's a "if this ever happens, we really should notice it",
in which case it should probably be a

if (WARN_ON)ONCE(..))
return -EBUSY;

or something.

That routine very much has an existing error path for failure to bring
up a CPU, dammit!

THERE IS NO EXCUSE FOR KILLING THE MACHINE THERE!

If it really was something core where there is no possible way to
continue, and continuing would imply that the end result will kill
your pets and everbody you love, then yes, use BUG_ON().

But if you already have code to handle the failure case, HELL NO!

I've pulled this thing (exactly because it's not a *new* BUG_ON()),
but I really think that people should look at the code they move. And
if it's a BUG_ON() or similar braindamage, they should look very
closely at what alternatives there are.

Linus