Re: 2.6.19-rc1 genirq causes either boot hang or "do_IRQ: cannothandle IRQ -1"

From: Linus Torvalds
Date: Sat Oct 07 2006 - 15:04:12 EST




On Sat, 7 Oct 2006, Eric W. Biederman wrote:
>
> I am hoping that by running the apics in a different delivery mode
> that explicitly says just deliver this interrupt to this cpu we
> will avoid the problem you are seeing.

Note that having too strict delivery modes could be a major pain in the
future, with things like multicore CPU's a lot more actively doing power
management on their own, and effectively going into sleep-states with
reasonably long latencies.

Especially with schedulers that are aware of things like that (and we
_try_, at least to some degree, and people are interested in more of it),
you can easily be in the situation that one of the cores is being fairly
actively kept in a low-power state, and can have millisecond latencies
(not to mention no L1 cache contents etc).

So I really do think that the belief that we should force irqs to a
particular core is fundamentally flawed.

We used to do lowest-priority stuff in hw, and then Intel broke it, but I
always told them that they were _stupid_ to break it. The fact is,
especially with multi-core, it actually makes a lot of sense to have
hardware decide which core to interrupt, because hardware simply
potentially knows better.

This is one of those age-old questions: in _theory_ you can do a better
job in software, but in _practice_ it's just too damn expensive and
complicated to do a perfect job especially with dynamic decisions, so in
_practice_ it tends to be better to let hardware make some of the
decisions.

We can see the same thing in instruction scheduling: in _theory_ a
compiler can do a better job of scheduling, since it can spend inordinate
amounts of resources on doing things once, and then the hardware can be
simpler and faster and never worry about it. In _practice_, however, the
biggest scheduling decisions are all dynamic at run-time, and depend on
things like cache misses etc, and only total idiots (or embedded people)
will do static scheduling these days.

I think it's a huge mistake to do static interrupt routing for the same
reason.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/