Here is a prototype patch of the compressed IRQ stubs -- this patch
compresses them down to 7 stubs per 32-byte cache line (or part of cache
line) at the expense of a back-to-back jmp which has the potential of
being ugly on some pipelines (we can only get 4 stubs into 32 bytes
without that).
You could actually get 4-byte stubs, using a 16-bit call (66 e8 ww ww). But it would be slower, since we won't be pairing it with a ret.
I suspect we could get it down to three bytes, by sharing the last byte of the four-byte call sequence with the first byte of the next:
66 e8 ff 66 e8 fc 66 e8 f9 66 e8 f6 ...
Every three bytes a new stub begins; it's a four-byte call to offset 0x6703 relative to the beginning of the first stub.
Can anyone better 24 bits/stub?