Correct.Hmm, color me confused. Is there something preventing a driver fromCan't get what you're talking about. A freelist of what?
having its own freelist (or whatever), in cases where kmalloc doesn't
suffice? If not, why should the core change? And what sort of driver
measurements are you doing, to conclude that kmalloc doesn't suffice?
Of whatever the driver wants. You had pointed at some code that
boiled down to keeping a freelist of some structures in the core.
Such code doesn't _need_ to be in the core at all ...
See b) above. Doesn't matter where to put an implementation of this memory allocation model. Matters whether it's usable or not. Currently it's not, especially if a driver is complicated enuff to have transfers of arbitrary size.
Basically the idea of the custom lightweight allocation is the following: allocate a page and divide into regions of the same size, initialize and use these regions as a stack.
There already a lot of allocators that support that, like kmem_cache_t
or dma_pool or mempool_t ... no more, please! They don't all have that
"stack"/freelist notion though; easy for drivers to do that, IMO.
Drivers that don't want to hit the heap will preallocate everything
they can, and they'll probably have their own freelists. That wins
by eliminating the slab subroutine calls from what are often hot
code paths, along with their fault handling logic ... and removing
the need (and testing) for fault handling wins by improving robustness.
Whoops. How can it be optional if we want to have an async transfer?I'd have said that since this increases the per-transfer costs (to setDisagree. Let's look deeper. kmalloc itself is more heavyweght than setting up list memberships.
up and manage the list memberships) you want to increase the weight of
that part of the API, not decrease it. ;)
But kmalloc was previously optional, yes? And should still be ...
Not sure if I got you here, sorry.
The list setting commands are pretty essential and will not add a lot to the assembly code.
I'm not totally averse to such changes, but you don't seem to be making
the best arguments. Example: they're clearly not "essential" because
transfer queues work today with the lists at the spi_message level.
'5' is only an example. What if there gonna be 7, 10, 12? What if the possible cases for a driver are 1, 2, 5, 7, 12 and 14 transfers per message?
And your understanding is not quite correct. What if I'm going to send a chain of 5 messages? I'll allocate 5 spi_msg's in my case which all are gonna be of the same size -- thus the technique described above is applicable. In case of your core it's not.
I must have been deceived then by this little utility:
static inline struct spi_message *spi_message_alloc(unsigned ntrans, gfp_t flags)
{
struct spi_message *m;
m = kzalloc(sizeof(struct spi_message)
+ ntrans * sizeof(struct spi_transfer),
flags);
if (m) {
m->transfers = (void *)(m + 1);
m->n_transfer = ntrans;
}
return m;
}
Sure looks to me like spi_message_alloc(5, SLAB_ATOMIC) should do the trick.
One allocation, always the same size. And kzalloc() could easily optimize
that to use the "size-192" slab cache (or whatever), like kmalloc() does;
that'd be a small speedup.
This is mostly a stability issue, not a performance thing.
Could you elaborate on this problem you perceive? This isn't the onlyThe problem is: we're using real-time enhancements patch developed by Ingo/Sven/Daniel etc. You cannot call kmalloc from the interrupt context if you're using this patch. Yeah, yeah -- the interrupt handlers are in threads by default, but we can't go for that since we want immediate acknowledgement from the interrupt context, and that implies spi_message/spi_transfer allocation.
driver API in Linux to talk in terms of arrays describing transfers,
even potentially large arrays.
Could you elaborate a bit here? You seem to be implying that for some
reason one of your SPI related drivers must use non-threaded hardirqs,
AND (news to me, if true) that such hardirqs can't kmalloc(), AND that
it can't use any of several widely used strategies to avoid hitting
things like the slab allocator. (That last seems hardest to believe...)
I asked earlier what sort of performance measurements you're making
that are leading you to these conclusions. I'm still wondering. :)
I'm not concerned much with USB mow :), though I know guys having _a lot_ of trouble trying USB devices work within the real-time Linux enhancememts environment I was referring to earlier ;) Probably a lot of memory allocations within interrupt context... which is a always-avoid thing for any real-time system BTW...Consider how "struct scatterlist" is used, and how USB manages theNot sure if I understand why it's relevant to what we're discussing.
descriptors for isochronous transfers. They don't use linked lists
there, and haven't seemed to suffer from it.
Examples of driver interfaces in Linux that work fine using arrays to couple
a group of transfers. If you see some problem that makes a compelling
argument that the SPI must change, it would also affect drivers using
those interfaces too, at least a little bit ... right? Does it?