Re: [PATCH 2/2] slab: move kmem_cache_free to common code

From: Glauber Costa
Date: Tue Oct 23 2012 - 06:52:40 EST


On 10/23/2012 12:07 PM, Glauber Costa wrote:
> On 10/23/2012 04:48 AM, JoonSoo Kim wrote:
>> Hello, Glauber.
>>
>> 2012/10/23 Glauber Costa <glommer@xxxxxxxxxxxxx>:
>>> On 10/22/2012 06:45 PM, Christoph Lameter wrote:
>>>> On Mon, 22 Oct 2012, Glauber Costa wrote:
>>>>
>>>>> + * kmem_cache_free - Deallocate an object
>>>>> + * @cachep: The cache the allocation was from.
>>>>> + * @objp: The previously allocated object.
>>>>> + *
>>>>> + * Free an object which was previously allocated from this
>>>>> + * cache.
>>>>> + */
>>>>> +void kmem_cache_free(struct kmem_cache *s, void *x)
>>>>> +{
>>>>> + __kmem_cache_free(s, x);
>>>>> + trace_kmem_cache_free(_RET_IP_, x);
>>>>> +}
>>>>> +EXPORT_SYMBOL(kmem_cache_free);
>>>>> +
>>>>
>>>> This results in an additional indirection if tracing is off. Wonder if
>>>> there is a performance impact?
>>>>
>>> if tracing is on, you mean?
>>>
>>> Tracing already incurs overhead, not sure how much a function call would
>>> add to the tracing overhead.
>>>
>>> I would not be concerned with this, but I can measure, if you have any
>>> specific workload in mind.
>>
>> With this patch, kmem_cache_free() invokes __kmem_cache_free(),
>> that is, it add one more "call instruction" than before.
>>
>> I think that Christoph's comment means above fact.
>
> Ah, this. Ok, I got fooled by his mention to tracing.
>
> I do agree, but since freeing is ultimately dependent on the allocator
> layout, I don't see a clean way of doing this without dropping tears of
> sorrow around. The calls in slub/slab/slob would have to be somehow
> inlined. Hum... maybe it is possible to do it from
> include/linux/sl*b_def.h...
>
> Let me give it a try and see what I can come up with.
>

Ok.

I am attaching a PoC for this for your appreciation. This gets quite
ugly, but it's the way I found without including sl{a,u,o}b.c directly -
which would be even worse.

But I guess if we really want to avoid the cost of a function call,
there has to be a tradeoff...

For the record, the proposed usage for this would be:

1) Given a (inline) function, defined in mm/slab.h that translates the
cache from its object address (and then sanity checks it against the
cache parameter), translate_cache():

#define KMEM_CACHE_FREE(allocator_fn) \
void kmem_cache_free(struct kmem_cache *s, void *x) \
{ \
struct kmem_cache *cachep; \
cachep = translate_cache(s, x); \
if (!cachep) \
return; \
allocator_fn(cachep, x); \
trace_kmem_cache_free(_RET_IP_, x); \
} \
EXPORT_SYMBOL(kmem_cache_free)