Re: [PATCH v4 1/5] getcpu_cache system call: cache CPU number of running thread

From: H. Peter Anvin
Date: Tue Mar 01 2016 - 13:29:43 EST


On 02/27/16 16:39, Mathieu Desnoyers wrote:
>
> Very good points! Would the following interfaces be acceptable ?
>
> /* This structure needs to be aligned cache line size. */
> struct thread_local_abi {
> int32_t cpu_id; /* Aligned on
> 32-bit. */
> uint32_t rseq_seqnum; /* Aligned on 32-bit. */
> uint64_t rseq_post_commit_ip; /* Aligned on 64-bit. */
> /* Add new fields at the end. */
> } __attribute__((packed));
>

First of all, DO NOT use __attribute__((packed)). First of all, it
buggers up the alignment of the *entire structure* (the alignment of a
packed structure defaults to 1, and gcc will assume the whole structure
is misaligned, generating unaligned access instructions on architectures
which need them.)

Sadly gcc doesn't currently have an __attribute__ to express "error out
on padding" which is what you actually want here.

You may, however, want to add an explicit alignment attribute to make
sure it is cache line aligned.

Second, as far as the 32/64 bit issue is concerned, you have to order
the fields so you always access the LSB. This is probably the best way
to do it:

#ifdef __LP64__
# define __FIELD_32_64(field,n) uint64_t field;
#elif __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
# define __FIELD_32_64(field,n) uint32_t field, _unused ## n;
#else
# define __FIELD_32_64(field,n) uint32_t _unused ## n, field;
#endif

All these macros are intrinsic to gcc (and hopefully to gcc-compatible
compilers) so there are no header file dependencies.

-hpa