Re: [PATCH 2/2] Add a thread cpu time implementation to vDSO

From: Andrew Lutomirski
Date: Mon Dec 12 2011 - 16:33:41 EST


On Mon, Dec 12, 2011 at 1:27 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> Le lundi 12 décembre 2011 à 13:19 -0800, Arun Sharma a écrit :
>> On 12/12/11 12:13 PM, Eric Dumazet wrote:
>>
>> >> +
>> >> +struct vcpu_data {
>> >> +  struct vpercpu_data vpercpu[NR_CPUS];
>> >> +  unsigned int tsc_khz;
>> >> +  unsigned int tsc_unstable;
>> >> +};
>> >
>> > Thats a showstopper.
>> >
>> > Try to compile the thing with NR_CPUS=4096 ?
>> >
>>
>> I get a link time error:
>>
>> ld: section .data..percpu [0000000001ac2000 -> 0000000001ad48ff]
>> overlaps section .vvar [0000000001ac0000 -> 0000000001b0083f]
>>
>> which I consider better than runtime memory corruption :)
>>
>> I could add a BUILD_BUG_ON() that tries to catch this earlier in the
>> compile process.
>>
>> Re: Fixing the build for NR_CPUS > 64
>>
>> How about something along the lines of the following:
>>
>> From: Arun Sharma <asharma@xxxxxx>
>> Date: Mon, 12 Dec 2011 13:13:43 -0800
>> Subject: [PATCH] Handle NR_CPUS > 64
>>
>> ---
>
> This only works if CONFIG_X86_L1_CACHE_SHIFT=6
>
> Some configurations have 128 bytes cache lines
>
> But really most modern distros have NR_CPUS > 64
>
>
>

I kind of like the idea of a per-cpu vvar page.

Pros:
- It could be used for very fast and simple userspace TLS.
- Things like gettid() could become vsyscalls.
- Thread time, process time, etc.
- I would use it for my nefarious asynchronous sysret idea.
- The kernel could use the same infrastructure to replace swapgs.
This might not be a win -- swapgs is probably very highly optimized on
modern cpus.
- Other than the core implementation, everything that uses it would be
very easy to understand.

Cons:
- Uses a page, a pte, and everything higher up the page table hierarchy per cpu.
- Takes up a tlb slot.
- Getting it to work with Xen might be interesting.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/