Re: atomic RAM ?

From: Michael Schnell
Date: Mon Apr 12 2010 - 05:58:41 EST


On 04/09/2010 05:14 PM, Arnd Bergmann wrote:
>
>> I don't see how (to do FUTEX) a hashlock can be implemented in a way
>> that we stay in user mode when locking it and - if it's already locked -
>> we do a Kernel call for waiting on it being unlocked by another thread.
>> (This is what FUTEX does.)
>>
> You wouldn't. For user space, you can always do a syscall for atomic
> operations, like some architectures do that lack atomic instructions.
>
Of course. This is what pthread_mutex_...() does, if the arch does not
provide the appropriate atomic functions and/or does not have FUTEX. But
this discussion is about fast thread communication, thus avoiding
syscalls is essential to me..
> You already need that with a non-SMP system anyway. As Alan explained,
> futex is only an optimization for a relatively uninteresting case
> (multi-threaded user applications), you really need to solve this for
> kernel space first, because the kernel is inherently multi-threaded.
>
I don't see why optimizing for speed and especially latency is
uninteresting (with embedded systems like the one I'm planning).

Multi-threaded user applications is exactly the case that is extremely
interesting to me and that is why I started this discussion. The
non-SMP-Kernel ( and non-FUTEX) case already is solved for NIOS
(supposedly by interrupt disabling). An SMP-Linux is not yet crafted
(and for me its a lot lower priority than decent user-space
multithreading, but of course it _is_ a valuable task).
> If you want to have atomics in user space, why not go all the way and
> make a small extension to your cache coherency logic to do load-locked/
> store-conditional as well.
Of course doing load-locked, store-conditional custom instructions was
an option I did consider, but as there is no way to access memory
through cache and MMU with custom instructions, I don't see how this
could be done, as the current way FUTEX works, the code will define the
DWORDs to be handled atomically anywhere in the user space memory. Of
course disabling the cache completely is not an option for a task that
is aimed to improve user space performance.
> As far as I understand, NIOS-2 does not
> come with a coherency logic normally, so you must already have made some
> extensions on that level in order to make the dcache coherent.
>
The user space cache coherency is *somehow* provided by the current
(non-SMP) Kernel. I understand that special considerations only are
necessary when the MMU tables are modified. There is no SMP Kernel for
NIOS yet. I in fact have no idea how cache coherency can/will be handled
in the SMP case.
> Typically, there is one dcache line that you can mark as exclusive-locked,
> which behaves just like exclusive, except that the store-conditional
> instruction only performs a store to an exclusive-locked cache line,
> setting a condition code if the cache line has transitioned to another
> state in the meantime.
>
AFAIK, the current NIOS cache hardware design does not provide for
features such as external invalidating of cache lines. So those who
start the NIOS SMP project will need to deal with that by having Altera
enhance the NIOS design and/or use additional "hardware" outside of the
CPU/cache/MMU block.

Thanks,
-Michael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/