[PATCH] include/asm-i386/semaphore.h speedup

Colin Plumb (colin@nyx.net)
Wed, 13 Jan 1999 20:22:20 -0700 (MST)


I was looking at this code and noticed the comment lamenting the inefficiency
is not necessary on a 486 and up.

Could someone please check out the following. In particular, is the
lock prefix required for cmpxchg, or is it already implicit?

-- 
	-Colin

--- linux/include/asm-i386/semaphore.h Wed Jan 13 16:59:27 1999 +++ linux/include/asm-i386/semaphore.h.new Wed Jan 13 18:39:32 1999 @@ -48,19 +48,33 @@ * These two _must_ execute atomically wrt each other. * * This is trivially done with load_locked/store_cond, - * but on the x86 we need an external synchronizer. + * but on the 386 we need an external synchronizer. + * On the 486 and up, it can be done with CMPXCHG. + * (Which atomically compares memory to %eax, and if they match, + * sets ZF and stores the register to memory. If they don't, it + * clears ZF and updates %eax with the actual memory contents.) */ static inline void wake_one_more(struct semaphore * sem) { +#ifdef CONFIG_M386 unsigned long flags; spin_lock_irqsave(&semaphore_wake_lock, flags); sem->waking++; spin_unlock_irqrestore(&semaphore_wake_lock, flags); +#else + __asm__ __volatile__( +#ifdef __SMP__ + "lock ; " +#endif + "incl %1" + : "=m" (sem->waking) : "0" (sem->waking)); +#endif } static inline int waking_non_zero(struct semaphore *sem) { +#if CONFIG_386 unsigned long flags; int ret = 0; @@ -71,6 +85,25 @@ } spin_unlock_irqrestore(&semaphore_wake_lock, flags); return ret; +#else + int ret; + int temp; + + __asm__ __volatile__( + "movl %2,%%eax\n" + "1:\tmovl %%eax,%1\n\t" + "decl %1\n\t" + "jmi 2f\n\t" +#ifdef __SMP__ + "lock ; " +#endif + "cmpxchg %1,%2\n\t" + "jnz 1b\n\t" + "movl $1,%%eax\n" + "2:" + : "=a" (ret), "=r" (temp) : "m" sem->waking) + return ret; +#endif } /*

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/