Re: [PATCH RFC 0/6] Implement per-processor data areas for i386.

From: Andi Kleen
Date: Sun Aug 27 2006 - 14:33:22 EST


On Sunday 27 August 2006 20:27, Andreas Mohr wrote:
> Hi,
>
> On Sun, Aug 27, 2006 at 08:04:38PM +0200, Andi Kleen wrote:
> >
> > > Something like that had to be done eventually about the inefficient
> > > current_thread_info() mechanism,
> >
> > Inefficient? It's two fast instructions. I won't call that inefficient.
>
> And that AGI stall?

What AGI stall?

[btw AGI stall is an outdated concept on modern x86 CPUs]

> > > I guess it's due to having tried that on an older installation with gcc 3.2,
> > > which probably does less efficient opcode merging of current_thread_info()
> > > requests compared to a current gcc version.
> >
> > gcc normally doesn't merge inline assembly at all.
>
> Depends on use of volatile, right?

No. It can only merge statements it knows anything about, and it doesn't
about inline assembly.

> OK, so probably there was no merging of separate requests,
> but opcode intermingling could have played a role.

It seems to make some difference if it's able to move asm around
and if they don't have memory clobbers. memory clobbers really seem
to cause much worse code in the whole function.

But current_thread_info didn't have that.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/