Re: [beta patch] SSE copy_page() / clear_page()

From: Manfred Spraul (manfred@colorfullife.com)
Date: Tue Feb 20 2001 - 16:16:15 EST


Pavel Machek wrote:
>
> > > > + __asm__ __volatile__(
> > > > + "mov %1, %0\n\t"
> > > > + : "=r" (i)
> > > > + : "r" (kaddr+offset)); /* load tlb entry */
> > > > + for(i=0;i<size;i+=64) {
> > > > + __asm__ __volatile__(
> > > > + "prefetchnta (%1, %0)\n\t"
> > > > + "prefetchnta 32(%1, %0)\n\t"
> > > > + : /* no output */
> > > > + : "r" (i), "r" (kaddr+offset));
> > > > + }
> > > > + }
> > > > left = __copy_to_user(desc->buf, kaddr + offset, size);
> > > > kunmap(page);
> > >
> > > This seems bogus -- you need to handle faults --
> > > i.e. __prefetchnta_to_user() ;-).
> >

Ahm. That's file_read_actor, not file_write_actor ;-)
I'm prefetching the kernel space buffer.

> > It wants wrapping nicely. A generic prefetch and prefetchw does help some other
> > cases (scheduler for one).
> >
> > Does the prefetch instruction fault on PIII/PIV then - the K7 one appears not
> > to be a source of faults
>
> My fault. I was told that prefetch instructions are always
> non-faulting.
>

But there is another problem:

The tlb preloading with a simple 'mov' is not enough:
the Pentium III cpu decodes the 'mov', begins to load the tlb entry -
this will take at least several dozend cpu ticks.

But the cpu continues to decode further instructions. It sees the
'prefetchnta', notices that the tlb entry is not loaded and ignores the
next prefetchnta's (prefetch without tlb is turned into NOP).

--
        Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Feb 23 2001 - 21:00:23 EST