Re: [PATCH] x86: Run checksumming in parallel accross multiple alu's

From: Neil Horman
Date: Mon Oct 21 2013 - 13:46:24 EST


On Mon, Oct 21, 2013 at 10:31:38AM -0700, Eric Dumazet wrote:
> On Sun, 2013-10-20 at 17:29 -0400, Neil Horman wrote:
> > On Fri, Oct 18, 2013 at 02:15:52PM -0700, Eric Dumazet wrote:
> > > On Fri, 2013-10-18 at 16:11 -0400, Neil Horman wrote:
> > >
> > > > #define BUFSIZ_ORDER 4
> > > > #define BUFSIZ ((2 << BUFSIZ_ORDER) * (1024*1024*2))
> > > > static int __init csum_init_module(void)
> > > > {
> > > > int i;
> > > > __wsum sum = 0;
> > > > struct timespec start, end;
> > > > u64 time;
> > > > struct page *page;
> > > > u32 offset = 0;
> > > >
> > > > page = alloc_pages((GFP_TRANSHUGE & ~__GFP_MOVABLE), BUFSIZ_ORDER);
> > >
> > > Not sure what you are doing here, but its not correct.
> > >
> > Why not? You asked for a test with 32 hugepages, so I allocated 32 hugepages.
>
> Not really. We cannot allocate 64 Mbytes in a single alloc_pages() call
> on x86. (MAX_ORDER = 11)
>
> You noticed nothing because you did not
> write anything on the 64Mbytes area (and corrupt memory) or
> use CONFIG_DEBUG_PAGEALLOC=y.
>
> Your code read data out of bounds and was lucky, thats all...
>
> You in fact allocated a page of (4096<<4) bytes
>
Gahh! I see what I did, the order in the alloc_pages call is the order of
hugepages, it still allocates that order as typically sized pages, and then
treats them as huge. Stupid of me...

I'll have results on your version of the test case in just a bit here
Neil

>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/