Re: [v3 0/9] parallelized "struct page" zeroing

From: Michal Hocko
Date: Thu May 11 2017 - 04:05:50 EST


On Wed 10-05-17 11:19:43, David S. Miller wrote:
> From: Michal Hocko <mhocko@xxxxxxxxxx>
> Date: Wed, 10 May 2017 16:57:26 +0200
>
> > Have you measured that? I do not think it would be super hard to
> > measure. I would be quite surprised if this added much if anything at
> > all as the whole struct page should be in the cache line already. We do
> > set reference count and other struct members. Almost nobody should be
> > looking at our page at this time and stealing the cache line. On the
> > other hand a large memcpy will basically wipe everything away from the
> > cpu cache. Or am I missing something?
>
> I guess it might be clearer if you understand what the block
> initializing stores do on sparc64. There are no memory accesses at
> all.
>
> The cpu just zeros out the cache line, that's it.
>
> No L3 cache line is allocated. So this "wipe everything" behavior
> will not happen in the L3.

OK, good to know. My undestanding of sparc64 is close to zero.

Anyway, do you agree that doing the struct page initialization along
with other writes to it shouldn't add a measurable overhead comparing
to pre-zeroing of larger block of struct pages? We already have an
exclusive cache line and doing one 64B write along with few other stores
should be basically the same.
--
Michal Hocko
SUSE Labs