Re: [PATCH 2/5] vmevent: Convert from deferred timer to deferred work

From: Anton Vorontsov
Date: Fri Jun 08 2012 - 08:15:34 EST


On Fri, Jun 08, 2012 at 11:03:29AM +0000, leonid.moiseichuk@xxxxxxxxx wrote:
> > -----Original Message-----
> > From: ext Anton Vorontsov [mailto:cbouatmailru@xxxxxxxxx]
> > Sent: 08 June, 2012 13:35
> ...
> > > Context switches, parsing, activity in userspace even memory situation is
> > not changed.
> >
> > Sure, there is some additional overhead. I'm just saying that it is not drastic. It
> > would be like 100 sprintfs + 100 sscanfs + 2 context switches? Well, it is
> > unfortunate... but come on, today's phones are running X11 and Java. :-)
>
> Vmstat generation is not so trivial. Meminfo has even higher overhead. I just checked generation time using idling device and open/read test:
> - vmstat min 30, avg 94 max 2746 uSeconds
> - meminfo min 30, average 65 max 15961 uSeconds
>
> In comparison /proc/version for the same conditions: min 30, average 41, max 1505 uSeconds

Hm. I would expect that avg value for meminfo will be much worse
than vmstat (meminfo grabs some locks).

OK, if we consider 100ms interval, then this would be like 0.1%
overhead? Not great, but still better than memcg:

http://lkml.org/lkml/2011/12/21/487

:-)

Personally? I'm all for saving these 0.1% tho, I'm all for vmevent.
But, for example, it's still broken for SMP as it is costly to
update vm_stat. And I see no way to fix this.

So, I guess the right approach would be to find ways to not depend on
frequent vm_stat updates (and thus reads).

userland deferred timers (and infrequent reads from vmstat) +
"userland vm pressure notifications" looks promising for the userland
solution.

For in-kernel solution it is all the same, a deferred timer that
reads vm_stat occasionally (no pressure case) + in-kernel shrinker
notifications for fast reaction under pressure.

> > > In kernel space you can use sliding timer (increasing interval) + shinker.
> >
> > Well, w/ Minchan's idea, we can get shrinker notifications into the userland,
> > so the sliding timer thing would be still possible.
>
> Only as a post-schrinker actions. In case of memory stressing or
> close-to-stressing conditions shrinkers called very often, I saw up to
> 50 times per second.

Well, yes. But in userland you would just poll/select on the shrinker
notification fd, you won't get more than you can (or want to) process.

--
Anton Vorontsov
Email: cbouatmailru@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/