Re: [uml-devel] Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19

From: Blaisorblade
Date: Fri Nov 04 2005 - 12:19:08 EST


(Note - I've removed a few CC's since we're too many ones, sorry for any
inconvenience).

On Friday 04 November 2005 16:50, Rob Landley wrote:
> On Thursday 03 November 2005 21:26, Blaisorblade wrote:
> > > I was hoping that since the file was deleted from disk and is already
> > > getting _some_ special treatment (since it's a longstanding "poor man's
> > > shared memory" hack), that madvise wouldn't flush the data to disk, but
> > > would just zero it out. A bit optimistic on my part, I know. :)
> >
> > I read at some time that this optimization existed but was deemed
> > obsolete and removed.
> >
> > Why obsolete? Because... we have tmpfs! And that's the point. With
> > DONTNEED, we detach references from page tables, but the content is still
> > pinned: it _is_ the "disk"! (And you have TMPDIR on tmpfs, right?)
>
> If I had that kind of control over environment my build would always be
> deployed in (including root access), I wouldn't need UML. :)
Yep, right for your case... however currently the majority of users use tmpfs
(I hope for them)...

> > I guess you refer to using frag. avoidance on the guest
>
> Yes. Moot point since Linus doesn't want it.
See lwn.net last issue (when it becomes available) on this issue. In short,
however, the real point is that we need this kind of support.

> Might be a performance issue if that gets introduced with per-page
> granularity,
I'm aware of this possibility, and I've said in fact "Frag. avoidance will be
nice to use". However I'm not sure that the system call overhead is so big,
compared to flushing the TLB entries...

But for now we haven't the issue - you don't do hotunplug frequently. When
somebody will write the auto-hotunplug management daemon we could have a
problem on this...
> and how do you avoid giving back pages we're about to re-use?

Jeff's trick is call the buddy allocator (__get_free_pages()) to get a full
page (and it will do any needed work to free memory), so nobody else will use
it, and then madvise() it.

If a better API exists, that will be used.

> Oh well, bench it when it happens. (And in any case, it needs a tunable to
> beat the page cache into submission or there's no free memory to give back.
I couldn't parse your sentence. The allocation will free memory like when
memory is needed.

However look at /proc/sys/vm/swappiness or use Con Kolivas's patches to find
new tunable and policies.
> If there's already such a tuneable, I haven't found it yet.)
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade





___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/