Re: [patch 1/2] x86_64 page fault NMI-safe

From: Ingo Molnar
Date: Wed Aug 04 2010 - 03:22:15 EST



* Dave Chinner <david@xxxxxxxxxxxxx> wrote:

> On Tue, Aug 03, 2010 at 11:56:11AM -0700, Linus Torvalds wrote:
> > On Tue, Aug 3, 2010 at 10:18 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > FWIW I really utterly detest the whole concept of sub-buffers.
> >
> > I'm not quite sure why. Is it something fundamental, or just an
> > implementation issue?
> >
> > One thing that I think could easily make sense in a _lot_ of buffering
> > areas is the notion of a "continuation" buffer. We know we have cases
> > where we want to attach a lot of data to one particular event, but the
> > buffering itself is inevitably always going to have some limits on
> > atomicity etc. And quite often, the event that _generates_ the data is
> > not necessarily going to have all that data in one contiguous region,
> > and doing a scatter-gather memcpy to get it that way is not good
> > either.
> >
> > At the same time, I do _not_ believe that the kernel ring-buffer code
> > should handle pointers to sub-buffers etc, or worry about iovec-like
> > arrays of smaller ranges. So if _that_ is what you mean by "concept of
> > sub-buffers", then I agree with you.
> >
> > But what I do think might make a lot of sense is to allow buffer
> > fragments, and just teach user space to do de-fragmentation. Where it
> > would be important that the de-fragmentation really is all in user
> > space, and not really ever visible to the ring-buffer implementation
> > itself (and there would not, for example, be any guarantees that the
> > fragments would be contiguous - there could be other events in the
> > buffer in between fragments). Maybe we could even say that fragments
> > might be across different CPU ring-buffers, and user-space needs to
> > sort it out if it wants to (where "sort it out" literally would mean
> > having to sort and re-attach them in the right order, since there
> > wouldn't be any ordering between them).
> >
> > From a kernel perspective, the only thing you need for fragment
> > handling would be to have a buffer entry that just says "I'm fragment
> > number X of event ID Y". Nothing more. Everything else would be up to
> > the parser in user space to work out.
>
> Heh. For a moment there I thought you were describing the the way XFS writes
> transactions into it's log. Replace "CPU ring-buffers" with "in-core log
> buffers", "userspace parsing" with "log recovery" and "event ID" with
> "transaction ID", and the concept you describe is eerily similar. That
> includes the fact that transactions are not contiguous in the log, can
> interleave fragments between concurrent transaction commits and they can
> span multiple log buffers, too. It works pretty well for scaling concurrent
> writers....

That's certainly a good model when you have to stream into a
persistent-storage transaction log space with multiple writers.

The difference is that with instrumentation we are generally able to make
things per task or per cpu so there's no real multi-CPU 'concurrent writers'
concurrency.

You dont have that luxory/simplicity when logging to storage, of course!

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/