Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system callfiltering

From: Avi Kivity
Date: Thu May 26 2011 - 07:03:46 EST


On 05/26/2011 12:48 PM, Ingo Molnar wrote:
* Ingo Molnar<mingo@xxxxxxx> wrote:

> You are missing the geniality of the tools/kvm/ thread pool! :-)
>
> It could be switched to a worker *process* model rather easily.
> Guest RAM and (a limited amount of) global resources would be
> shared via mmap(SHARED), but otherwise each worker process would
> have its own stack, its own subsystem-specific state, etc.

We get VM exit events in the vcpu threads which after minimal
processing pass much of the work to the thread pool. Most of the
virtio work (which could be a source of vulnerability - ringbuffers
are hard) is done in the worker task context.

It would be possible to further increase isolation there by also
passing the IO/MMIO decoding to the worker thread - but i'm not sure
that's truly needed. Most of the risk is where most of the code is -
and the code is in the worker task which interprets on-disk data,
protocols, etc.

I've suggested in the past to add an "mmiofd" facility to kvm, similar to ioeventfd. This is how it would work:

- userspace configures kvm with an mmio range and a pipe
- guest writes to that range write a packet to the pipe describing the write
- guest reads from that range write a packet to the pipe describing the read, then wait for a reply packet with the result

The advantages would be
- avoid heavyweight exit; kvm can simply wake up a thread on another core and resume processing
- writes can be pipelined, similar to how PCI writes are posted
- supports process separation

So far no one has posted an implementation but it should be pretty simple.

So we could not only isolate devices from each other, but we could
also protect the highly capable vcpu fd from exploits in devices -
worker threads generally do not need access to the vcpu fd IIRC.

Yes.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/