Re: [PATCH 2/2] KVM: Fix writeback on page boundary that propagate changes in spite of #PF

From: Nadav Amit
Date: Sat Jan 14 2012 - 13:29:08 EST



On Jan 12, 2012, at 12:27 PM, Gleb Natapov wrote:

> On Thu, Jan 12, 2012 at 12:21:00PM +0200, Avi Kivity wrote:
>> On 01/12/2012 12:12 PM, Gleb Natapov wrote:
>>> On Wed, Jan 11, 2012 at 06:53:31PM +0200, Nadav Amit wrote:
>>>> Consider the case in which an instruction emulation writeback is performed on a page boundary.
>>>> In such case, if a #PF occurs on the second page, the write to the first page already occurred and cannot be retracted.
>>>> Therefore, validation of the second page access must be performed prior to writeback.
>>>>
>>>> Signed-off-by: Nadav Amit <nadav.amit@xxxxxxxxx>
>>>> ---
>>>> arch/x86/kvm/x86.c | 13 +++++++++++++
>>>> 1 files changed, 13 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index 05fd3d7..7af3d67 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -3626,6 +3626,8 @@ struct read_write_emulator_ops {
>>>> int bytes, void *val);
>>>> int (*read_write_exit_mmio)(struct kvm_vcpu *vcpu, gpa_t gpa,
>>>> void *val, int bytes);
>>>> + gpa_t (*read_write_validate)(struct kvm_vcpu *vcpu, gva_t gva,
>>>> + struct x86_exception *exception);
>>>> bool write;
>>>> };
>>>>
>>>> @@ -3686,6 +3688,7 @@ static struct read_write_emulator_ops write_emultor = {
>>>> .read_write_emulate = write_emulate,
>>>> .read_write_mmio = write_mmio,
>>>> .read_write_exit_mmio = write_exit_mmio,
>>>> + .read_write_validate = kvm_mmu_gva_to_gpa_write,
>>>> .write = true,
>>>> };
>>>>
>>>> @@ -3750,6 +3753,16 @@ int emulator_read_write(struct x86_emulate_ctxt *ctxt, unsigned long addr,
>>>> int rc, now;
>>>>
>>>> now = -addr & ~PAGE_MASK;
>>>> +
>>>> + /* First check there is no page-fault on the next page */
>>>> + if (ops->read_write_validate &&
>>>> + ops->read_write_validate(vcpu, addr+now, exception) ==
>>>> + UNMAPPED_GVA) {
>>>> + /* #PF on the first page should be reported first */
>>>> + ops->read_write_validate(vcpu, addr, exception);
>>>> + return X86EMUL_PROPAGATE_FAULT;
>>>> + }
>>>> +
>>> This undoes optimization that vcpu_mmio_gva_to_gpa() has for handling
>>> mmio.
>>
>> Right. I suggest changing I/O to have two phases: first, translate the
>> virtual address into an array of two physical addresses; check
>> exceptions and report. Then do the actual writes.
>>
>>> Furthermore for common (non faulting) case we will check page
>>> tables twice on each write that crosses page boundary, first time here
>>> and second time in emulator_read_write_onepage().
>>
>> Those should be very uncommon.
>>
> Still it is better to have all the checks in one place like you suggest
> above.

I agree. I'll submit a revised version soon.
I'm just struggling with this emulation code as more stuff is broken.

Regards,
Nadav

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/