Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

KVM: x86/mmu: Drop/zap existing present SPTE even when creating an MMIO SPTE

When installing an emulated MMIO SPTE, do so *after* dropping/zapping the
existing SPTE (if it's shadow-present). While commit a54aa15c6bda3 was
right about it being impossible to convert a shadow-present SPTE to an
MMIO SPTE due to a _guest_ write, it failed to account for writes to guest
memory that are outside the scope of KVM.

E.g. if host userspace modifies a shadowed gPTE to switch from a memslot
to emulted MMIO and then the guest hits a relevant page fault, KVM will
install the MMIO SPTE without first zapping the shadow-present SPTE.

------------[ cut here ]------------
is_shadow_present_pte(*sptep)
WARNING: arch/x86/kvm/mmu/mmu.c:484 at mark_mmio_spte+0xb2/0xc0 [kvm], CPU#0: vmx_ept_stale_r/4292
Modules linked in: kvm_intel kvm irqbypass
CPU: 0 UID: 1000 PID: 4292 Comm: vmx_ept_stale_r Not tainted 7.0.0-rc2-eafebd2d2ab0-sink-vm #319 PREEMPT
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:mark_mmio_spte+0xb2/0xc0 [kvm]
Call Trace:
<TASK>
mmu_set_spte+0x237/0x440 [kvm]
ept_page_fault+0x535/0x7f0 [kvm]
kvm_mmu_do_page_fault+0xee/0x1f0 [kvm]
kvm_mmu_page_fault+0x8d/0x620 [kvm]
vmx_handle_exit+0x18c/0x5a0 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xc55/0x1c20 [kvm]
kvm_vcpu_ioctl+0x2d5/0x980 [kvm]
__x64_sys_ioctl+0x8a/0xd0
do_syscall_64+0xb5/0x730
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x47fa3f
</TASK>
---[ end trace 0000000000000000 ]---

Reported-by: Alexander Bulekov <bkov@amazon.com>
Debugged-by: Alexander Bulekov <bkov@amazon.com>
Suggested-by: Fred Griffoul <fgriffo@amazon.co.uk>
Fixes: a54aa15c6bda3 ("KVM: x86/mmu: Handle MMIO SPTEs directly in mmu_set_spte()")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>

authored by

Sean Christopherson and committed by
Paolo Bonzini
aad885e7 6c6ba548

+8 -6
+8 -6
arch/x86/kvm/mmu/mmu.c
··· 3044 3044 bool prefetch = !fault || fault->prefetch; 3045 3045 bool write_fault = fault && fault->write; 3046 3046 3047 - if (unlikely(is_noslot_pfn(pfn))) { 3048 - vcpu->stat.pf_mmio_spte_created++; 3049 - mark_mmio_spte(vcpu, sptep, gfn, pte_access); 3050 - return RET_PF_EMULATE; 3051 - } 3052 - 3053 3047 if (is_shadow_present_pte(*sptep)) { 3054 3048 if (prefetch && is_last_spte(*sptep, level) && 3055 3049 pfn == spte_to_pfn(*sptep)) ··· 3065 3071 flush = true; 3066 3072 } else 3067 3073 was_rmapped = 1; 3074 + } 3075 + 3076 + if (unlikely(is_noslot_pfn(pfn))) { 3077 + vcpu->stat.pf_mmio_spte_created++; 3078 + mark_mmio_spte(vcpu, sptep, gfn, pte_access); 3079 + if (flush) 3080 + kvm_flush_remote_tlbs_gfn(vcpu->kvm, gfn, level); 3081 + return RET_PF_EMULATE; 3068 3082 } 3069 3083 3070 3084 wrprot = make_spte(vcpu, sp, slot, pte_access, gfn, pfn, *sptep, prefetch,