Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

KVM: SVM: Fix detection of AMD Errata 1096

When CPU raise #NPF on guest data access and guest CR4.SMAP=1, it is
possible that CPU microcode implementing DecodeAssist will fail
to read bytes of instruction which caused #NPF. This is AMD errata
1096 and it happens because CPU microcode reading instruction bytes
incorrectly attempts to read code as implicit supervisor-mode data
accesses (that is, just like it would read e.g. a TSS), which are
susceptible to SMAP faults. The microcode reads CS:RIP and if it is
a user-mode address according to the page tables, the processor
gives up and returns no instruction bytes. In this case,
GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
return 0 instead of the correct guest instruction bytes.

Current KVM code attemps to detect and workaround this errata, but it
has multiple issues:

1) It mistakenly checks if guest CR4.SMAP=0 instead of guest CR4.SMAP=1,
which is required for encountering a SMAP fault.

2) It assumes SMAP faults can only occur when guest CPL==3.
However, in case guest CR4.SMEP=0, the guest can execute an instruction
which reside in a user-accessible page with CPL<3 priviledge. If this
instruction raise a #NPF on it's data access, then CPU DecodeAssist
microcode will still encounter a SMAP violation. Even though no sane
OS will do so (as it's an obvious priviledge escalation vulnerability),
we still need to handle this semanticly correct in KVM side.

Note that (2) *is* a useful optimization, because CR4.SMAP=1 is an easy
triggerable condition and guests usually enable SMAP together with SMEP.
If the vCPU has CR4.SMEP=1, the errata could indeed be encountered onlt
at guest CPL==3; otherwise, the CPU would raise a SMEP fault to guest
instead of #NPF. We keep this condition to avoid false positives in
the detection of the errata.

In addition, to avoid future confusion and improve code readbility,
include details of the errata in code and not just in commit message.

Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)")
Cc: Singh Brijesh <brijesh.singh@amd.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

authored by

Liran Alon and committed by
Paolo Bonzini
118154bd 0c5f81da

+35 -7
+35 -7
arch/x86/kvm/svm.c
··· 7128 7128 7129 7129 static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu) 7130 7130 { 7131 - bool is_user, smap; 7132 - 7133 - is_user = svm_get_cpl(vcpu) == 3; 7134 - smap = !kvm_read_cr4_bits(vcpu, X86_CR4_SMAP); 7131 + unsigned long cr4 = kvm_read_cr4(vcpu); 7132 + bool smep = cr4 & X86_CR4_SMEP; 7133 + bool smap = cr4 & X86_CR4_SMAP; 7134 + bool is_user = svm_get_cpl(vcpu) == 3; 7135 7135 7136 7136 /* 7137 - * Detect and workaround Errata 1096 Fam_17h_00_0Fh 7137 + * Detect and workaround Errata 1096 Fam_17h_00_0Fh. 7138 + * 7139 + * Errata: 7140 + * When CPU raise #NPF on guest data access and vCPU CR4.SMAP=1, it is 7141 + * possible that CPU microcode implementing DecodeAssist will fail 7142 + * to read bytes of instruction which caused #NPF. In this case, 7143 + * GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly 7144 + * return 0 instead of the correct guest instruction bytes. 7145 + * 7146 + * This happens because CPU microcode reading instruction bytes 7147 + * uses a special opcode which attempts to read data using CPL=0 7148 + * priviledges. The microcode reads CS:RIP and if it hits a SMAP 7149 + * fault, it gives up and returns no instruction bytes. 7150 + * 7151 + * Detection: 7152 + * We reach here in case CPU supports DecodeAssist, raised #NPF and 7153 + * returned 0 in GuestIntrBytes field of the VMCB. 7154 + * First, errata can only be triggered in case vCPU CR4.SMAP=1. 7155 + * Second, if vCPU CR4.SMEP=1, errata could only be triggered 7156 + * in case vCPU CPL==3 (Because otherwise guest would have triggered 7157 + * a SMEP fault instead of #NPF). 7158 + * Otherwise, vCPU CR4.SMEP=0, errata could be triggered by any vCPU CPL. 7159 + * As most guests enable SMAP if they have also enabled SMEP, use above 7160 + * logic in order to attempt minimize false-positive of detecting errata 7161 + * while still preserving all cases semantic correctness. 7162 + * 7163 + * Workaround: 7164 + * To determine what instruction the guest was executing, the hypervisor 7165 + * will have to decode the instruction at the instruction pointer. 7138 7166 * 7139 7167 * In non SEV guest, hypervisor will be able to read the guest 7140 7168 * memory to decode the instruction pointer when insn_len is zero ··· 7173 7145 * instruction pointer so we will not able to workaround it. Lets 7174 7146 * print the error and request to kill the guest. 7175 7147 */ 7176 - if (is_user && smap) { 7148 + if (smap && (!smep || is_user)) { 7177 7149 if (!sev_guest(vcpu->kvm)) 7178 7150 return true; 7179 7151 7180 - pr_err_ratelimited("KVM: Guest triggered AMD Erratum 1096\n"); 7152 + pr_err_ratelimited("KVM: SEV Guest triggered AMD Erratum 1096\n"); 7181 7153 kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); 7182 7154 } 7183 7155