Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

KVM: x86: Introduce EM_ASM_1

Replace fastops with C based stubs. There are a bunch of problems with
the current fastop infrastructure, most all related to their special
calling convention, which bypasses the normal C-ABI.

There are two immediate problems with this at present:

- it relies on RET preserving EFLAGS; whereas C-ABI does not.

- it circumvents compiler based control-flow-integrity checking
because its all asm magic.

The first is a problem for some mitigations where the
x86_indirect_return_thunk needs to include non-trivial work that
clobbers EFLAGS (eg. the Skylake call depth tracking thing).

The second is a problem because it presents a 'naked' indirect call on
kCFI builds, making it a prime target for control flow hijacking.

Additionally, given that a large chunk of virtual machine performance
relies on absolutely avoiding vmexit these days, this emulation stuff
just isn't that critical for performance anymore.

As such, replace the fastop calls with normal C functions using the
'execute' member.

As noted by Paolo: this code was performance critical for pre-Westmere
(2010) and only when running big real mode code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Sean Christopherson <seanjc@google.com>
Link: https://lkml.kernel.org/r/20250714103439.773781574@infradead.org

+58 -13
+58 -13
arch/x86/kvm/emulate.c
··· 267 267 X86_EFLAGS_PF|X86_EFLAGS_CF) 268 268 269 269 #ifdef CONFIG_X86_64 270 - #define ON64(x) x 270 + #define ON64(x...) x 271 271 #else 272 - #define ON64(x) 272 + #define ON64(x...) 273 273 #endif 274 + 275 + #define EM_ASM_START(op) \ 276 + static int em_##op(struct x86_emulate_ctxt *ctxt) \ 277 + { \ 278 + unsigned long flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF; \ 279 + int bytes = 1, ok = 1; \ 280 + if (!(ctxt->d & ByteOp)) \ 281 + bytes = ctxt->dst.bytes; \ 282 + switch (bytes) { 283 + 284 + #define __EM_ASM(str) \ 285 + asm("push %[flags]; popf \n\t" \ 286 + "10: " str \ 287 + "pushf; pop %[flags] \n\t" \ 288 + "11: \n\t" \ 289 + : "+a" (ctxt->dst.val), \ 290 + "+d" (ctxt->src.val), \ 291 + [flags] "+D" (flags), \ 292 + "+S" (ok) \ 293 + : "c" (ctxt->src2.val)) 294 + 295 + #define __EM_ASM_1(op, dst) \ 296 + __EM_ASM(#op " %%" #dst " \n\t") 297 + 298 + #define __EM_ASM_1_EX(op, dst) \ 299 + __EM_ASM(#op " %%" #dst " \n\t" \ 300 + _ASM_EXTABLE_TYPE_REG(10b, 11f, EX_TYPE_ZERO_REG, %%esi)) 301 + 302 + #define __EM_ASM_2(op, dst, src) \ 303 + __EM_ASM(#op " %%" #src ", %%" #dst " \n\t") 304 + 305 + #define EM_ASM_END \ 306 + } \ 307 + ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK); \ 308 + return !ok ? emulate_de(ctxt) : X86EMUL_CONTINUE; \ 309 + } 310 + 311 + /* 1-operand, using "a" (dst) */ 312 + #define EM_ASM_1(op) \ 313 + EM_ASM_START(op) \ 314 + case 1: __EM_ASM_1(op##b, al); break; \ 315 + case 2: __EM_ASM_1(op##w, ax); break; \ 316 + case 4: __EM_ASM_1(op##l, eax); break; \ 317 + ON64(case 8: __EM_ASM_1(op##q, rax); break;) \ 318 + EM_ASM_END 274 319 275 320 /* 276 321 * fastop functions have a special calling convention: ··· 1047 1002 1048 1003 FASTOP2W(imul); 1049 1004 1050 - FASTOP1(not); 1051 - FASTOP1(neg); 1052 - FASTOP1(inc); 1053 - FASTOP1(dec); 1005 + EM_ASM_1(not); 1006 + EM_ASM_1(neg); 1007 + EM_ASM_1(inc); 1008 + EM_ASM_1(dec); 1054 1009 1055 1010 FASTOP2CL(rol); 1056 1011 FASTOP2CL(ror); ··· 4066 4021 static const struct opcode group3[] = { 4067 4022 F(DstMem | SrcImm | NoWrite, em_test), 4068 4023 F(DstMem | SrcImm | NoWrite, em_test), 4069 - F(DstMem | SrcNone | Lock, em_not), 4070 - F(DstMem | SrcNone | Lock, em_neg), 4024 + I(DstMem | SrcNone | Lock, em_not), 4025 + I(DstMem | SrcNone | Lock, em_neg), 4071 4026 F(DstXacc | Src2Mem, em_mul_ex), 4072 4027 F(DstXacc | Src2Mem, em_imul_ex), 4073 4028 F(DstXacc | Src2Mem, em_div_ex), ··· 4075 4030 }; 4076 4031 4077 4032 static const struct opcode group4[] = { 4078 - F(ByteOp | DstMem | SrcNone | Lock, em_inc), 4079 - F(ByteOp | DstMem | SrcNone | Lock, em_dec), 4033 + I(ByteOp | DstMem | SrcNone | Lock, em_inc), 4034 + I(ByteOp | DstMem | SrcNone | Lock, em_dec), 4080 4035 N, N, N, N, N, N, 4081 4036 }; 4082 4037 4083 4038 static const struct opcode group5[] = { 4084 - F(DstMem | SrcNone | Lock, em_inc), 4085 - F(DstMem | SrcNone | Lock, em_dec), 4039 + I(DstMem | SrcNone | Lock, em_inc), 4040 + I(DstMem | SrcNone | Lock, em_dec), 4086 4041 I(SrcMem | NearBranch | IsBranch, em_call_near_abs), 4087 4042 I(SrcMemFAddr | ImplicitOps | IsBranch, em_call_far), 4088 4043 I(SrcMem | NearBranch | IsBranch, em_jmp_abs), ··· 4282 4237 /* 0x38 - 0x3F */ 4283 4238 F6ALU(NoWrite, em_cmp), N, N, 4284 4239 /* 0x40 - 0x4F */ 4285 - X8(F(DstReg, em_inc)), X8(F(DstReg, em_dec)), 4240 + X8(I(DstReg, em_inc)), X8(I(DstReg, em_dec)), 4286 4241 /* 0x50 - 0x57 */ 4287 4242 X8(I(SrcReg | Stack, em_push)), 4288 4243 /* 0x58 - 0x5F */