Merge tag 'x86_sev_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

+118

Documentation/arch/x86/amd-memory-encryption.rst

··· 130 130 131 131 More details in AMD64 APM[1] Vol 2: 15.34.10 SEV_STATUS MSR 132 132 133 + Reverse Map Table (RMP) 134 + ======================= 135 + 136 + The RMP is a structure in system memory that is used to ensure a one-to-one 137 + mapping between system physical addresses and guest physical addresses. Each 138 + page of memory that is potentially assignable to guests has one entry within 139 + the RMP. 140 + 141 + The RMP table can be either contiguous in memory or a collection of segments 142 + in memory. 143 + 144 + Contiguous RMP 145 + -------------- 146 + 147 + Support for this form of the RMP is present when support for SEV-SNP is 148 + present, which can be determined using the CPUID instruction:: 149 + 150 + 0x8000001f[eax]: 151 + Bit[4] indicates support for SEV-SNP 152 + 153 + The location of the RMP is identified to the hardware through two MSRs:: 154 + 155 + 0xc0010132 (RMP_BASE): 156 + System physical address of the first byte of the RMP 157 + 158 + 0xc0010133 (RMP_END): 159 + System physical address of the last byte of the RMP 160 + 161 + Hardware requires that RMP_BASE and (RPM_END + 1) be 8KB aligned, but SEV 162 + firmware increases the alignment requirement to require a 1MB alignment. 163 + 164 + The RMP consists of a 16KB region used for processor bookkeeping followed 165 + by the RMP entries, which are 16 bytes in size. The size of the RMP 166 + determines the range of physical memory that the hypervisor can assign to 167 + SEV-SNP guests. The RMP covers the system physical address from:: 168 + 169 + 0 to ((RMP_END + 1 - RMP_BASE - 16KB) / 16B) x 4KB. 170 + 171 + The current Linux support relies on BIOS to allocate/reserve the memory for 172 + the RMP and to set RMP_BASE and RMP_END appropriately. Linux uses the MSR 173 + values to locate the RMP and determine the size of the RMP. The RMP must 174 + cover all of system memory in order for Linux to enable SEV-SNP. 175 + 176 + Segmented RMP 177 + ------------- 178 + 179 + Segmented RMP support is a new way of representing the layout of an RMP. 180 + Initial RMP support required the RMP table to be contiguous in memory. 181 + RMP accesses from a NUMA node on which the RMP doesn't reside 182 + can take longer than accesses from a NUMA node on which the RMP resides. 183 + Segmented RMP support allows the RMP entries to be located on the same 184 + node as the memory the RMP is covering, potentially reducing latency 185 + associated with accessing an RMP entry associated with the memory. Each 186 + RMP segment covers a specific range of system physical addresses. 187 + 188 + Support for this form of the RMP can be determined using the CPUID 189 + instruction:: 190 + 191 + 0x8000001f[eax]: 192 + Bit[23] indicates support for segmented RMP 193 + 194 + If supported, segmented RMP attributes can be found using the CPUID 195 + instruction:: 196 + 197 + 0x80000025[eax]: 198 + Bits[5:0] minimum supported RMP segment size 199 + Bits[11:6] maximum supported RMP segment size 200 + 201 + 0x80000025[ebx]: 202 + Bits[9:0] number of cacheable RMP segment definitions 203 + Bit[10] indicates if the number of cacheable RMP segments 204 + is a hard limit 205 + 206 + To enable a segmented RMP, a new MSR is available:: 207 + 208 + 0xc0010136 (RMP_CFG): 209 + Bit[0] indicates if segmented RMP is enabled 210 + Bits[13:8] contains the size of memory covered by an RMP 211 + segment (expressed as a power of 2) 212 + 213 + The RMP segment size defined in the RMP_CFG MSR applies to all segments 214 + of the RMP. Therefore each RMP segment covers a specific range of system 215 + physical addresses. For example, if the RMP_CFG MSR value is 0x2401, then 216 + the RMP segment coverage value is 0x24 => 36, meaning the size of memory 217 + covered by an RMP segment is 64GB (1 << 36). So the first RMP segment 218 + covers physical addresses from 0 to 0xF_FFFF_FFFF, the second RMP segment 219 + covers physical addresses from 0x10_0000_0000 to 0x1F_FFFF_FFFF, etc. 220 + 221 + When a segmented RMP is enabled, RMP_BASE points to the RMP bookkeeping 222 + area as it does today (16K in size). However, instead of RMP entries 223 + beginning immediately after the bookkeeping area, there is a 4K RMP 224 + segment table (RST). Each entry in the RST is 8-bytes in size and represents 225 + an RMP segment:: 226 + 227 + Bits[19:0] mapped size (in GB) 228 + The mapped size can be less than the defined segment size. 229 + A value of zero, indicates that no RMP exists for the range 230 + of system physical addresses associated with this segment. 231 + Bits[51:20] segment physical address 232 + This address is left shift 20-bits (or just masked when 233 + read) to form the physical address of the segment (1MB 234 + alignment). 235 + 236 + The RST can hold 512 segment entries but can be limited in size to the number 237 + of cacheable RMP segments (CPUID 0x80000025_EBX[9:0]) if the number of cacheable 238 + RMP segments is a hard limit (CPUID 0x80000025_EBX[10]). 239 + 240 + The current Linux support relies on BIOS to allocate/reserve the memory for 241 + the segmented RMP (the bookkeeping area, RST, and all segments), build the RST 242 + and to set RMP_BASE, RMP_END, and RMP_CFG appropriately. Linux uses the MSR 243 + values to locate the RMP and determine the size and location of the RMP 244 + segments. The RMP must cover all of system memory in order for Linux to enable 245 + SEV-SNP. 246 + 247 + More details in the AMD64 APM Vol 2, section "15.36.3 Reverse Map Table", 248 + docID: 24593. 249 + 133 250 Secure VM Service Module (SVSM) 134 251 =============================== 252 + 135 253 SNP provides a feature called Virtual Machine Privilege Levels (VMPL) which 136 254 defines four privilege levels at which guest software can run. The most 137 255 privileged level is 0 and numerically higher numbers have lesser privileges.

+1

arch/x86/Kconfig

··· 1558 1558 select ARCH_HAS_CC_PLATFORM 1559 1559 select X86_MEM_ENCRYPT 1560 1560 select UNACCEPTED_MEMORY 1561 + select CRYPTO_LIB_AESGCM 1561 1562 help 1562 1563 Say yes to enable support for the encryption of system memory. 1563 1564 This requires an AMD processor that supports Secure Memory

+2 -1

arch/x86/boot/compressed/sev.c

··· 401 401 * by the guest kernel. As and when a new feature is implemented in the 402 402 * guest kernel, a corresponding bit should be added to the mask. 403 403 */ 404 - #define SNP_FEATURES_PRESENT MSR_AMD64_SNP_DEBUG_SWAP 404 + #define SNP_FEATURES_PRESENT (MSR_AMD64_SNP_DEBUG_SWAP | \ 405 + MSR_AMD64_SNP_SECURE_TSC) 405 406 406 407 u64 snp_get_unsupported_features(u64 status) 407 408 {

+3 -1

arch/x86/coco/core.c

··· 65 65 * up under SME the trampoline area cannot be encrypted, whereas under SEV 66 66 * the trampoline area must be encrypted. 67 67 */ 68 - 69 68 static bool noinstr amd_cc_platform_has(enum cc_attr attr) 70 69 { 71 70 #ifdef CONFIG_AMD_MEM_ENCRYPT ··· 95 96 96 97 case CC_ATTR_GUEST_SEV_SNP: 97 98 return sev_status & MSR_AMD64_SEV_SNP_ENABLED; 99 + 100 + case CC_ATTR_GUEST_SNP_SECURE_TSC: 101 + return sev_status & MSR_AMD64_SNP_SECURE_TSC; 98 102 99 103 case CC_ATTR_HOST_SEV_SNP: 100 104 return cc_flags.host_sev_snp;

+642 -10

arch/x86/coco/sev/core.c

··· 25 25 #include <linux/psp-sev.h> 26 26 #include <linux/dmi.h> 27 27 #include <uapi/linux/sev-guest.h> 28 + #include <crypto/gcm.h> 28 29 29 30 #include <asm/init.h> 30 31 #include <asm/cpu_entry_area.h> ··· 95 94 96 95 /* Secrets page physical address from the CC blob */ 97 96 static u64 secrets_pa __ro_after_init; 97 + 98 + /* 99 + * For Secure TSC guests, the BSP fetches TSC_INFO using SNP guest messaging and 100 + * initializes snp_tsc_scale and snp_tsc_offset. These values are replicated 101 + * across the APs VMSA fields (TSC_SCALE and TSC_OFFSET). 102 + */ 103 + static u64 snp_tsc_scale __ro_after_init; 104 + static u64 snp_tsc_offset __ro_after_init; 105 + static u64 snp_tsc_freq_khz __ro_after_init; 98 106 99 107 /* #VC handler runtime per-CPU data */ 100 108 struct sev_es_runtime_data { ··· 1286 1276 vmsa->vmpl = snp_vmpl; 1287 1277 vmsa->sev_features = sev_status >> 2; 1288 1278 1279 + /* Populate AP's TSC scale/offset to get accurate TSC values. */ 1280 + if (cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)) { 1281 + vmsa->tsc_scale = snp_tsc_scale; 1282 + vmsa->tsc_offset = snp_tsc_offset; 1283 + } 1284 + 1289 1285 /* Switch the page over to a VMSA page now that it is initialized */ 1290 1286 ret = snp_set_vmsa(vmsa, caa, apic_id, true); 1291 1287 if (ret) { ··· 1434 1418 return ES_OK; 1435 1419 } 1436 1420 1421 + /* 1422 + * TSC related accesses should not exit to the hypervisor when a guest is 1423 + * executing with Secure TSC enabled, so special handling is required for 1424 + * accesses of MSR_IA32_TSC and MSR_AMD64_GUEST_TSC_FREQ. 1425 + */ 1426 + static enum es_result __vc_handle_secure_tsc_msrs(struct pt_regs *regs, bool write) 1427 + { 1428 + u64 tsc; 1429 + 1430 + /* 1431 + * GUEST_TSC_FREQ should not be intercepted when Secure TSC is enabled. 1432 + * Terminate the SNP guest when the interception is enabled. 1433 + */ 1434 + if (regs->cx == MSR_AMD64_GUEST_TSC_FREQ) 1435 + return ES_VMM_ERROR; 1436 + 1437 + /* 1438 + * Writes: Writing to MSR_IA32_TSC can cause subsequent reads of the TSC 1439 + * to return undefined values, so ignore all writes. 1440 + * 1441 + * Reads: Reads of MSR_IA32_TSC should return the current TSC value, use 1442 + * the value returned by rdtsc_ordered(). 1443 + */ 1444 + if (write) { 1445 + WARN_ONCE(1, "TSC MSR writes are verboten!\n"); 1446 + return ES_OK; 1447 + } 1448 + 1449 + tsc = rdtsc_ordered(); 1450 + regs->ax = lower_32_bits(tsc); 1451 + regs->dx = upper_32_bits(tsc); 1452 + 1453 + return ES_OK; 1454 + } 1455 + 1437 1456 static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt) 1438 1457 { 1439 1458 struct pt_regs *regs = ctxt->regs; ··· 1478 1427 /* Is it a WRMSR? */ 1479 1428 write = ctxt->insn.opcode.bytes[1] == 0x30; 1480 1429 1481 - if (regs->cx == MSR_SVSM_CAA) 1430 + switch (regs->cx) { 1431 + case MSR_SVSM_CAA: 1482 1432 return __vc_handle_msr_caa(regs, write); 1433 + case MSR_IA32_TSC: 1434 + case MSR_AMD64_GUEST_TSC_FREQ: 1435 + if (sev_status & MSR_AMD64_SNP_SECURE_TSC) 1436 + return __vc_handle_secure_tsc_msrs(regs, write); 1437 + else 1438 + break; 1439 + default: 1440 + break; 1441 + } 1483 1442 1484 1443 ghcb_set_rcx(ghcb, regs->cx); 1485 1444 if (write) { ··· 2569 2508 } 2570 2509 EXPORT_SYMBOL_GPL(snp_issue_svsm_attest_req); 2571 2510 2572 - int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *input, 2573 - struct snp_guest_request_ioctl *rio) 2511 + static int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *input, 2512 + struct snp_guest_request_ioctl *rio) 2574 2513 { 2575 2514 struct ghcb_state state; 2576 2515 struct es_em_ctxt ctxt; ··· 2632 2571 2633 2572 return ret; 2634 2573 } 2635 - EXPORT_SYMBOL_GPL(snp_issue_guest_request); 2636 2574 2637 2575 static struct platform_device sev_guest_device = { 2638 2576 .name = "sev-guest", ··· 2640 2580 2641 2581 static int __init snp_init_platform_device(void) 2642 2582 { 2643 - struct sev_guest_platform_data data; 2644 - 2645 2583 if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) 2646 - return -ENODEV; 2647 - 2648 - data.secrets_gpa = secrets_pa; 2649 - if (platform_device_add_data(&sev_guest_device, &data, sizeof(data))) 2650 2584 return -ENODEV; 2651 2585 2652 2586 if (platform_device_register(&sev_guest_device)) ··· 2721 2667 } 2722 2668 arch_initcall(sev_sysfs_init); 2723 2669 #endif // CONFIG_SYSFS 2670 + 2671 + static void free_shared_pages(void *buf, size_t sz) 2672 + { 2673 + unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT; 2674 + int ret; 2675 + 2676 + if (!buf) 2677 + return; 2678 + 2679 + ret = set_memory_encrypted((unsigned long)buf, npages); 2680 + if (ret) { 2681 + WARN_ONCE(ret, "failed to restore encryption mask (leak it)\n"); 2682 + return; 2683 + } 2684 + 2685 + __free_pages(virt_to_page(buf), get_order(sz)); 2686 + } 2687 + 2688 + static void *alloc_shared_pages(size_t sz) 2689 + { 2690 + unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT; 2691 + struct page *page; 2692 + int ret; 2693 + 2694 + page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz)); 2695 + if (!page) 2696 + return NULL; 2697 + 2698 + ret = set_memory_decrypted((unsigned long)page_address(page), npages); 2699 + if (ret) { 2700 + pr_err("failed to mark page shared, ret=%d\n", ret); 2701 + __free_pages(page, get_order(sz)); 2702 + return NULL; 2703 + } 2704 + 2705 + return page_address(page); 2706 + } 2707 + 2708 + static u8 *get_vmpck(int id, struct snp_secrets_page *secrets, u32 **seqno) 2709 + { 2710 + u8 *key = NULL; 2711 + 2712 + switch (id) { 2713 + case 0: 2714 + *seqno = &secrets->os_area.msg_seqno_0; 2715 + key = secrets->vmpck0; 2716 + break; 2717 + case 1: 2718 + *seqno = &secrets->os_area.msg_seqno_1; 2719 + key = secrets->vmpck1; 2720 + break; 2721 + case 2: 2722 + *seqno = &secrets->os_area.msg_seqno_2; 2723 + key = secrets->vmpck2; 2724 + break; 2725 + case 3: 2726 + *seqno = &secrets->os_area.msg_seqno_3; 2727 + key = secrets->vmpck3; 2728 + break; 2729 + default: 2730 + break; 2731 + } 2732 + 2733 + return key; 2734 + } 2735 + 2736 + static struct aesgcm_ctx *snp_init_crypto(u8 *key, size_t keylen) 2737 + { 2738 + struct aesgcm_ctx *ctx; 2739 + 2740 + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); 2741 + if (!ctx) 2742 + return NULL; 2743 + 2744 + if (aesgcm_expandkey(ctx, key, keylen, AUTHTAG_LEN)) { 2745 + pr_err("Crypto context initialization failed\n"); 2746 + kfree(ctx); 2747 + return NULL; 2748 + } 2749 + 2750 + return ctx; 2751 + } 2752 + 2753 + int snp_msg_init(struct snp_msg_desc *mdesc, int vmpck_id) 2754 + { 2755 + /* Adjust the default VMPCK key based on the executing VMPL level */ 2756 + if (vmpck_id == -1) 2757 + vmpck_id = snp_vmpl; 2758 + 2759 + mdesc->vmpck = get_vmpck(vmpck_id, mdesc->secrets, &mdesc->os_area_msg_seqno); 2760 + if (!mdesc->vmpck) { 2761 + pr_err("Invalid VMPCK%d communication key\n", vmpck_id); 2762 + return -EINVAL; 2763 + } 2764 + 2765 + /* Verify that VMPCK is not zero. */ 2766 + if (!memchr_inv(mdesc->vmpck, 0, VMPCK_KEY_LEN)) { 2767 + pr_err("Empty VMPCK%d communication key\n", vmpck_id); 2768 + return -EINVAL; 2769 + } 2770 + 2771 + mdesc->vmpck_id = vmpck_id; 2772 + 2773 + mdesc->ctx = snp_init_crypto(mdesc->vmpck, VMPCK_KEY_LEN); 2774 + if (!mdesc->ctx) 2775 + return -ENOMEM; 2776 + 2777 + return 0; 2778 + } 2779 + EXPORT_SYMBOL_GPL(snp_msg_init); 2780 + 2781 + struct snp_msg_desc *snp_msg_alloc(void) 2782 + { 2783 + struct snp_msg_desc *mdesc; 2784 + void __iomem *mem; 2785 + 2786 + BUILD_BUG_ON(sizeof(struct snp_guest_msg) > PAGE_SIZE); 2787 + 2788 + mdesc = kzalloc(sizeof(struct snp_msg_desc), GFP_KERNEL); 2789 + if (!mdesc) 2790 + return ERR_PTR(-ENOMEM); 2791 + 2792 + mem = ioremap_encrypted(secrets_pa, PAGE_SIZE); 2793 + if (!mem) 2794 + goto e_free_mdesc; 2795 + 2796 + mdesc->secrets = (__force struct snp_secrets_page *)mem; 2797 + 2798 + /* Allocate the shared page used for the request and response message. */ 2799 + mdesc->request = alloc_shared_pages(sizeof(struct snp_guest_msg)); 2800 + if (!mdesc->request) 2801 + goto e_unmap; 2802 + 2803 + mdesc->response = alloc_shared_pages(sizeof(struct snp_guest_msg)); 2804 + if (!mdesc->response) 2805 + goto e_free_request; 2806 + 2807 + mdesc->certs_data = alloc_shared_pages(SEV_FW_BLOB_MAX_SIZE); 2808 + if (!mdesc->certs_data) 2809 + goto e_free_response; 2810 + 2811 + /* initial the input address for guest request */ 2812 + mdesc->input.req_gpa = __pa(mdesc->request); 2813 + mdesc->input.resp_gpa = __pa(mdesc->response); 2814 + mdesc->input.data_gpa = __pa(mdesc->certs_data); 2815 + 2816 + return mdesc; 2817 + 2818 + e_free_response: 2819 + free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg)); 2820 + e_free_request: 2821 + free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg)); 2822 + e_unmap: 2823 + iounmap(mem); 2824 + e_free_mdesc: 2825 + kfree(mdesc); 2826 + 2827 + return ERR_PTR(-ENOMEM); 2828 + } 2829 + EXPORT_SYMBOL_GPL(snp_msg_alloc); 2830 + 2831 + void snp_msg_free(struct snp_msg_desc *mdesc) 2832 + { 2833 + if (!mdesc) 2834 + return; 2835 + 2836 + kfree(mdesc->ctx); 2837 + free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg)); 2838 + free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg)); 2839 + free_shared_pages(mdesc->certs_data, SEV_FW_BLOB_MAX_SIZE); 2840 + iounmap((__force void __iomem *)mdesc->secrets); 2841 + 2842 + memset(mdesc, 0, sizeof(*mdesc)); 2843 + kfree(mdesc); 2844 + } 2845 + EXPORT_SYMBOL_GPL(snp_msg_free); 2846 + 2847 + /* Mutex to serialize the shared buffer access and command handling. */ 2848 + static DEFINE_MUTEX(snp_cmd_mutex); 2849 + 2850 + /* 2851 + * If an error is received from the host or AMD Secure Processor (ASP) there 2852 + * are two options. Either retry the exact same encrypted request or discontinue 2853 + * using the VMPCK. 2854 + * 2855 + * This is because in the current encryption scheme GHCB v2 uses AES-GCM to 2856 + * encrypt the requests. The IV for this scheme is the sequence number. GCM 2857 + * cannot tolerate IV reuse. 2858 + * 2859 + * The ASP FW v1.51 only increments the sequence numbers on a successful 2860 + * guest<->ASP back and forth and only accepts messages at its exact sequence 2861 + * number. 2862 + * 2863 + * So if the sequence number were to be reused the encryption scheme is 2864 + * vulnerable. If the sequence number were incremented for a fresh IV the ASP 2865 + * will reject the request. 2866 + */ 2867 + static void snp_disable_vmpck(struct snp_msg_desc *mdesc) 2868 + { 2869 + pr_alert("Disabling VMPCK%d communication key to prevent IV reuse.\n", 2870 + mdesc->vmpck_id); 2871 + memzero_explicit(mdesc->vmpck, VMPCK_KEY_LEN); 2872 + mdesc->vmpck = NULL; 2873 + } 2874 + 2875 + static inline u64 __snp_get_msg_seqno(struct snp_msg_desc *mdesc) 2876 + { 2877 + u64 count; 2878 + 2879 + lockdep_assert_held(&snp_cmd_mutex); 2880 + 2881 + /* Read the current message sequence counter from secrets pages */ 2882 + count = *mdesc->os_area_msg_seqno; 2883 + 2884 + return count + 1; 2885 + } 2886 + 2887 + /* Return a non-zero on success */ 2888 + static u64 snp_get_msg_seqno(struct snp_msg_desc *mdesc) 2889 + { 2890 + u64 count = __snp_get_msg_seqno(mdesc); 2891 + 2892 + /* 2893 + * The message sequence counter for the SNP guest request is a 64-bit 2894 + * value but the version 2 of GHCB specification defines a 32-bit storage 2895 + * for it. If the counter exceeds the 32-bit value then return zero. 2896 + * The caller should check the return value, but if the caller happens to 2897 + * not check the value and use it, then the firmware treats zero as an 2898 + * invalid number and will fail the message request. 2899 + */ 2900 + if (count >= UINT_MAX) { 2901 + pr_err("request message sequence counter overflow\n"); 2902 + return 0; 2903 + } 2904 + 2905 + return count; 2906 + } 2907 + 2908 + static void snp_inc_msg_seqno(struct snp_msg_desc *mdesc) 2909 + { 2910 + /* 2911 + * The counter is also incremented by the PSP, so increment it by 2 2912 + * and save in secrets page. 2913 + */ 2914 + *mdesc->os_area_msg_seqno += 2; 2915 + } 2916 + 2917 + static int verify_and_dec_payload(struct snp_msg_desc *mdesc, struct snp_guest_req *req) 2918 + { 2919 + struct snp_guest_msg *resp_msg = &mdesc->secret_response; 2920 + struct snp_guest_msg *req_msg = &mdesc->secret_request; 2921 + struct snp_guest_msg_hdr *req_msg_hdr = &req_msg->hdr; 2922 + struct snp_guest_msg_hdr *resp_msg_hdr = &resp_msg->hdr; 2923 + struct aesgcm_ctx *ctx = mdesc->ctx; 2924 + u8 iv[GCM_AES_IV_SIZE] = {}; 2925 + 2926 + pr_debug("response [seqno %lld type %d version %d sz %d]\n", 2927 + resp_msg_hdr->msg_seqno, resp_msg_hdr->msg_type, resp_msg_hdr->msg_version, 2928 + resp_msg_hdr->msg_sz); 2929 + 2930 + /* Copy response from shared memory to encrypted memory. */ 2931 + memcpy(resp_msg, mdesc->response, sizeof(*resp_msg)); 2932 + 2933 + /* Verify that the sequence counter is incremented by 1 */ 2934 + if (unlikely(resp_msg_hdr->msg_seqno != (req_msg_hdr->msg_seqno + 1))) 2935 + return -EBADMSG; 2936 + 2937 + /* Verify response message type and version number. */ 2938 + if (resp_msg_hdr->msg_type != (req_msg_hdr->msg_type + 1) || 2939 + resp_msg_hdr->msg_version != req_msg_hdr->msg_version) 2940 + return -EBADMSG; 2941 + 2942 + /* 2943 + * If the message size is greater than our buffer length then return 2944 + * an error. 2945 + */ 2946 + if (unlikely((resp_msg_hdr->msg_sz + ctx->authsize) > req->resp_sz)) 2947 + return -EBADMSG; 2948 + 2949 + /* Decrypt the payload */ 2950 + memcpy(iv, &resp_msg_hdr->msg_seqno, min(sizeof(iv), sizeof(resp_msg_hdr->msg_seqno))); 2951 + if (!aesgcm_decrypt(ctx, req->resp_buf, resp_msg->payload, resp_msg_hdr->msg_sz, 2952 + &resp_msg_hdr->algo, AAD_LEN, iv, resp_msg_hdr->authtag)) 2953 + return -EBADMSG; 2954 + 2955 + return 0; 2956 + } 2957 + 2958 + static int enc_payload(struct snp_msg_desc *mdesc, u64 seqno, struct snp_guest_req *req) 2959 + { 2960 + struct snp_guest_msg *msg = &mdesc->secret_request; 2961 + struct snp_guest_msg_hdr *hdr = &msg->hdr; 2962 + struct aesgcm_ctx *ctx = mdesc->ctx; 2963 + u8 iv[GCM_AES_IV_SIZE] = {}; 2964 + 2965 + memset(msg, 0, sizeof(*msg)); 2966 + 2967 + hdr->algo = SNP_AEAD_AES_256_GCM; 2968 + hdr->hdr_version = MSG_HDR_VER; 2969 + hdr->hdr_sz = sizeof(*hdr); 2970 + hdr->msg_type = req->msg_type; 2971 + hdr->msg_version = req->msg_version; 2972 + hdr->msg_seqno = seqno; 2973 + hdr->msg_vmpck = req->vmpck_id; 2974 + hdr->msg_sz = req->req_sz; 2975 + 2976 + /* Verify the sequence number is non-zero */ 2977 + if (!hdr->msg_seqno) 2978 + return -ENOSR; 2979 + 2980 + pr_debug("request [seqno %lld type %d version %d sz %d]\n", 2981 + hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz); 2982 + 2983 + if (WARN_ON((req->req_sz + ctx->authsize) > sizeof(msg->payload))) 2984 + return -EBADMSG; 2985 + 2986 + memcpy(iv, &hdr->msg_seqno, min(sizeof(iv), sizeof(hdr->msg_seqno))); 2987 + aesgcm_encrypt(ctx, msg->payload, req->req_buf, req->req_sz, &hdr->algo, 2988 + AAD_LEN, iv, hdr->authtag); 2989 + 2990 + return 0; 2991 + } 2992 + 2993 + static int __handle_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req, 2994 + struct snp_guest_request_ioctl *rio) 2995 + { 2996 + unsigned long req_start = jiffies; 2997 + unsigned int override_npages = 0; 2998 + u64 override_err = 0; 2999 + int rc; 3000 + 3001 + retry_request: 3002 + /* 3003 + * Call firmware to process the request. In this function the encrypted 3004 + * message enters shared memory with the host. So after this call the 3005 + * sequence number must be incremented or the VMPCK must be deleted to 3006 + * prevent reuse of the IV. 3007 + */ 3008 + rc = snp_issue_guest_request(req, &mdesc->input, rio); 3009 + switch (rc) { 3010 + case -ENOSPC: 3011 + /* 3012 + * If the extended guest request fails due to having too 3013 + * small of a certificate data buffer, retry the same 3014 + * guest request without the extended data request in 3015 + * order to increment the sequence number and thus avoid 3016 + * IV reuse. 3017 + */ 3018 + override_npages = mdesc->input.data_npages; 3019 + req->exit_code = SVM_VMGEXIT_GUEST_REQUEST; 3020 + 3021 + /* 3022 + * Override the error to inform callers the given extended 3023 + * request buffer size was too small and give the caller the 3024 + * required buffer size. 3025 + */ 3026 + override_err = SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN); 3027 + 3028 + /* 3029 + * If this call to the firmware succeeds, the sequence number can 3030 + * be incremented allowing for continued use of the VMPCK. If 3031 + * there is an error reflected in the return value, this value 3032 + * is checked further down and the result will be the deletion 3033 + * of the VMPCK and the error code being propagated back to the 3034 + * user as an ioctl() return code. 3035 + */ 3036 + goto retry_request; 3037 + 3038 + /* 3039 + * The host may return SNP_GUEST_VMM_ERR_BUSY if the request has been 3040 + * throttled. Retry in the driver to avoid returning and reusing the 3041 + * message sequence number on a different message. 3042 + */ 3043 + case -EAGAIN: 3044 + if (jiffies - req_start > SNP_REQ_MAX_RETRY_DURATION) { 3045 + rc = -ETIMEDOUT; 3046 + break; 3047 + } 3048 + schedule_timeout_killable(SNP_REQ_RETRY_DELAY); 3049 + goto retry_request; 3050 + } 3051 + 3052 + /* 3053 + * Increment the message sequence number. There is no harm in doing 3054 + * this now because decryption uses the value stored in the response 3055 + * structure and any failure will wipe the VMPCK, preventing further 3056 + * use anyway. 3057 + */ 3058 + snp_inc_msg_seqno(mdesc); 3059 + 3060 + if (override_err) { 3061 + rio->exitinfo2 = override_err; 3062 + 3063 + /* 3064 + * If an extended guest request was issued and the supplied certificate 3065 + * buffer was not large enough, a standard guest request was issued to 3066 + * prevent IV reuse. If the standard request was successful, return -EIO 3067 + * back to the caller as would have originally been returned. 3068 + */ 3069 + if (!rc && override_err == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN)) 3070 + rc = -EIO; 3071 + } 3072 + 3073 + if (override_npages) 3074 + mdesc->input.data_npages = override_npages; 3075 + 3076 + return rc; 3077 + } 3078 + 3079 + int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req, 3080 + struct snp_guest_request_ioctl *rio) 3081 + { 3082 + u64 seqno; 3083 + int rc; 3084 + 3085 + guard(mutex)(&snp_cmd_mutex); 3086 + 3087 + /* Check if the VMPCK is not empty */ 3088 + if (!mdesc->vmpck || !memchr_inv(mdesc->vmpck, 0, VMPCK_KEY_LEN)) { 3089 + pr_err_ratelimited("VMPCK is disabled\n"); 3090 + return -ENOTTY; 3091 + } 3092 + 3093 + /* Get message sequence and verify that its a non-zero */ 3094 + seqno = snp_get_msg_seqno(mdesc); 3095 + if (!seqno) 3096 + return -EIO; 3097 + 3098 + /* Clear shared memory's response for the host to populate. */ 3099 + memset(mdesc->response, 0, sizeof(struct snp_guest_msg)); 3100 + 3101 + /* Encrypt the userspace provided payload in mdesc->secret_request. */ 3102 + rc = enc_payload(mdesc, seqno, req); 3103 + if (rc) 3104 + return rc; 3105 + 3106 + /* 3107 + * Write the fully encrypted request to the shared unencrypted 3108 + * request page. 3109 + */ 3110 + memcpy(mdesc->request, &mdesc->secret_request, sizeof(mdesc->secret_request)); 3111 + 3112 + rc = __handle_guest_request(mdesc, req, rio); 3113 + if (rc) { 3114 + if (rc == -EIO && 3115 + rio->exitinfo2 == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN)) 3116 + return rc; 3117 + 3118 + pr_alert("Detected error from ASP request. rc: %d, exitinfo2: 0x%llx\n", 3119 + rc, rio->exitinfo2); 3120 + 3121 + snp_disable_vmpck(mdesc); 3122 + return rc; 3123 + } 3124 + 3125 + rc = verify_and_dec_payload(mdesc, req); 3126 + if (rc) { 3127 + pr_alert("Detected unexpected decode failure from ASP. rc: %d\n", rc); 3128 + snp_disable_vmpck(mdesc); 3129 + return rc; 3130 + } 3131 + 3132 + return 0; 3133 + } 3134 + EXPORT_SYMBOL_GPL(snp_send_guest_request); 3135 + 3136 + static int __init snp_get_tsc_info(void) 3137 + { 3138 + struct snp_guest_request_ioctl *rio; 3139 + struct snp_tsc_info_resp *tsc_resp; 3140 + struct snp_tsc_info_req *tsc_req; 3141 + struct snp_msg_desc *mdesc; 3142 + struct snp_guest_req *req; 3143 + int rc = -ENOMEM; 3144 + 3145 + tsc_req = kzalloc(sizeof(*tsc_req), GFP_KERNEL); 3146 + if (!tsc_req) 3147 + return rc; 3148 + 3149 + /* 3150 + * The intermediate response buffer is used while decrypting the 3151 + * response payload. Make sure that it has enough space to cover 3152 + * the authtag. 3153 + */ 3154 + tsc_resp = kzalloc(sizeof(*tsc_resp) + AUTHTAG_LEN, GFP_KERNEL); 3155 + if (!tsc_resp) 3156 + goto e_free_tsc_req; 3157 + 3158 + req = kzalloc(sizeof(*req), GFP_KERNEL); 3159 + if (!req) 3160 + goto e_free_tsc_resp; 3161 + 3162 + rio = kzalloc(sizeof(*rio), GFP_KERNEL); 3163 + if (!rio) 3164 + goto e_free_req; 3165 + 3166 + mdesc = snp_msg_alloc(); 3167 + if (IS_ERR_OR_NULL(mdesc)) 3168 + goto e_free_rio; 3169 + 3170 + rc = snp_msg_init(mdesc, snp_vmpl); 3171 + if (rc) 3172 + goto e_free_mdesc; 3173 + 3174 + req->msg_version = MSG_HDR_VER; 3175 + req->msg_type = SNP_MSG_TSC_INFO_REQ; 3176 + req->vmpck_id = snp_vmpl; 3177 + req->req_buf = tsc_req; 3178 + req->req_sz = sizeof(*tsc_req); 3179 + req->resp_buf = (void *)tsc_resp; 3180 + req->resp_sz = sizeof(*tsc_resp) + AUTHTAG_LEN; 3181 + req->exit_code = SVM_VMGEXIT_GUEST_REQUEST; 3182 + 3183 + rc = snp_send_guest_request(mdesc, req, rio); 3184 + if (rc) 3185 + goto e_request; 3186 + 3187 + pr_debug("%s: response status 0x%x scale 0x%llx offset 0x%llx factor 0x%x\n", 3188 + __func__, tsc_resp->status, tsc_resp->tsc_scale, tsc_resp->tsc_offset, 3189 + tsc_resp->tsc_factor); 3190 + 3191 + if (!tsc_resp->status) { 3192 + snp_tsc_scale = tsc_resp->tsc_scale; 3193 + snp_tsc_offset = tsc_resp->tsc_offset; 3194 + } else { 3195 + pr_err("Failed to get TSC info, response status 0x%x\n", tsc_resp->status); 3196 + rc = -EIO; 3197 + } 3198 + 3199 + e_request: 3200 + /* The response buffer contains sensitive data, explicitly clear it. */ 3201 + memzero_explicit(tsc_resp, sizeof(*tsc_resp) + AUTHTAG_LEN); 3202 + e_free_mdesc: 3203 + snp_msg_free(mdesc); 3204 + e_free_rio: 3205 + kfree(rio); 3206 + e_free_req: 3207 + kfree(req); 3208 + e_free_tsc_resp: 3209 + kfree(tsc_resp); 3210 + e_free_tsc_req: 3211 + kfree(tsc_req); 3212 + 3213 + return rc; 3214 + } 3215 + 3216 + void __init snp_secure_tsc_prepare(void) 3217 + { 3218 + if (!cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)) 3219 + return; 3220 + 3221 + if (snp_get_tsc_info()) { 3222 + pr_alert("Unable to retrieve Secure TSC info from ASP\n"); 3223 + sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SECURE_TSC); 3224 + } 3225 + 3226 + pr_debug("SecureTSC enabled"); 3227 + } 3228 + 3229 + static unsigned long securetsc_get_tsc_khz(void) 3230 + { 3231 + return snp_tsc_freq_khz; 3232 + } 3233 + 3234 + void __init snp_secure_tsc_init(void) 3235 + { 3236 + unsigned long long tsc_freq_mhz; 3237 + 3238 + if (!cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)) 3239 + return; 3240 + 3241 + setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ); 3242 + rdmsrl(MSR_AMD64_GUEST_TSC_FREQ, tsc_freq_mhz); 3243 + snp_tsc_freq_khz = (unsigned long)(tsc_freq_mhz * 1000); 3244 + 3245 + x86_platform.calibrate_cpu = securetsc_get_tsc_khz; 3246 + x86_platform.calibrate_tsc = securetsc_get_tsc_khz; 3247 + }

+10

arch/x86/coco/sev/shared.c

··· 1140 1140 bool rdtscp = (exit_code == SVM_EXIT_RDTSCP); 1141 1141 enum es_result ret; 1142 1142 1143 + /* 1144 + * The hypervisor should not be intercepting RDTSC/RDTSCP when Secure 1145 + * TSC is enabled. A #VC exception will be generated if the RDTSC/RDTSCP 1146 + * instructions are being intercepted. If this should occur and Secure 1147 + * TSC is enabled, guest execution should be terminated as the guest 1148 + * cannot rely on the TSC value provided by the hypervisor. 1149 + */ 1150 + if (sev_status & MSR_AMD64_SNP_SECURE_TSC) 1151 + return ES_VMM_ERROR; 1152 + 1143 1153 ret = sev_es_ghcb_hv_call(ghcb, ctxt, exit_code, 0, 0); 1144 1154 if (ret != ES_OK) 1145 1155 return ret;

+2

arch/x86/include/asm/cpufeatures.h

··· 451 451 #define X86_FEATURE_V_TSC_AUX (19*32+ 9) /* Virtual TSC_AUX */ 452 452 #define X86_FEATURE_SME_COHERENT (19*32+10) /* AMD hardware-enforced cache coherency */ 453 453 #define X86_FEATURE_DEBUG_SWAP (19*32+14) /* "debug_swap" AMD SEV-ES full debug state swap support */ 454 + #define X86_FEATURE_RMPREAD (19*32+21) /* RMPREAD instruction */ 455 + #define X86_FEATURE_SEGMENTED_RMP (19*32+23) /* Segmented RMP support */ 454 456 #define X86_FEATURE_SVSM (19*32+28) /* "svsm" SVSM present */ 455 457 #define X86_FEATURE_HV_INUSE_WR_ALLOWED (19*32+30) /* Allow Write to in-use hypervisor-owned pages */ 456 458

+6 -3

arch/x86/include/asm/msr-index.h

··· 608 608 #define MSR_AMD_PERF_CTL 0xc0010062 609 609 #define MSR_AMD_PERF_STATUS 0xc0010063 610 610 #define MSR_AMD_PSTATE_DEF_BASE 0xc0010064 611 + #define MSR_AMD64_GUEST_TSC_FREQ 0xc0010134 611 612 #define MSR_AMD64_OSVW_ID_LENGTH 0xc0010140 612 613 #define MSR_AMD64_OSVW_STATUS 0xc0010141 613 614 #define MSR_AMD_PPIN_CTL 0xc00102f0 ··· 645 644 #define MSR_AMD64_IBS_REG_COUNT_MAX 8 /* includes MSR_AMD64_IBSBRTARGET */ 646 645 #define MSR_AMD64_SVM_AVIC_DOORBELL 0xc001011b 647 646 #define MSR_AMD64_VM_PAGE_FLUSH 0xc001011e 647 + #define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f 648 648 #define MSR_AMD64_SEV_ES_GHCB 0xc0010130 649 649 #define MSR_AMD64_SEV 0xc0010131 650 650 #define MSR_AMD64_SEV_ENABLED_BIT 0 ··· 684 682 #define MSR_AMD64_SNP_SMT_PROT BIT_ULL(MSR_AMD64_SNP_SMT_PROT_BIT) 685 683 #define MSR_AMD64_SNP_RESV_BIT 18 686 684 #define MSR_AMD64_SNP_RESERVED_MASK GENMASK_ULL(63, MSR_AMD64_SNP_RESV_BIT) 687 - 688 - #define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f 689 - 690 685 #define MSR_AMD64_RMP_BASE 0xc0010132 691 686 #define MSR_AMD64_RMP_END 0xc0010133 687 + #define MSR_AMD64_RMP_CFG 0xc0010136 688 + #define MSR_AMD64_SEG_RMP_ENABLED_BIT 0 689 + #define MSR_AMD64_SEG_RMP_ENABLED BIT_ULL(MSR_AMD64_SEG_RMP_ENABLED_BIT) 690 + #define MSR_AMD64_RMP_SEGMENT_SHIFT(x) (((x) & GENMASK_ULL(13, 8)) >> 8) 692 691 693 692 #define MSR_SVSM_CAA 0xc001f000 694 693

+1

arch/x86/include/asm/sev-common.h

··· 206 206 #define GHCB_TERM_NO_SVSM 7 /* SVSM is not advertised in the secrets page */ 207 207 #define GHCB_TERM_SVSM_VMPL0 8 /* SVSM is present but has set VMPL to 0 */ 208 208 #define GHCB_TERM_SVSM_CAA 9 /* SVSM is present but CAA is not page aligned */ 209 + #define GHCB_TERM_SECURE_TSC 10 /* Secure TSC initialization failed */ 209 210 210 211 #define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK) 211 212

+38 -10

arch/x86/include/asm/sev.h

··· 14 14 #include <asm/insn.h> 15 15 #include <asm/sev-common.h> 16 16 #include <asm/coco.h> 17 + #include <asm/set_memory.h> 17 18 18 19 #define GHCB_PROTOCOL_MIN 1ULL 19 20 #define GHCB_PROTOCOL_MAX 2ULL ··· 125 124 #define AAD_LEN 48 126 125 #define MSG_HDR_VER 1 127 126 127 + #define SNP_REQ_MAX_RETRY_DURATION (60*HZ) 128 + #define SNP_REQ_RETRY_DELAY (2*HZ) 129 + 128 130 /* See SNP spec SNP_GUEST_REQUEST section for the structure */ 129 131 enum msg_type { 130 132 SNP_MSG_TYPE_INVALID = 0, ··· 145 141 SNP_MSG_ABSORB_RSP, 146 142 SNP_MSG_VMRK_REQ, 147 143 SNP_MSG_VMRK_RSP, 144 + 145 + SNP_MSG_TSC_INFO_REQ = 17, 146 + SNP_MSG_TSC_INFO_RSP, 148 147 149 148 SNP_MSG_TYPE_MAX 150 149 }; ··· 177 170 u8 payload[PAGE_SIZE - sizeof(struct snp_guest_msg_hdr)]; 178 171 } __packed; 179 172 180 - struct sev_guest_platform_data { 181 - u64 secrets_gpa; 182 - }; 173 + #define SNP_TSC_INFO_REQ_SZ 128 174 + 175 + struct snp_tsc_info_req { 176 + u8 rsvd[SNP_TSC_INFO_REQ_SZ]; 177 + } __packed; 178 + 179 + struct snp_tsc_info_resp { 180 + u32 status; 181 + u32 rsvd1; 182 + u64 tsc_scale; 183 + u64 tsc_offset; 184 + u32 tsc_factor; 185 + u8 rsvd2[100]; 186 + } __packed; 183 187 184 188 struct snp_guest_req { 185 189 void *req_buf; ··· 271 253 272 254 u32 *os_area_msg_seqno; 273 255 u8 *vmpck; 256 + int vmpck_id; 274 257 }; 275 258 276 259 /* ··· 464 445 bool snp_init(struct boot_params *bp); 465 446 void __noreturn snp_abort(void); 466 447 void snp_dmi_setup(void); 467 - int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *input, 468 - struct snp_guest_request_ioctl *rio); 469 448 int snp_issue_svsm_attest_req(u64 call_id, struct svsm_call *call, struct svsm_attest_call *input); 470 449 void snp_accept_memory(phys_addr_t start, phys_addr_t end); 471 450 u64 snp_get_unsupported_features(u64 status); ··· 474 457 void set_pte_enc_mask(pte_t *kpte, unsigned long pfn, pgprot_t new_prot); 475 458 void snp_kexec_finish(void); 476 459 void snp_kexec_begin(void); 460 + 461 + int snp_msg_init(struct snp_msg_desc *mdesc, int vmpck_id); 462 + struct snp_msg_desc *snp_msg_alloc(void); 463 + void snp_msg_free(struct snp_msg_desc *mdesc); 464 + int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req, 465 + struct snp_guest_request_ioctl *rio); 466 + 467 + void __init snp_secure_tsc_prepare(void); 468 + void __init snp_secure_tsc_init(void); 477 469 478 470 #else /* !CONFIG_AMD_MEM_ENCRYPT */ 479 471 ··· 506 480 static inline bool snp_init(struct boot_params *bp) { return false; } 507 481 static inline void snp_abort(void) { } 508 482 static inline void snp_dmi_setup(void) { } 509 - static inline int snp_issue_guest_request(struct snp_guest_req *req, struct snp_req_data *input, 510 - struct snp_guest_request_ioctl *rio) 511 - { 512 - return -ENOTTY; 513 - } 514 483 static inline int snp_issue_svsm_attest_req(u64 call_id, struct svsm_call *call, struct svsm_attest_call *input) 515 484 { 516 485 return -ENOTTY; ··· 519 498 static inline void set_pte_enc_mask(pte_t *kpte, unsigned long pfn, pgprot_t new_prot) { } 520 499 static inline void snp_kexec_finish(void) { } 521 500 static inline void snp_kexec_begin(void) { } 501 + static inline int snp_msg_init(struct snp_msg_desc *mdesc, int vmpck_id) { return -1; } 502 + static inline struct snp_msg_desc *snp_msg_alloc(void) { return NULL; } 503 + static inline void snp_msg_free(struct snp_msg_desc *mdesc) { } 504 + static inline int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req, 505 + struct snp_guest_request_ioctl *rio) { return -ENODEV; } 506 + static inline void __init snp_secure_tsc_prepare(void) { } 507 + static inline void __init snp_secure_tsc_init(void) { } 522 508 523 509 #endif /* CONFIG_AMD_MEM_ENCRYPT */ 524 510

+4 -2

arch/x86/include/asm/svm.h

··· 417 417 u8 reserved_0x298[80]; 418 418 u32 pkru; 419 419 u32 tsc_aux; 420 - u8 reserved_0x2f0[24]; 420 + u64 tsc_scale; 421 + u64 tsc_offset; 422 + u8 reserved_0x300[8]; 421 423 u64 rcx; 422 424 u64 rdx; 423 425 u64 rbx; ··· 566 564 BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x1c0); 567 565 BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x248); 568 566 BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x298); 569 - BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x2f0); 567 + BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x300); 570 568 BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x320); 571 569 BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x380); 572 570 BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0x3f0);

+7 -2

arch/x86/kernel/cpu/amd.c

··· 355 355 /* 356 356 * RMP table entry format is not architectural and is defined by the 357 357 * per-processor PPR. Restrict SNP support on the known CPU models 358 - * for which the RMP table entry format is currently defined for. 358 + * for which the RMP table entry format is currently defined or for 359 + * processors which support the architecturally defined RMPREAD 360 + * instruction. 359 361 */ 360 362 if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && 361 - c->x86 >= 0x19 && snp_probe_rmptable_info()) { 363 + (cpu_feature_enabled(X86_FEATURE_ZEN3) || 364 + cpu_feature_enabled(X86_FEATURE_ZEN4) || 365 + cpu_feature_enabled(X86_FEATURE_RMPREAD)) && 366 + snp_probe_rmptable_info()) { 362 367 cc_platform_set(CC_ATTR_HOST_SEV_SNP); 363 368 } else { 364 369 setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);

+4

arch/x86/kernel/tsc.c

··· 30 30 #include <asm/i8259.h> 31 31 #include <asm/topology.h> 32 32 #include <asm/uv/uv.h> 33 + #include <asm/sev.h> 33 34 34 35 unsigned int __read_mostly cpu_khz; /* TSC clocks / usec, not used here */ 35 36 EXPORT_SYMBOL(cpu_khz); ··· 1516 1515 /* Don't change UV TSC multi-chassis synchronization */ 1517 1516 if (is_early_uv_system()) 1518 1517 return; 1518 + 1519 + snp_secure_tsc_init(); 1520 + 1519 1521 if (!determine_cpu_tsc_frequencies(true)) 1520 1522 return; 1521 1523 tsc_enable_sched_clock();

+2

arch/x86/mm/mem_encrypt.c

··· 94 94 /* Call into SWIOTLB to update the SWIOTLB DMA buffers */ 95 95 swiotlb_update_mem_attributes(); 96 96 97 + snp_secure_tsc_prepare(); 98 + 97 99 print_mem_encrypt_feature_info(); 98 100 } 99 101

+3

arch/x86/mm/mem_encrypt_amd.c

··· 541 541 * kernel mapped. 542 542 */ 543 543 snp_update_svsm_ca(); 544 + 545 + if (sev_status & MSR_AMD64_SNP_SECURE_TSC) 546 + setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE); 544 547 } 545 548 546 549 void __init mem_encrypt_free_decrypted_mem(void)

+570 -97

arch/x86/virt/svm/sev.c

··· 18 18 #include <linux/cpumask.h> 19 19 #include <linux/iommu.h> 20 20 #include <linux/amd-iommu.h> 21 + #include <linux/nospec.h> 21 22 22 23 #include <asm/sev.h> 23 24 #include <asm/processor.h> ··· 32 31 #include <asm/iommu.h> 33 32 34 33 /* 35 - * The RMP entry format is not architectural. The format is defined in PPR 36 - * Family 19h Model 01h, Rev B1 processor. 34 + * The RMP entry information as returned by the RMPREAD instruction. 37 35 */ 38 36 struct rmpentry { 37 + u64 gpa; 38 + u8 assigned :1, 39 + rsvd1 :7; 40 + u8 pagesize :1, 41 + hpage_region_status :1, 42 + rsvd2 :6; 43 + u8 immutable :1, 44 + rsvd3 :7; 45 + u8 rsvd4; 46 + u32 asid; 47 + } __packed; 48 + 49 + /* 50 + * The raw RMP entry format is not architectural. The format is defined in PPR 51 + * Family 19h Model 01h, Rev B1 processor. This format represents the actual 52 + * entry in the RMP table memory. The bitfield definitions are used for machines 53 + * without the RMPREAD instruction (Zen3 and Zen4), otherwise the "hi" and "lo" 54 + * fields are only used for dumping the raw data. 55 + */ 56 + struct rmpentry_raw { 39 57 union { 40 58 struct { 41 59 u64 assigned : 1, ··· 78 58 */ 79 59 #define RMPTABLE_CPU_BOOKKEEPING_SZ 0x4000 80 60 61 + /* 62 + * For a non-segmented RMP table, use the maximum physical addressing as the 63 + * segment size in order to always arrive at index 0 in the table. 64 + */ 65 + #define RMPTABLE_NON_SEGMENTED_SHIFT 52 66 + 67 + struct rmp_segment_desc { 68 + struct rmpentry_raw *rmp_entry; 69 + u64 max_index; 70 + u64 size; 71 + }; 72 + 73 + /* 74 + * Segmented RMP Table support. 75 + * - The segment size is used for two purposes: 76 + * - Identify the amount of memory covered by an RMP segment 77 + * - Quickly locate an RMP segment table entry for a physical address 78 + * 79 + * - The RMP segment table contains pointers to an RMP table that covers 80 + * a specific portion of memory. There can be up to 512 8-byte entries, 81 + * one pages worth. 82 + */ 83 + #define RST_ENTRY_MAPPED_SIZE(x) ((x) & GENMASK_ULL(19, 0)) 84 + #define RST_ENTRY_SEGMENT_BASE(x) ((x) & GENMASK_ULL(51, 20)) 85 + 86 + #define RST_SIZE SZ_4K 87 + static struct rmp_segment_desc **rmp_segment_table __ro_after_init; 88 + static unsigned int rst_max_index __ro_after_init = 512; 89 + 90 + static unsigned int rmp_segment_shift; 91 + static u64 rmp_segment_size; 92 + static u64 rmp_segment_mask; 93 + 94 + #define RST_ENTRY_INDEX(x) ((x) >> rmp_segment_shift) 95 + #define RMP_ENTRY_INDEX(x) ((u64)(PHYS_PFN((x) & rmp_segment_mask))) 96 + 97 + static u64 rmp_cfg; 98 + 81 99 /* Mask to apply to a PFN to get the first PFN of a 2MB page */ 82 100 #define PFN_PMD_MASK GENMASK_ULL(63, PMD_SHIFT - PAGE_SHIFT) 83 101 84 102 static u64 probed_rmp_base, probed_rmp_size; 85 - static struct rmpentry *rmptable __ro_after_init; 86 - static u64 rmptable_max_pfn __ro_after_init; 87 103 88 104 static LIST_HEAD(snp_leaked_pages_list); 89 105 static DEFINE_SPINLOCK(snp_leaked_pages_list_lock); ··· 172 116 __snp_enable(smp_processor_id()); 173 117 } 174 118 175 - #define RMP_ADDR_MASK GENMASK_ULL(51, 13) 176 - 177 - bool snp_probe_rmptable_info(void) 178 - { 179 - u64 rmp_sz, rmp_base, rmp_end; 180 - 181 - rdmsrl(MSR_AMD64_RMP_BASE, rmp_base); 182 - rdmsrl(MSR_AMD64_RMP_END, rmp_end); 183 - 184 - if (!(rmp_base & RMP_ADDR_MASK) || !(rmp_end & RMP_ADDR_MASK)) { 185 - pr_err("Memory for the RMP table has not been reserved by BIOS\n"); 186 - return false; 187 - } 188 - 189 - if (rmp_base > rmp_end) { 190 - pr_err("RMP configuration not valid: base=%#llx, end=%#llx\n", rmp_base, rmp_end); 191 - return false; 192 - } 193 - 194 - rmp_sz = rmp_end - rmp_base + 1; 195 - 196 - probed_rmp_base = rmp_base; 197 - probed_rmp_size = rmp_sz; 198 - 199 - pr_info("RMP table physical range [0x%016llx - 0x%016llx]\n", 200 - rmp_base, rmp_end); 201 - 202 - return true; 203 - } 204 - 205 119 static void __init __snp_fixup_e820_tables(u64 pa) 206 120 { 207 121 if (IS_ALIGNED(pa, PMD_SIZE)) ··· 204 178 } 205 179 } 206 180 207 - void __init snp_fixup_e820_tables(void) 181 + static void __init fixup_e820_tables_for_segmented_rmp(void) 182 + { 183 + u64 pa, *rst, size, mapped_size; 184 + unsigned int i; 185 + 186 + __snp_fixup_e820_tables(probed_rmp_base); 187 + 188 + pa = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ; 189 + 190 + __snp_fixup_e820_tables(pa + RST_SIZE); 191 + 192 + rst = early_memremap(pa, RST_SIZE); 193 + if (!rst) 194 + return; 195 + 196 + for (i = 0; i < rst_max_index; i++) { 197 + pa = RST_ENTRY_SEGMENT_BASE(rst[i]); 198 + mapped_size = RST_ENTRY_MAPPED_SIZE(rst[i]); 199 + if (!mapped_size) 200 + continue; 201 + 202 + __snp_fixup_e820_tables(pa); 203 + 204 + /* 205 + * Mapped size in GB. Mapped size is allowed to exceed 206 + * the segment coverage size, but gets reduced to the 207 + * segment coverage size. 208 + */ 209 + mapped_size <<= 30; 210 + if (mapped_size > rmp_segment_size) 211 + mapped_size = rmp_segment_size; 212 + 213 + /* Calculate the RMP segment size (16 bytes/page mapped) */ 214 + size = PHYS_PFN(mapped_size) << 4; 215 + 216 + __snp_fixup_e820_tables(pa + size); 217 + } 218 + 219 + early_memunmap(rst, RST_SIZE); 220 + } 221 + 222 + static void __init fixup_e820_tables_for_contiguous_rmp(void) 208 223 { 209 224 __snp_fixup_e820_tables(probed_rmp_base); 210 225 __snp_fixup_e820_tables(probed_rmp_base + probed_rmp_size); 211 226 } 212 227 213 - /* 214 - * Do the necessary preparations which are verified by the firmware as 215 - * described in the SNP_INIT_EX firmware command description in the SNP 216 - * firmware ABI spec. 217 - */ 218 - static int __init snp_rmptable_init(void) 228 + void __init snp_fixup_e820_tables(void) 219 229 { 220 - u64 max_rmp_pfn, calc_rmp_sz, rmptable_size, rmp_end, val; 221 - void *rmptable_start; 230 + if (rmp_cfg & MSR_AMD64_SEG_RMP_ENABLED) { 231 + fixup_e820_tables_for_segmented_rmp(); 232 + } else { 233 + fixup_e820_tables_for_contiguous_rmp(); 234 + } 235 + } 222 236 223 - if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP)) 224 - return 0; 237 + static bool __init clear_rmptable_bookkeeping(void) 238 + { 239 + void *bk; 225 240 226 - if (!amd_iommu_snp_en) 227 - goto nosnp; 241 + bk = memremap(probed_rmp_base, RMPTABLE_CPU_BOOKKEEPING_SZ, MEMREMAP_WB); 242 + if (!bk) { 243 + pr_err("Failed to map RMP bookkeeping area\n"); 244 + return false; 245 + } 246 + 247 + memset(bk, 0, RMPTABLE_CPU_BOOKKEEPING_SZ); 248 + 249 + memunmap(bk); 250 + 251 + return true; 252 + } 253 + 254 + static bool __init alloc_rmp_segment_desc(u64 segment_pa, u64 segment_size, u64 pa) 255 + { 256 + u64 rst_index, rmp_segment_size_max; 257 + struct rmp_segment_desc *desc; 258 + void *rmp_segment; 259 + 260 + /* Calculate the maximum size an RMP can be (16 bytes/page mapped) */ 261 + rmp_segment_size_max = PHYS_PFN(rmp_segment_size) << 4; 262 + 263 + /* Validate the RMP segment size */ 264 + if (segment_size > rmp_segment_size_max) { 265 + pr_err("Invalid RMP size 0x%llx for configured segment size 0x%llx\n", 266 + segment_size, rmp_segment_size_max); 267 + return false; 268 + } 269 + 270 + /* Validate the RMP segment table index */ 271 + rst_index = RST_ENTRY_INDEX(pa); 272 + if (rst_index >= rst_max_index) { 273 + pr_err("Invalid RMP segment base address 0x%llx for configured segment size 0x%llx\n", 274 + pa, rmp_segment_size); 275 + return false; 276 + } 277 + 278 + if (rmp_segment_table[rst_index]) { 279 + pr_err("RMP segment descriptor already exists at index %llu\n", rst_index); 280 + return false; 281 + } 282 + 283 + rmp_segment = memremap(segment_pa, segment_size, MEMREMAP_WB); 284 + if (!rmp_segment) { 285 + pr_err("Failed to map RMP segment addr 0x%llx size 0x%llx\n", 286 + segment_pa, segment_size); 287 + return false; 288 + } 289 + 290 + desc = kzalloc(sizeof(*desc), GFP_KERNEL); 291 + if (!desc) { 292 + memunmap(rmp_segment); 293 + return false; 294 + } 295 + 296 + desc->rmp_entry = rmp_segment; 297 + desc->max_index = segment_size / sizeof(*desc->rmp_entry); 298 + desc->size = segment_size; 299 + 300 + rmp_segment_table[rst_index] = desc; 301 + 302 + return true; 303 + } 304 + 305 + static void __init free_rmp_segment_table(void) 306 + { 307 + unsigned int i; 308 + 309 + for (i = 0; i < rst_max_index; i++) { 310 + struct rmp_segment_desc *desc; 311 + 312 + desc = rmp_segment_table[i]; 313 + if (!desc) 314 + continue; 315 + 316 + memunmap(desc->rmp_entry); 317 + 318 + kfree(desc); 319 + } 320 + 321 + free_page((unsigned long)rmp_segment_table); 322 + 323 + rmp_segment_table = NULL; 324 + } 325 + 326 + /* Allocate the table used to index into the RMP segments */ 327 + static bool __init alloc_rmp_segment_table(void) 328 + { 329 + struct page *page; 330 + 331 + page = alloc_page(__GFP_ZERO); 332 + if (!page) 333 + return false; 334 + 335 + rmp_segment_table = page_address(page); 336 + 337 + return true; 338 + } 339 + 340 + static bool __init setup_contiguous_rmptable(void) 341 + { 342 + u64 max_rmp_pfn, calc_rmp_sz, rmptable_segment, rmptable_size, rmp_end; 228 343 229 344 if (!probed_rmp_size) 230 - goto nosnp; 345 + return false; 231 346 232 347 rmp_end = probed_rmp_base + probed_rmp_size - 1; 233 348 234 349 /* 235 - * Calculate the amount the memory that must be reserved by the BIOS to 350 + * Calculate the amount of memory that must be reserved by the BIOS to 236 351 * address the whole RAM, including the bookkeeping area. The RMP itself 237 352 * must also be covered. 238 353 */ ··· 385 218 if (calc_rmp_sz > probed_rmp_size) { 386 219 pr_err("Memory reserved for the RMP table does not cover full system RAM (expected 0x%llx got 0x%llx)\n", 387 220 calc_rmp_sz, probed_rmp_size); 388 - goto nosnp; 221 + return false; 389 222 } 390 223 391 - rmptable_start = memremap(probed_rmp_base, probed_rmp_size, MEMREMAP_WB); 392 - if (!rmptable_start) { 393 - pr_err("Failed to map RMP table\n"); 394 - goto nosnp; 224 + if (!alloc_rmp_segment_table()) 225 + return false; 226 + 227 + /* Map only the RMP entries */ 228 + rmptable_segment = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ; 229 + rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ; 230 + 231 + if (!alloc_rmp_segment_desc(rmptable_segment, rmptable_size, 0)) { 232 + free_rmp_segment_table(); 233 + return false; 395 234 } 235 + 236 + return true; 237 + } 238 + 239 + static bool __init setup_segmented_rmptable(void) 240 + { 241 + u64 rst_pa, *rst, pa, ram_pa_end, ram_pa_max; 242 + unsigned int i, max_index; 243 + 244 + if (!probed_rmp_base) 245 + return false; 246 + 247 + if (!alloc_rmp_segment_table()) 248 + return false; 249 + 250 + rst_pa = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ; 251 + rst = memremap(rst_pa, RST_SIZE, MEMREMAP_WB); 252 + if (!rst) { 253 + pr_err("Failed to map RMP segment table addr 0x%llx\n", rst_pa); 254 + goto e_free; 255 + } 256 + 257 + pr_info("Segmented RMP using %lluGB segments\n", rmp_segment_size >> 30); 258 + 259 + ram_pa_max = max_pfn << PAGE_SHIFT; 260 + 261 + max_index = 0; 262 + ram_pa_end = 0; 263 + for (i = 0; i < rst_max_index; i++) { 264 + u64 rmp_segment, rmp_size, mapped_size; 265 + 266 + mapped_size = RST_ENTRY_MAPPED_SIZE(rst[i]); 267 + if (!mapped_size) 268 + continue; 269 + 270 + max_index = i; 271 + 272 + /* 273 + * Mapped size in GB. Mapped size is allowed to exceed the 274 + * segment coverage size, but gets reduced to the segment 275 + * coverage size. 276 + */ 277 + mapped_size <<= 30; 278 + if (mapped_size > rmp_segment_size) { 279 + pr_info("RMP segment %u mapped size (0x%llx) reduced to 0x%llx\n", 280 + i, mapped_size, rmp_segment_size); 281 + mapped_size = rmp_segment_size; 282 + } 283 + 284 + rmp_segment = RST_ENTRY_SEGMENT_BASE(rst[i]); 285 + 286 + /* Calculate the RMP segment size (16 bytes/page mapped) */ 287 + rmp_size = PHYS_PFN(mapped_size) << 4; 288 + 289 + pa = (u64)i << rmp_segment_shift; 290 + 291 + /* 292 + * Some segments may be for MMIO mapped above system RAM. These 293 + * segments are used for Trusted I/O. 294 + */ 295 + if (pa < ram_pa_max) 296 + ram_pa_end = pa + mapped_size; 297 + 298 + if (!alloc_rmp_segment_desc(rmp_segment, rmp_size, pa)) 299 + goto e_unmap; 300 + 301 + pr_info("RMP segment %u physical address [0x%llx - 0x%llx] covering [0x%llx - 0x%llx]\n", 302 + i, rmp_segment, rmp_segment + rmp_size - 1, pa, pa + mapped_size - 1); 303 + } 304 + 305 + if (ram_pa_max > ram_pa_end) { 306 + pr_err("Segmented RMP does not cover full system RAM (expected 0x%llx got 0x%llx)\n", 307 + ram_pa_max, ram_pa_end); 308 + goto e_unmap; 309 + } 310 + 311 + /* Adjust the maximum index based on the found segments */ 312 + rst_max_index = max_index + 1; 313 + 314 + memunmap(rst); 315 + 316 + return true; 317 + 318 + e_unmap: 319 + memunmap(rst); 320 + 321 + e_free: 322 + free_rmp_segment_table(); 323 + 324 + return false; 325 + } 326 + 327 + static bool __init setup_rmptable(void) 328 + { 329 + if (rmp_cfg & MSR_AMD64_SEG_RMP_ENABLED) { 330 + return setup_segmented_rmptable(); 331 + } else { 332 + return setup_contiguous_rmptable(); 333 + } 334 + } 335 + 336 + /* 337 + * Do the necessary preparations which are verified by the firmware as 338 + * described in the SNP_INIT_EX firmware command description in the SNP 339 + * firmware ABI spec. 340 + */ 341 + static int __init snp_rmptable_init(void) 342 + { 343 + unsigned int i; 344 + u64 val; 345 + 346 + if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP)) 347 + return 0; 348 + 349 + if (!amd_iommu_snp_en) 350 + goto nosnp; 351 + 352 + if (!setup_rmptable()) 353 + goto nosnp; 396 354 397 355 /* 398 356 * Check if SEV-SNP is already enabled, this can happen in case of ··· 527 235 if (val & MSR_AMD64_SYSCFG_SNP_EN) 528 236 goto skip_enable; 529 237 530 - memset(rmptable_start, 0, probed_rmp_size); 238 + /* Zero out the RMP bookkeeping area */ 239 + if (!clear_rmptable_bookkeeping()) { 240 + free_rmp_segment_table(); 241 + goto nosnp; 242 + } 243 + 244 + /* Zero out the RMP entries */ 245 + for (i = 0; i < rst_max_index; i++) { 246 + struct rmp_segment_desc *desc; 247 + 248 + desc = rmp_segment_table[i]; 249 + if (!desc) 250 + continue; 251 + 252 + memset(desc->rmp_entry, 0, desc->size); 253 + } 531 254 532 255 /* Flush the caches to ensure that data is written before SNP is enabled. */ 533 256 wbinvd_on_all_cpus(); ··· 553 246 on_each_cpu(snp_enable, NULL, 1); 554 247 555 248 skip_enable: 556 - rmptable_start += RMPTABLE_CPU_BOOKKEEPING_SZ; 557 - rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ; 558 - 559 - rmptable = (struct rmpentry *)rmptable_start; 560 - rmptable_max_pfn = rmptable_size / sizeof(struct rmpentry) - 1; 561 - 562 249 cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL); 563 250 564 251 /* ··· 573 272 */ 574 273 device_initcall(snp_rmptable_init); 575 274 576 - static struct rmpentry *get_rmpentry(u64 pfn) 275 + static void set_rmp_segment_info(unsigned int segment_shift) 577 276 { 578 - if (WARN_ON_ONCE(pfn > rmptable_max_pfn)) 579 - return ERR_PTR(-EFAULT); 580 - 581 - return &rmptable[pfn]; 277 + rmp_segment_shift = segment_shift; 278 + rmp_segment_size = 1ULL << rmp_segment_shift; 279 + rmp_segment_mask = rmp_segment_size - 1; 582 280 } 583 281 584 - static struct rmpentry *__snp_lookup_rmpentry(u64 pfn, int *level) 585 - { 586 - struct rmpentry *large_entry, *entry; 282 + #define RMP_ADDR_MASK GENMASK_ULL(51, 13) 587 283 588 - if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP)) 284 + static bool probe_contiguous_rmptable_info(void) 285 + { 286 + u64 rmp_sz, rmp_base, rmp_end; 287 + 288 + rdmsrl(MSR_AMD64_RMP_BASE, rmp_base); 289 + rdmsrl(MSR_AMD64_RMP_END, rmp_end); 290 + 291 + if (!(rmp_base & RMP_ADDR_MASK) || !(rmp_end & RMP_ADDR_MASK)) { 292 + pr_err("Memory for the RMP table has not been reserved by BIOS\n"); 293 + return false; 294 + } 295 + 296 + if (rmp_base > rmp_end) { 297 + pr_err("RMP configuration not valid: base=%#llx, end=%#llx\n", rmp_base, rmp_end); 298 + return false; 299 + } 300 + 301 + rmp_sz = rmp_end - rmp_base + 1; 302 + 303 + /* Treat the contiguous RMP table as a single segment */ 304 + rst_max_index = 1; 305 + 306 + set_rmp_segment_info(RMPTABLE_NON_SEGMENTED_SHIFT); 307 + 308 + probed_rmp_base = rmp_base; 309 + probed_rmp_size = rmp_sz; 310 + 311 + pr_info("RMP table physical range [0x%016llx - 0x%016llx]\n", 312 + rmp_base, rmp_end); 313 + 314 + return true; 315 + } 316 + 317 + static bool probe_segmented_rmptable_info(void) 318 + { 319 + unsigned int eax, ebx, segment_shift, segment_shift_min, segment_shift_max; 320 + u64 rmp_base, rmp_end; 321 + 322 + rdmsrl(MSR_AMD64_RMP_BASE, rmp_base); 323 + if (!(rmp_base & RMP_ADDR_MASK)) { 324 + pr_err("Memory for the RMP table has not been reserved by BIOS\n"); 325 + return false; 326 + } 327 + 328 + rdmsrl(MSR_AMD64_RMP_END, rmp_end); 329 + WARN_ONCE(rmp_end & RMP_ADDR_MASK, 330 + "Segmented RMP enabled but RMP_END MSR is non-zero\n"); 331 + 332 + /* Obtain the min and max supported RMP segment size */ 333 + eax = cpuid_eax(0x80000025); 334 + segment_shift_min = eax & GENMASK(5, 0); 335 + segment_shift_max = (eax & GENMASK(11, 6)) >> 6; 336 + 337 + /* Verify the segment size is within the supported limits */ 338 + segment_shift = MSR_AMD64_RMP_SEGMENT_SHIFT(rmp_cfg); 339 + if (segment_shift > segment_shift_max || segment_shift < segment_shift_min) { 340 + pr_err("RMP segment size (%u) is not within advertised bounds (min=%u, max=%u)\n", 341 + segment_shift, segment_shift_min, segment_shift_max); 342 + return false; 343 + } 344 + 345 + /* Override the max supported RST index if a hardware limit exists */ 346 + ebx = cpuid_ebx(0x80000025); 347 + if (ebx & BIT(10)) 348 + rst_max_index = ebx & GENMASK(9, 0); 349 + 350 + set_rmp_segment_info(segment_shift); 351 + 352 + probed_rmp_base = rmp_base; 353 + probed_rmp_size = 0; 354 + 355 + pr_info("Segmented RMP base table physical range [0x%016llx - 0x%016llx]\n", 356 + rmp_base, rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ + RST_SIZE); 357 + 358 + return true; 359 + } 360 + 361 + bool snp_probe_rmptable_info(void) 362 + { 363 + if (cpu_feature_enabled(X86_FEATURE_SEGMENTED_RMP)) 364 + rdmsrl(MSR_AMD64_RMP_CFG, rmp_cfg); 365 + 366 + if (rmp_cfg & MSR_AMD64_SEG_RMP_ENABLED) 367 + return probe_segmented_rmptable_info(); 368 + else 369 + return probe_contiguous_rmptable_info(); 370 + } 371 + 372 + /* 373 + * About the array_index_nospec() usage below: 374 + * 375 + * This function can get called by exported functions like 376 + * snp_lookup_rmpentry(), which is used by the KVM #PF handler, among 377 + * others, and since the @pfn passed in cannot always be trusted, 378 + * speculation should be stopped as a protective measure. 379 + */ 380 + static struct rmpentry_raw *get_raw_rmpentry(u64 pfn) 381 + { 382 + u64 paddr, rst_index, segment_index; 383 + struct rmp_segment_desc *desc; 384 + 385 + if (!rmp_segment_table) 589 386 return ERR_PTR(-ENODEV); 590 387 591 - entry = get_rmpentry(pfn); 592 - if (IS_ERR(entry)) 593 - return entry; 388 + paddr = pfn << PAGE_SHIFT; 389 + 390 + rst_index = RST_ENTRY_INDEX(paddr); 391 + if (unlikely(rst_index >= rst_max_index)) 392 + return ERR_PTR(-EFAULT); 393 + 394 + rst_index = array_index_nospec(rst_index, rst_max_index); 395 + 396 + desc = rmp_segment_table[rst_index]; 397 + if (unlikely(!desc)) 398 + return ERR_PTR(-EFAULT); 399 + 400 + segment_index = RMP_ENTRY_INDEX(paddr); 401 + if (unlikely(segment_index >= desc->max_index)) 402 + return ERR_PTR(-EFAULT); 403 + 404 + segment_index = array_index_nospec(segment_index, desc->max_index); 405 + 406 + return desc->rmp_entry + segment_index; 407 + } 408 + 409 + static int get_rmpentry(u64 pfn, struct rmpentry *e) 410 + { 411 + struct rmpentry_raw *e_raw; 412 + 413 + if (cpu_feature_enabled(X86_FEATURE_RMPREAD)) { 414 + int ret; 415 + 416 + /* Binutils version 2.44 supports the RMPREAD mnemonic. */ 417 + asm volatile(".byte 0xf2, 0x0f, 0x01, 0xfd" 418 + : "=a" (ret) 419 + : "a" (pfn << PAGE_SHIFT), "c" (e) 420 + : "memory", "cc"); 421 + 422 + return ret; 423 + } 424 + 425 + e_raw = get_raw_rmpentry(pfn); 426 + if (IS_ERR(e_raw)) 427 + return PTR_ERR(e_raw); 428 + 429 + /* 430 + * Map the raw RMP table entry onto the RMPREAD output format. 431 + * The 2MB region status indicator (hpage_region_status field) is not 432 + * calculated, since the overhead could be significant and the field 433 + * is not used. 434 + */ 435 + memset(e, 0, sizeof(*e)); 436 + e->gpa = e_raw->gpa << PAGE_SHIFT; 437 + e->asid = e_raw->asid; 438 + e->assigned = e_raw->assigned; 439 + e->pagesize = e_raw->pagesize; 440 + e->immutable = e_raw->immutable; 441 + 442 + return 0; 443 + } 444 + 445 + static int __snp_lookup_rmpentry(u64 pfn, struct rmpentry *e, int *level) 446 + { 447 + struct rmpentry e_large; 448 + int ret; 449 + 450 + if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP)) 451 + return -ENODEV; 452 + 453 + ret = get_rmpentry(pfn, e); 454 + if (ret) 455 + return ret; 594 456 595 457 /* 596 458 * Find the authoritative RMP entry for a PFN. This can be either a 4K 597 459 * RMP entry or a special large RMP entry that is authoritative for a 598 460 * whole 2M area. 599 461 */ 600 - large_entry = get_rmpentry(pfn & PFN_PMD_MASK); 601 - if (IS_ERR(large_entry)) 602 - return large_entry; 462 + ret = get_rmpentry(pfn & PFN_PMD_MASK, &e_large); 463 + if (ret) 464 + return ret; 603 465 604 - *level = RMP_TO_PG_LEVEL(large_entry->pagesize); 466 + *level = RMP_TO_PG_LEVEL(e_large.pagesize); 605 467 606 - return entry; 468 + return 0; 607 469 } 608 470 609 471 int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) 610 472 { 611 - struct rmpentry *e; 473 + struct rmpentry e; 474 + int ret; 612 475 613 - e = __snp_lookup_rmpentry(pfn, level); 614 - if (IS_ERR(e)) 615 - return PTR_ERR(e); 476 + ret = __snp_lookup_rmpentry(pfn, &e, level); 477 + if (ret) 478 + return ret; 616 479 617 - *assigned = !!e->assigned; 480 + *assigned = !!e.assigned; 618 481 return 0; 619 482 } 620 483 EXPORT_SYMBOL_GPL(snp_lookup_rmpentry); ··· 791 326 */ 792 327 static void dump_rmpentry(u64 pfn) 793 328 { 329 + struct rmpentry_raw *e_raw; 794 330 u64 pfn_i, pfn_end; 795 - struct rmpentry *e; 796 - int level; 331 + struct rmpentry e; 332 + int level, ret; 797 333 798 - e = __snp_lookup_rmpentry(pfn, &level); 799 - if (IS_ERR(e)) { 800 - pr_err("Failed to read RMP entry for PFN 0x%llx, error %ld\n", 801 - pfn, PTR_ERR(e)); 334 + ret = __snp_lookup_rmpentry(pfn, &e, &level); 335 + if (ret) { 336 + pr_err("Failed to read RMP entry for PFN 0x%llx, error %d\n", 337 + pfn, ret); 802 338 return; 803 339 } 804 340 805 - if (e->assigned) { 341 + if (e.assigned) { 342 + e_raw = get_raw_rmpentry(pfn); 343 + if (IS_ERR(e_raw)) { 344 + pr_err("Failed to read RMP contents for PFN 0x%llx, error %ld\n", 345 + pfn, PTR_ERR(e_raw)); 346 + return; 347 + } 348 + 806 349 pr_info("PFN 0x%llx, RMP entry: [0x%016llx - 0x%016llx]\n", 807 - pfn, e->lo, e->hi); 350 + pfn, e_raw->lo, e_raw->hi); 808 351 return; 809 352 } 810 353 ··· 831 358 pfn, pfn_i, pfn_end); 832 359 833 360 while (pfn_i < pfn_end) { 834 - e = __snp_lookup_rmpentry(pfn_i, &level); 835 - if (IS_ERR(e)) { 836 - pr_err("Error %ld reading RMP entry for PFN 0x%llx\n", 837 - PTR_ERR(e), pfn_i); 361 + e_raw = get_raw_rmpentry(pfn_i); 362 + if (IS_ERR(e_raw)) { 363 + pr_err("Error %ld reading RMP contents for PFN 0x%llx\n", 364 + PTR_ERR(e_raw), pfn_i); 838 365 pfn_i++; 839 366 continue; 840 367 } 841 368 842 - if (e->lo || e->hi) 843 - pr_info("PFN: 0x%llx, [0x%016llx - 0x%016llx]\n", pfn_i, e->lo, e->hi); 369 + if (e_raw->lo || e_raw->hi) 370 + pr_info("PFN: 0x%llx, [0x%016llx - 0x%016llx]\n", pfn_i, e_raw->lo, e_raw->hi); 844 371 pfn_i++; 845 372 } 846 373 }

-1

drivers/virt/coco/sev-guest/Kconfig

··· 2 2 tristate "AMD SEV Guest driver" 3 3 default m 4 4 depends on AMD_MEM_ENCRYPT 5 - select CRYPTO_LIB_AESGCM 6 5 select TSM_REPORTS 7 6 help 8 7 SEV-SNP firmware provides the guest a mechanism to communicate with

+20 -465

drivers/virt/coco/sev-guest/sev-guest.c

··· 31 31 32 32 #define DEVICE_NAME "sev-guest" 33 33 34 - #define SNP_REQ_MAX_RETRY_DURATION (60*HZ) 35 - #define SNP_REQ_RETRY_DELAY (2*HZ) 36 - 37 34 #define SVSM_MAX_RETRIES 3 38 35 39 36 struct snp_guest_dev { ··· 57 60 module_param(vmpck_id, int, 0444); 58 61 MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP."); 59 62 60 - /* Mutex to serialize the shared buffer access and command handling. */ 61 - static DEFINE_MUTEX(snp_cmd_mutex); 62 - 63 - static bool is_vmpck_empty(struct snp_msg_desc *mdesc) 64 - { 65 - char zero_key[VMPCK_KEY_LEN] = {0}; 66 - 67 - if (mdesc->vmpck) 68 - return !memcmp(mdesc->vmpck, zero_key, VMPCK_KEY_LEN); 69 - 70 - return true; 71 - } 72 - 73 - /* 74 - * If an error is received from the host or AMD Secure Processor (ASP) there 75 - * are two options. Either retry the exact same encrypted request or discontinue 76 - * using the VMPCK. 77 - * 78 - * This is because in the current encryption scheme GHCB v2 uses AES-GCM to 79 - * encrypt the requests. The IV for this scheme is the sequence number. GCM 80 - * cannot tolerate IV reuse. 81 - * 82 - * The ASP FW v1.51 only increments the sequence numbers on a successful 83 - * guest<->ASP back and forth and only accepts messages at its exact sequence 84 - * number. 85 - * 86 - * So if the sequence number were to be reused the encryption scheme is 87 - * vulnerable. If the sequence number were incremented for a fresh IV the ASP 88 - * will reject the request. 89 - */ 90 - static void snp_disable_vmpck(struct snp_msg_desc *mdesc) 91 - { 92 - pr_alert("Disabling VMPCK%d communication key to prevent IV reuse.\n", 93 - vmpck_id); 94 - memzero_explicit(mdesc->vmpck, VMPCK_KEY_LEN); 95 - mdesc->vmpck = NULL; 96 - } 97 - 98 - static inline u64 __snp_get_msg_seqno(struct snp_msg_desc *mdesc) 99 - { 100 - u64 count; 101 - 102 - lockdep_assert_held(&snp_cmd_mutex); 103 - 104 - /* Read the current message sequence counter from secrets pages */ 105 - count = *mdesc->os_area_msg_seqno; 106 - 107 - return count + 1; 108 - } 109 - 110 - /* Return a non-zero on success */ 111 - static u64 snp_get_msg_seqno(struct snp_msg_desc *mdesc) 112 - { 113 - u64 count = __snp_get_msg_seqno(mdesc); 114 - 115 - /* 116 - * The message sequence counter for the SNP guest request is a 64-bit 117 - * value but the version 2 of GHCB specification defines a 32-bit storage 118 - * for it. If the counter exceeds the 32-bit value then return zero. 119 - * The caller should check the return value, but if the caller happens to 120 - * not check the value and use it, then the firmware treats zero as an 121 - * invalid number and will fail the message request. 122 - */ 123 - if (count >= UINT_MAX) { 124 - pr_err("request message sequence counter overflow\n"); 125 - return 0; 126 - } 127 - 128 - return count; 129 - } 130 - 131 - static void snp_inc_msg_seqno(struct snp_msg_desc *mdesc) 132 - { 133 - /* 134 - * The counter is also incremented by the PSP, so increment it by 2 135 - * and save in secrets page. 136 - */ 137 - *mdesc->os_area_msg_seqno += 2; 138 - } 139 - 140 63 static inline struct snp_guest_dev *to_snp_dev(struct file *file) 141 64 { 142 65 struct miscdevice *dev = file->private_data; 143 66 144 67 return container_of(dev, struct snp_guest_dev, misc); 145 - } 146 - 147 - static struct aesgcm_ctx *snp_init_crypto(u8 *key, size_t keylen) 148 - { 149 - struct aesgcm_ctx *ctx; 150 - 151 - ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT); 152 - if (!ctx) 153 - return NULL; 154 - 155 - if (aesgcm_expandkey(ctx, key, keylen, AUTHTAG_LEN)) { 156 - pr_err("Crypto context initialization failed\n"); 157 - kfree(ctx); 158 - return NULL; 159 - } 160 - 161 - return ctx; 162 - } 163 - 164 - static int verify_and_dec_payload(struct snp_msg_desc *mdesc, struct snp_guest_req *req) 165 - { 166 - struct snp_guest_msg *resp_msg = &mdesc->secret_response; 167 - struct snp_guest_msg *req_msg = &mdesc->secret_request; 168 - struct snp_guest_msg_hdr *req_msg_hdr = &req_msg->hdr; 169 - struct snp_guest_msg_hdr *resp_msg_hdr = &resp_msg->hdr; 170 - struct aesgcm_ctx *ctx = mdesc->ctx; 171 - u8 iv[GCM_AES_IV_SIZE] = {}; 172 - 173 - pr_debug("response [seqno %lld type %d version %d sz %d]\n", 174 - resp_msg_hdr->msg_seqno, resp_msg_hdr->msg_type, resp_msg_hdr->msg_version, 175 - resp_msg_hdr->msg_sz); 176 - 177 - /* Copy response from shared memory to encrypted memory. */ 178 - memcpy(resp_msg, mdesc->response, sizeof(*resp_msg)); 179 - 180 - /* Verify that the sequence counter is incremented by 1 */ 181 - if (unlikely(resp_msg_hdr->msg_seqno != (req_msg_hdr->msg_seqno + 1))) 182 - return -EBADMSG; 183 - 184 - /* Verify response message type and version number. */ 185 - if (resp_msg_hdr->msg_type != (req_msg_hdr->msg_type + 1) || 186 - resp_msg_hdr->msg_version != req_msg_hdr->msg_version) 187 - return -EBADMSG; 188 - 189 - /* 190 - * If the message size is greater than our buffer length then return 191 - * an error. 192 - */ 193 - if (unlikely((resp_msg_hdr->msg_sz + ctx->authsize) > req->resp_sz)) 194 - return -EBADMSG; 195 - 196 - /* Decrypt the payload */ 197 - memcpy(iv, &resp_msg_hdr->msg_seqno, min(sizeof(iv), sizeof(resp_msg_hdr->msg_seqno))); 198 - if (!aesgcm_decrypt(ctx, req->resp_buf, resp_msg->payload, resp_msg_hdr->msg_sz, 199 - &resp_msg_hdr->algo, AAD_LEN, iv, resp_msg_hdr->authtag)) 200 - return -EBADMSG; 201 - 202 - return 0; 203 - } 204 - 205 - static int enc_payload(struct snp_msg_desc *mdesc, u64 seqno, struct snp_guest_req *req) 206 - { 207 - struct snp_guest_msg *msg = &mdesc->secret_request; 208 - struct snp_guest_msg_hdr *hdr = &msg->hdr; 209 - struct aesgcm_ctx *ctx = mdesc->ctx; 210 - u8 iv[GCM_AES_IV_SIZE] = {}; 211 - 212 - memset(msg, 0, sizeof(*msg)); 213 - 214 - hdr->algo = SNP_AEAD_AES_256_GCM; 215 - hdr->hdr_version = MSG_HDR_VER; 216 - hdr->hdr_sz = sizeof(*hdr); 217 - hdr->msg_type = req->msg_type; 218 - hdr->msg_version = req->msg_version; 219 - hdr->msg_seqno = seqno; 220 - hdr->msg_vmpck = req->vmpck_id; 221 - hdr->msg_sz = req->req_sz; 222 - 223 - /* Verify the sequence number is non-zero */ 224 - if (!hdr->msg_seqno) 225 - return -ENOSR; 226 - 227 - pr_debug("request [seqno %lld type %d version %d sz %d]\n", 228 - hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz); 229 - 230 - if (WARN_ON((req->req_sz + ctx->authsize) > sizeof(msg->payload))) 231 - return -EBADMSG; 232 - 233 - memcpy(iv, &hdr->msg_seqno, min(sizeof(iv), sizeof(hdr->msg_seqno))); 234 - aesgcm_encrypt(ctx, msg->payload, req->req_buf, req->req_sz, &hdr->algo, 235 - AAD_LEN, iv, hdr->authtag); 236 - 237 - return 0; 238 - } 239 - 240 - static int __handle_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req, 241 - struct snp_guest_request_ioctl *rio) 242 - { 243 - unsigned long req_start = jiffies; 244 - unsigned int override_npages = 0; 245 - u64 override_err = 0; 246 - int rc; 247 - 248 - retry_request: 249 - /* 250 - * Call firmware to process the request. In this function the encrypted 251 - * message enters shared memory with the host. So after this call the 252 - * sequence number must be incremented or the VMPCK must be deleted to 253 - * prevent reuse of the IV. 254 - */ 255 - rc = snp_issue_guest_request(req, &mdesc->input, rio); 256 - switch (rc) { 257 - case -ENOSPC: 258 - /* 259 - * If the extended guest request fails due to having too 260 - * small of a certificate data buffer, retry the same 261 - * guest request without the extended data request in 262 - * order to increment the sequence number and thus avoid 263 - * IV reuse. 264 - */ 265 - override_npages = mdesc->input.data_npages; 266 - req->exit_code = SVM_VMGEXIT_GUEST_REQUEST; 267 - 268 - /* 269 - * Override the error to inform callers the given extended 270 - * request buffer size was too small and give the caller the 271 - * required buffer size. 272 - */ 273 - override_err = SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN); 274 - 275 - /* 276 - * If this call to the firmware succeeds, the sequence number can 277 - * be incremented allowing for continued use of the VMPCK. If 278 - * there is an error reflected in the return value, this value 279 - * is checked further down and the result will be the deletion 280 - * of the VMPCK and the error code being propagated back to the 281 - * user as an ioctl() return code. 282 - */ 283 - goto retry_request; 284 - 285 - /* 286 - * The host may return SNP_GUEST_VMM_ERR_BUSY if the request has been 287 - * throttled. Retry in the driver to avoid returning and reusing the 288 - * message sequence number on a different message. 289 - */ 290 - case -EAGAIN: 291 - if (jiffies - req_start > SNP_REQ_MAX_RETRY_DURATION) { 292 - rc = -ETIMEDOUT; 293 - break; 294 - } 295 - schedule_timeout_killable(SNP_REQ_RETRY_DELAY); 296 - goto retry_request; 297 - } 298 - 299 - /* 300 - * Increment the message sequence number. There is no harm in doing 301 - * this now because decryption uses the value stored in the response 302 - * structure and any failure will wipe the VMPCK, preventing further 303 - * use anyway. 304 - */ 305 - snp_inc_msg_seqno(mdesc); 306 - 307 - if (override_err) { 308 - rio->exitinfo2 = override_err; 309 - 310 - /* 311 - * If an extended guest request was issued and the supplied certificate 312 - * buffer was not large enough, a standard guest request was issued to 313 - * prevent IV reuse. If the standard request was successful, return -EIO 314 - * back to the caller as would have originally been returned. 315 - */ 316 - if (!rc && override_err == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN)) 317 - rc = -EIO; 318 - } 319 - 320 - if (override_npages) 321 - mdesc->input.data_npages = override_npages; 322 - 323 - return rc; 324 - } 325 - 326 - static int snp_send_guest_request(struct snp_msg_desc *mdesc, struct snp_guest_req *req, 327 - struct snp_guest_request_ioctl *rio) 328 - { 329 - u64 seqno; 330 - int rc; 331 - 332 - guard(mutex)(&snp_cmd_mutex); 333 - 334 - /* Check if the VMPCK is not empty */ 335 - if (is_vmpck_empty(mdesc)) { 336 - pr_err_ratelimited("VMPCK is disabled\n"); 337 - return -ENOTTY; 338 - } 339 - 340 - /* Get message sequence and verify that its a non-zero */ 341 - seqno = snp_get_msg_seqno(mdesc); 342 - if (!seqno) 343 - return -EIO; 344 - 345 - /* Clear shared memory's response for the host to populate. */ 346 - memset(mdesc->response, 0, sizeof(struct snp_guest_msg)); 347 - 348 - /* Encrypt the userspace provided payload in mdesc->secret_request. */ 349 - rc = enc_payload(mdesc, seqno, req); 350 - if (rc) 351 - return rc; 352 - 353 - /* 354 - * Write the fully encrypted request to the shared unencrypted 355 - * request page. 356 - */ 357 - memcpy(mdesc->request, &mdesc->secret_request, 358 - sizeof(mdesc->secret_request)); 359 - 360 - rc = __handle_guest_request(mdesc, req, rio); 361 - if (rc) { 362 - if (rc == -EIO && 363 - rio->exitinfo2 == SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN)) 364 - return rc; 365 - 366 - pr_alert("Detected error from ASP request. rc: %d, exitinfo2: 0x%llx\n", 367 - rc, rio->exitinfo2); 368 - 369 - snp_disable_vmpck(mdesc); 370 - return rc; 371 - } 372 - 373 - rc = verify_and_dec_payload(mdesc, req); 374 - if (rc) { 375 - pr_alert("Detected unexpected decode failure from ASP. rc: %d\n", rc); 376 - snp_disable_vmpck(mdesc); 377 - return rc; 378 - } 379 - 380 - return 0; 381 68 } 382 69 383 70 struct snp_req_resp { ··· 95 414 96 415 req.msg_version = arg->msg_version; 97 416 req.msg_type = SNP_MSG_REPORT_REQ; 98 - req.vmpck_id = vmpck_id; 417 + req.vmpck_id = mdesc->vmpck_id; 99 418 req.req_buf = report_req; 100 419 req.req_sz = sizeof(*report_req); 101 420 req.resp_buf = report_resp->data; ··· 142 461 143 462 req.msg_version = arg->msg_version; 144 463 req.msg_type = SNP_MSG_KEY_REQ; 145 - req.vmpck_id = vmpck_id; 464 + req.vmpck_id = mdesc->vmpck_id; 146 465 req.req_buf = derived_key_req; 147 466 req.req_sz = sizeof(*derived_key_req); 148 467 req.resp_buf = buf; ··· 220 539 221 540 req.msg_version = arg->msg_version; 222 541 req.msg_type = SNP_MSG_REPORT_REQ; 223 - req.vmpck_id = vmpck_id; 542 + req.vmpck_id = mdesc->vmpck_id; 224 543 req.req_buf = &report_req->data; 225 544 req.req_sz = sizeof(report_req->data); 226 545 req.resp_buf = report_resp->data; ··· 297 616 return ret; 298 617 } 299 618 300 - static void free_shared_pages(void *buf, size_t sz) 301 - { 302 - unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT; 303 - int ret; 304 - 305 - if (!buf) 306 - return; 307 - 308 - ret = set_memory_encrypted((unsigned long)buf, npages); 309 - if (ret) { 310 - WARN_ONCE(ret, "failed to restore encryption mask (leak it)\n"); 311 - return; 312 - } 313 - 314 - __free_pages(virt_to_page(buf), get_order(sz)); 315 - } 316 - 317 - static void *alloc_shared_pages(struct device *dev, size_t sz) 318 - { 319 - unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT; 320 - struct page *page; 321 - int ret; 322 - 323 - page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz)); 324 - if (!page) 325 - return NULL; 326 - 327 - ret = set_memory_decrypted((unsigned long)page_address(page), npages); 328 - if (ret) { 329 - dev_err(dev, "failed to mark page shared, ret=%d\n", ret); 330 - __free_pages(page, get_order(sz)); 331 - return NULL; 332 - } 333 - 334 - return page_address(page); 335 - } 336 - 337 619 static const struct file_operations snp_guest_fops = { 338 620 .owner = THIS_MODULE, 339 621 .unlocked_ioctl = snp_guest_ioctl, 340 622 }; 341 - 342 - static u8 *get_vmpck(int id, struct snp_secrets_page *secrets, u32 **seqno) 343 - { 344 - u8 *key = NULL; 345 - 346 - switch (id) { 347 - case 0: 348 - *seqno = &secrets->os_area.msg_seqno_0; 349 - key = secrets->vmpck0; 350 - break; 351 - case 1: 352 - *seqno = &secrets->os_area.msg_seqno_1; 353 - key = secrets->vmpck1; 354 - break; 355 - case 2: 356 - *seqno = &secrets->os_area.msg_seqno_2; 357 - key = secrets->vmpck2; 358 - break; 359 - case 3: 360 - *seqno = &secrets->os_area.msg_seqno_3; 361 - key = secrets->vmpck3; 362 - break; 363 - default: 364 - break; 365 - } 366 - 367 - return key; 368 - } 369 623 370 624 struct snp_msg_report_resp_hdr { 371 625 u32 status; ··· 595 979 596 980 static int __init sev_guest_probe(struct platform_device *pdev) 597 981 { 598 - struct sev_guest_platform_data *data; 599 - struct snp_secrets_page *secrets; 600 982 struct device *dev = &pdev->dev; 601 983 struct snp_guest_dev *snp_dev; 602 984 struct snp_msg_desc *mdesc; 603 985 struct miscdevice *misc; 604 - void __iomem *mapping; 605 986 int ret; 606 987 607 988 BUILD_BUG_ON(sizeof(struct snp_guest_msg) > PAGE_SIZE); ··· 606 993 if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) 607 994 return -ENODEV; 608 995 609 - if (!dev->platform_data) 610 - return -ENODEV; 611 - 612 - data = (struct sev_guest_platform_data *)dev->platform_data; 613 - mapping = ioremap_encrypted(data->secrets_gpa, PAGE_SIZE); 614 - if (!mapping) 615 - return -ENODEV; 616 - 617 - secrets = (__force void *)mapping; 618 - 619 - ret = -ENOMEM; 620 996 snp_dev = devm_kzalloc(&pdev->dev, sizeof(struct snp_guest_dev), GFP_KERNEL); 621 997 if (!snp_dev) 622 - goto e_unmap; 998 + return -ENOMEM; 623 999 624 - mdesc = devm_kzalloc(&pdev->dev, sizeof(struct snp_msg_desc), GFP_KERNEL); 625 - if (!mdesc) 626 - goto e_unmap; 1000 + mdesc = snp_msg_alloc(); 1001 + if (IS_ERR_OR_NULL(mdesc)) 1002 + return -ENOMEM; 627 1003 628 - /* Adjust the default VMPCK key based on the executing VMPL level */ 629 - if (vmpck_id == -1) 630 - vmpck_id = snp_vmpl; 631 - 632 - ret = -EINVAL; 633 - mdesc->vmpck = get_vmpck(vmpck_id, secrets, &mdesc->os_area_msg_seqno); 634 - if (!mdesc->vmpck) { 635 - dev_err(dev, "Invalid VMPCK%d communication key\n", vmpck_id); 636 - goto e_unmap; 637 - } 638 - 639 - /* Verify that VMPCK is not zero. */ 640 - if (is_vmpck_empty(mdesc)) { 641 - dev_err(dev, "Empty VMPCK%d communication key\n", vmpck_id); 642 - goto e_unmap; 643 - } 1004 + ret = snp_msg_init(mdesc, vmpck_id); 1005 + if (ret) 1006 + goto e_msg_init; 644 1007 645 1008 platform_set_drvdata(pdev, snp_dev); 646 1009 snp_dev->dev = dev; 647 - mdesc->secrets = secrets; 648 - 649 - /* Allocate the shared page used for the request and response message. */ 650 - mdesc->request = alloc_shared_pages(dev, sizeof(struct snp_guest_msg)); 651 - if (!mdesc->request) 652 - goto e_unmap; 653 - 654 - mdesc->response = alloc_shared_pages(dev, sizeof(struct snp_guest_msg)); 655 - if (!mdesc->response) 656 - goto e_free_request; 657 - 658 - mdesc->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE); 659 - if (!mdesc->certs_data) 660 - goto e_free_response; 661 - 662 - ret = -EIO; 663 - mdesc->ctx = snp_init_crypto(mdesc->vmpck, VMPCK_KEY_LEN); 664 - if (!mdesc->ctx) 665 - goto e_free_cert_data; 666 1010 667 1011 misc = &snp_dev->misc; 668 1012 misc->minor = MISC_DYNAMIC_MINOR; 669 1013 misc->name = DEVICE_NAME; 670 1014 misc->fops = &snp_guest_fops; 671 1015 672 - /* Initialize the input addresses for guest request */ 673 - mdesc->input.req_gpa = __pa(mdesc->request); 674 - mdesc->input.resp_gpa = __pa(mdesc->response); 675 - mdesc->input.data_gpa = __pa(mdesc->certs_data); 676 - 677 1016 /* Set the privlevel_floor attribute based on the vmpck_id */ 678 - sev_tsm_ops.privlevel_floor = vmpck_id; 1017 + sev_tsm_ops.privlevel_floor = mdesc->vmpck_id; 679 1018 680 1019 ret = tsm_register(&sev_tsm_ops, snp_dev); 681 1020 if (ret) 682 - goto e_free_cert_data; 1021 + goto e_msg_init; 683 1022 684 1023 ret = devm_add_action_or_reset(&pdev->dev, unregister_sev_tsm, NULL); 685 1024 if (ret) 686 - goto e_free_cert_data; 1025 + goto e_msg_init; 687 1026 688 1027 ret = misc_register(misc); 689 1028 if (ret) 690 - goto e_free_ctx; 1029 + goto e_msg_init; 691 1030 692 1031 snp_dev->msg_desc = mdesc; 693 - dev_info(dev, "Initialized SEV guest driver (using VMPCK%d communication key)\n", vmpck_id); 1032 + dev_info(dev, "Initialized SEV guest driver (using VMPCK%d communication key)\n", 1033 + mdesc->vmpck_id); 694 1034 return 0; 695 1035 696 - e_free_ctx: 697 - kfree(mdesc->ctx); 698 - e_free_cert_data: 699 - free_shared_pages(mdesc->certs_data, SEV_FW_BLOB_MAX_SIZE); 700 - e_free_response: 701 - free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg)); 702 - e_free_request: 703 - free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg)); 704 - e_unmap: 705 - iounmap(mapping); 1036 + e_msg_init: 1037 + snp_msg_free(mdesc); 1038 + 706 1039 return ret; 707 1040 } 708 1041 709 1042 static void __exit sev_guest_remove(struct platform_device *pdev) 710 1043 { 711 1044 struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev); 712 - struct snp_msg_desc *mdesc = snp_dev->msg_desc; 713 1045 714 - free_shared_pages(mdesc->certs_data, SEV_FW_BLOB_MAX_SIZE); 715 - free_shared_pages(mdesc->response, sizeof(struct snp_guest_msg)); 716 - free_shared_pages(mdesc->request, sizeof(struct snp_guest_msg)); 717 - kfree(mdesc->ctx); 1046 + snp_msg_free(snp_dev->msg_desc); 718 1047 misc_deregister(&snp_dev->misc); 719 1048 } 720 1049

+8

include/linux/cc_platform.h

··· 82 82 CC_ATTR_GUEST_SEV_SNP, 83 83 84 84 /** 85 + * @CC_ATTR_GUEST_SNP_SECURE_TSC: SNP Secure TSC is active. 86 + * 87 + * The platform/OS is running as a guest/virtual machine and actively 88 + * using AMD SEV-SNP Secure TSC feature. 89 + */ 90 + CC_ATTR_GUEST_SNP_SECURE_TSC, 91 + 92 + /** 85 93 * @CC_ATTR_HOST_SEV_SNP: AMD SNP enabled on the host. 86 94 * 87 95 * The host kernel is running with the necessary features

Configure Feed

Configure Feed