Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'edac_updates_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

- imh_edac: Add a new EDAC driver for Intel Diamond Rapids and future
incarnations of this memory controllers architecture

- amd64_edac: Remove the legacy csrow sysfs interface which has been
deprecated and unused (we assume) for at least a decade

- Add the capability to fallback to BIOS-provided address translation
functionality (ACPI PRM) which can be used on systems unsupported by
the current AMD address translation library

- The usual fixes, fixlets, cleanups and improvements all over the
place

* tag 'edac_updates_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
RAS/AMD/ATL: Replace bitwise_xor_bits() with hweight16()
EDAC/igen6: Fix error handling in igen6_edac driver
EDAC/imh: Setup 'imh_test' debugfs testing node
EDAC/{skx_comm,imh}: Detect 2-level memory configuration
EDAC/skx_common: Extend the maximum number of DRAM chip row bits
EDAC/{skx_common,imh}: Add EDAC driver for Intel Diamond Rapids servers
EDAC/skx_common: Prepare for skx_set_hi_lo()
EDAC/skx_common: Prepare for skx_get_edac_list()
EDAC/{skx_common,skx,i10nm}: Make skx_register_mci() independent of pci_dev
EDAC/ghes: Replace deprecated strcpy() in ghes_edac_report_mem_error()
EDAC/ie31200: Fix error handling in ie31200_register_mci
RAS/CEC: Replace use of system_wq with system_percpu_wq
EDAC: Remove the legacy EDAC sysfs interface
EDAC/amd64: Remove NUM_CONTROLLERS macro
EDAC/amd64: Generate ctl_name string at runtime
RAS/AMD/ATL: Require PRM support for future systems
ACPI: PRM: Add acpi_prm_handler_available()
RAS/AMD/ATL: Return error codes from helper functions

+796 -673
+3 -139
Documentation/admin-guide/RAS/main.rst
··· 406 406 |->mc2 407 407 .... 408 408 409 - Under each ``mcX`` directory each ``csrowX`` is again represented by a 410 - ``csrowX``, where ``X`` is the csrow index:: 411 - 412 - .../mc/mc0/ 413 - | 414 - |->csrow0 415 - |->csrow2 416 - |->csrow3 417 - .... 418 - 419 - Notice that there is no csrow1, which indicates that csrow0 is composed 420 - of a single ranked DIMMs. This should also apply in both Channels, in 421 - order to have dual-channel mode be operational. Since both csrow2 and 422 - csrow3 are populated, this indicates a dual ranked set of DIMMs for 423 - channels 0 and 1. 424 - 425 - Within each of the ``mcX`` and ``csrowX`` directories are several EDAC 426 - control and attribute files. 409 + Within each of the ``mcX`` directory are several EDAC control and 410 + attribute files. 427 411 428 412 ``mcX`` directories 429 413 ------------------- ··· 553 569 - Unbuffered-DDR 554 570 555 571 .. [#f5] On some systems, the memory controller doesn't have any logic 556 - to identify the memory module. On such systems, the directory is called ``rankX`` and works on a similar way as the ``csrowX`` directories. 572 + to identify the memory module. On such systems, the directory is called ``rankX``. 557 573 On modern Intel memory controllers, the memory controller identifies the 558 574 memory modules directly. On such systems, the directory is called ``dimmX``. 559 575 560 576 .. [#f6] There are also some ``power`` directories and ``subsystem`` 561 577 symlinks inside the sysfs mapping that are automatically created by 562 578 the sysfs subsystem. Currently, they serve no purpose. 563 - 564 - ``csrowX`` directories 565 - ---------------------- 566 - 567 - When CONFIG_EDAC_LEGACY_SYSFS is enabled, sysfs will contain the ``csrowX`` 568 - directories. As this API doesn't work properly for Rambus, FB-DIMMs and 569 - modern Intel Memory Controllers, this is being deprecated in favor of 570 - ``dimmX`` directories. 571 - 572 - In the ``csrowX`` directories are EDAC control and attribute files for 573 - this ``X`` instance of csrow: 574 - 575 - 576 - - ``ue_count`` - Total Uncorrectable Errors count attribute file 577 - 578 - This attribute file displays the total count of uncorrectable 579 - errors that have occurred on this csrow. If panic_on_ue is set 580 - this counter will not have a chance to increment, since EDAC 581 - will panic the system. 582 - 583 - 584 - - ``ce_count`` - Total Correctable Errors count attribute file 585 - 586 - This attribute file displays the total count of correctable 587 - errors that have occurred on this csrow. This count is very 588 - important to examine. CEs provide early indications that a 589 - DIMM is beginning to fail. This count field should be 590 - monitored for non-zero values and report such information 591 - to the system administrator. 592 - 593 - 594 - - ``size_mb`` - Total memory managed by this csrow attribute file 595 - 596 - This attribute file displays, in count of megabytes, the memory 597 - that this csrow contains. 598 - 599 - 600 - - ``mem_type`` - Memory Type attribute file 601 - 602 - This attribute file will display what type of memory is currently 603 - on this csrow. Normally, either buffered or unbuffered memory. 604 - Examples: 605 - 606 - - Registered-DDR 607 - - Unbuffered-DDR 608 - 609 - 610 - - ``edac_mode`` - EDAC Mode of operation attribute file 611 - 612 - This attribute file will display what type of Error detection 613 - and correction is being utilized. 614 - 615 - 616 - - ``dev_type`` - Device type attribute file 617 - 618 - This attribute file will display what type of DRAM device is 619 - being utilized on this DIMM. 620 - Examples: 621 - 622 - - x1 623 - - x2 624 - - x4 625 - - x8 626 - 627 - 628 - - ``ch0_ce_count`` - Channel 0 CE Count attribute file 629 - 630 - This attribute file will display the count of CEs on this 631 - DIMM located in channel 0. 632 - 633 - 634 - - ``ch0_ue_count`` - Channel 0 UE Count attribute file 635 - 636 - This attribute file will display the count of UEs on this 637 - DIMM located in channel 0. 638 - 639 - 640 - - ``ch0_dimm_label`` - Channel 0 DIMM Label control file 641 - 642 - 643 - This control file allows this DIMM to have a label assigned 644 - to it. With this label in the module, when errors occur 645 - the output can provide the DIMM label in the system log. 646 - This becomes vital for panic events to isolate the 647 - cause of the UE event. 648 - 649 - DIMM Labels must be assigned after booting, with information 650 - that correctly identifies the physical slot with its 651 - silk screen label. This information is currently very 652 - motherboard specific and determination of this information 653 - must occur in userland at this time. 654 - 655 - 656 - - ``ch1_ce_count`` - Channel 1 CE Count attribute file 657 - 658 - 659 - This attribute file will display the count of CEs on this 660 - DIMM located in channel 1. 661 - 662 - 663 - - ``ch1_ue_count`` - Channel 1 UE Count attribute file 664 - 665 - 666 - This attribute file will display the count of UEs on this 667 - DIMM located in channel 0. 668 - 669 - 670 - - ``ch1_dimm_label`` - Channel 1 DIMM Label control file 671 - 672 - This control file allows this DIMM to have a label assigned 673 - to it. With this label in the module, when errors occur 674 - the output can provide the DIMM label in the system log. 675 - This becomes vital for panic events to isolate the 676 - cause of the UE event. 677 - 678 - DIMM Labels must be assigned after booting, with information 679 - that correctly identifies the physical slot with its 680 - silk screen label. This information is currently very 681 - motherboard specific and determination of this information 682 - must occur in userland at this time. 683 579 684 580 685 581 System Logging
-1
arch/loongarch/configs/loongson3_defconfig
··· 917 917 CONFIG_MMC_LOONGSON2=m 918 918 CONFIG_INFINIBAND=m 919 919 CONFIG_EDAC=y 920 - # CONFIG_EDAC_LEGACY_SYSFS is not set 921 920 CONFIG_EDAC_LOONGSON=y 922 921 CONFIG_RTC_CLASS=y 923 922 CONFIG_RTC_DRV_EFI=y
+6
drivers/acpi/prmt.c
··· 244 244 return (struct prm_handler_info *) find_guid_info(guid, GET_HANDLER); 245 245 } 246 246 247 + bool acpi_prm_handler_available(const guid_t *guid) 248 + { 249 + return find_prm_handler(guid) && find_prm_module(guid); 250 + } 251 + EXPORT_SYMBOL_GPL(acpi_prm_handler_available); 252 + 247 253 /* In-coming PRM commands */ 248 254 249 255 #define PRM_CMD_RUN_SERVICE 0
+12 -8
drivers/edac/Kconfig
··· 23 23 24 24 if EDAC 25 25 26 - config EDAC_LEGACY_SYSFS 27 - bool "EDAC legacy sysfs" 28 - default y 29 - help 30 - Enable the compatibility sysfs nodes. 31 - Use 'Y' if your edac utilities aren't ported to work with the newer 32 - structures. 33 - 34 26 config EDAC_DEBUG 35 27 bool "Debugging" 36 28 select DEBUG_FS ··· 282 290 10nm server Integrated Memory Controllers. If your 283 291 system has non-volatile DIMMs you should also manually 284 292 select CONFIG_ACPI_NFIT. 293 + 294 + config EDAC_IMH 295 + tristate "Intel Integrated Memory/IO Hub MC" 296 + depends on X86_64 && X86_MCE_INTEL && ACPI 297 + depends on ACPI_NFIT || !ACPI_NFIT # if ACPI_NFIT=m, EDAC_IMH can't be y 298 + select DMI 299 + select ACPI_ADXL 300 + help 301 + Support for error detection and correction the Intel 302 + Integrated Memory/IO Hub Memory Controller. This MC IP is 303 + first used on the Diamond Rapids servers but may appear on 304 + others in the future. 285 305 286 306 config EDAC_PND2 287 307 tristate "Intel Pondicherry2"
+3
drivers/edac/Makefile
··· 65 65 i10nm_edac-y := i10nm_base.o 66 66 obj-$(CONFIG_EDAC_I10NM) += i10nm_edac.o skx_edac_common.o 67 67 68 + imh_edac-y := imh_base.o 69 + obj-$(CONFIG_EDAC_IMH) += imh_edac.o skx_edac_common.o 70 + 68 71 obj-$(CONFIG_EDAC_HIGHBANK_MC) += highbank_mc_edac.o 69 72 obj-$(CONFIG_EDAC_HIGHBANK_L2) += highbank_l2_edac.o 70 73
+15 -46
drivers/edac/amd64_edac.c
··· 3732 3732 pci_dev_put(pvt->F1); 3733 3733 pci_dev_put(pvt->F2); 3734 3734 kfree(pvt->umc); 3735 + kfree(pvt->csels); 3735 3736 } 3736 3737 3737 3738 static struct low_ops umc_ops = { ··· 3767 3766 pvt->stepping = boot_cpu_data.x86_stepping; 3768 3767 pvt->model = boot_cpu_data.x86_model; 3769 3768 pvt->fam = boot_cpu_data.x86; 3769 + char *tmp_name = NULL; 3770 3770 pvt->max_mcs = 2; 3771 3771 3772 3772 /* ··· 3781 3779 3782 3780 switch (pvt->fam) { 3783 3781 case 0xf: 3784 - pvt->ctl_name = (pvt->ext_model >= K8_REV_F) ? 3782 + tmp_name = (pvt->ext_model >= K8_REV_F) ? 3785 3783 "K8 revF or later" : "K8 revE or earlier"; 3786 3784 pvt->f1_id = PCI_DEVICE_ID_AMD_K8_NB_ADDRMAP; 3787 3785 pvt->f2_id = PCI_DEVICE_ID_AMD_K8_NB_MEMCTL; ··· 3790 3788 break; 3791 3789 3792 3790 case 0x10: 3793 - pvt->ctl_name = "F10h"; 3794 3791 pvt->f1_id = PCI_DEVICE_ID_AMD_10H_NB_MAP; 3795 3792 pvt->f2_id = PCI_DEVICE_ID_AMD_10H_NB_DRAM; 3796 3793 pvt->ops->dbam_to_cs = f10_dbam_to_chip_select; ··· 3798 3797 case 0x15: 3799 3798 switch (pvt->model) { 3800 3799 case 0x30: 3801 - pvt->ctl_name = "F15h_M30h"; 3802 3800 pvt->f1_id = PCI_DEVICE_ID_AMD_15H_M30H_NB_F1; 3803 3801 pvt->f2_id = PCI_DEVICE_ID_AMD_15H_M30H_NB_F2; 3804 3802 break; 3805 3803 case 0x60: 3806 - pvt->ctl_name = "F15h_M60h"; 3807 3804 pvt->f1_id = PCI_DEVICE_ID_AMD_15H_M60H_NB_F1; 3808 3805 pvt->f2_id = PCI_DEVICE_ID_AMD_15H_M60H_NB_F2; 3809 3806 pvt->ops->dbam_to_cs = f15_m60h_dbam_to_chip_select; ··· 3810 3811 /* Richland is only client */ 3811 3812 return -ENODEV; 3812 3813 default: 3813 - pvt->ctl_name = "F15h"; 3814 3814 pvt->f1_id = PCI_DEVICE_ID_AMD_15H_NB_F1; 3815 3815 pvt->f2_id = PCI_DEVICE_ID_AMD_15H_NB_F2; 3816 3816 pvt->ops->dbam_to_cs = f15_dbam_to_chip_select; ··· 3820 3822 case 0x16: 3821 3823 switch (pvt->model) { 3822 3824 case 0x30: 3823 - pvt->ctl_name = "F16h_M30h"; 3824 3825 pvt->f1_id = PCI_DEVICE_ID_AMD_16H_M30H_NB_F1; 3825 3826 pvt->f2_id = PCI_DEVICE_ID_AMD_16H_M30H_NB_F2; 3826 3827 break; 3827 3828 default: 3828 - pvt->ctl_name = "F16h"; 3829 3829 pvt->f1_id = PCI_DEVICE_ID_AMD_16H_NB_F1; 3830 3830 pvt->f2_id = PCI_DEVICE_ID_AMD_16H_NB_F2; 3831 3831 break; ··· 3832 3836 3833 3837 case 0x17: 3834 3838 switch (pvt->model) { 3835 - case 0x10 ... 0x2f: 3836 - pvt->ctl_name = "F17h_M10h"; 3837 - break; 3838 3839 case 0x30 ... 0x3f: 3839 - pvt->ctl_name = "F17h_M30h"; 3840 3840 pvt->max_mcs = 8; 3841 3841 break; 3842 - case 0x60 ... 0x6f: 3843 - pvt->ctl_name = "F17h_M60h"; 3844 - break; 3845 - case 0x70 ... 0x7f: 3846 - pvt->ctl_name = "F17h_M70h"; 3847 - break; 3848 3842 default: 3849 - pvt->ctl_name = "F17h"; 3850 3843 break; 3851 3844 } 3852 3845 break; 3853 3846 3854 3847 case 0x18: 3855 - pvt->ctl_name = "F18h"; 3856 3848 break; 3857 3849 3858 3850 case 0x19: 3859 3851 switch (pvt->model) { 3860 3852 case 0x00 ... 0x0f: 3861 - pvt->ctl_name = "F19h"; 3862 3853 pvt->max_mcs = 8; 3863 3854 break; 3864 3855 case 0x10 ... 0x1f: 3865 - pvt->ctl_name = "F19h_M10h"; 3866 3856 pvt->max_mcs = 12; 3867 3857 pvt->flags.zn_regs_v2 = 1; 3868 3858 break; 3869 - case 0x20 ... 0x2f: 3870 - pvt->ctl_name = "F19h_M20h"; 3871 - break; 3872 3859 case 0x30 ... 0x3f: 3873 3860 if (pvt->F3->device == PCI_DEVICE_ID_AMD_MI200_DF_F3) { 3874 - pvt->ctl_name = "MI200"; 3861 + tmp_name = "MI200"; 3875 3862 pvt->max_mcs = 4; 3876 3863 pvt->dram_type = MEM_HBM2; 3877 3864 pvt->gpu_umc_base = 0x50000; 3878 3865 pvt->ops = &gpu_ops; 3879 3866 } else { 3880 - pvt->ctl_name = "F19h_M30h"; 3881 3867 pvt->max_mcs = 8; 3882 3868 } 3883 3869 break; 3884 - case 0x50 ... 0x5f: 3885 - pvt->ctl_name = "F19h_M50h"; 3886 - break; 3887 3870 case 0x60 ... 0x6f: 3888 - pvt->ctl_name = "F19h_M60h"; 3889 3871 pvt->flags.zn_regs_v2 = 1; 3890 3872 break; 3891 3873 case 0x70 ... 0x7f: 3892 - pvt->ctl_name = "F19h_M70h"; 3893 3874 pvt->max_mcs = 4; 3894 3875 pvt->flags.zn_regs_v2 = 1; 3895 3876 break; 3896 3877 case 0x90 ... 0x9f: 3897 - pvt->ctl_name = "F19h_M90h"; 3898 3878 pvt->max_mcs = 4; 3899 3879 pvt->dram_type = MEM_HBM3; 3900 3880 pvt->gpu_umc_base = 0x90000; 3901 3881 pvt->ops = &gpu_ops; 3902 3882 break; 3903 3883 case 0xa0 ... 0xaf: 3904 - pvt->ctl_name = "F19h_MA0h"; 3905 3884 pvt->max_mcs = 12; 3906 3885 pvt->flags.zn_regs_v2 = 1; 3907 3886 break; ··· 3886 3915 case 0x1A: 3887 3916 switch (pvt->model) { 3888 3917 case 0x00 ... 0x1f: 3889 - pvt->ctl_name = "F1Ah"; 3890 3918 pvt->max_mcs = 12; 3891 3919 pvt->flags.zn_regs_v2 = 1; 3892 3920 break; 3893 3921 case 0x40 ... 0x4f: 3894 - pvt->ctl_name = "F1Ah_M40h"; 3895 3922 pvt->flags.zn_regs_v2 = 1; 3896 3923 break; 3897 3924 case 0x50 ... 0x57: 3898 - pvt->ctl_name = "F1Ah_M50h"; 3925 + case 0xc0 ... 0xc7: 3899 3926 pvt->max_mcs = 16; 3900 3927 pvt->flags.zn_regs_v2 = 1; 3901 3928 break; 3902 3929 case 0x90 ... 0x9f: 3903 - pvt->ctl_name = "F1Ah_M90h"; 3904 - pvt->max_mcs = 8; 3905 - pvt->flags.zn_regs_v2 = 1; 3906 - break; 3907 3930 case 0xa0 ... 0xaf: 3908 - pvt->ctl_name = "F1Ah_MA0h"; 3909 3931 pvt->max_mcs = 8; 3910 - pvt->flags.zn_regs_v2 = 1; 3911 - break; 3912 - case 0xc0 ... 0xc7: 3913 - pvt->ctl_name = "F1Ah_MC0h"; 3914 - pvt->max_mcs = 16; 3915 3932 pvt->flags.zn_regs_v2 = 1; 3916 3933 break; 3917 3934 } ··· 3909 3950 amd64_err("Unsupported family!\n"); 3910 3951 return -ENODEV; 3911 3952 } 3953 + 3954 + if (tmp_name) 3955 + scnprintf(pvt->ctl_name, sizeof(pvt->ctl_name), tmp_name); 3956 + else 3957 + scnprintf(pvt->ctl_name, sizeof(pvt->ctl_name), "F%02Xh_M%02Xh", 3958 + pvt->fam, pvt->model); 3959 + 3960 + pvt->csels = kcalloc(pvt->max_mcs, sizeof(*pvt->csels), GFP_KERNEL); 3961 + if (!pvt->csels) 3962 + return -ENOMEM; 3912 3963 3913 3964 return 0; 3914 3965 }
+4 -3
drivers/edac/amd64_edac.h
··· 96 96 /* Hardware limit on ChipSelect rows per MC and processors per system */ 97 97 #define NUM_CHIPSELECTS 8 98 98 #define DRAM_RANGES 8 99 - #define NUM_CONTROLLERS 16 100 99 101 100 #define ON true 102 101 #define OFF false 102 + 103 + #define MAX_CTL_NAMELEN 19 103 104 104 105 /* 105 106 * PCI-defined configuration space registers ··· 347 346 u32 dbam1; /* DRAM Base Address Mapping reg for DCT1 */ 348 347 349 348 /* one for each DCT/UMC */ 350 - struct chip_select csels[NUM_CONTROLLERS]; 349 + struct chip_select *csels; 351 350 352 351 /* DRAM base and limit pairs F1x[78,70,68,60,58,50,48,40] */ 353 352 struct dram_range ranges[DRAM_RANGES]; ··· 363 362 /* x4, x8, or x16 syndromes in use */ 364 363 u8 ecc_sym_sz; 365 364 366 - const char *ctl_name; 365 + char ctl_name[MAX_CTL_NAMELEN]; 367 366 u16 f1_id, f2_id; 368 367 /* Maximum number of memory controllers per die/node. */ 369 368 u8 max_mcs;
-404
drivers/edac/edac_mc_sysfs.c
··· 115 115 [EDAC_S16ECD16ED] = "S16ECD16ED" 116 116 }; 117 117 118 - #ifdef CONFIG_EDAC_LEGACY_SYSFS 119 - /* 120 - * EDAC sysfs CSROW data structures and methods 121 - */ 122 - 123 - #define to_csrow(k) container_of(k, struct csrow_info, dev) 124 - 125 - /* 126 - * We need it to avoid namespace conflicts between the legacy API 127 - * and the per-dimm/per-rank one 128 - */ 129 - #define DEVICE_ATTR_LEGACY(_name, _mode, _show, _store) \ 130 - static struct device_attribute dev_attr_legacy_##_name = __ATTR(_name, _mode, _show, _store) 131 - 132 - struct dev_ch_attribute { 133 - struct device_attribute attr; 134 - unsigned int channel; 135 - }; 136 - 137 - #define DEVICE_CHANNEL(_name, _mode, _show, _store, _var) \ 138 - static struct dev_ch_attribute dev_attr_legacy_##_name = \ 139 - { __ATTR(_name, _mode, _show, _store), (_var) } 140 - 141 - #define to_channel(k) (container_of(k, struct dev_ch_attribute, attr)->channel) 142 - 143 - /* Set of more default csrow<id> attribute show/store functions */ 144 - static ssize_t csrow_ue_count_show(struct device *dev, 145 - struct device_attribute *mattr, char *data) 146 - { 147 - struct csrow_info *csrow = to_csrow(dev); 148 - 149 - return sysfs_emit(data, "%u\n", csrow->ue_count); 150 - } 151 - 152 - static ssize_t csrow_ce_count_show(struct device *dev, 153 - struct device_attribute *mattr, char *data) 154 - { 155 - struct csrow_info *csrow = to_csrow(dev); 156 - 157 - return sysfs_emit(data, "%u\n", csrow->ce_count); 158 - } 159 - 160 - static ssize_t csrow_size_show(struct device *dev, 161 - struct device_attribute *mattr, char *data) 162 - { 163 - struct csrow_info *csrow = to_csrow(dev); 164 - int i; 165 - u32 nr_pages = 0; 166 - 167 - for (i = 0; i < csrow->nr_channels; i++) 168 - nr_pages += csrow->channels[i]->dimm->nr_pages; 169 - return sysfs_emit(data, "%u\n", PAGES_TO_MiB(nr_pages)); 170 - } 171 - 172 - static ssize_t csrow_mem_type_show(struct device *dev, 173 - struct device_attribute *mattr, char *data) 174 - { 175 - struct csrow_info *csrow = to_csrow(dev); 176 - 177 - return sysfs_emit(data, "%s\n", edac_mem_types[csrow->channels[0]->dimm->mtype]); 178 - } 179 - 180 - static ssize_t csrow_dev_type_show(struct device *dev, 181 - struct device_attribute *mattr, char *data) 182 - { 183 - struct csrow_info *csrow = to_csrow(dev); 184 - 185 - return sysfs_emit(data, "%s\n", dev_types[csrow->channels[0]->dimm->dtype]); 186 - } 187 - 188 - static ssize_t csrow_edac_mode_show(struct device *dev, 189 - struct device_attribute *mattr, 190 - char *data) 191 - { 192 - struct csrow_info *csrow = to_csrow(dev); 193 - 194 - return sysfs_emit(data, "%s\n", edac_caps[csrow->channels[0]->dimm->edac_mode]); 195 - } 196 - 197 - /* show/store functions for DIMM Label attributes */ 198 - static ssize_t channel_dimm_label_show(struct device *dev, 199 - struct device_attribute *mattr, 200 - char *data) 201 - { 202 - struct csrow_info *csrow = to_csrow(dev); 203 - unsigned int chan = to_channel(mattr); 204 - struct rank_info *rank = csrow->channels[chan]; 205 - 206 - /* if field has not been initialized, there is nothing to send */ 207 - if (!rank->dimm->label[0]) 208 - return 0; 209 - 210 - return sysfs_emit(data, "%s\n", rank->dimm->label); 211 - } 212 - 213 - static ssize_t channel_dimm_label_store(struct device *dev, 214 - struct device_attribute *mattr, 215 - const char *data, size_t count) 216 - { 217 - struct csrow_info *csrow = to_csrow(dev); 218 - unsigned int chan = to_channel(mattr); 219 - struct rank_info *rank = csrow->channels[chan]; 220 - size_t copy_count = count; 221 - 222 - if (count == 0) 223 - return -EINVAL; 224 - 225 - if (data[count - 1] == '\0' || data[count - 1] == '\n') 226 - copy_count -= 1; 227 - 228 - if (copy_count == 0 || copy_count >= sizeof(rank->dimm->label)) 229 - return -EINVAL; 230 - 231 - memcpy(rank->dimm->label, data, copy_count); 232 - rank->dimm->label[copy_count] = '\0'; 233 - 234 - return count; 235 - } 236 - 237 - /* show function for dynamic chX_ce_count attribute */ 238 - static ssize_t channel_ce_count_show(struct device *dev, 239 - struct device_attribute *mattr, char *data) 240 - { 241 - struct csrow_info *csrow = to_csrow(dev); 242 - unsigned int chan = to_channel(mattr); 243 - struct rank_info *rank = csrow->channels[chan]; 244 - 245 - return sysfs_emit(data, "%u\n", rank->ce_count); 246 - } 247 - 248 - /* cwrow<id>/attribute files */ 249 - DEVICE_ATTR_LEGACY(size_mb, S_IRUGO, csrow_size_show, NULL); 250 - DEVICE_ATTR_LEGACY(dev_type, S_IRUGO, csrow_dev_type_show, NULL); 251 - DEVICE_ATTR_LEGACY(mem_type, S_IRUGO, csrow_mem_type_show, NULL); 252 - DEVICE_ATTR_LEGACY(edac_mode, S_IRUGO, csrow_edac_mode_show, NULL); 253 - DEVICE_ATTR_LEGACY(ue_count, S_IRUGO, csrow_ue_count_show, NULL); 254 - DEVICE_ATTR_LEGACY(ce_count, S_IRUGO, csrow_ce_count_show, NULL); 255 - 256 - /* default attributes of the CSROW<id> object */ 257 - static struct attribute *csrow_attrs[] = { 258 - &dev_attr_legacy_dev_type.attr, 259 - &dev_attr_legacy_mem_type.attr, 260 - &dev_attr_legacy_edac_mode.attr, 261 - &dev_attr_legacy_size_mb.attr, 262 - &dev_attr_legacy_ue_count.attr, 263 - &dev_attr_legacy_ce_count.attr, 264 - NULL, 265 - }; 266 - 267 - static const struct attribute_group csrow_attr_grp = { 268 - .attrs = csrow_attrs, 269 - }; 270 - 271 - static const struct attribute_group *csrow_attr_groups[] = { 272 - &csrow_attr_grp, 273 - NULL 274 - }; 275 - 276 - static const struct device_type csrow_attr_type = { 277 - .groups = csrow_attr_groups, 278 - }; 279 - 280 - /* 281 - * possible dynamic channel DIMM Label attribute files 282 - * 283 - */ 284 - DEVICE_CHANNEL(ch0_dimm_label, S_IRUGO | S_IWUSR, 285 - channel_dimm_label_show, channel_dimm_label_store, 0); 286 - DEVICE_CHANNEL(ch1_dimm_label, S_IRUGO | S_IWUSR, 287 - channel_dimm_label_show, channel_dimm_label_store, 1); 288 - DEVICE_CHANNEL(ch2_dimm_label, S_IRUGO | S_IWUSR, 289 - channel_dimm_label_show, channel_dimm_label_store, 2); 290 - DEVICE_CHANNEL(ch3_dimm_label, S_IRUGO | S_IWUSR, 291 - channel_dimm_label_show, channel_dimm_label_store, 3); 292 - DEVICE_CHANNEL(ch4_dimm_label, S_IRUGO | S_IWUSR, 293 - channel_dimm_label_show, channel_dimm_label_store, 4); 294 - DEVICE_CHANNEL(ch5_dimm_label, S_IRUGO | S_IWUSR, 295 - channel_dimm_label_show, channel_dimm_label_store, 5); 296 - DEVICE_CHANNEL(ch6_dimm_label, S_IRUGO | S_IWUSR, 297 - channel_dimm_label_show, channel_dimm_label_store, 6); 298 - DEVICE_CHANNEL(ch7_dimm_label, S_IRUGO | S_IWUSR, 299 - channel_dimm_label_show, channel_dimm_label_store, 7); 300 - DEVICE_CHANNEL(ch8_dimm_label, S_IRUGO | S_IWUSR, 301 - channel_dimm_label_show, channel_dimm_label_store, 8); 302 - DEVICE_CHANNEL(ch9_dimm_label, S_IRUGO | S_IWUSR, 303 - channel_dimm_label_show, channel_dimm_label_store, 9); 304 - DEVICE_CHANNEL(ch10_dimm_label, S_IRUGO | S_IWUSR, 305 - channel_dimm_label_show, channel_dimm_label_store, 10); 306 - DEVICE_CHANNEL(ch11_dimm_label, S_IRUGO | S_IWUSR, 307 - channel_dimm_label_show, channel_dimm_label_store, 11); 308 - DEVICE_CHANNEL(ch12_dimm_label, S_IRUGO | S_IWUSR, 309 - channel_dimm_label_show, channel_dimm_label_store, 12); 310 - DEVICE_CHANNEL(ch13_dimm_label, S_IRUGO | S_IWUSR, 311 - channel_dimm_label_show, channel_dimm_label_store, 13); 312 - DEVICE_CHANNEL(ch14_dimm_label, S_IRUGO | S_IWUSR, 313 - channel_dimm_label_show, channel_dimm_label_store, 14); 314 - DEVICE_CHANNEL(ch15_dimm_label, S_IRUGO | S_IWUSR, 315 - channel_dimm_label_show, channel_dimm_label_store, 15); 316 - 317 - /* Total possible dynamic DIMM Label attribute file table */ 318 - static struct attribute *dynamic_csrow_dimm_attr[] = { 319 - &dev_attr_legacy_ch0_dimm_label.attr.attr, 320 - &dev_attr_legacy_ch1_dimm_label.attr.attr, 321 - &dev_attr_legacy_ch2_dimm_label.attr.attr, 322 - &dev_attr_legacy_ch3_dimm_label.attr.attr, 323 - &dev_attr_legacy_ch4_dimm_label.attr.attr, 324 - &dev_attr_legacy_ch5_dimm_label.attr.attr, 325 - &dev_attr_legacy_ch6_dimm_label.attr.attr, 326 - &dev_attr_legacy_ch7_dimm_label.attr.attr, 327 - &dev_attr_legacy_ch8_dimm_label.attr.attr, 328 - &dev_attr_legacy_ch9_dimm_label.attr.attr, 329 - &dev_attr_legacy_ch10_dimm_label.attr.attr, 330 - &dev_attr_legacy_ch11_dimm_label.attr.attr, 331 - &dev_attr_legacy_ch12_dimm_label.attr.attr, 332 - &dev_attr_legacy_ch13_dimm_label.attr.attr, 333 - &dev_attr_legacy_ch14_dimm_label.attr.attr, 334 - &dev_attr_legacy_ch15_dimm_label.attr.attr, 335 - NULL 336 - }; 337 - 338 - /* possible dynamic channel ce_count attribute files */ 339 - DEVICE_CHANNEL(ch0_ce_count, S_IRUGO, 340 - channel_ce_count_show, NULL, 0); 341 - DEVICE_CHANNEL(ch1_ce_count, S_IRUGO, 342 - channel_ce_count_show, NULL, 1); 343 - DEVICE_CHANNEL(ch2_ce_count, S_IRUGO, 344 - channel_ce_count_show, NULL, 2); 345 - DEVICE_CHANNEL(ch3_ce_count, S_IRUGO, 346 - channel_ce_count_show, NULL, 3); 347 - DEVICE_CHANNEL(ch4_ce_count, S_IRUGO, 348 - channel_ce_count_show, NULL, 4); 349 - DEVICE_CHANNEL(ch5_ce_count, S_IRUGO, 350 - channel_ce_count_show, NULL, 5); 351 - DEVICE_CHANNEL(ch6_ce_count, S_IRUGO, 352 - channel_ce_count_show, NULL, 6); 353 - DEVICE_CHANNEL(ch7_ce_count, S_IRUGO, 354 - channel_ce_count_show, NULL, 7); 355 - DEVICE_CHANNEL(ch8_ce_count, S_IRUGO, 356 - channel_ce_count_show, NULL, 8); 357 - DEVICE_CHANNEL(ch9_ce_count, S_IRUGO, 358 - channel_ce_count_show, NULL, 9); 359 - DEVICE_CHANNEL(ch10_ce_count, S_IRUGO, 360 - channel_ce_count_show, NULL, 10); 361 - DEVICE_CHANNEL(ch11_ce_count, S_IRUGO, 362 - channel_ce_count_show, NULL, 11); 363 - DEVICE_CHANNEL(ch12_ce_count, S_IRUGO, 364 - channel_ce_count_show, NULL, 12); 365 - DEVICE_CHANNEL(ch13_ce_count, S_IRUGO, 366 - channel_ce_count_show, NULL, 13); 367 - DEVICE_CHANNEL(ch14_ce_count, S_IRUGO, 368 - channel_ce_count_show, NULL, 14); 369 - DEVICE_CHANNEL(ch15_ce_count, S_IRUGO, 370 - channel_ce_count_show, NULL, 15); 371 - 372 - /* Total possible dynamic ce_count attribute file table */ 373 - static struct attribute *dynamic_csrow_ce_count_attr[] = { 374 - &dev_attr_legacy_ch0_ce_count.attr.attr, 375 - &dev_attr_legacy_ch1_ce_count.attr.attr, 376 - &dev_attr_legacy_ch2_ce_count.attr.attr, 377 - &dev_attr_legacy_ch3_ce_count.attr.attr, 378 - &dev_attr_legacy_ch4_ce_count.attr.attr, 379 - &dev_attr_legacy_ch5_ce_count.attr.attr, 380 - &dev_attr_legacy_ch6_ce_count.attr.attr, 381 - &dev_attr_legacy_ch7_ce_count.attr.attr, 382 - &dev_attr_legacy_ch8_ce_count.attr.attr, 383 - &dev_attr_legacy_ch9_ce_count.attr.attr, 384 - &dev_attr_legacy_ch10_ce_count.attr.attr, 385 - &dev_attr_legacy_ch11_ce_count.attr.attr, 386 - &dev_attr_legacy_ch12_ce_count.attr.attr, 387 - &dev_attr_legacy_ch13_ce_count.attr.attr, 388 - &dev_attr_legacy_ch14_ce_count.attr.attr, 389 - &dev_attr_legacy_ch15_ce_count.attr.attr, 390 - NULL 391 - }; 392 - 393 - static umode_t csrow_dev_is_visible(struct kobject *kobj, 394 - struct attribute *attr, int idx) 395 - { 396 - struct device *dev = kobj_to_dev(kobj); 397 - struct csrow_info *csrow = container_of(dev, struct csrow_info, dev); 398 - 399 - if (idx >= csrow->nr_channels) 400 - return 0; 401 - 402 - if (idx >= ARRAY_SIZE(dynamic_csrow_ce_count_attr) - 1) { 403 - WARN_ONCE(1, "idx: %d\n", idx); 404 - return 0; 405 - } 406 - 407 - /* Only expose populated DIMMs */ 408 - if (!csrow->channels[idx]->dimm->nr_pages) 409 - return 0; 410 - 411 - return attr->mode; 412 - } 413 - 414 - 415 - static const struct attribute_group csrow_dev_dimm_group = { 416 - .attrs = dynamic_csrow_dimm_attr, 417 - .is_visible = csrow_dev_is_visible, 418 - }; 419 - 420 - static const struct attribute_group csrow_dev_ce_count_group = { 421 - .attrs = dynamic_csrow_ce_count_attr, 422 - .is_visible = csrow_dev_is_visible, 423 - }; 424 - 425 - static const struct attribute_group *csrow_dev_groups[] = { 426 - &csrow_dev_dimm_group, 427 - &csrow_dev_ce_count_group, 428 - NULL 429 - }; 430 - 431 - static void csrow_release(struct device *dev) 432 - { 433 - /* 434 - * Nothing to do, just unregister sysfs here. The mci 435 - * device owns the data and will also release it. 436 - */ 437 - } 438 - 439 - static inline int nr_pages_per_csrow(struct csrow_info *csrow) 440 - { 441 - int chan, nr_pages = 0; 442 - 443 - for (chan = 0; chan < csrow->nr_channels; chan++) 444 - nr_pages += csrow->channels[chan]->dimm->nr_pages; 445 - 446 - return nr_pages; 447 - } 448 - 449 - /* Create a CSROW object under specified edac_mc_device */ 450 - static int edac_create_csrow_object(struct mem_ctl_info *mci, 451 - struct csrow_info *csrow, int index) 452 - { 453 - int err; 454 - 455 - csrow->dev.type = &csrow_attr_type; 456 - csrow->dev.groups = csrow_dev_groups; 457 - csrow->dev.release = csrow_release; 458 - device_initialize(&csrow->dev); 459 - csrow->dev.parent = &mci->dev; 460 - csrow->mci = mci; 461 - dev_set_name(&csrow->dev, "csrow%d", index); 462 - dev_set_drvdata(&csrow->dev, csrow); 463 - 464 - err = device_add(&csrow->dev); 465 - if (err) { 466 - edac_dbg(1, "failure: create device %s\n", dev_name(&csrow->dev)); 467 - put_device(&csrow->dev); 468 - return err; 469 - } 470 - 471 - edac_dbg(0, "device %s created\n", dev_name(&csrow->dev)); 472 - 473 - return 0; 474 - } 475 - 476 - /* Create a CSROW object under specified edac_mc_device */ 477 - static int edac_create_csrow_objects(struct mem_ctl_info *mci) 478 - { 479 - int err, i; 480 - struct csrow_info *csrow; 481 - 482 - for (i = 0; i < mci->nr_csrows; i++) { 483 - csrow = mci->csrows[i]; 484 - if (!nr_pages_per_csrow(csrow)) 485 - continue; 486 - err = edac_create_csrow_object(mci, mci->csrows[i], i); 487 - if (err < 0) 488 - goto error; 489 - } 490 - return 0; 491 - 492 - error: 493 - for (--i; i >= 0; i--) { 494 - if (device_is_registered(&mci->csrows[i]->dev)) 495 - device_unregister(&mci->csrows[i]->dev); 496 - } 497 - 498 - return err; 499 - } 500 - 501 - static void edac_delete_csrow_objects(struct mem_ctl_info *mci) 502 - { 503 - int i; 504 - 505 - for (i = 0; i < mci->nr_csrows; i++) { 506 - if (device_is_registered(&mci->csrows[i]->dev)) 507 - device_unregister(&mci->csrows[i]->dev); 508 - } 509 - } 510 - 511 - #endif 512 - 513 118 /* 514 119 * Per-dimm (or per-rank) devices 515 120 */ ··· 594 989 goto fail; 595 990 } 596 991 597 - #ifdef CONFIG_EDAC_LEGACY_SYSFS 598 - err = edac_create_csrow_objects(mci); 599 - if (err < 0) 600 - goto fail; 601 - #endif 602 - 603 992 edac_create_debugfs_nodes(mci); 604 993 return 0; 605 994 ··· 617 1018 618 1019 #ifdef CONFIG_EDAC_DEBUG 619 1020 edac_debugfs_remove_recursive(mci->debugfs); 620 - #endif 621 - #ifdef CONFIG_EDAC_LEGACY_SYSFS 622 - edac_delete_csrow_objects(mci); 623 1021 #endif 624 1022 625 1023 mci_for_each_dimm(mci, dimm) {
+4 -3
drivers/edac/ghes_edac.c
··· 15 15 #include "edac_module.h" 16 16 #include <ras/ras_event.h> 17 17 #include <linux/notifier.h> 18 + #include <linux/string.h> 18 19 19 20 #define OTHER_DETAIL_LEN 400 20 21 ··· 333 332 p = pvt->msg; 334 333 p += snprintf(p, sizeof(pvt->msg), "%s", cper_mem_err_type_str(etype)); 335 334 } else { 336 - strcpy(pvt->msg, "unknown error"); 335 + strscpy(pvt->msg, "unknown error"); 337 336 } 338 337 339 338 /* Error address */ ··· 358 357 dimm = find_dimm_by_handle(mci, mem_err->mem_dev_handle); 359 358 if (dimm) { 360 359 e->top_layer = dimm->idx; 361 - strcpy(e->label, dimm->label); 360 + strscpy(e->label, dimm->label); 362 361 } 363 362 } 364 363 if (p > e->location) 365 364 *(p - 1) = '\0'; 366 365 367 366 if (!*e->label) 368 - strcpy(e->label, "unknown memory"); 367 + strscpy(e->label, "unknown memory"); 369 368 370 369 /* All other fields are mapped on e->other_detail */ 371 370 p = pvt->other_detail;
+2 -1
drivers/edac/i10nm_base.c
··· 1198 1198 d->imc[i].num_dimms = cfg->ddr_dimm_num; 1199 1199 } 1200 1200 1201 - rc = skx_register_mci(&d->imc[i], d->imc[i].mdev, 1201 + rc = skx_register_mci(&d->imc[i], &d->imc[i].mdev->dev, 1202 + pci_name(d->imc[i].mdev), 1202 1203 "Intel_10nm Socket", EDAC_MOD_STR, 1203 1204 i10nm_get_dimm_config, cfg); 1204 1205 if (rc < 0)
+2
drivers/edac/ie31200_edac.c
··· 526 526 ie31200_pvt.priv[mc] = priv; 527 527 return 0; 528 528 fail_unmap: 529 + put_device(&priv->dev); 529 530 iounmap(window); 530 531 fail_free: 531 532 edac_mc_free(mci); ··· 599 598 mci = priv->mci; 600 599 edac_mc_del_mc(mci->pdev); 601 600 iounmap(priv->window); 601 + put_device(&priv->dev); 602 602 edac_mc_free(mci); 603 603 } 604 604 }
+2
drivers/edac/igen6_edac.c
··· 1300 1300 imc->mci = mci; 1301 1301 return 0; 1302 1302 fail3: 1303 + put_device(&imc->dev); 1303 1304 mci->pvt_info = NULL; 1304 1305 kfree(mci->ctl_name); 1305 1306 fail2: ··· 1327 1326 kfree(mci->ctl_name); 1328 1327 mci->pvt_info = NULL; 1329 1328 edac_mc_free(mci); 1329 + put_device(&imc->dev); 1330 1330 iounmap(imc->window); 1331 1331 } 1332 1332 }
+602
drivers/edac/imh_base.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Driver for Intel(R) servers with Integrated Memory/IO Hub-based memory controller. 4 + * Copyright (c) 2025, Intel Corporation. 5 + */ 6 + 7 + #include <linux/kernel.h> 8 + #include <linux/io.h> 9 + #include <asm/cpu_device_id.h> 10 + #include <asm/intel-family.h> 11 + #include <asm/mce.h> 12 + #include <asm/cpu.h> 13 + #include "edac_module.h" 14 + #include "skx_common.h" 15 + 16 + #define IMH_REVISION "v0.0.1" 17 + #define EDAC_MOD_STR "imh_edac" 18 + 19 + /* Debug macros */ 20 + #define imh_printk(level, fmt, arg...) \ 21 + edac_printk(level, "imh", fmt, ##arg) 22 + 23 + /* Configuration Agent(Ubox) */ 24 + #define MMIO_BASE_H(reg) (((u64)GET_BITFIELD(reg, 0, 29)) << 23) 25 + #define SOCKET_ID(reg) GET_BITFIELD(reg, 0, 3) 26 + 27 + /* PUNIT */ 28 + #define DDR_IMC_BITMAP(reg) GET_BITFIELD(reg, 23, 30) 29 + 30 + /* Memory Controller */ 31 + #define ECC_ENABLED(reg) GET_BITFIELD(reg, 2, 2) 32 + #define DIMM_POPULATED(reg) GET_BITFIELD(reg, 15, 15) 33 + 34 + /* System Cache Agent(SCA) */ 35 + #define TOLM(reg) (((u64)GET_BITFIELD(reg, 16, 31)) << 16) 36 + #define TOHM(reg) (((u64)GET_BITFIELD(reg, 16, 51)) << 16) 37 + 38 + /* Home Agent (HA) */ 39 + #define NMCACHING(reg) GET_BITFIELD(reg, 8, 8) 40 + 41 + /** 42 + * struct local_reg - A register as described in the local package view. 43 + * 44 + * @pkg: (input) The package where the register is located. 45 + * @pbase: (input) The IP MMIO base physical address in the local package view. 46 + * @size: (input) The IP MMIO size. 47 + * @offset: (input) The register offset from the IP MMIO base @pbase. 48 + * @width: (input) The register width in byte. 49 + * @vbase: (internal) The IP MMIO base virtual address. 50 + * @val: (output) The register value. 51 + */ 52 + struct local_reg { 53 + int pkg; 54 + u64 pbase; 55 + u32 size; 56 + u32 offset; 57 + u8 width; 58 + void __iomem *vbase; 59 + u64 val; 60 + }; 61 + 62 + #define DEFINE_LOCAL_REG(name, cfg, package, north, ip_name, ip_idx, reg_name) \ 63 + struct local_reg name = { \ 64 + .pkg = package, \ 65 + .pbase = (north ? (cfg)->mmio_base_l_north : \ 66 + (cfg)->mmio_base_l_south) + \ 67 + (cfg)->ip_name##_base + \ 68 + (cfg)->ip_name##_size * (ip_idx), \ 69 + .size = (cfg)->ip_name##_size, \ 70 + .offset = (cfg)->ip_name##_reg_##reg_name##_offset, \ 71 + .width = (cfg)->ip_name##_reg_##reg_name##_width, \ 72 + } 73 + 74 + static u64 readx(void __iomem *addr, u8 width) 75 + { 76 + switch (width) { 77 + case 1: 78 + return readb(addr); 79 + case 2: 80 + return readw(addr); 81 + case 4: 82 + return readl(addr); 83 + case 8: 84 + return readq(addr); 85 + default: 86 + imh_printk(KERN_ERR, "Invalid reg 0x%p width %d\n", addr, width); 87 + return 0; 88 + } 89 + } 90 + 91 + static void __read_local_reg(void *reg) 92 + { 93 + struct local_reg *r = (struct local_reg *)reg; 94 + 95 + r->val = readx(r->vbase + r->offset, r->width); 96 + } 97 + 98 + /* Read a local-view register. */ 99 + static bool read_local_reg(struct local_reg *reg) 100 + { 101 + int cpu; 102 + 103 + /* Get the target CPU in the package @reg->pkg. */ 104 + for_each_online_cpu(cpu) { 105 + if (reg->pkg == topology_physical_package_id(cpu)) 106 + break; 107 + } 108 + 109 + if (cpu >= nr_cpu_ids) 110 + return false; 111 + 112 + reg->vbase = ioremap(reg->pbase, reg->size); 113 + if (!reg->vbase) { 114 + imh_printk(KERN_ERR, "Failed to ioremap 0x%llx\n", reg->pbase); 115 + return false; 116 + } 117 + 118 + /* Get the target CPU to read the register. */ 119 + smp_call_function_single(cpu, __read_local_reg, reg, 1); 120 + iounmap(reg->vbase); 121 + 122 + return true; 123 + } 124 + 125 + /* Get the bitmap of memory controller instances in package @pkg. */ 126 + static u32 get_imc_bitmap(struct res_config *cfg, int pkg, bool north) 127 + { 128 + DEFINE_LOCAL_REG(reg, cfg, pkg, north, pcu, 0, capid3); 129 + 130 + if (!read_local_reg(&reg)) 131 + return 0; 132 + 133 + edac_dbg(2, "Pkg%d %s mc instances bitmap 0x%llx (reg 0x%llx)\n", 134 + pkg, north ? "north" : "south", 135 + DDR_IMC_BITMAP(reg.val), reg.val); 136 + 137 + return DDR_IMC_BITMAP(reg.val); 138 + } 139 + 140 + static void imc_release(struct device *dev) 141 + { 142 + edac_dbg(2, "imc device %s released\n", dev_name(dev)); 143 + kfree(dev); 144 + } 145 + 146 + static int __get_ddr_munits(struct res_config *cfg, struct skx_dev *d, 147 + bool north, int lmc) 148 + { 149 + unsigned long size = cfg->ddr_chan_mmio_sz * cfg->ddr_chan_num; 150 + unsigned long bitmap = get_imc_bitmap(cfg, d->pkg, north); 151 + void __iomem *mbase; 152 + struct device *dev; 153 + int i, rc, pmc; 154 + u64 base; 155 + 156 + for_each_set_bit(i, &bitmap, sizeof(bitmap) * 8) { 157 + base = north ? d->mmio_base_h_north : d->mmio_base_h_south; 158 + base += cfg->ddr_imc_base + size * i; 159 + 160 + edac_dbg(2, "Pkg%d mc%d mmio base 0x%llx size 0x%lx\n", 161 + d->pkg, lmc, base, size); 162 + 163 + /* Set up the imc MMIO. */ 164 + mbase = ioremap(base, size); 165 + if (!mbase) { 166 + imh_printk(KERN_ERR, "Failed to ioremap 0x%llx\n", base); 167 + return -ENOMEM; 168 + } 169 + 170 + d->imc[lmc].mbase = mbase; 171 + d->imc[lmc].lmc = lmc; 172 + 173 + /* Create the imc device instance. */ 174 + dev = kzalloc(sizeof(*dev), GFP_KERNEL); 175 + if (!dev) 176 + return -ENOMEM; 177 + 178 + dev->release = imc_release; 179 + device_initialize(dev); 180 + rc = dev_set_name(dev, "0x%llx", base); 181 + if (rc) { 182 + imh_printk(KERN_ERR, "Failed to set dev name\n"); 183 + put_device(dev); 184 + return rc; 185 + } 186 + 187 + d->imc[lmc].dev = dev; 188 + 189 + /* Set up the imc index mapping. */ 190 + pmc = north ? i : 8 + i; 191 + skx_set_mc_mapping(d, pmc, lmc); 192 + 193 + lmc++; 194 + } 195 + 196 + return lmc; 197 + } 198 + 199 + static bool get_ddr_munits(struct res_config *cfg, struct skx_dev *d) 200 + { 201 + int lmc = __get_ddr_munits(cfg, d, true, 0); 202 + 203 + if (lmc < 0) 204 + return false; 205 + 206 + lmc = __get_ddr_munits(cfg, d, false, lmc); 207 + if (lmc <= 0) 208 + return false; 209 + 210 + return true; 211 + } 212 + 213 + static bool get_socket_id(struct res_config *cfg, struct skx_dev *d) 214 + { 215 + DEFINE_LOCAL_REG(reg, cfg, d->pkg, true, ubox, 0, socket_id); 216 + u8 src_id; 217 + int i; 218 + 219 + if (!read_local_reg(&reg)) 220 + return false; 221 + 222 + src_id = SOCKET_ID(reg.val); 223 + edac_dbg(2, "socket id 0x%x (reg 0x%llx)\n", src_id, reg.val); 224 + 225 + for (i = 0; i < cfg->ddr_imc_num; i++) 226 + d->imc[i].src_id = src_id; 227 + 228 + return true; 229 + } 230 + 231 + /* Get TOLM (Top Of Low Memory) and TOHM (Top Of High Memory) parameters. */ 232 + static bool imh_get_tolm_tohm(struct res_config *cfg, u64 *tolm, u64 *tohm) 233 + { 234 + DEFINE_LOCAL_REG(reg, cfg, 0, true, sca, 0, tolm); 235 + 236 + if (!read_local_reg(&reg)) 237 + return false; 238 + 239 + *tolm = TOLM(reg.val); 240 + edac_dbg(2, "tolm 0x%llx (reg 0x%llx)\n", *tolm, reg.val); 241 + 242 + DEFINE_LOCAL_REG(reg2, cfg, 0, true, sca, 0, tohm); 243 + 244 + if (!read_local_reg(&reg2)) 245 + return false; 246 + 247 + *tohm = TOHM(reg2.val); 248 + edac_dbg(2, "tohm 0x%llx (reg 0x%llx)\n", *tohm, reg2.val); 249 + 250 + return true; 251 + } 252 + 253 + /* Get the system-view MMIO_BASE_H for {north,south}-IMH. */ 254 + static int imh_get_all_mmio_base_h(struct res_config *cfg, struct list_head *edac_list) 255 + { 256 + int i, n = topology_max_packages(), imc_num = cfg->ddr_imc_num + cfg->hbm_imc_num; 257 + struct skx_dev *d; 258 + 259 + for (i = 0; i < n; i++) { 260 + d = kzalloc(struct_size(d, imc, imc_num), GFP_KERNEL); 261 + if (!d) 262 + return -ENOMEM; 263 + 264 + DEFINE_LOCAL_REG(reg, cfg, i, true, ubox, 0, mmio_base); 265 + 266 + /* Get MMIO_BASE_H for the north-IMH. */ 267 + if (!read_local_reg(&reg) || !reg.val) { 268 + kfree(d); 269 + imh_printk(KERN_ERR, "Pkg%d has no north mmio_base_h\n", i); 270 + return -ENODEV; 271 + } 272 + 273 + d->mmio_base_h_north = MMIO_BASE_H(reg.val); 274 + edac_dbg(2, "Pkg%d north mmio_base_h 0x%llx (reg 0x%llx)\n", 275 + i, d->mmio_base_h_north, reg.val); 276 + 277 + /* Get MMIO_BASE_H for the south-IMH (optional). */ 278 + DEFINE_LOCAL_REG(reg2, cfg, i, false, ubox, 0, mmio_base); 279 + 280 + if (read_local_reg(&reg2)) { 281 + d->mmio_base_h_south = MMIO_BASE_H(reg2.val); 282 + edac_dbg(2, "Pkg%d south mmio_base_h 0x%llx (reg 0x%llx)\n", 283 + i, d->mmio_base_h_south, reg2.val); 284 + } 285 + 286 + d->pkg = i; 287 + d->num_imc = imc_num; 288 + skx_init_mc_mapping(d); 289 + list_add_tail(&d->list, edac_list); 290 + } 291 + 292 + return 0; 293 + } 294 + 295 + /* Get the number of per-package memory controllers. */ 296 + static int imh_get_imc_num(struct res_config *cfg) 297 + { 298 + int imc_num = hweight32(get_imc_bitmap(cfg, 0, true)) + 299 + hweight32(get_imc_bitmap(cfg, 0, false)); 300 + 301 + if (!imc_num) { 302 + imh_printk(KERN_ERR, "Invalid mc number\n"); 303 + return -ENODEV; 304 + } 305 + 306 + if (cfg->ddr_imc_num != imc_num) { 307 + /* 308 + * Update the configuration data to reflect the number of 309 + * present DDR memory controllers. 310 + */ 311 + cfg->ddr_imc_num = imc_num; 312 + edac_dbg(2, "Set ddr mc number %d\n", imc_num); 313 + } 314 + 315 + return 0; 316 + } 317 + 318 + /* Get all memory controllers' parameters. */ 319 + static int imh_get_munits(struct res_config *cfg, struct list_head *edac_list) 320 + { 321 + struct skx_imc *imc; 322 + struct skx_dev *d; 323 + u8 mc = 0; 324 + int i; 325 + 326 + list_for_each_entry(d, edac_list, list) { 327 + if (!get_ddr_munits(cfg, d)) { 328 + imh_printk(KERN_ERR, "No mc found\n"); 329 + return -ENODEV; 330 + } 331 + 332 + if (!get_socket_id(cfg, d)) { 333 + imh_printk(KERN_ERR, "Failed to get socket id\n"); 334 + return -ENODEV; 335 + } 336 + 337 + for (i = 0; i < cfg->ddr_imc_num; i++) { 338 + imc = &d->imc[i]; 339 + if (!imc->mbase) 340 + continue; 341 + 342 + imc->chan_mmio_sz = cfg->ddr_chan_mmio_sz; 343 + imc->num_channels = cfg->ddr_chan_num; 344 + imc->num_dimms = cfg->ddr_dimm_num; 345 + imc->mc = mc++; 346 + } 347 + } 348 + 349 + return 0; 350 + } 351 + 352 + static bool check_2lm_enabled(struct res_config *cfg, struct skx_dev *d, int ha_idx) 353 + { 354 + DEFINE_LOCAL_REG(reg, cfg, d->pkg, true, ha, ha_idx, mode); 355 + 356 + if (!read_local_reg(&reg)) 357 + return false; 358 + 359 + if (!NMCACHING(reg.val)) 360 + return false; 361 + 362 + edac_dbg(2, "2-level memory configuration (reg 0x%llx, ha idx %d)\n", reg.val, ha_idx); 363 + return true; 364 + } 365 + 366 + /* Check whether the system has a 2-level memory configuration. */ 367 + static bool imh_2lm_enabled(struct res_config *cfg, struct list_head *head) 368 + { 369 + struct skx_dev *d; 370 + int i; 371 + 372 + list_for_each_entry(d, head, list) { 373 + for (i = 0; i < cfg->ddr_imc_num; i++) 374 + if (check_2lm_enabled(cfg, d, i)) 375 + return true; 376 + } 377 + 378 + return false; 379 + } 380 + 381 + /* Helpers to read memory controller registers */ 382 + static u64 read_imc_reg(struct skx_imc *imc, int chan, u32 offset, u8 width) 383 + { 384 + return readx(imc->mbase + imc->chan_mmio_sz * chan + offset, width); 385 + } 386 + 387 + static u32 read_imc_mcmtr(struct res_config *cfg, struct skx_imc *imc, int chan) 388 + { 389 + return (u32)read_imc_reg(imc, chan, cfg->ddr_reg_mcmtr_offset, cfg->ddr_reg_mcmtr_width); 390 + } 391 + 392 + static u32 read_imc_dimmmtr(struct res_config *cfg, struct skx_imc *imc, int chan, int dimm) 393 + { 394 + return (u32)read_imc_reg(imc, chan, cfg->ddr_reg_dimmmtr_offset + 395 + cfg->ddr_reg_dimmmtr_width * dimm, 396 + cfg->ddr_reg_dimmmtr_width); 397 + } 398 + 399 + static bool ecc_enabled(u32 mcmtr) 400 + { 401 + return (bool)ECC_ENABLED(mcmtr); 402 + } 403 + 404 + static bool dimm_populated(u32 dimmmtr) 405 + { 406 + return (bool)DIMM_POPULATED(dimmmtr); 407 + } 408 + 409 + /* Get each DIMM's configurations of the memory controller @mci. */ 410 + static int imh_get_dimm_config(struct mem_ctl_info *mci, struct res_config *cfg) 411 + { 412 + struct skx_pvt *pvt = mci->pvt_info; 413 + struct skx_imc *imc = pvt->imc; 414 + struct dimm_info *dimm; 415 + u32 mcmtr, dimmmtr; 416 + int i, j, ndimms; 417 + 418 + for (i = 0; i < imc->num_channels; i++) { 419 + if (!imc->mbase) 420 + continue; 421 + 422 + mcmtr = read_imc_mcmtr(cfg, imc, i); 423 + 424 + for (ndimms = 0, j = 0; j < imc->num_dimms; j++) { 425 + dimmmtr = read_imc_dimmmtr(cfg, imc, i, j); 426 + edac_dbg(1, "mcmtr 0x%x dimmmtr 0x%x (mc%d ch%d dimm%d)\n", 427 + mcmtr, dimmmtr, imc->mc, i, j); 428 + 429 + if (!dimm_populated(dimmmtr)) 430 + continue; 431 + 432 + dimm = edac_get_dimm(mci, i, j, 0); 433 + ndimms += skx_get_dimm_info(dimmmtr, 0, 0, dimm, 434 + imc, i, j, cfg); 435 + } 436 + 437 + if (ndimms && !ecc_enabled(mcmtr)) { 438 + imh_printk(KERN_ERR, "ECC is disabled on mc%d ch%d\n", 439 + imc->mc, i); 440 + return -ENODEV; 441 + } 442 + } 443 + 444 + return 0; 445 + } 446 + 447 + /* Register all memory controllers to the EDAC core. */ 448 + static int imh_register_mci(struct res_config *cfg, struct list_head *edac_list) 449 + { 450 + struct skx_imc *imc; 451 + struct skx_dev *d; 452 + int i, rc; 453 + 454 + list_for_each_entry(d, edac_list, list) { 455 + for (i = 0; i < cfg->ddr_imc_num; i++) { 456 + imc = &d->imc[i]; 457 + if (!imc->mbase) 458 + continue; 459 + 460 + rc = skx_register_mci(imc, imc->dev, 461 + dev_name(imc->dev), 462 + "Intel IMH-based Socket", 463 + EDAC_MOD_STR, 464 + imh_get_dimm_config, cfg); 465 + if (rc) 466 + return rc; 467 + } 468 + } 469 + 470 + return 0; 471 + } 472 + 473 + static struct res_config dmr_cfg = { 474 + .type = DMR, 475 + .support_ddr5 = true, 476 + .mmio_base_l_north = 0xf6800000, 477 + .mmio_base_l_south = 0xf6000000, 478 + .ddr_chan_num = 1, 479 + .ddr_dimm_num = 2, 480 + .ddr_imc_base = 0x39b000, 481 + .ddr_chan_mmio_sz = 0x8000, 482 + .ddr_reg_mcmtr_offset = 0x360, 483 + .ddr_reg_mcmtr_width = 4, 484 + .ddr_reg_dimmmtr_offset = 0x370, 485 + .ddr_reg_dimmmtr_width = 4, 486 + .ubox_base = 0x0, 487 + .ubox_size = 0x2000, 488 + .ubox_reg_mmio_base_offset = 0x580, 489 + .ubox_reg_mmio_base_width = 4, 490 + .ubox_reg_socket_id_offset = 0x1080, 491 + .ubox_reg_socket_id_width = 4, 492 + .pcu_base = 0x3000, 493 + .pcu_size = 0x10000, 494 + .pcu_reg_capid3_offset = 0x290, 495 + .pcu_reg_capid3_width = 4, 496 + .sca_base = 0x24c000, 497 + .sca_size = 0x2500, 498 + .sca_reg_tolm_offset = 0x2100, 499 + .sca_reg_tolm_width = 8, 500 + .sca_reg_tohm_offset = 0x2108, 501 + .sca_reg_tohm_width = 8, 502 + .ha_base = 0x3eb000, 503 + .ha_size = 0x1000, 504 + .ha_reg_mode_offset = 0x4a0, 505 + .ha_reg_mode_width = 4, 506 + }; 507 + 508 + static const struct x86_cpu_id imh_cpuids[] = { 509 + X86_MATCH_VFM(INTEL_DIAMONDRAPIDS_X, &dmr_cfg), 510 + {} 511 + }; 512 + MODULE_DEVICE_TABLE(x86cpu, imh_cpuids); 513 + 514 + static struct notifier_block imh_mce_dec = { 515 + .notifier_call = skx_mce_check_error, 516 + .priority = MCE_PRIO_EDAC, 517 + }; 518 + 519 + static int __init imh_init(void) 520 + { 521 + const struct x86_cpu_id *id; 522 + struct list_head *edac_list; 523 + struct res_config *cfg; 524 + const char *owner; 525 + u64 tolm, tohm; 526 + int rc; 527 + 528 + edac_dbg(2, "\n"); 529 + 530 + if (ghes_get_devices()) 531 + return -EBUSY; 532 + 533 + owner = edac_get_owner(); 534 + if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR))) 535 + return -EBUSY; 536 + 537 + if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) 538 + return -ENODEV; 539 + 540 + id = x86_match_cpu(imh_cpuids); 541 + if (!id) 542 + return -ENODEV; 543 + cfg = (struct res_config *)id->driver_data; 544 + skx_set_res_cfg(cfg); 545 + 546 + if (!imh_get_tolm_tohm(cfg, &tolm, &tohm)) 547 + return -ENODEV; 548 + 549 + skx_set_hi_lo(tolm, tohm); 550 + 551 + rc = imh_get_imc_num(cfg); 552 + if (rc < 0) 553 + goto fail; 554 + 555 + edac_list = skx_get_edac_list(); 556 + 557 + rc = imh_get_all_mmio_base_h(cfg, edac_list); 558 + if (rc) 559 + goto fail; 560 + 561 + rc = imh_get_munits(cfg, edac_list); 562 + if (rc) 563 + goto fail; 564 + 565 + skx_set_mem_cfg(imh_2lm_enabled(cfg, edac_list)); 566 + 567 + rc = imh_register_mci(cfg, edac_list); 568 + if (rc) 569 + goto fail; 570 + 571 + rc = skx_adxl_get(); 572 + if (rc) 573 + goto fail; 574 + 575 + opstate_init(); 576 + mce_register_decode_chain(&imh_mce_dec); 577 + skx_setup_debug("imh_test"); 578 + 579 + imh_printk(KERN_INFO, "%s\n", IMH_REVISION); 580 + 581 + return 0; 582 + fail: 583 + skx_remove(); 584 + return rc; 585 + } 586 + 587 + static void __exit imh_exit(void) 588 + { 589 + edac_dbg(2, "\n"); 590 + 591 + skx_teardown_debug(); 592 + mce_unregister_decode_chain(&imh_mce_dec); 593 + skx_adxl_put(); 594 + skx_remove(); 595 + } 596 + 597 + module_init(imh_init); 598 + module_exit(imh_exit); 599 + 600 + MODULE_LICENSE("GPL"); 601 + MODULE_AUTHOR("Qiuxu Zhuo"); 602 + MODULE_DESCRIPTION("MC Driver for Intel servers using IMH-based memory controller");
+2 -2
drivers/edac/skx_base.c
··· 662 662 d->imc[i].src_id = src_id; 663 663 d->imc[i].num_channels = cfg->ddr_chan_num; 664 664 d->imc[i].num_dimms = cfg->ddr_dimm_num; 665 - 666 - rc = skx_register_mci(&d->imc[i], d->imc[i].chan[0].cdev, 665 + rc = skx_register_mci(&d->imc[i], &d->imc[i].chan[0].cdev->dev, 666 + pci_name(d->imc[i].chan[0].cdev), 667 667 "Skylake Socket", EDAC_MOD_STR, 668 668 skx_get_dimm_config, cfg); 669 669 if (rc < 0)
+25 -8
drivers/edac/skx_common.c
··· 124 124 } 125 125 EXPORT_SYMBOL_GPL(skx_adxl_put); 126 126 127 - static void skx_init_mc_mapping(struct skx_dev *d) 127 + void skx_init_mc_mapping(struct skx_dev *d) 128 128 { 129 129 /* 130 130 * By default, the BIOS presents all memory controllers within each ··· 135 135 for (int i = 0; i < d->num_imc; i++) 136 136 d->imc[i].mc_mapping = i; 137 137 } 138 + EXPORT_SYMBOL_GPL(skx_init_mc_mapping); 138 139 139 140 void skx_set_mc_mapping(struct skx_dev *d, u8 pmc, u8 lmc) 140 141 { ··· 385 384 } 386 385 EXPORT_SYMBOL_GPL(skx_get_all_bus_mappings); 387 386 387 + struct list_head *skx_get_edac_list(void) 388 + { 389 + return &dev_edac_list; 390 + } 391 + EXPORT_SYMBOL_GPL(skx_get_edac_list); 392 + 388 393 int skx_get_hi_lo(unsigned int did, int off[], u64 *tolm, u64 *tohm) 389 394 { 390 395 struct pci_dev *pdev; ··· 431 424 } 432 425 EXPORT_SYMBOL_GPL(skx_get_hi_lo); 433 426 427 + void skx_set_hi_lo(u64 tolm, u64 tohm) 428 + { 429 + skx_tolm = tolm; 430 + skx_tohm = tohm; 431 + } 432 + EXPORT_SYMBOL_GPL(skx_set_hi_lo); 433 + 434 434 static int skx_get_dimm_attr(u32 reg, int lobit, int hibit, int add, 435 435 int minval, int maxval, const char *name) 436 436 { ··· 451 437 } 452 438 453 439 #define numrank(reg) skx_get_dimm_attr(reg, 12, 13, 0, 0, 2, "ranks") 454 - #define numrow(reg) skx_get_dimm_attr(reg, 2, 4, 12, 1, 6, "rows") 440 + #define numrow(reg) skx_get_dimm_attr(reg, 2, 4, 12, 1, 7, "rows") 455 441 #define numcol(reg) skx_get_dimm_attr(reg, 0, 1, 10, 0, 2, "cols") 456 442 457 443 int skx_get_dimm_info(u32 mtr, u32 mcmtr, u32 amap, struct dimm_info *dimm, ··· 559 545 } 560 546 EXPORT_SYMBOL_GPL(skx_get_nvdimm_info); 561 547 562 - int skx_register_mci(struct skx_imc *imc, struct pci_dev *pdev, 563 - const char *ctl_name, const char *mod_str, 564 - get_dimm_config_f get_dimm_config, 548 + int skx_register_mci(struct skx_imc *imc, struct device *dev, 549 + const char *dev_name, const char *ctl_name, 550 + const char *mod_str, get_dimm_config_f get_dimm_config, 565 551 struct res_config *cfg) 566 552 { 567 553 struct mem_ctl_info *mci; ··· 602 588 mci->edac_ctl_cap = EDAC_FLAG_NONE; 603 589 mci->edac_cap = EDAC_FLAG_NONE; 604 590 mci->mod_name = mod_str; 605 - mci->dev_name = pci_name(pdev); 591 + mci->dev_name = dev_name; 606 592 mci->ctl_page_to_phys = NULL; 607 593 608 594 rc = get_dimm_config(mci, cfg); ··· 610 596 goto fail; 611 597 612 598 /* Record ptr to the generic device */ 613 - mci->pdev = &pdev->dev; 599 + mci->pdev = dev; 614 600 615 601 /* Add this new MC control structure to EDAC's list of MCs */ 616 602 if (unlikely(edac_mc_add_mc(mci))) { ··· 824 810 if (d->imc[i].mbase) 825 811 iounmap(d->imc[i].mbase); 826 812 813 + if (d->imc[i].dev) 814 + put_device(d->imc[i].dev); 815 + 827 816 for (j = 0; j < d->imc[i].num_channels; j++) { 828 817 if (d->imc[i].chan[j].cdev) 829 818 pci_dev_put(d->imc[i].chan[j].cdev); ··· 850 833 /* 851 834 * Debug feature. 852 835 * Exercise the address decode logic by writing an address to 853 - * /sys/kernel/debug/edac/{skx,i10nm}_test/addr. 836 + * /sys/kernel/debug/edac/{skx,i10nm,imh}_test/addr. 854 837 */ 855 838 static struct dentry *skx_test; 856 839
+73 -25
drivers/edac/skx_common.h
··· 121 121 * memory controllers on the die. 122 122 */ 123 123 struct skx_dev { 124 - struct list_head list; 124 + /* {skx,i10nm}_edac */ 125 125 u8 bus[4]; 126 126 int seg; 127 127 struct pci_dev *sad_all; 128 128 struct pci_dev *util_all; 129 - struct pci_dev *uracu; /* for i10nm CPU */ 130 - struct pci_dev *pcu_cr3; /* for HBM memory detection */ 129 + struct pci_dev *uracu; 130 + struct pci_dev *pcu_cr3; 131 131 u32 mcroute; 132 + 133 + /* imh_edac */ 134 + /* System-view MMIO base physical addresses. */ 135 + u64 mmio_base_h_north; 136 + u64 mmio_base_h_south; 137 + int pkg; 138 + 132 139 int num_imc; 140 + struct list_head list; 133 141 struct skx_imc { 142 + /* i10nm_edac */ 143 + struct pci_dev *mdev; 144 + 145 + /* imh_edac */ 146 + struct device *dev; 147 + 134 148 struct mem_ctl_info *mci; 135 - struct pci_dev *mdev; /* for i10nm CPU */ 136 - void __iomem *mbase; /* for i10nm CPU */ 137 - int chan_mmio_sz; /* for i10nm CPU */ 149 + void __iomem *mbase; 150 + int chan_mmio_sz; 138 151 int num_channels; /* channels per memory controller */ 139 152 int num_dimms; /* dimms per channel */ 140 153 bool hbm_mc; ··· 191 178 SKX, 192 179 I10NM, 193 180 SPR, 194 - GNR 181 + GNR, 182 + DMR, 195 183 }; 196 184 197 185 enum { ··· 251 237 252 238 struct res_config { 253 239 enum type type; 254 - /* Configuration agent device ID */ 255 - unsigned int decs_did; 256 - /* Default bus number configuration register offset */ 257 - int busno_cfg_offset; 258 240 /* DDR memory controllers per socket */ 259 241 int ddr_imc_num; 260 242 /* DDR channels per DDR memory controller */ ··· 268 258 /* Per HBM channel memory-mapped I/O size */ 269 259 int hbm_chan_mmio_sz; 270 260 bool support_ddr5; 271 - /* SAD device BDF */ 272 - struct pci_bdf sad_all_bdf; 273 - /* PCU device BDF */ 274 - struct pci_bdf pcu_cr3_bdf; 275 - /* UTIL device BDF */ 276 - struct pci_bdf util_all_bdf; 277 - /* URACU device BDF */ 278 - struct pci_bdf uracu_bdf; 279 - /* DDR mdev device BDF */ 280 - struct pci_bdf ddr_mdev_bdf; 281 - /* HBM mdev device BDF */ 282 - struct pci_bdf hbm_mdev_bdf; 283 - int sad_all_offset; 284 261 /* RRL register sets per DDR channel */ 285 262 struct reg_rrl *reg_rrl_ddr; 286 263 /* RRL register sets per HBM channel */ 287 264 struct reg_rrl *reg_rrl_hbm[2]; 265 + union { 266 + /* {skx,i10nm}_edac */ 267 + struct { 268 + /* Configuration agent device ID */ 269 + unsigned int decs_did; 270 + /* Default bus number configuration register offset */ 271 + int busno_cfg_offset; 272 + struct pci_bdf sad_all_bdf; 273 + struct pci_bdf pcu_cr3_bdf; 274 + struct pci_bdf util_all_bdf; 275 + struct pci_bdf uracu_bdf; 276 + struct pci_bdf ddr_mdev_bdf; 277 + struct pci_bdf hbm_mdev_bdf; 278 + int sad_all_offset; 279 + }; 280 + /* imh_edac */ 281 + struct { 282 + /* MMIO base physical address in local package view */ 283 + u64 mmio_base_l_north; 284 + u64 mmio_base_l_south; 285 + u64 ddr_imc_base; 286 + u64 ddr_reg_mcmtr_offset; 287 + u8 ddr_reg_mcmtr_width; 288 + u64 ddr_reg_dimmmtr_offset; 289 + u8 ddr_reg_dimmmtr_width; 290 + u64 ubox_base; 291 + u32 ubox_size; 292 + u32 ubox_reg_mmio_base_offset; 293 + u8 ubox_reg_mmio_base_width; 294 + u32 ubox_reg_socket_id_offset; 295 + u8 ubox_reg_socket_id_width; 296 + u64 pcu_base; 297 + u32 pcu_size; 298 + u32 pcu_reg_capid3_offset; 299 + u8 pcu_reg_capid3_width; 300 + u64 sca_base; 301 + u32 sca_size; 302 + u32 sca_reg_tolm_offset; 303 + u8 sca_reg_tolm_width; 304 + u32 sca_reg_tohm_offset; 305 + u8 sca_reg_tohm_width; 306 + u64 ha_base; 307 + u32 ha_size; 308 + u32 ha_reg_mode_offset; 309 + u8 ha_reg_mode_width; 310 + }; 311 + }; 288 312 }; 289 313 290 314 typedef int (*get_dimm_config_f)(struct mem_ctl_info *mci, ··· 331 287 void skx_set_decode(skx_decode_f decode, skx_show_retry_log_f show_retry_log); 332 288 void skx_set_mem_cfg(bool mem_cfg_2lm); 333 289 void skx_set_res_cfg(struct res_config *cfg); 290 + void skx_init_mc_mapping(struct skx_dev *d); 334 291 void skx_set_mc_mapping(struct skx_dev *d, u8 pmc, u8 lmc); 335 292 336 293 int skx_get_src_id(struct skx_dev *d, int off, u8 *id); 337 294 338 295 int skx_get_all_bus_mappings(struct res_config *cfg, struct list_head **list); 339 296 297 + struct list_head *skx_get_edac_list(void); 298 + 340 299 int skx_get_hi_lo(unsigned int did, int off[], u64 *tolm, u64 *tohm); 300 + void skx_set_hi_lo(u64 tolm, u64 tohm); 341 301 342 302 int skx_get_dimm_info(u32 mtr, u32 mcmtr, u32 amap, struct dimm_info *dimm, 343 303 struct skx_imc *imc, int chan, int dimmno, ··· 350 302 int skx_get_nvdimm_info(struct dimm_info *dimm, struct skx_imc *imc, 351 303 int chan, int dimmno, const char *mod_str); 352 304 353 - int skx_register_mci(struct skx_imc *imc, struct pci_dev *pdev, 305 + int skx_register_mci(struct skx_imc *imc, struct device *dev, const char *dev_name, 354 306 const char *ctl_name, const char *mod_str, 355 307 get_dimm_config_f get_dimm_config, 356 308 struct res_config *cfg);
+5 -2
drivers/ras/amd/atl/core.c
··· 194 194 195 195 static int __init amd_atl_init(void) 196 196 { 197 + int ret; 198 + 197 199 if (!x86_match_cpu(amd_atl_cpuids)) 198 200 return -ENODEV; 199 201 ··· 204 202 205 203 check_for_legacy_df_access(); 206 204 207 - if (get_df_system_info()) 208 - return -ENODEV; 205 + ret = get_df_system_info(); 206 + if (ret) 207 + return ret; 209 208 210 209 /* Increment this module's recount so that it can't be easily unloaded. */ 211 210 __module_get(THIS_MODULE);
+5 -1
drivers/ras/amd/atl/internal.h
··· 138 138 __u8 legacy_ficaa : 1, 139 139 socket_id_shift_quirk : 1, 140 140 heterogeneous : 1, 141 - __reserved_0 : 5; 141 + prm_only : 1, 142 + __reserved_0 : 4; 142 143 }; 143 144 144 145 struct df_config { ··· 283 282 284 283 u64 add_base_and_hole(struct addr_ctx *ctx, u64 addr); 285 284 u64 remove_base_and_hole(struct addr_ctx *ctx, u64 addr); 285 + 286 + /* GUIDs for PRM handlers */ 287 + extern const guid_t norm_to_sys_guid; 286 288 287 289 #ifdef CONFIG_AMD_ATL_PRM 288 290 unsigned long prm_umc_norm_to_sys_addr(u8 socket_id, u64 umc_bank_inst_id, unsigned long addr);
-4
drivers/ras/amd/atl/prm.c
··· 29 29 void *out_buf; 30 30 } __packed; 31 31 32 - static const guid_t norm_to_sys_guid = GUID_INIT(0xE7180659, 0xA65D, 0x451D, 33 - 0x92, 0xCD, 0x2B, 0x56, 0xF1, 34 - 0x2B, 0xEB, 0xA6); 35 - 36 32 unsigned long prm_umc_norm_to_sys_addr(u8 socket_id, u64 bank_id, unsigned long addr) 37 33 { 38 34 struct norm_to_sys_param_buf p_buf;
+22 -8
drivers/ras/amd/atl/system.c
··· 12 12 13 13 #include "internal.h" 14 14 15 + #include <linux/prmt.h> 16 + 17 + const guid_t norm_to_sys_guid = GUID_INIT(0xE7180659, 0xA65D, 0x451D, 18 + 0x92, 0xCD, 0x2B, 0x56, 0xF1, 19 + 0x2B, 0xEB, 0xA6); 20 + 15 21 int determine_node_id(struct addr_ctx *ctx, u8 socket_id, u8 die_id) 16 22 { 17 23 u16 socket_id_bits, die_id_bits; ··· 218 212 if (!rev) 219 213 return determine_df_rev_legacy(); 220 214 221 - /* 222 - * Fail out for major revisions other than '4'. 223 - * 224 - * Explicit support should be added for newer systems to avoid issues. 225 - */ 226 215 if (rev == 4) 227 216 return df4_determine_df_rev(reg); 228 217 229 - return -EINVAL; 218 + /* All other systems should have PRM handlers. */ 219 + if (!acpi_prm_handler_available(&norm_to_sys_guid)) { 220 + pr_debug("PRM not available\n"); 221 + return -ENODEV; 222 + } 223 + 224 + df_cfg.flags.prm_only = true; 225 + return 0; 230 226 } 231 227 232 228 static int get_dram_hole_base(void) ··· 296 288 297 289 int get_df_system_info(void) 298 290 { 299 - if (determine_df_rev()) { 291 + int ret; 292 + 293 + ret = determine_df_rev(); 294 + if (ret) { 300 295 pr_warn("Failed to determine DF Revision"); 301 296 df_cfg.rev = UNKNOWN; 302 - return -EINVAL; 297 + return ret; 303 298 } 299 + 300 + if (df_cfg.flags.prm_only) 301 + return 0; 304 302 305 303 apply_node_id_shift(); 306 304
+6 -17
drivers/ras/amd/atl/umc.c
··· 49 49 return i; 50 50 } 51 51 52 - /* XOR the bits in @val. */ 53 - static u16 bitwise_xor_bits(u16 val) 54 - { 55 - u16 tmp = 0; 56 - u8 i; 57 - 58 - for (i = 0; i < 16; i++) 59 - tmp ^= (val >> i) & 0x1; 60 - 61 - return tmp; 62 - } 63 52 64 53 struct xor_bits { 65 54 bool xor_enable; ··· 239 250 if (!addr_hash.bank[i].xor_enable) 240 251 continue; 241 252 242 - temp = bitwise_xor_bits(col & addr_hash.bank[i].col_xor); 243 - temp ^= bitwise_xor_bits(row & addr_hash.bank[i].row_xor); 253 + temp = hweight16(col & addr_hash.bank[i].col_xor) & 1; 254 + temp ^= hweight16(row & addr_hash.bank[i].row_xor) & 1; 244 255 bank ^= temp << i; 245 256 } 246 257 247 258 /* Calculate hash for PC bit. */ 248 259 if (addr_hash.pc.xor_enable) { 249 - temp = bitwise_xor_bits(col & addr_hash.pc.col_xor); 250 - temp ^= bitwise_xor_bits(row & addr_hash.pc.row_xor); 260 + temp = hweight16(col & addr_hash.pc.col_xor) & 1; 261 + temp ^= hweight16(row & addr_hash.pc.row_xor) & 1; 251 262 /* Bits SID[1:0] act as Bank[5:4] for PC hash, so apply them here. */ 252 - temp ^= bitwise_xor_bits((bank | sid << NUM_BANK_BITS) & addr_hash.bank_xor); 263 + temp ^= hweight16((bank | sid << NUM_BANK_BITS) & addr_hash.bank_xor) & 1; 253 264 pc ^= temp; 254 265 } 255 266 ··· 411 422 socket_id, die_id, coh_st_inst_id, addr); 412 423 413 424 ret_addr = prm_umc_norm_to_sys_addr(socket_id, err->ipid, addr); 414 - if (!IS_ERR_VALUE(ret_addr)) 425 + if (!IS_ERR_VALUE(ret_addr) || df_cfg.flags.prm_only) 415 426 return ret_addr; 416 427 417 428 return norm_to_sys_addr(socket_id, die_id, coh_st_inst_id, addr);
+1 -1
drivers/ras/cec.c
··· 166 166 unsigned long iv; 167 167 168 168 iv = interval * HZ; 169 - mod_delayed_work(system_wq, &cec_work, round_jiffies(iv)); 169 + mod_delayed_work(system_percpu_wq, &cec_work, round_jiffies(iv)); 170 170 } 171 171 172 172 static void cec_work_fn(struct work_struct *work)
+2
include/linux/prmt.h
··· 4 4 5 5 #ifdef CONFIG_ACPI_PRMT 6 6 void init_prmt(void); 7 + bool acpi_prm_handler_available(const guid_t *handler_guid); 7 8 int acpi_call_prm_handler(guid_t handler_guid, void *param_buffer); 8 9 #else 9 10 static inline void init_prmt(void) { } 11 + static inline bool acpi_prm_handler_available(const guid_t *handler_guid) { return false; } 10 12 static inline int acpi_call_prm_handler(guid_t handler_guid, void *param_buffer) 11 13 { 12 14 return -EOPNOTSUPP;