Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'driver-core-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core / kernfs updates from Greg KH:
"Here is the set of driver core and kernfs changes for 6.0-rc1.

The "biggest" thing in here is some scalability improvements for
kernfs for large systems. Other than that, included in here are:

- arch topology and cache info changes that have been reviewed and
discussed a lot.

- potential error path cleanup fixes

- deferred driver probe cleanups

- firmware loader cleanups and tweaks

- documentation updates

- other small things

All of these have been in the linux-next tree for a while with no
reported problems"

* tag 'driver-core-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (63 commits)
docs: embargoed-hardware-issues: fix invalid AMD contact email
firmware_loader: Replace kmap() with kmap_local_page()
sysfs docs: ABI: Fix typo in comment
kobject: fix Kconfig.debug "its" grammar
kernfs: Fix typo 'the the' in comment
docs: driver-api: firmware: add driver firmware guidelines. (v3)
arch_topology: Fix cache attributes detection in the CPU hotplug path
ACPI: PPTT: Leave the table mapped for the runtime usage
cacheinfo: Use atomic allocation for percpu cache attributes
drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist
MAINTAINERS: Change mentions of mpm to olivia
docs: ABI: sysfs-devices-soc: Update Lee Jones' email address
docs: ABI: sysfs-class-pwm: Update Lee Jones' email address
Documentation/process: Add embargoed HW contact for LLVM
Revert "kernfs: Change kernfs_notify_list to llist."
ACPI: Remove the unused find_acpi_cpu_cache_topology()
arch_topology: Warn that topology for nested clusters is not supported
arch_topology: Add support for parsing sockets in /cpu-map
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Limit span of cpu_clustergroup_mask()
...

+720 -376
+1 -1
Documentation/ABI/stable/sysfs-module
··· 38 38 Date: Jun 2005 39 39 Description: 40 40 If the module source has MODULE_VERSION, this file will contain 41 - the checksum of the the source code. 41 + the checksum of the source code. 42 42 43 43 What: /sys/module/<MODULENAME>/version 44 44 Date: Jun 2005
+1 -1
Documentation/ABI/testing/sysfs-class-pwm
··· 81 81 What: /sys/class/pwm/pwmchip<N>/pwmX/capture 82 82 Date: June 2016 83 83 KernelVersion: 4.8 84 - Contact: Lee Jones <lee.jones@linaro.org> 84 + Contact: Lee Jones <lee@kernel.org> 85 85 Description: 86 86 Capture information about a PWM signal. The output format is a 87 87 pair unsigned integers (period and duty cycle), separated by a
+1 -1
Documentation/ABI/testing/sysfs-class-rtrs-client
··· 78 78 Date: Feb 2020 79 79 KernelVersion: 5.7 80 80 Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com> 81 - Description: RO, Contains the the name of HCA the connection established on. 81 + Description: RO, Contains the name of HCA the connection established on. 82 82 83 83 What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/hca_port 84 84 Date: Feb 2020
+1 -1
Documentation/ABI/testing/sysfs-class-rtrs-server
··· 24 24 Date: Feb 2020 25 25 KernelVersion: 5.7 26 26 Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com> 27 - Description: RO, Contains the the name of HCA the connection established on. 27 + Description: RO, Contains the name of HCA the connection established on. 28 28 29 29 What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/hca_port 30 30 Date: Feb 2020
+1 -1
Documentation/ABI/testing/sysfs-devices-platform-ACPI-TAD
··· 74 74 75 75 Reads also cause the AC alarm timer status to be reset. 76 76 77 - Another way to reset the the status of the AC alarm timer is to 77 + Another way to reset the status of the AC alarm timer is to 78 78 write (the number) 0 to this file. 79 79 80 80 If the status return value indicates that the timer has expired,
+1 -1
Documentation/ABI/testing/sysfs-devices-power
··· 303 303 Contact: Dominik Brodowski <linux@dominikbrodowski.net> 304 304 Description: 305 305 Reports the runtime PM children usage count of a device, or 306 - 0 if the the children will be ignored. 306 + 0 if the children will be ignored. 307 307
+7 -7
Documentation/ABI/testing/sysfs-devices-soc
··· 1 1 What: /sys/devices/socX 2 2 Date: January 2012 3 - contact: Lee Jones <lee.jones@linaro.org> 3 + contact: Lee Jones <lee@kernel.org> 4 4 Description: 5 5 The /sys/devices/ directory contains a sub-directory for each 6 6 System-on-Chip (SoC) device on a running platform. Information ··· 14 14 15 15 What: /sys/devices/socX/machine 16 16 Date: January 2012 17 - contact: Lee Jones <lee.jones@linaro.org> 17 + contact: Lee Jones <lee@kernel.org> 18 18 Description: 19 19 Read-only attribute common to all SoCs. Contains the SoC machine 20 20 name (e.g. Ux500). 21 21 22 22 What: /sys/devices/socX/family 23 23 Date: January 2012 24 - contact: Lee Jones <lee.jones@linaro.org> 24 + contact: Lee Jones <lee@kernel.org> 25 25 Description: 26 26 Read-only attribute common to all SoCs. Contains SoC family name 27 27 (e.g. DB8500). ··· 59 59 60 60 What: /sys/devices/socX/soc_id 61 61 Date: January 2012 62 - contact: Lee Jones <lee.jones@linaro.org> 62 + contact: Lee Jones <lee@kernel.org> 63 63 Description: 64 64 Read-only attribute supported by most SoCs. In the case of 65 65 ST-Ericsson's chips this contains the SoC serial number. ··· 72 72 73 73 What: /sys/devices/socX/revision 74 74 Date: January 2012 75 - contact: Lee Jones <lee.jones@linaro.org> 75 + contact: Lee Jones <lee@kernel.org> 76 76 Description: 77 77 Read-only attribute supported by most SoCs. Contains the SoC's 78 78 manufacturing revision number. 79 79 80 80 What: /sys/devices/socX/process 81 81 Date: January 2012 82 - contact: Lee Jones <lee.jones@linaro.org> 82 + contact: Lee Jones <lee@kernel.org> 83 83 Description: 84 84 Read-only attribute supported ST-Ericsson's silicon. Contains the 85 85 the process by which the silicon chip was manufactured. 86 86 87 87 What: /sys/bus/soc 88 88 Date: January 2012 89 - contact: Lee Jones <lee.jones@linaro.org> 89 + contact: Lee Jones <lee@kernel.org> 90 90 Description: 91 91 The /sys/bus/soc/ directory contains the usual sub-folders 92 92 expected under most buses. /sys/bus/soc/devices is of particular
+1 -6
Documentation/ABI/testing/sysfs-devices-system-cpu
··· 67 67 /sys/devices/system/cpu/cpu42/node2 -> ../../node/node2 68 68 69 69 70 - What: /sys/devices/system/cpu/cpuX/topology/core_id 71 - /sys/devices/system/cpu/cpuX/topology/core_siblings 70 + What: /sys/devices/system/cpu/cpuX/topology/core_siblings 72 71 /sys/devices/system/cpu/cpuX/topology/core_siblings_list 73 72 /sys/devices/system/cpu/cpuX/topology/physical_package_id 74 73 /sys/devices/system/cpu/cpuX/topology/thread_siblings ··· 82 83 e.g. /sys/devices/system/cpu/cpu42/. 83 84 84 85 Briefly, the files above are: 85 - 86 - core_id: the CPU core ID of cpuX. Typically it is the 87 - hardware platform's identifier (rather than the kernel's). 88 - The actual value is architecture and platform dependent. 89 86 90 87 core_siblings: internal kernel map of cpuX's hardware threads 91 88 within the same physical_package_id.
+1
Documentation/driver-api/firmware/core.rst
··· 13 13 direct-fs-lookup 14 14 fallback-mechanisms 15 15 lookup-order 16 + firmware-usage-guidelines 16 17
+44
Documentation/driver-api/firmware/firmware-usage-guidelines.rst
··· 1 + =================== 2 + Firmware Guidelines 3 + =================== 4 + 5 + Users switching to a newer kernel should *not* have to install newer 6 + firmware files to keep their hardware working. At the same time updated 7 + firmware files must not cause any regressions for users of older kernel 8 + releases. 9 + 10 + Drivers that use firmware from linux-firmware should follow the rules in 11 + this guide. (Where there is limited control of the firmware, 12 + i.e. company doesn't support Linux, firmwares sourced from misc places, 13 + then of course these rules will not apply strictly.) 14 + 15 + * Firmware files shall be designed in a way that it allows checking for 16 + firmware ABI version changes. It is recommended that firmware files be 17 + versioned with at least a major/minor version. It is suggested that 18 + the firmware files in linux-firmware be named with some device 19 + specific name, and just the major version. The firmware version should 20 + be stored in the firmware header, or as an exception, as part of the 21 + firmware file name, in order to let the driver detact any non-ABI 22 + fixes/changes. The firmware files in linux-firmware should be 23 + overwritten with the newest compatible major version. Newer major 24 + version firmware shall remain compatible with all kernels that load 25 + that major number. 26 + 27 + * If the kernel support for the hardware is normally inactive, or the 28 + hardware isn't available for public consumption, this can 29 + be ignored, until the first kernel release that enables that hardware. 30 + This means no major version bumps without the kernel retaining 31 + backwards compatibility for the older major versions. Minor version 32 + bumps should not introduce new features that newer kernels depend on 33 + non-optionally. 34 + 35 + * If a security fix needs lockstep firmware and kernel fixes in order to 36 + be successful, then all supported major versions in the linux-firmware 37 + repo that are required by currently supported stable/LTS kernels, 38 + should be updated with the security fix. The kernel patches should 39 + detect if the firmware is new enough to declare if the security issue 40 + is fixed. All communications around security fixes should point at 41 + both the firmware and kernel fixes. If a security fix requires 42 + deprecating old major versions, then this should only be done as a 43 + last option, and be stated clearly in all communications. 44 +
+4 -1
Documentation/process/embargoed-hardware-issues.rst
··· 244 244 an involved disclosed party. The current ambassadors list: 245 245 246 246 ============= ======================================================== 247 - AMD Tom Lendacky <tom.lendacky@amd.com> 247 + AMD Tom Lendacky <thomas.lendacky@amd.com> 248 248 Ampere Darren Hart <darren@os.amperecomputing.com> 249 249 ARM Catalin Marinas <catalin.marinas@arm.com> 250 250 IBM Power Anton Blanchard <anton@linux.ibm.com> ··· 264 264 265 265 Amazon 266 266 Google Kees Cook <keescook@chromium.org> 267 + 268 + GCC 269 + LLVM Nick Desaulniers <ndesaulniers@google.com> 267 270 ============= ======================================================== 268 271 269 272 If you want your organization to be added to the ambassadors list, please
+1 -1
Documentation/translations/zh_CN/process/embargoed-hardware-issues.rst
··· 174 174 175 175 ============= ======================================================== 176 176 ARM 177 - AMD Tom Lendacky <tom.lendacky@amd.com> 177 + AMD Tom Lendacky <thomas.lendacky@amd.com> 178 178 IBM 179 179 Intel Tony Luck <tony.luck@intel.com> 180 180 Qualcomm Trilok Soni <tsoni@codeaurora.org>
+1 -1
Documentation/translations/zh_TW/process/embargoed-hardware-issues.rst
··· 177 177 178 178 ============= ======================================================== 179 179 ARM 180 - AMD Tom Lendacky <tom.lendacky@amd.com> 180 + AMD Tom Lendacky <thomas.lendacky@amd.com> 181 181 IBM 182 182 Intel Tony Luck <tony.luck@intel.com> 183 183 Qualcomm Trilok Soni <tsoni@codeaurora.org>
+2 -2
MAINTAINERS
··· 7494 7494 F: drivers/media/usb/em28xx/ 7495 7495 7496 7496 EMBEDDED LINUX 7497 - M: Matt Mackall <mpm@selenic.com> 7497 + M: Olivia Mackall <olivia@selenic.com> 7498 7498 M: David Woodhouse <dwmw2@infradead.org> 7499 7499 L: linux-embedded@vger.kernel.org 7500 7500 S: Maintained ··· 8902 8902 K: (devm_)?hwmon_device_(un)?register(|_with_groups|_with_info) 8903 8903 8904 8904 HARDWARE RANDOM NUMBER GENERATOR CORE 8905 - M: Matt Mackall <mpm@selenic.com> 8905 + M: Olivia Mackall <olivia@selenic.com> 8906 8906 M: Herbert Xu <herbert@gondor.apana.org.au> 8907 8907 L: linux-crypto@vger.kernel.org 8908 8908 S: Odd fixes
-14
arch/arm64/kernel/topology.c
··· 89 89 return 0; 90 90 91 91 for_each_possible_cpu(cpu) { 92 - int i, cache_id; 93 - 94 92 topology_id = find_acpi_cpu_topology(cpu, 0); 95 93 if (topology_id < 0) 96 94 return topology_id; ··· 105 107 cpu_topology[cpu].cluster_id = topology_id; 106 108 topology_id = find_acpi_cpu_topology_package(cpu); 107 109 cpu_topology[cpu].package_id = topology_id; 108 - 109 - i = acpi_find_last_cache_level(cpu); 110 - 111 - if (i > 0) { 112 - /* 113 - * this is the only part of cpu_topology that has 114 - * a direct relationship with the cache topology 115 - */ 116 - cache_id = find_acpi_cpu_cache_topology(cpu, i); 117 - if (cache_id > 0) 118 - cpu_topology[cpu].llc_id = cache_id; 119 - } 120 110 } 121 111 122 112 return 0;
+49 -93
drivers/acpi/pptt.c
··· 437 437 pr_debug("found = %p %p\n", found_cache, cpu_node); 438 438 if (found_cache) 439 439 update_cache_properties(this_leaf, found_cache, 440 - cpu_node, table->revision); 440 + ACPI_TO_POINTER(ACPI_PTR_DIFF(cpu_node, table)), 441 + table->revision); 441 442 442 443 index++; 443 444 } ··· 533 532 return -ENOENT; 534 533 } 535 534 535 + 536 + static struct acpi_table_header *acpi_get_pptt(void) 537 + { 538 + static struct acpi_table_header *pptt; 539 + acpi_status status; 540 + 541 + /* 542 + * PPTT will be used at runtime on every CPU hotplug in path, so we 543 + * don't need to call acpi_put_table() to release the table mapping. 544 + */ 545 + if (!pptt) { 546 + status = acpi_get_table(ACPI_SIG_PPTT, 0, &pptt); 547 + if (ACPI_FAILURE(status)) 548 + acpi_pptt_warn_missing(); 549 + } 550 + 551 + return pptt; 552 + } 553 + 536 554 static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag) 537 555 { 538 556 struct acpi_table_header *table; 539 - acpi_status status; 540 557 int retval; 541 558 542 - status = acpi_get_table(ACPI_SIG_PPTT, 0, &table); 543 - if (ACPI_FAILURE(status)) { 544 - acpi_pptt_warn_missing(); 559 + table = acpi_get_pptt(); 560 + if (!table) 545 561 return -ENOENT; 546 - } 562 + 547 563 retval = topology_get_acpi_cpu_tag(table, cpu, level, flag); 548 564 pr_debug("Topology Setup ACPI CPU %d, level %d ret = %d\n", 549 565 cpu, level, retval); 550 - acpi_put_table(table); 551 566 552 567 return retval; 553 568 } ··· 584 567 static int check_acpi_cpu_flag(unsigned int cpu, int rev, u32 flag) 585 568 { 586 569 struct acpi_table_header *table; 587 - acpi_status status; 588 570 u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu); 589 571 struct acpi_pptt_processor *cpu_node = NULL; 590 572 int ret = -ENOENT; 591 573 592 - status = acpi_get_table(ACPI_SIG_PPTT, 0, &table); 593 - if (ACPI_FAILURE(status)) { 594 - acpi_pptt_warn_missing(); 595 - return ret; 596 - } 574 + table = acpi_get_pptt(); 575 + if (!table) 576 + return -ENOENT; 597 577 598 578 if (table->revision >= rev) 599 579 cpu_node = acpi_find_processor_node(table, acpi_cpu_id); 600 580 601 581 if (cpu_node) 602 582 ret = (cpu_node->flags & flag) != 0; 603 - 604 - acpi_put_table(table); 605 583 606 584 return ret; 607 585 } ··· 616 604 u32 acpi_cpu_id; 617 605 struct acpi_table_header *table; 618 606 int number_of_levels = 0; 619 - acpi_status status; 607 + 608 + table = acpi_get_pptt(); 609 + if (!table) 610 + return -ENOENT; 620 611 621 612 pr_debug("Cache Setup find last level CPU=%d\n", cpu); 622 613 623 614 acpi_cpu_id = get_acpi_id_for_cpu(cpu); 624 - status = acpi_get_table(ACPI_SIG_PPTT, 0, &table); 625 - if (ACPI_FAILURE(status)) { 626 - acpi_pptt_warn_missing(); 627 - } else { 628 - number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id); 629 - acpi_put_table(table); 630 - } 615 + number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id); 631 616 pr_debug("Cache Setup find last level level=%d\n", number_of_levels); 632 617 633 618 return number_of_levels; ··· 646 637 int cache_setup_acpi(unsigned int cpu) 647 638 { 648 639 struct acpi_table_header *table; 649 - acpi_status status; 640 + 641 + table = acpi_get_pptt(); 642 + if (!table) 643 + return -ENOENT; 650 644 651 645 pr_debug("Cache Setup ACPI CPU %d\n", cpu); 652 646 653 - status = acpi_get_table(ACPI_SIG_PPTT, 0, &table); 654 - if (ACPI_FAILURE(status)) { 655 - acpi_pptt_warn_missing(); 656 - return -ENOENT; 657 - } 658 - 659 647 cache_setup_acpi_cpu(table, cpu); 660 - acpi_put_table(table); 661 648 662 - return status; 649 + return 0; 663 650 } 664 651 665 652 /** ··· 693 688 int find_acpi_cpu_topology(unsigned int cpu, int level) 694 689 { 695 690 return find_acpi_cpu_topology_tag(cpu, level, 0); 696 - } 697 - 698 - /** 699 - * find_acpi_cpu_cache_topology() - Determine a unique cache topology value 700 - * @cpu: Kernel logical CPU number 701 - * @level: The cache level for which we would like a unique ID 702 - * 703 - * Determine a unique ID for each unified cache in the system 704 - * 705 - * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found. 706 - * Otherwise returns a value which represents a unique topological feature. 707 - */ 708 - int find_acpi_cpu_cache_topology(unsigned int cpu, int level) 709 - { 710 - struct acpi_table_header *table; 711 - struct acpi_pptt_cache *found_cache; 712 - acpi_status status; 713 - u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu); 714 - struct acpi_pptt_processor *cpu_node = NULL; 715 - int ret = -1; 716 - 717 - status = acpi_get_table(ACPI_SIG_PPTT, 0, &table); 718 - if (ACPI_FAILURE(status)) { 719 - acpi_pptt_warn_missing(); 720 - return -ENOENT; 721 - } 722 - 723 - found_cache = acpi_find_cache_node(table, acpi_cpu_id, 724 - CACHE_TYPE_UNIFIED, 725 - level, 726 - &cpu_node); 727 - if (found_cache) 728 - ret = ACPI_PTR_DIFF(cpu_node, table); 729 - 730 - acpi_put_table(table); 731 - 732 - return ret; 733 691 } 734 692 735 693 /** ··· 734 766 int find_acpi_cpu_topology_cluster(unsigned int cpu) 735 767 { 736 768 struct acpi_table_header *table; 737 - acpi_status status; 738 769 struct acpi_pptt_processor *cpu_node, *cluster_node; 739 770 u32 acpi_cpu_id; 740 771 int retval; 741 772 int is_thread; 742 773 743 - status = acpi_get_table(ACPI_SIG_PPTT, 0, &table); 744 - if (ACPI_FAILURE(status)) { 745 - acpi_pptt_warn_missing(); 774 + table = acpi_get_pptt(); 775 + if (!table) 746 776 return -ENOENT; 747 - } 748 777 749 778 acpi_cpu_id = get_acpi_id_for_cpu(cpu); 750 779 cpu_node = acpi_find_processor_node(table, acpi_cpu_id); 751 - if (cpu_node == NULL || !cpu_node->parent) { 752 - retval = -ENOENT; 753 - goto put_table; 754 - } 780 + if (!cpu_node || !cpu_node->parent) 781 + return -ENOENT; 755 782 756 783 is_thread = cpu_node->flags & ACPI_PPTT_ACPI_PROCESSOR_IS_THREAD; 757 784 cluster_node = fetch_pptt_node(table, cpu_node->parent); 758 - if (cluster_node == NULL) { 759 - retval = -ENOENT; 760 - goto put_table; 761 - } 785 + if (!cluster_node) 786 + return -ENOENT; 787 + 762 788 if (is_thread) { 763 - if (!cluster_node->parent) { 764 - retval = -ENOENT; 765 - goto put_table; 766 - } 789 + if (!cluster_node->parent) 790 + return -ENOENT; 791 + 767 792 cluster_node = fetch_pptt_node(table, cluster_node->parent); 768 - if (cluster_node == NULL) { 769 - retval = -ENOENT; 770 - goto put_table; 771 - } 793 + if (!cluster_node) 794 + return -ENOENT; 772 795 } 773 796 if (cluster_node->flags & ACPI_PPTT_ACPI_PROCESSOR_ID_VALID) 774 797 retval = cluster_node->acpi_processor_id; 775 798 else 776 799 retval = ACPI_PTR_DIFF(cluster_node, table); 777 - 778 - put_table: 779 - acpi_put_table(table); 780 800 781 801 return retval; 782 802 }
+77 -27
drivers/base/arch_topology.c
··· 7 7 */ 8 8 9 9 #include <linux/acpi.h> 10 + #include <linux/cacheinfo.h> 10 11 #include <linux/cpu.h> 11 12 #include <linux/cpufreq.h> 12 13 #include <linux/device.h> ··· 497 496 } 498 497 499 498 static int __init parse_core(struct device_node *core, int package_id, 500 - int core_id) 499 + int cluster_id, int core_id) 501 500 { 502 501 char name[20]; 503 502 bool leaf = true; ··· 513 512 cpu = get_cpu_for_node(t); 514 513 if (cpu >= 0) { 515 514 cpu_topology[cpu].package_id = package_id; 515 + cpu_topology[cpu].cluster_id = cluster_id; 516 516 cpu_topology[cpu].core_id = core_id; 517 517 cpu_topology[cpu].thread_id = i; 518 518 } else if (cpu != -ENODEV) { ··· 535 533 } 536 534 537 535 cpu_topology[cpu].package_id = package_id; 536 + cpu_topology[cpu].cluster_id = cluster_id; 538 537 cpu_topology[cpu].core_id = core_id; 539 538 } else if (leaf && cpu != -ENODEV) { 540 539 pr_err("%pOF: Can't get CPU for leaf core\n", core); ··· 545 542 return 0; 546 543 } 547 544 548 - static int __init parse_cluster(struct device_node *cluster, int depth) 545 + static int __init parse_cluster(struct device_node *cluster, int package_id, 546 + int cluster_id, int depth) 549 547 { 550 548 char name[20]; 551 549 bool leaf = true; 552 550 bool has_cores = false; 553 551 struct device_node *c; 554 - static int package_id __initdata; 555 552 int core_id = 0; 556 553 int i, ret; 557 554 ··· 566 563 c = of_get_child_by_name(cluster, name); 567 564 if (c) { 568 565 leaf = false; 569 - ret = parse_cluster(c, depth + 1); 566 + ret = parse_cluster(c, package_id, i, depth + 1); 567 + if (depth > 0) 568 + pr_warn("Topology for clusters of clusters not yet supported\n"); 570 569 of_node_put(c); 571 570 if (ret != 0) 572 571 return ret; ··· 592 587 } 593 588 594 589 if (leaf) { 595 - ret = parse_core(c, package_id, core_id++); 590 + ret = parse_core(c, package_id, cluster_id, 591 + core_id++); 596 592 } else { 597 593 pr_err("%pOF: Non-leaf cluster with core %s\n", 598 594 cluster, name); ··· 610 604 if (leaf && !has_cores) 611 605 pr_warn("%pOF: empty cluster\n", cluster); 612 606 613 - if (leaf) 614 - package_id++; 615 - 616 607 return 0; 608 + } 609 + 610 + static int __init parse_socket(struct device_node *socket) 611 + { 612 + char name[20]; 613 + struct device_node *c; 614 + bool has_socket = false; 615 + int package_id = 0, ret; 616 + 617 + do { 618 + snprintf(name, sizeof(name), "socket%d", package_id); 619 + c = of_get_child_by_name(socket, name); 620 + if (c) { 621 + has_socket = true; 622 + ret = parse_cluster(c, package_id, -1, 0); 623 + of_node_put(c); 624 + if (ret != 0) 625 + return ret; 626 + } 627 + package_id++; 628 + } while (c); 629 + 630 + if (!has_socket) 631 + ret = parse_cluster(socket, 0, -1, 0); 632 + 633 + return ret; 617 634 } 618 635 619 636 static int __init parse_dt_topology(void) ··· 659 630 if (!map) 660 631 goto out; 661 632 662 - ret = parse_cluster(map, 0); 633 + ret = parse_socket(map); 663 634 if (ret != 0) 664 635 goto out_map; 665 636 ··· 670 641 * only mark cores described in the DT as possible. 671 642 */ 672 643 for_each_possible_cpu(cpu) 673 - if (cpu_topology[cpu].package_id == -1) 644 + if (cpu_topology[cpu].package_id < 0) { 674 645 ret = -EINVAL; 646 + break; 647 + } 675 648 676 649 out_map: 677 650 of_node_put(map); ··· 698 667 /* not numa in package, lets use the package siblings */ 699 668 core_mask = &cpu_topology[cpu].core_sibling; 700 669 } 701 - if (cpu_topology[cpu].llc_id != -1) { 670 + 671 + if (last_level_cache_is_valid(cpu)) { 702 672 if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask)) 703 673 core_mask = &cpu_topology[cpu].llc_sibling; 704 674 } ··· 718 686 719 687 const struct cpumask *cpu_clustergroup_mask(int cpu) 720 688 { 689 + /* 690 + * Forbid cpu_clustergroup_mask() to span more or the same CPUs as 691 + * cpu_coregroup_mask(). 692 + */ 693 + if (cpumask_subset(cpu_coregroup_mask(cpu), 694 + &cpu_topology[cpu].cluster_sibling)) 695 + return get_cpu_mask(cpu); 696 + 721 697 return &cpu_topology[cpu].cluster_sibling; 722 698 } 723 699 724 700 void update_siblings_masks(unsigned int cpuid) 725 701 { 726 702 struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid]; 727 - int cpu; 703 + int cpu, ret; 704 + 705 + ret = detect_cache_attributes(cpuid); 706 + if (ret) 707 + pr_info("Early cacheinfo failed, ret = %d\n", ret); 728 708 729 709 /* update core and thread sibling masks */ 730 710 for_each_online_cpu(cpu) { 731 711 cpu_topo = &cpu_topology[cpu]; 732 712 733 - if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) { 713 + if (last_level_cache_is_shared(cpu, cpuid)) { 734 714 cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling); 735 715 cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling); 736 716 } ··· 750 706 if (cpuid_topo->package_id != cpu_topo->package_id) 751 707 continue; 752 708 753 - if (cpuid_topo->cluster_id == cpu_topo->cluster_id && 754 - cpuid_topo->cluster_id != -1) { 709 + cpumask_set_cpu(cpuid, &cpu_topo->core_sibling); 710 + cpumask_set_cpu(cpu, &cpuid_topo->core_sibling); 711 + 712 + if (cpuid_topo->cluster_id != cpu_topo->cluster_id) 713 + continue; 714 + 715 + if (cpuid_topo->cluster_id >= 0) { 755 716 cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling); 756 717 cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling); 757 718 } 758 - 759 - cpumask_set_cpu(cpuid, &cpu_topo->core_sibling); 760 - cpumask_set_cpu(cpu, &cpuid_topo->core_sibling); 761 719 762 720 if (cpuid_topo->core_id != cpu_topo->core_id) 763 721 continue; ··· 796 750 cpu_topo->core_id = -1; 797 751 cpu_topo->cluster_id = -1; 798 752 cpu_topo->package_id = -1; 799 - cpu_topo->llc_id = -1; 800 753 801 754 clear_cpu_topology(cpu); 802 755 } ··· 825 780 #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV) 826 781 void __init init_cpu_topology(void) 827 782 { 828 - reset_cpu_topology(); 783 + int ret; 829 784 830 - /* 831 - * Discard anything that was parsed if we hit an error so we 832 - * don't use partial information. 833 - */ 834 - if (parse_acpi_topology()) 785 + reset_cpu_topology(); 786 + ret = parse_acpi_topology(); 787 + if (!ret) 788 + ret = of_have_populated_dt() && parse_dt_topology(); 789 + 790 + if (ret) { 791 + /* 792 + * Discard anything that was parsed if we hit an error so we 793 + * don't use partial information. 794 + */ 835 795 reset_cpu_topology(); 836 - else if (of_have_populated_dt() && parse_dt_topology()) 837 - reset_cpu_topology(); 796 + return; 797 + } 838 798 } 839 799 #endif
+1
drivers/base/base.h
··· 160 160 extern void device_block_probing(void); 161 161 extern void device_unblock_probing(void); 162 162 extern void deferred_probe_extend_timeout(void); 163 + extern void driver_deferred_probe_trigger(void); 163 164 164 165 /* /sys/devices directory */ 165 166 extern struct kset *devices_kset;
+93 -52
drivers/base/cacheinfo.c
··· 14 14 #include <linux/cpu.h> 15 15 #include <linux/device.h> 16 16 #include <linux/init.h> 17 - #include <linux/of.h> 17 + #include <linux/of_device.h> 18 18 #include <linux/sched.h> 19 19 #include <linux/slab.h> 20 20 #include <linux/smp.h> ··· 25 25 #define ci_cacheinfo(cpu) (&per_cpu(ci_cpu_cacheinfo, cpu)) 26 26 #define cache_leaves(cpu) (ci_cacheinfo(cpu)->num_leaves) 27 27 #define per_cpu_cacheinfo(cpu) (ci_cacheinfo(cpu)->info_list) 28 + #define per_cpu_cacheinfo_idx(cpu, idx) \ 29 + (per_cpu_cacheinfo(cpu) + (idx)) 28 30 29 31 struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu) 30 32 { 31 33 return ci_cacheinfo(cpu); 32 34 } 33 35 34 - #ifdef CONFIG_OF 35 36 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf, 36 37 struct cacheinfo *sib_leaf) 37 38 { 39 + /* 40 + * For non DT/ACPI systems, assume unique level 1 caches, 41 + * system-wide shared caches for all other levels. This will be used 42 + * only if arch specific code has not populated shared_cpu_map 43 + */ 44 + if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI))) 45 + return !(this_leaf->level == 1); 46 + 47 + if ((sib_leaf->attributes & CACHE_ID) && 48 + (this_leaf->attributes & CACHE_ID)) 49 + return sib_leaf->id == this_leaf->id; 50 + 38 51 return sib_leaf->fw_token == this_leaf->fw_token; 39 52 } 40 53 54 + bool last_level_cache_is_valid(unsigned int cpu) 55 + { 56 + struct cacheinfo *llc; 57 + 58 + if (!cache_leaves(cpu)) 59 + return false; 60 + 61 + llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1); 62 + 63 + return (llc->attributes & CACHE_ID) || !!llc->fw_token; 64 + 65 + } 66 + 67 + bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y) 68 + { 69 + struct cacheinfo *llc_x, *llc_y; 70 + 71 + if (!last_level_cache_is_valid(cpu_x) || 72 + !last_level_cache_is_valid(cpu_y)) 73 + return false; 74 + 75 + llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1); 76 + llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1); 77 + 78 + return cache_leaves_are_shared(llc_x, llc_y); 79 + } 80 + 81 + #ifdef CONFIG_OF 41 82 /* OF properties to query for a given cache type */ 42 83 struct cache_type_info { 43 84 const char *size_prop; ··· 198 157 { 199 158 struct device_node *np; 200 159 struct cacheinfo *this_leaf; 201 - struct device *cpu_dev = get_cpu_device(cpu); 202 - struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); 203 160 unsigned int index = 0; 204 161 205 - /* skip if fw_token is already populated */ 206 - if (this_cpu_ci->info_list->fw_token) { 207 - return 0; 208 - } 209 - 210 - if (!cpu_dev) { 211 - pr_err("No cpu device for CPU %d\n", cpu); 212 - return -ENODEV; 213 - } 214 - np = cpu_dev->of_node; 162 + np = of_cpu_device_node_get(cpu); 215 163 if (!np) { 216 164 pr_err("Failed to find cpu%d device node\n", cpu); 217 165 return -ENOENT; 218 166 } 219 167 220 168 while (index < cache_leaves(cpu)) { 221 - this_leaf = this_cpu_ci->info_list + index; 169 + this_leaf = per_cpu_cacheinfo_idx(cpu, index); 222 170 if (this_leaf->level != 1) 223 171 np = of_find_next_cache_node(np); 224 172 else ··· 226 196 } 227 197 #else 228 198 static inline int cache_setup_of_node(unsigned int cpu) { return 0; } 229 - static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf, 230 - struct cacheinfo *sib_leaf) 231 - { 232 - /* 233 - * For non-DT/ACPI systems, assume unique level 1 caches, system-wide 234 - * shared caches for all other levels. This will be used only if 235 - * arch specific code has not populated shared_cpu_map 236 - */ 237 - return !(this_leaf->level == 1); 238 - } 239 199 #endif 240 200 241 201 int __weak cache_setup_acpi(unsigned int cpu) ··· 234 214 } 235 215 236 216 unsigned int coherency_max_size; 217 + 218 + static int cache_setup_properties(unsigned int cpu) 219 + { 220 + int ret = 0; 221 + 222 + if (of_have_populated_dt()) 223 + ret = cache_setup_of_node(cpu); 224 + else if (!acpi_disabled) 225 + ret = cache_setup_acpi(cpu); 226 + 227 + return ret; 228 + } 237 229 238 230 static int cache_shared_cpu_map_setup(unsigned int cpu) 239 231 { ··· 257 225 if (this_cpu_ci->cpu_map_populated) 258 226 return 0; 259 227 260 - if (of_have_populated_dt()) 261 - ret = cache_setup_of_node(cpu); 262 - else if (!acpi_disabled) 263 - ret = cache_setup_acpi(cpu); 264 - 265 - if (ret) 266 - return ret; 228 + /* 229 + * skip setting up cache properties if LLC is valid, just need 230 + * to update the shared cpu_map if the cache attributes were 231 + * populated early before all the cpus are brought online 232 + */ 233 + if (!last_level_cache_is_valid(cpu)) { 234 + ret = cache_setup_properties(cpu); 235 + if (ret) 236 + return ret; 237 + } 267 238 268 239 for (index = 0; index < cache_leaves(cpu); index++) { 269 240 unsigned int i; 270 241 271 - this_leaf = this_cpu_ci->info_list + index; 272 - /* skip if shared_cpu_map is already populated */ 273 - if (!cpumask_empty(&this_leaf->shared_cpu_map)) 274 - continue; 242 + this_leaf = per_cpu_cacheinfo_idx(cpu, index); 275 243 276 244 cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map); 277 245 for_each_online_cpu(i) { ··· 279 247 280 248 if (i == cpu || !sib_cpu_ci->info_list) 281 249 continue;/* skip if itself or no cacheinfo */ 282 - sib_leaf = sib_cpu_ci->info_list + index; 250 + 251 + sib_leaf = per_cpu_cacheinfo_idx(i, index); 283 252 if (cache_leaves_are_shared(this_leaf, sib_leaf)) { 284 253 cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map); 285 254 cpumask_set_cpu(i, &this_leaf->shared_cpu_map); ··· 296 263 297 264 static void cache_shared_cpu_map_remove(unsigned int cpu) 298 265 { 299 - struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); 300 266 struct cacheinfo *this_leaf, *sib_leaf; 301 267 unsigned int sibling, index; 302 268 303 269 for (index = 0; index < cache_leaves(cpu); index++) { 304 - this_leaf = this_cpu_ci->info_list + index; 270 + this_leaf = per_cpu_cacheinfo_idx(cpu, index); 305 271 for_each_cpu(sibling, &this_leaf->shared_cpu_map) { 306 - struct cpu_cacheinfo *sib_cpu_ci; 272 + struct cpu_cacheinfo *sib_cpu_ci = 273 + get_cpu_cacheinfo(sibling); 307 274 308 - if (sibling == cpu) /* skip itself */ 309 - continue; 275 + if (sibling == cpu || !sib_cpu_ci->info_list) 276 + continue;/* skip if itself or no cacheinfo */ 310 277 311 - sib_cpu_ci = get_cpu_cacheinfo(sibling); 312 - if (!sib_cpu_ci->info_list) 313 - continue; 314 - 315 - sib_leaf = sib_cpu_ci->info_list + index; 278 + sib_leaf = per_cpu_cacheinfo_idx(sibling, index); 316 279 cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map); 317 280 cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map); 318 281 } ··· 339 310 return -ENOENT; 340 311 } 341 312 342 - static int detect_cache_attributes(unsigned int cpu) 313 + int detect_cache_attributes(unsigned int cpu) 343 314 { 344 315 int ret; 316 + 317 + /* Since early detection of the cacheinfo is allowed via this 318 + * function and this also gets called as CPU hotplug callbacks via 319 + * cacheinfo_cpu_online, the initialisation can be skipped and only 320 + * CPU maps can be updated as the CPU online status would be update 321 + * if called via cacheinfo_cpu_online path. 322 + */ 323 + if (per_cpu_cacheinfo(cpu)) 324 + goto update_cpu_map; 345 325 346 326 if (init_cache_level(cpu) || !cache_leaves(cpu)) 347 327 return -ENOENT; 348 328 349 329 per_cpu_cacheinfo(cpu) = kcalloc(cache_leaves(cpu), 350 - sizeof(struct cacheinfo), GFP_KERNEL); 351 - if (per_cpu_cacheinfo(cpu) == NULL) 330 + sizeof(struct cacheinfo), GFP_ATOMIC); 331 + if (per_cpu_cacheinfo(cpu) == NULL) { 332 + cache_leaves(cpu) = 0; 352 333 return -ENOMEM; 334 + } 353 335 354 336 /* 355 337 * populate_cache_leaves() may completely setup the cache leaves and ··· 369 329 ret = populate_cache_leaves(cpu); 370 330 if (ret) 371 331 goto free_ci; 332 + 333 + update_cpu_map: 372 334 /* 373 335 * For systems using DT for cache hierarchy, fw_token 374 336 * and shared_cpu_map will be set up here only if they are ··· 656 614 int rc; 657 615 struct device *ci_dev, *parent; 658 616 struct cacheinfo *this_leaf; 659 - struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); 660 617 const struct attribute_group **cache_groups; 661 618 662 619 rc = cpu_cache_sysfs_init(cpu); ··· 664 623 665 624 parent = per_cpu_cache_dev(cpu); 666 625 for (i = 0; i < cache_leaves(cpu); i++) { 667 - this_leaf = this_cpu_ci->info_list + i; 626 + this_leaf = per_cpu_cacheinfo_idx(cpu, i); 668 627 if (this_leaf->disable_sysfs) 669 628 continue; 670 629 if (this_leaf->type == CACHE_TYPE_NOCACHE)
+116 -7
drivers/base/core.c
··· 54 54 static DEFINE_MUTEX(fwnode_link_lock); 55 55 static bool fw_devlink_is_permissive(void); 56 56 static bool fw_devlink_drv_reg_done; 57 + static bool fw_devlink_best_effort; 57 58 58 59 /** 59 60 * fwnode_link_add - Create a link between two fwnode_handles. ··· 977 976 } 978 977 } 979 978 979 + static bool dev_is_best_effort(struct device *dev) 980 + { 981 + return (fw_devlink_best_effort && dev->can_match) || 982 + (dev->fwnode && (dev->fwnode->flags & FWNODE_FLAG_BEST_EFFORT)); 983 + } 984 + 980 985 /** 981 986 * device_links_check_suppliers - Check presence of supplier drivers. 982 987 * @dev: Consumer device. ··· 1002 995 int device_links_check_suppliers(struct device *dev) 1003 996 { 1004 997 struct device_link *link; 1005 - int ret = 0; 998 + int ret = 0, fwnode_ret = 0; 1006 999 struct fwnode_handle *sup_fw; 1007 1000 1008 1001 /* ··· 1015 1008 sup_fw = list_first_entry(&dev->fwnode->suppliers, 1016 1009 struct fwnode_link, 1017 1010 c_hook)->supplier; 1018 - dev_err_probe(dev, -EPROBE_DEFER, "wait for supplier %pfwP\n", 1019 - sup_fw); 1020 - mutex_unlock(&fwnode_link_lock); 1021 - return -EPROBE_DEFER; 1011 + if (!dev_is_best_effort(dev)) { 1012 + fwnode_ret = -EPROBE_DEFER; 1013 + dev_err_probe(dev, -EPROBE_DEFER, 1014 + "wait for supplier %pfwP\n", sup_fw); 1015 + } else { 1016 + fwnode_ret = -EAGAIN; 1017 + } 1022 1018 } 1023 1019 mutex_unlock(&fwnode_link_lock); 1020 + if (fwnode_ret == -EPROBE_DEFER) 1021 + return fwnode_ret; 1024 1022 1025 1023 device_links_write_lock(); 1026 1024 ··· 1035 1023 1036 1024 if (link->status != DL_STATE_AVAILABLE && 1037 1025 !(link->flags & DL_FLAG_SYNC_STATE_ONLY)) { 1026 + 1027 + if (dev_is_best_effort(dev) && 1028 + link->flags & DL_FLAG_INFERRED && 1029 + !link->supplier->can_match) { 1030 + ret = -EAGAIN; 1031 + continue; 1032 + } 1033 + 1038 1034 device_links_missing_supplier(dev); 1039 1035 dev_err_probe(dev, -EPROBE_DEFER, 1040 1036 "supplier %s not ready\n", ··· 1055 1035 dev->links.status = DL_DEV_PROBING; 1056 1036 1057 1037 device_links_write_unlock(); 1058 - return ret; 1038 + 1039 + return ret ? ret : fwnode_ret; 1059 1040 } 1060 1041 1061 1042 /** ··· 1319 1298 * When DL_FLAG_SYNC_STATE_ONLY is set, it means no 1320 1299 * other DL_MANAGED_LINK_FLAGS have been set. So, it's 1321 1300 * save to drop the managed link completely. 1301 + */ 1302 + device_link_drop_managed(link); 1303 + } else if (dev_is_best_effort(dev) && 1304 + link->flags & DL_FLAG_INFERRED && 1305 + link->status != DL_STATE_CONSUMER_PROBE && 1306 + !link->supplier->can_match) { 1307 + /* 1308 + * When dev_is_best_effort() is true, we ignore device 1309 + * links to suppliers that don't have a driver. If the 1310 + * consumer device still managed to probe, there's no 1311 + * point in maintaining a device link in a weird state 1312 + * (consumer probed before supplier). So delete it. 1322 1313 */ 1323 1314 device_link_drop_managed(link); 1324 1315 } else { ··· 1625 1592 } 1626 1593 early_param("fw_devlink", fw_devlink_setup); 1627 1594 1628 - static bool fw_devlink_strict; 1595 + static bool fw_devlink_strict = true; 1629 1596 static int __init fw_devlink_strict_setup(char *arg) 1630 1597 { 1631 1598 return strtobool(arg, &fw_devlink_strict); ··· 1697 1664 class_for_each_device(&devlink_class, NULL, NULL, 1698 1665 fw_devlink_no_driver); 1699 1666 device_links_write_unlock(); 1667 + } 1668 + 1669 + /** 1670 + * wait_for_init_devices_probe - Try to probe any device needed for init 1671 + * 1672 + * Some devices might need to be probed and bound successfully before the kernel 1673 + * boot sequence can finish and move on to init/userspace. For example, a 1674 + * network interface might need to be bound to be able to mount a NFS rootfs. 1675 + * 1676 + * With fw_devlink=on by default, some of these devices might be blocked from 1677 + * probing because they are waiting on a optional supplier that doesn't have a 1678 + * driver. While fw_devlink will eventually identify such devices and unblock 1679 + * the probing automatically, it might be too late by the time it unblocks the 1680 + * probing of devices. For example, the IP4 autoconfig might timeout before 1681 + * fw_devlink unblocks probing of the network interface. 1682 + * 1683 + * This function is available to temporarily try and probe all devices that have 1684 + * a driver even if some of their suppliers haven't been added or don't have 1685 + * drivers. 1686 + * 1687 + * The drivers can then decide which of the suppliers are optional vs mandatory 1688 + * and probe the device if possible. By the time this function returns, all such 1689 + * "best effort" probes are guaranteed to be completed. If a device successfully 1690 + * probes in this mode, we delete all fw_devlink discovered dependencies of that 1691 + * device where the supplier hasn't yet probed successfully because they have to 1692 + * be optional dependencies. 1693 + * 1694 + * Any devices that didn't successfully probe go back to being treated as if 1695 + * this function was never called. 1696 + * 1697 + * This also means that some devices that aren't needed for init and could have 1698 + * waited for their optional supplier to probe (when the supplier's module is 1699 + * loaded later on) would end up probing prematurely with limited functionality. 1700 + * So call this function only when boot would fail without it. 1701 + */ 1702 + void __init wait_for_init_devices_probe(void) 1703 + { 1704 + if (!fw_devlink_flags || fw_devlink_is_permissive()) 1705 + return; 1706 + 1707 + /* 1708 + * Wait for all ongoing probes to finish so that the "best effort" is 1709 + * only applied to devices that can't probe otherwise. 1710 + */ 1711 + wait_for_device_probe(); 1712 + 1713 + pr_info("Trying to probe devices needed for running init ...\n"); 1714 + fw_devlink_best_effort = true; 1715 + driver_deferred_probe_trigger(); 1716 + 1717 + /* 1718 + * Wait for all "best effort" probes to finish before going back to 1719 + * normal enforcement. 1720 + */ 1721 + wait_for_device_probe(); 1722 + fw_devlink_best_effort = false; 1700 1723 } 1701 1724 1702 1725 static void fw_devlink_unblock_consumers(struct device *dev) ··· 3931 3842 return child; 3932 3843 } 3933 3844 EXPORT_SYMBOL_GPL(device_find_child_by_name); 3845 + 3846 + static int match_any(struct device *dev, void *unused) 3847 + { 3848 + return 1; 3849 + } 3850 + 3851 + /** 3852 + * device_find_any_child - device iterator for locating a child device, if any. 3853 + * @parent: parent struct device 3854 + * 3855 + * This is similar to the device_find_child() function above, but it 3856 + * returns a reference to a child device, if any. 3857 + * 3858 + * NOTE: you will need to drop the reference with put_device() after use. 3859 + */ 3860 + struct device *device_find_any_child(struct device *parent) 3861 + { 3862 + return device_find_child(parent, NULL, match_any); 3863 + } 3864 + EXPORT_SYMBOL_GPL(device_find_any_child); 3934 3865 3935 3866 int __init devices_init(void) 3936 3867 {
+23 -36
drivers/base/dd.c
··· 172 172 * changes in the midst of a probe, then deferred processing should be triggered 173 173 * again. 174 174 */ 175 - static void driver_deferred_probe_trigger(void) 175 + void driver_deferred_probe_trigger(void) 176 176 { 177 177 if (!driver_deferred_probe_enable) 178 178 return; ··· 256 256 } 257 257 DEFINE_SHOW_ATTRIBUTE(deferred_devs); 258 258 259 + #ifdef CONFIG_MODULES 260 + int driver_deferred_probe_timeout = 10; 261 + #else 259 262 int driver_deferred_probe_timeout; 263 + #endif 264 + 260 265 EXPORT_SYMBOL_GPL(driver_deferred_probe_timeout); 261 266 262 267 static int __init deferred_probe_timeout_setup(char *str) ··· 274 269 } 275 270 __setup("deferred_probe_timeout=", deferred_probe_timeout_setup); 276 271 277 - /** 278 - * driver_deferred_probe_check_state() - Check deferred probe state 279 - * @dev: device to check 280 - * 281 - * Return: 282 - * * -ENODEV if initcalls have completed and modules are disabled. 283 - * * -ETIMEDOUT if the deferred probe timeout was set and has expired 284 - * and modules are enabled. 285 - * * -EPROBE_DEFER in other cases. 286 - * 287 - * Drivers or subsystems can opt-in to calling this function instead of directly 288 - * returning -EPROBE_DEFER. 289 - */ 290 - int driver_deferred_probe_check_state(struct device *dev) 291 - { 292 - if (!IS_ENABLED(CONFIG_MODULES) && initcalls_done) { 293 - dev_warn(dev, "ignoring dependency for device, assuming no driver\n"); 294 - return -ENODEV; 295 - } 296 - 297 - if (!driver_deferred_probe_timeout && initcalls_done) { 298 - dev_warn(dev, "deferred probe timeout, ignoring dependency\n"); 299 - return -ETIMEDOUT; 300 - } 301 - 302 - return -EPROBE_DEFER; 303 - } 304 - EXPORT_SYMBOL_GPL(driver_deferred_probe_check_state); 305 - 306 272 static void deferred_probe_timeout_work_func(struct work_struct *work) 307 273 { 308 274 struct device_private *p; 309 275 310 276 fw_devlink_drivers_done(); 311 277 312 - driver_deferred_probe_timeout = 0; 313 278 driver_deferred_probe_trigger(); 314 279 flush_work(&deferred_probe_work); 315 280 ··· 555 580 { 556 581 bool test_remove = IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE) && 557 582 !drv->suppress_bind_attrs; 558 - int ret; 583 + int ret, link_ret; 559 584 560 585 if (defer_all_probes) { 561 586 /* ··· 567 592 return -EPROBE_DEFER; 568 593 } 569 594 570 - ret = device_links_check_suppliers(dev); 571 - if (ret) 572 - return ret; 595 + link_ret = device_links_check_suppliers(dev); 596 + if (link_ret == -EPROBE_DEFER) 597 + return link_ret; 573 598 574 599 pr_debug("bus: '%s': %s: probing driver %s with device %s\n", 575 600 drv->bus->name, __func__, drv->name, dev_name(dev)); ··· 608 633 609 634 ret = call_driver_probe(dev, drv); 610 635 if (ret) { 636 + /* 637 + * If fw_devlink_best_effort is active (denoted by -EAGAIN), the 638 + * device might actually probe properly once some of its missing 639 + * suppliers have probed. So, treat this as if the driver 640 + * returned -EPROBE_DEFER. 641 + */ 642 + if (link_ret == -EAGAIN) 643 + ret = -EPROBE_DEFER; 644 + 611 645 /* 612 646 * Return probe errors as positive values so that the callers 613 647 * can distinguish them from other errors. ··· 1099 1115 static int __driver_attach(struct device *dev, void *data) 1100 1116 { 1101 1117 struct device_driver *drv = data; 1118 + bool async = false; 1102 1119 int ret; 1103 1120 1104 1121 /* ··· 1138 1153 if (!dev->driver && !dev->p->async_driver) { 1139 1154 get_device(dev); 1140 1155 dev->p->async_driver = drv; 1141 - async_schedule_dev(__driver_attach_async_helper, dev); 1156 + async = true; 1142 1157 } 1143 1158 device_unlock(dev); 1159 + if (async) 1160 + async_schedule_dev(__driver_attach_async_helper, dev); 1144 1161 return 0; 1145 1162 } 1146 1163
+1
drivers/base/devtmpfs.c
··· 482 482 if (err) { 483 483 printk(KERN_ERR "devtmpfs: unable to create devtmpfs %i\n", err); 484 484 unregister_filesystem(&dev_fs_type); 485 + thread = NULL; 485 486 return err; 486 487 } 487 488
+2 -2
drivers/base/firmware_loader/main.c
··· 435 435 436 436 /* decompress onto the new allocated page */ 437 437 page = fw_priv->pages[fw_priv->nr_pages - 1]; 438 - xz_buf.out = kmap(page); 438 + xz_buf.out = kmap_local_page(page); 439 439 xz_buf.out_pos = 0; 440 440 xz_buf.out_size = PAGE_SIZE; 441 441 xz_ret = xz_dec_run(xz_dec, &xz_buf); 442 - kunmap(page); 442 + kunmap_local(xz_buf.out); 443 443 fw_priv->size += xz_buf.out_pos; 444 444 /* partial decompression means either end or error */ 445 445 if (xz_buf.out_pos != PAGE_SIZE)
+4 -6
drivers/base/firmware_loader/sysfs.c
··· 242 242 loff_t offset, size_t count, bool read) 243 243 { 244 244 while (count) { 245 - void *page_data; 246 245 int page_nr = offset >> PAGE_SHIFT; 247 246 int page_ofs = offset & (PAGE_SIZE - 1); 248 247 int page_cnt = min_t(size_t, PAGE_SIZE - page_ofs, count); 249 248 250 - page_data = kmap(fw_priv->pages[page_nr]); 251 - 252 249 if (read) 253 - memcpy(buffer, page_data + page_ofs, page_cnt); 250 + memcpy_from_page(buffer, fw_priv->pages[page_nr], 251 + page_ofs, page_cnt); 254 252 else 255 - memcpy(page_data + page_ofs, buffer, page_cnt); 253 + memcpy_to_page(fw_priv->pages[page_nr], page_ofs, 254 + buffer, page_cnt); 256 255 257 - kunmap(fw_priv->pages[page_nr]); 258 256 buffer += page_cnt; 259 257 offset += page_cnt; 260 258 count -= page_cnt;
+2 -2
drivers/base/node.c
··· 45 45 return n; 46 46 } 47 47 48 - static BIN_ATTR_RO(cpumap, 0); 48 + static BIN_ATTR_RO(cpumap, CPUMAP_FILE_MAX_BYTES); 49 49 50 50 static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj, 51 51 struct bin_attribute *attr, char *buf, ··· 66 66 return n; 67 67 } 68 68 69 - static BIN_ATTR_RO(cpulist, 0); 69 + static BIN_ATTR_RO(cpulist, CPULIST_FILE_MAX_BYTES); 70 70 71 71 /** 72 72 * struct node_access_nodes - Access class device to hold user visible
+1 -1
drivers/base/power/domain.c
··· 2733 2733 mutex_unlock(&gpd_list_lock); 2734 2734 dev_dbg(dev, "%s() failed to find PM domain: %ld\n", 2735 2735 __func__, PTR_ERR(pd)); 2736 - return driver_deferred_probe_check_state(base_dev); 2736 + return -ENODEV; 2737 2737 } 2738 2738 2739 2739 dev_dbg(dev, "adding to PM domain %s\n", pd->name);
+16 -16
drivers/base/topology.c
··· 62 62 static DEVICE_ATTR_ADMIN_RO(ppin); 63 63 64 64 define_siblings_read_func(thread_siblings, sibling_cpumask); 65 - static BIN_ATTR_RO(thread_siblings, 0); 66 - static BIN_ATTR_RO(thread_siblings_list, 0); 65 + static BIN_ATTR_RO(thread_siblings, CPUMAP_FILE_MAX_BYTES); 66 + static BIN_ATTR_RO(thread_siblings_list, CPULIST_FILE_MAX_BYTES); 67 67 68 68 define_siblings_read_func(core_cpus, sibling_cpumask); 69 - static BIN_ATTR_RO(core_cpus, 0); 70 - static BIN_ATTR_RO(core_cpus_list, 0); 69 + static BIN_ATTR_RO(core_cpus, CPUMAP_FILE_MAX_BYTES); 70 + static BIN_ATTR_RO(core_cpus_list, CPULIST_FILE_MAX_BYTES); 71 71 72 72 define_siblings_read_func(core_siblings, core_cpumask); 73 - static BIN_ATTR_RO(core_siblings, 0); 74 - static BIN_ATTR_RO(core_siblings_list, 0); 73 + static BIN_ATTR_RO(core_siblings, CPUMAP_FILE_MAX_BYTES); 74 + static BIN_ATTR_RO(core_siblings_list, CPULIST_FILE_MAX_BYTES); 75 75 76 76 #ifdef TOPOLOGY_CLUSTER_SYSFS 77 77 define_siblings_read_func(cluster_cpus, cluster_cpumask); 78 - static BIN_ATTR_RO(cluster_cpus, 0); 79 - static BIN_ATTR_RO(cluster_cpus_list, 0); 78 + static BIN_ATTR_RO(cluster_cpus, CPUMAP_FILE_MAX_BYTES); 79 + static BIN_ATTR_RO(cluster_cpus_list, CPULIST_FILE_MAX_BYTES); 80 80 #endif 81 81 82 82 #ifdef TOPOLOGY_DIE_SYSFS 83 83 define_siblings_read_func(die_cpus, die_cpumask); 84 - static BIN_ATTR_RO(die_cpus, 0); 85 - static BIN_ATTR_RO(die_cpus_list, 0); 84 + static BIN_ATTR_RO(die_cpus, CPUMAP_FILE_MAX_BYTES); 85 + static BIN_ATTR_RO(die_cpus_list, CPULIST_FILE_MAX_BYTES); 86 86 #endif 87 87 88 88 define_siblings_read_func(package_cpus, core_cpumask); 89 - static BIN_ATTR_RO(package_cpus, 0); 90 - static BIN_ATTR_RO(package_cpus_list, 0); 89 + static BIN_ATTR_RO(package_cpus, CPUMAP_FILE_MAX_BYTES); 90 + static BIN_ATTR_RO(package_cpus_list, CPULIST_FILE_MAX_BYTES); 91 91 92 92 #ifdef TOPOLOGY_BOOK_SYSFS 93 93 define_id_show_func(book_id, "%d"); 94 94 static DEVICE_ATTR_RO(book_id); 95 95 define_siblings_read_func(book_siblings, book_cpumask); 96 - static BIN_ATTR_RO(book_siblings, 0); 97 - static BIN_ATTR_RO(book_siblings_list, 0); 96 + static BIN_ATTR_RO(book_siblings, CPUMAP_FILE_MAX_BYTES); 97 + static BIN_ATTR_RO(book_siblings_list, CPULIST_FILE_MAX_BYTES); 98 98 #endif 99 99 100 100 #ifdef TOPOLOGY_DRAWER_SYSFS 101 101 define_id_show_func(drawer_id, "%d"); 102 102 static DEVICE_ATTR_RO(drawer_id); 103 103 define_siblings_read_func(drawer_siblings, drawer_cpumask); 104 - static BIN_ATTR_RO(drawer_siblings, 0); 105 - static BIN_ATTR_RO(drawer_siblings_list, 0); 104 + static BIN_ATTR_RO(drawer_siblings, CPUMAP_FILE_MAX_BYTES); 105 + static BIN_ATTR_RO(drawer_siblings_list, CPULIST_FILE_MAX_BYTES); 106 106 #endif 107 107 108 108 static struct bin_attribute *bin_attrs[] = {
+1 -1
drivers/iommu/of_iommu.c
··· 40 40 * a proper probe-ordering dependency mechanism in future. 41 41 */ 42 42 if (!ops) 43 - return driver_deferred_probe_check_state(dev); 43 + return -ENODEV; 44 44 45 45 if (!try_module_get(ops->owner)) 46 46 return -ENODEV;
+1 -3
drivers/net/mdio/fwnode_mdio.c
··· 47 47 * just fall back to poll mode 48 48 */ 49 49 if (rc == -EPROBE_DEFER) 50 - rc = driver_deferred_probe_check_state(&phy->mdio.dev); 51 - if (rc == -EPROBE_DEFER) 52 - return rc; 50 + rc = -ENODEV; 53 51 54 52 if (rc > 0) { 55 53 phy->irq = rc;
+2
drivers/of/base.c
··· 1919 1919 of_property_read_string(of_aliases, "stdout", &name); 1920 1920 if (name) 1921 1921 of_stdout = of_find_node_opts_by_path(name, &of_stdout_options); 1922 + if (of_stdout) 1923 + of_stdout->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT; 1922 1924 } 1923 1925 1924 1926 if (!of_aliases)
+1 -1
drivers/pinctrl/devicetree.c
··· 129 129 np_pctldev = of_get_next_parent(np_pctldev); 130 130 if (!np_pctldev || of_node_is_root(np_pctldev)) { 131 131 of_node_put(np_pctldev); 132 - ret = driver_deferred_probe_check_state(p->dev); 132 + ret = -ENODEV; 133 133 /* keep deferring if modules are enabled */ 134 134 if (IS_ENABLED(CONFIG_MODULES) && !allow_default && ret < 0) 135 135 ret = -EPROBE_DEFER;
+2 -7
drivers/spi/spi.c
··· 2687 2687 } 2688 2688 EXPORT_SYMBOL_GPL(spi_slave_abort); 2689 2689 2690 - static int match_true(struct device *dev, void *data) 2691 - { 2692 - return 1; 2693 - } 2694 - 2695 2690 static ssize_t slave_show(struct device *dev, struct device_attribute *attr, 2696 2691 char *buf) 2697 2692 { ··· 2694 2699 dev); 2695 2700 struct device *child; 2696 2701 2697 - child = device_find_child(&ctlr->dev, NULL, match_true); 2702 + child = device_find_any_child(&ctlr->dev); 2698 2703 return sprintf(buf, "%s\n", 2699 2704 child ? to_spi_device(child)->modalias : NULL); 2700 2705 } ··· 2713 2718 if (rc != 1 || !name[0]) 2714 2719 return -EINVAL; 2715 2720 2716 - child = device_find_child(&ctlr->dev, NULL, match_true); 2721 + child = device_find_any_child(&ctlr->dev); 2717 2722 if (child) { 2718 2723 /* Remove registered slave */ 2719 2724 device_unregister(child);
+5 -2
fs/kernfs/dir.c
··· 1343 1343 { 1344 1344 struct kernfs_node *pos; 1345 1345 1346 + /* Short-circuit if non-root @kn has already finished removal. */ 1347 + if (!kn) 1348 + return; 1349 + 1346 1350 lockdep_assert_held_write(&kernfs_root(kn)->kernfs_rwsem); 1347 1351 1348 1352 /* 1349 - * Short-circuit if non-root @kn has already finished removal. 1350 1353 * This is for kernfs_remove_self() which plays with active ref 1351 1354 * after removal. 1352 1355 */ 1353 - if (!kn || (kn->parent && RB_EMPTY_NODE(&kn->rb))) 1356 + if (kn->parent && RB_EMPTY_NODE(&kn->rb)) 1354 1357 return; 1355 1358 1356 1359 pr_debug("kernfs %s: removing\n", kn->name);
+134 -71
fs/kernfs/file.c
··· 18 18 19 19 #include "kernfs-internal.h" 20 20 21 - /* 22 - * There's one kernfs_open_file for each open file and one kernfs_open_node 23 - * for each kernfs_node with one or more open files. 24 - * 25 - * kernfs_node->attr.open points to kernfs_open_node. attr.open is 26 - * protected by kernfs_open_node_lock. 27 - * 28 - * filp->private_data points to seq_file whose ->private points to 29 - * kernfs_open_file. kernfs_open_files are chained at 30 - * kernfs_open_node->files, which is protected by kernfs_open_file_mutex. 31 - */ 32 - static DEFINE_SPINLOCK(kernfs_open_node_lock); 33 - static DEFINE_MUTEX(kernfs_open_file_mutex); 34 - 35 21 struct kernfs_open_node { 22 + struct rcu_head rcu_head; 36 23 atomic_t event; 37 24 wait_queue_head_t poll; 38 25 struct list_head files; /* goes through kernfs_open_file.list */ ··· 37 50 38 51 static DEFINE_SPINLOCK(kernfs_notify_lock); 39 52 static struct kernfs_node *kernfs_notify_list = KERNFS_NOTIFY_EOL; 53 + 54 + static inline struct mutex *kernfs_open_file_mutex_ptr(struct kernfs_node *kn) 55 + { 56 + int idx = hash_ptr(kn, NR_KERNFS_LOCK_BITS); 57 + 58 + return &kernfs_locks->open_file_mutex[idx]; 59 + } 60 + 61 + static inline struct mutex *kernfs_open_file_mutex_lock(struct kernfs_node *kn) 62 + { 63 + struct mutex *lock; 64 + 65 + lock = kernfs_open_file_mutex_ptr(kn); 66 + 67 + mutex_lock(lock); 68 + 69 + return lock; 70 + } 71 + 72 + /** 73 + * kernfs_deref_open_node - Get kernfs_open_node corresponding to @kn. 74 + * 75 + * @of: associated kernfs_open_file instance. 76 + * @kn: target kernfs_node. 77 + * 78 + * Fetch and return ->attr.open of @kn if @of->list is non empty. 79 + * If @of->list is not empty we can safely assume that @of is on 80 + * @kn->attr.open->files list and this guarantees that @kn->attr.open 81 + * will not vanish i.e. dereferencing outside RCU read-side critical 82 + * section is safe here. 83 + * 84 + * The caller needs to make sure that @of->list is not empty. 85 + */ 86 + static struct kernfs_open_node * 87 + kernfs_deref_open_node(struct kernfs_open_file *of, struct kernfs_node *kn) 88 + { 89 + struct kernfs_open_node *on; 90 + 91 + on = rcu_dereference_check(kn->attr.open, !list_empty(&of->list)); 92 + 93 + return on; 94 + } 95 + 96 + /** 97 + * kernfs_deref_open_node_protected - Get kernfs_open_node corresponding to @kn 98 + * 99 + * @kn: target kernfs_node. 100 + * 101 + * Fetch and return ->attr.open of @kn when caller holds the 102 + * kernfs_open_file_mutex_ptr(kn). 103 + * 104 + * Update of ->attr.open happens under kernfs_open_file_mutex_ptr(kn). So when 105 + * the caller guarantees that this mutex is being held, other updaters can't 106 + * change ->attr.open and this means that we can safely deref ->attr.open 107 + * outside RCU read-side critical section. 108 + * 109 + * The caller needs to make sure that kernfs_open_file_mutex is held. 110 + */ 111 + static struct kernfs_open_node * 112 + kernfs_deref_open_node_protected(struct kernfs_node *kn) 113 + { 114 + return rcu_dereference_protected(kn->attr.open, 115 + lockdep_is_held(kernfs_open_file_mutex_ptr(kn))); 116 + } 40 117 41 118 static struct kernfs_open_file *kernfs_of(struct file *file) 42 119 { ··· 207 156 static int kernfs_seq_show(struct seq_file *sf, void *v) 208 157 { 209 158 struct kernfs_open_file *of = sf->private; 159 + struct kernfs_open_node *on = kernfs_deref_open_node(of, of->kn); 210 160 211 - of->event = atomic_read(&of->kn->attr.open->event); 161 + if (!on) 162 + return -EINVAL; 163 + 164 + of->event = atomic_read(&on->event); 212 165 213 166 return of->kn->attr.ops->seq_show(sf, v); 214 167 } ··· 235 180 struct kernfs_open_file *of = kernfs_of(iocb->ki_filp); 236 181 ssize_t len = min_t(size_t, iov_iter_count(iter), PAGE_SIZE); 237 182 const struct kernfs_ops *ops; 183 + struct kernfs_open_node *on; 238 184 char *buf; 239 185 240 186 buf = of->prealloc_buf; ··· 257 201 goto out_free; 258 202 } 259 203 260 - of->event = atomic_read(&of->kn->attr.open->event); 204 + on = kernfs_deref_open_node(of, of->kn); 205 + if (!on) { 206 + len = -EINVAL; 207 + mutex_unlock(&of->mutex); 208 + goto out_free; 209 + } 210 + 211 + of->event = atomic_read(&on->event); 212 + 261 213 ops = kernfs_ops(of->kn); 262 214 if (ops->read) 263 215 len = ops->read(of, buf, len, iocb->ki_pos); ··· 307 243 * There is no easy way for us to know if userspace is only doing a partial 308 244 * write, so we don't support them. We expect the entire buffer to come on 309 245 * the first write. Hint: if you're writing a value, first read the file, 310 - * modify only the the value you're changing, then write entire buffer 246 + * modify only the value you're changing, then write entire buffer 311 247 * back. 312 248 */ 313 249 static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter) ··· 548 484 * It is not possible to successfully wrap close. 549 485 * So error if someone is trying to use close. 550 486 */ 551 - rc = -EINVAL; 552 487 if (vma->vm_ops && vma->vm_ops->close) 553 488 goto out_put; 554 489 ··· 581 518 struct kernfs_open_file *of) 582 519 { 583 520 struct kernfs_open_node *on, *new_on = NULL; 521 + struct mutex *mutex = NULL; 584 522 585 - retry: 586 - mutex_lock(&kernfs_open_file_mutex); 587 - spin_lock_irq(&kernfs_open_node_lock); 588 - 589 - if (!kn->attr.open && new_on) { 590 - kn->attr.open = new_on; 591 - new_on = NULL; 592 - } 593 - 594 - on = kn->attr.open; 595 - if (on) 596 - list_add_tail(&of->list, &on->files); 597 - 598 - spin_unlock_irq(&kernfs_open_node_lock); 599 - mutex_unlock(&kernfs_open_file_mutex); 523 + mutex = kernfs_open_file_mutex_lock(kn); 524 + on = kernfs_deref_open_node_protected(kn); 600 525 601 526 if (on) { 602 - kfree(new_on); 527 + list_add_tail(&of->list, &on->files); 528 + mutex_unlock(mutex); 603 529 return 0; 530 + } else { 531 + /* not there, initialize a new one */ 532 + new_on = kmalloc(sizeof(*new_on), GFP_KERNEL); 533 + if (!new_on) { 534 + mutex_unlock(mutex); 535 + return -ENOMEM; 536 + } 537 + atomic_set(&new_on->event, 1); 538 + init_waitqueue_head(&new_on->poll); 539 + INIT_LIST_HEAD(&new_on->files); 540 + list_add_tail(&of->list, &new_on->files); 541 + rcu_assign_pointer(kn->attr.open, new_on); 604 542 } 543 + mutex_unlock(mutex); 605 544 606 - /* not there, initialize a new one and retry */ 607 - new_on = kmalloc(sizeof(*new_on), GFP_KERNEL); 608 - if (!new_on) 609 - return -ENOMEM; 610 - 611 - atomic_set(&new_on->event, 1); 612 - init_waitqueue_head(&new_on->poll); 613 - INIT_LIST_HEAD(&new_on->files); 614 - goto retry; 545 + return 0; 615 546 } 616 547 617 548 /** ··· 624 567 static void kernfs_unlink_open_file(struct kernfs_node *kn, 625 568 struct kernfs_open_file *of) 626 569 { 627 - struct kernfs_open_node *on = kn->attr.open; 628 - unsigned long flags; 570 + struct kernfs_open_node *on; 571 + struct mutex *mutex = NULL; 629 572 630 - mutex_lock(&kernfs_open_file_mutex); 631 - spin_lock_irqsave(&kernfs_open_node_lock, flags); 573 + mutex = kernfs_open_file_mutex_lock(kn); 574 + 575 + on = kernfs_deref_open_node_protected(kn); 576 + if (!on) { 577 + mutex_unlock(mutex); 578 + return; 579 + } 632 580 633 581 if (of) 634 582 list_del(&of->list); 635 583 636 - if (list_empty(&on->files)) 637 - kn->attr.open = NULL; 638 - else 639 - on = NULL; 584 + if (list_empty(&on->files)) { 585 + rcu_assign_pointer(kn->attr.open, NULL); 586 + kfree_rcu(on, rcu_head); 587 + } 640 588 641 - spin_unlock_irqrestore(&kernfs_open_node_lock, flags); 642 - mutex_unlock(&kernfs_open_file_mutex); 643 - 644 - kfree(on); 589 + mutex_unlock(mutex); 645 590 } 646 591 647 592 static int kernfs_fop_open(struct inode *inode, struct file *file) ··· 781 722 /* 782 723 * @of is guaranteed to have no other file operations in flight and 783 724 * we just want to synchronize release and drain paths. 784 - * @kernfs_open_file_mutex is enough. @of->mutex can't be used 725 + * @kernfs_open_file_mutex_ptr(kn) is enough. @of->mutex can't be used 785 726 * here because drain path may be called from places which can 786 727 * cause circular dependency. 787 728 */ 788 - lockdep_assert_held(&kernfs_open_file_mutex); 729 + lockdep_assert_held(kernfs_open_file_mutex_ptr(kn)); 789 730 790 731 if (!of->released) { 791 732 /* ··· 802 743 { 803 744 struct kernfs_node *kn = inode->i_private; 804 745 struct kernfs_open_file *of = kernfs_of(filp); 746 + struct mutex *mutex = NULL; 805 747 806 748 if (kn->flags & KERNFS_HAS_RELEASE) { 807 - mutex_lock(&kernfs_open_file_mutex); 749 + mutex = kernfs_open_file_mutex_lock(kn); 808 750 kernfs_release_file(kn, of); 809 - mutex_unlock(&kernfs_open_file_mutex); 751 + mutex_unlock(mutex); 810 752 } 811 753 812 754 kernfs_unlink_open_file(kn, of); ··· 822 762 { 823 763 struct kernfs_open_node *on; 824 764 struct kernfs_open_file *of; 765 + struct mutex *mutex = NULL; 825 766 826 767 if (!(kn->flags & (KERNFS_HAS_MMAP | KERNFS_HAS_RELEASE))) 827 768 return; ··· 832 771 * ->attr.open at this point of time. This check allows early bail out 833 772 * if ->attr.open is already NULL. kernfs_unlink_open_file makes 834 773 * ->attr.open NULL only while holding kernfs_open_file_mutex so below 835 - * check under kernfs_open_file_mutex will ensure bailing out if 774 + * check under kernfs_open_file_mutex_ptr(kn) will ensure bailing out if 836 775 * ->attr.open became NULL while waiting for the mutex. 837 776 */ 838 - if (!kn->attr.open) 777 + if (!rcu_access_pointer(kn->attr.open)) 839 778 return; 840 779 841 - mutex_lock(&kernfs_open_file_mutex); 842 - if (!kn->attr.open) { 843 - mutex_unlock(&kernfs_open_file_mutex); 780 + mutex = kernfs_open_file_mutex_lock(kn); 781 + on = kernfs_deref_open_node_protected(kn); 782 + if (!on) { 783 + mutex_unlock(mutex); 844 784 return; 845 785 } 846 - 847 - on = kn->attr.open; 848 786 849 787 list_for_each_entry(of, &on->files, list) { 850 788 struct inode *inode = file_inode(of->file); ··· 855 795 kernfs_release_file(kn, of); 856 796 } 857 797 858 - mutex_unlock(&kernfs_open_file_mutex); 798 + mutex_unlock(mutex); 859 799 } 860 800 861 801 /* ··· 875 815 __poll_t kernfs_generic_poll(struct kernfs_open_file *of, poll_table *wait) 876 816 { 877 817 struct kernfs_node *kn = kernfs_dentry_node(of->file->f_path.dentry); 878 - struct kernfs_open_node *on = kn->attr.open; 818 + struct kernfs_open_node *on = kernfs_deref_open_node(of, kn); 819 + 820 + if (!on) 821 + return EPOLLERR; 879 822 880 823 poll_wait(of->file, &on->poll, wait); 881 824 ··· 985 922 return; 986 923 987 924 /* kick poll immediately */ 988 - spin_lock_irqsave(&kernfs_open_node_lock, flags); 989 - on = kn->attr.open; 925 + rcu_read_lock(); 926 + on = rcu_dereference(kn->attr.open); 990 927 if (on) { 991 928 atomic_inc(&on->event); 992 929 wake_up_interruptible(&on->poll); 993 930 } 994 - spin_unlock_irqrestore(&kernfs_open_node_lock, flags); 931 + rcu_read_unlock(); 995 932 996 933 /* schedule work to kick fsnotify */ 997 934 spin_lock_irqsave(&kernfs_notify_lock, flags);
+4
fs/kernfs/kernfs-internal.h
··· 164 164 */ 165 165 extern const struct inode_operations kernfs_symlink_iops; 166 166 167 + /* 168 + * kernfs locks 169 + */ 170 + extern struct kernfs_global_locks *kernfs_locks; 167 171 #endif /* __KERNFS_INTERNAL_H */
+19
fs/kernfs/mount.c
··· 20 20 #include "kernfs-internal.h" 21 21 22 22 struct kmem_cache *kernfs_node_cache, *kernfs_iattrs_cache; 23 + struct kernfs_global_locks *kernfs_locks; 23 24 24 25 static int kernfs_sop_show_options(struct seq_file *sf, struct dentry *dentry) 25 26 { ··· 388 387 kfree(info); 389 388 } 390 389 390 + static void __init kernfs_mutex_init(void) 391 + { 392 + int count; 393 + 394 + for (count = 0; count < NR_KERNFS_LOCKS; count++) 395 + mutex_init(&kernfs_locks->open_file_mutex[count]); 396 + } 397 + 398 + static void __init kernfs_lock_init(void) 399 + { 400 + kernfs_locks = kmalloc(sizeof(struct kernfs_global_locks), GFP_KERNEL); 401 + WARN_ON(!kernfs_locks); 402 + 403 + kernfs_mutex_init(); 404 + } 405 + 391 406 void __init kernfs_init(void) 392 407 { 393 408 kernfs_node_cache = kmem_cache_create("kernfs_node_cache", ··· 414 397 kernfs_iattrs_cache = kmem_cache_create("kernfs_iattrs_cache", 415 398 sizeof(struct kernfs_iattrs), 416 399 0, SLAB_PANIC, NULL); 400 + 401 + kernfs_lock_init(); 417 402 }
-5
include/linux/acpi.h
··· 1431 1431 int find_acpi_cpu_topology_cluster(unsigned int cpu); 1432 1432 int find_acpi_cpu_topology_package(unsigned int cpu); 1433 1433 int find_acpi_cpu_topology_hetero_id(unsigned int cpu); 1434 - int find_acpi_cpu_cache_topology(unsigned int cpu, int level); 1435 1434 #else 1436 1435 static inline int acpi_pptt_cpu_is_thread(unsigned int cpu) 1437 1436 { ··· 1449 1450 return -EINVAL; 1450 1451 } 1451 1452 static inline int find_acpi_cpu_topology_hetero_id(unsigned int cpu) 1452 - { 1453 - return -EINVAL; 1454 - } 1455 - static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level) 1456 1453 { 1457 1454 return -EINVAL; 1458 1455 }
-1
include/linux/arch_topology.h
··· 68 68 int core_id; 69 69 int cluster_id; 70 70 int package_id; 71 - int llc_id; 72 71 cpumask_t thread_sibling; 73 72 cpumask_t core_sibling; 74 73 cpumask_t cluster_sibling;
+3
include/linux/cacheinfo.h
··· 82 82 int init_cache_level(unsigned int cpu); 83 83 int populate_cache_leaves(unsigned int cpu); 84 84 int cache_setup_acpi(unsigned int cpu); 85 + bool last_level_cache_is_valid(unsigned int cpu); 86 + bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y); 87 + int detect_cache_attributes(unsigned int cpu); 85 88 #ifndef CONFIG_ACPI_PPTT 86 89 /* 87 90 * acpi_find_last_cache_level is only called on ACPI enabled
+18
include/linux/cpumask.h
··· 1071 1071 [0] = 1UL \ 1072 1072 } } 1073 1073 1074 + /* 1075 + * Provide a valid theoretical max size for cpumap and cpulist sysfs files 1076 + * to avoid breaking userspace which may allocate a buffer based on the size 1077 + * reported by e.g. fstat. 1078 + * 1079 + * for cpumap NR_CPUS * 9/32 - 1 should be an exact length. 1080 + * 1081 + * For cpulist 7 is (ceil(log10(NR_CPUS)) + 1) allowing for NR_CPUS to be up 1082 + * to 2 orders of magnitude larger than 8192. And then we divide by 2 to 1083 + * cover a worst-case of every other cpu being on one of two nodes for a 1084 + * very large NR_CPUS. 1085 + * 1086 + * Use PAGE_SIZE as a minimum for smaller configurations. 1087 + */ 1088 + #define CPUMAP_FILE_MAX_BYTES ((((NR_CPUS * 9)/32 - 1) > PAGE_SIZE) \ 1089 + ? (NR_CPUS * 9)/32 - 1 : PAGE_SIZE) 1090 + #define CPULIST_FILE_MAX_BYTES (((NR_CPUS * 7)/2 > PAGE_SIZE) ? (NR_CPUS * 7)/2 : PAGE_SIZE) 1091 + 1074 1092 #endif /* __LINUX_CPUMASK_H */
+2
include/linux/device.h
··· 905 905 int (*match)(struct device *dev, void *data)); 906 906 struct device *device_find_child_by_name(struct device *parent, 907 907 const char *name); 908 + struct device *device_find_any_child(struct device *parent); 909 + 908 910 int device_rename(struct device *dev, const char *new_name); 909 911 int device_move(struct device *dev, struct device *new_parent, 910 912 enum dpm_order dpm_order);
+1 -1
include/linux/device/driver.h
··· 129 129 struct bus_type *bus); 130 130 extern int driver_probe_done(void); 131 131 extern void wait_for_device_probe(void); 132 + void __init wait_for_init_devices_probe(void); 132 133 133 134 /* sysfs interface for exporting driver attributes */ 134 135 ··· 242 241 243 242 extern int driver_deferred_probe_timeout; 244 243 void driver_deferred_probe_add(struct device *dev); 245 - int driver_deferred_probe_check_state(struct device *dev); 246 244 void driver_init(void); 247 245 248 246 /**
+6 -2
include/linux/firmware/trusted_foundations.h
··· 71 71 72 72 static inline void of_register_trusted_foundations(void) 73 73 { 74 + struct device_node *np = of_find_compatible_node(NULL, NULL, "tlm,trusted-foundations"); 75 + 76 + if (!np) 77 + return; 78 + of_node_put(np); 74 79 /* 75 80 * If we find the target should enable TF but does not support it, 76 81 * fail as the system won't be able to do much anyway 77 82 */ 78 - if (of_find_compatible_node(NULL, NULL, "tlm,trusted-foundations")) 79 - register_trusted_foundations(NULL); 83 + register_trusted_foundations(NULL); 80 84 } 81 85 82 86 static inline bool trusted_foundations_registered(void)
+4
include/linux/fwnode.h
··· 27 27 * driver needs its child devices to be bound with 28 28 * their respective drivers as soon as they are 29 29 * added. 30 + * BEST_EFFORT: The fwnode/device needs to probe early and might be missing some 31 + * suppliers. Only enforce ordering with suppliers that have 32 + * drivers. 30 33 */ 31 34 #define FWNODE_FLAG_LINKS_ADDED BIT(0) 32 35 #define FWNODE_FLAG_NOT_DEVICE BIT(1) 33 36 #define FWNODE_FLAG_INITIALIZED BIT(2) 34 37 #define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD BIT(3) 38 + #define FWNODE_FLAG_BEST_EFFORT BIT(4) 35 39 36 40 struct fwnode_handle { 37 41 struct fwnode_handle *secondary;
+58 -1
include/linux/kernfs.h
··· 18 18 #include <linux/uidgid.h> 19 19 #include <linux/wait.h> 20 20 #include <linux/rwsem.h> 21 + #include <linux/cache.h> 21 22 22 23 struct file; 23 24 struct dentry; ··· 34 33 struct kernfs_fs_context; 35 34 struct kernfs_open_node; 36 35 struct kernfs_iattrs; 36 + 37 + /* 38 + * NR_KERNFS_LOCK_BITS determines size (NR_KERNFS_LOCKS) of hash 39 + * table of locks. 40 + * Having a small hash table would impact scalability, since 41 + * more and more kernfs_node objects will end up using same lock 42 + * and having a very large hash table would waste memory. 43 + * 44 + * At the moment size of hash table of locks is being set based on 45 + * the number of CPUs as follows: 46 + * 47 + * NR_CPU NR_KERNFS_LOCK_BITS NR_KERNFS_LOCKS 48 + * 1 1 2 49 + * 2-3 2 4 50 + * 4-7 4 16 51 + * 8-15 6 64 52 + * 16-31 8 256 53 + * 32 and more 10 1024 54 + * 55 + * The above relation between NR_CPU and number of locks is based 56 + * on some internal experimentation which involved booting qemu 57 + * with different values of smp, performing some sysfs operations 58 + * on all CPUs and observing how increase in number of locks impacts 59 + * completion time of these sysfs operations on each CPU. 60 + */ 61 + #ifdef CONFIG_SMP 62 + #define NR_KERNFS_LOCK_BITS (2 * (ilog2(NR_CPUS < 32 ? NR_CPUS : 32))) 63 + #else 64 + #define NR_KERNFS_LOCK_BITS 1 65 + #endif 66 + 67 + #define NR_KERNFS_LOCKS (1 << NR_KERNFS_LOCK_BITS) 68 + 69 + /* 70 + * There's one kernfs_open_file for each open file and one kernfs_open_node 71 + * for each kernfs_node with one or more open files. 72 + * 73 + * filp->private_data points to seq_file whose ->private points to 74 + * kernfs_open_file. 75 + * 76 + * kernfs_open_files are chained at kernfs_open_node->files, which is 77 + * protected by kernfs_global_locks.open_file_mutex[i]. 78 + * 79 + * To reduce possible contention in sysfs access, arising due to single 80 + * locks, use an array of locks (e.g. open_file_mutex) and use kernfs_node 81 + * object address as hash keys to get the index of these locks. 82 + * 83 + * Hashed mutexes are safe to use here because operations using these don't 84 + * rely on global exclusion. 85 + * 86 + * In future we intend to replace other global locks with hashed ones as well. 87 + * kernfs_global_locks acts as a holder for all such hash tables. 88 + */ 89 + struct kernfs_global_locks { 90 + struct mutex open_file_mutex[NR_KERNFS_LOCKS]; 91 + }; 37 92 38 93 enum kernfs_node_type { 39 94 KERNFS_DIR = 0x0001, ··· 171 114 172 115 struct kernfs_elem_attr { 173 116 const struct kernfs_ops *ops; 174 - struct kernfs_open_node *open; 117 + struct kernfs_open_node __rcu *open; 175 118 loff_t size; 176 119 struct kernfs_node *notify_next; /* for kernfs_notify() */ 177 120 };
+1 -1
lib/Kconfig.debug
··· 1560 1560 help 1561 1561 kobjects are reference counted objects. This means that their 1562 1562 last reference count put is not predictable, and the kobject can 1563 - live on past the point at which a driver decides to drop it's 1563 + live on past the point at which a driver decides to drop its 1564 1564 initial reference to the kobject gained on allocation. An 1565 1565 example of this would be a struct device which has just been 1566 1566 unregistered.
+6
net/ipv4/ipconfig.c
··· 1434 1434 static int __init wait_for_devices(void) 1435 1435 { 1436 1436 int i; 1437 + bool try_init_devs = true; 1437 1438 1438 1439 for (i = 0; i < DEVICE_WAIT_MAX; i++) { 1439 1440 struct net_device *dev; ··· 1453 1452 rtnl_unlock(); 1454 1453 if (found) 1455 1454 return 0; 1455 + if (try_init_devs && 1456 + (ROOT_DEV == Root_NFS || ROOT_DEV == Root_CIFS)) { 1457 + try_init_devs = false; 1458 + wait_for_init_devices_probe(); 1459 + } 1456 1460 ssleep(1); 1457 1461 } 1458 1462 return -ENODEV;