Merge tag 'pm-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

Pull power management updates from Rafael Wysocki:
"By the number of commits, cpufreq is the leading party (again) and the
most visible change there is the removal of the omap-cpufreq driver
that has not been used for a long time (good riddance). There are also
quite a few changes in the cppc_cpufreq driver, mostly related to
fixing its frequency invariance engine in the case when the CPPC
registers used by it are not in PCC. In addition to that, support for
AM62L3 is added to the ti-cpufreq driver and the cpufreq-dt-platdev
list is updated for some platforms. The remaining cpufreq changes are
assorted fixes and cleanups.

Next up is cpuidle and the changes there are dominated by intel_idle
driver updates, mostly related to the new command line facility
allowing users to adjust the list of C-states used by the driver.
There are also a few updates of cpuidle governors, including two menu
governor fixes and some refinements of the teo governor, and a
MAINTAINERS update adding Christian Loehle as a cpuidle reviewer.
[Thanks for stepping up Christian!]

The most significant update related to system suspend and hibernation
is the one to stop freezing the PM runtime workqueue during system PM
transitions which allows some deadlocks to be avoided. There is also a
fix for possible concurrent bit field updates in the core device
suspend code and a few other minor fixes.

Apart from the above, several drivers are updated to discard the
return value of pm_runtime_put() which is going to be converted to a
void function as soon as everybody stops using its return value, PL4
support for Ice Lake is added to the Intel RAPL power capping driver,
and there are assorted cleanups, documentation fixes, and some
cpupower utility improvements.

Specifics:

- Remove the unused omap-cpufreq driver (Andreas Kemnade)

- Optimize error handling code in cpufreq_boost_trigger_state() and
make cpufreq_boost_trigger_state() return -EOPNOTSUPP if no policy
supports boost (Lifeng Zheng)

- Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
Dhruva Gole, and Konrad Dybcio)

- Minor improvements to the cpufreq and cpumask rust implementation
(Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen)

- Add support for AM62L3 SoC to the ti-cpufreq driver (Dhruva Gole)

- Update arch_freq_scale in the CPPC cpufreq driver's frequency
invariance engine (FIE) in scheduler ticks if the related CPPC
registers are not in PCC (Jie Zhan)

- Assorted minor cleanups and improvements in ARM cpufreq drivers
(Juan Martinez, Felix Gu, Luca Weiss, and Sergey Shtylyov)

- Add generic helpers for sysfs show/store to cppc_cpufreq (Sumit
Gupta)

- Make the scaling_setspeed cpufreq sysfs attribute return the actual
requested frequency to avoid confusion (Pengjie Zhang)

- Simplify the idle CPU time granularity test in the ondemand cpufreq
governor (Frederic Weisbecker)

- Enable asym capacity in intel_pstate only when CPU SMT is not
possible (Yaxiong Tian)

- Update the description of rate_limit_us default value in cpufreq
documentation (Yaxiong Tian)

- Add a command line option to adjust the C-states table in the
intel_idle driver, remove the 'preferred_cstates' module parameter
from it, add C-states validation to it and clean it up (Artem
Bityutskiy)

- Make the menu cpuidle governor always check the time till the
closest timer event when the scheduler tick has been stopped to
prevent it from mistakenly selecting the deepest available idle
state (Rafael Wysocki)

- Update the teo cpuidle governor to avoid making suboptimal
decisions in certain corner cases and generally improve idle state
selection accuracy (Rafael Wysocki)

- Remove an unlikely() annotation on the early-return condition in
menu_select() that leads to branch misprediction 100% of the time
on systems with only 1 idle state enabled, like ARM64 servers
(Breno Leitao)

- Add Christian Loehle to MAINTAINERS as a cpuidle reviewer
(Christian Loehle)

- Stop flagging the PM runtime workqueue as freezable to avoid system
suspend and resume deadlocks in subsystems that assume asynchronous
runtime PM to work during system-wide PM transitions (Rafael
Wysocki)

- Drop redundant NULL pointer checks before acomp_request_free() from
the hibernation code handling image saving (Rafael Wysocki)

- Update wakeup_sources_walk_start() to handle empty lists of wakeup
sources as appropriate (Samuel Wu)

- Make dev_pm_clear_wake_irq() check the power.wakeirq value under
power.lock to avoid race conditions (Gui-Dong Han)

- Avoid bit field races related to power.work_in_progress in the core
device suspend code (Xuewen Yan)

- Make several drivers discard pm_runtime_put() return value in
preparation for converting that function to a void one (Rafael
Wysocki)

- Add PL4 support for Ice Lake to the Intel RAPL power capping driver
(Daniel Tang)

- Replace sprintf() with sysfs_emit() in power capping sysfs show
functions (Sumeet Pawnikar)

- Make dev_pm_opp_get_level() return value match the documentation
after a previous update of the latter (Aleks Todorov)

- Use scoped for each OF child loop in the OPP code (Krzysztof
Kozlowski)

- Fix a bug in an example code snippet and correct typos in the
energy model management documentation (Patrick Little)

- Fix miscellaneous problems in cpupower (Kaushlendra Kumar):
* idle_monitor: Fix incorrect value logged after stop
* Fix inverted APERF capability check
* Use strcspn() to strip trailing newline
* Reset errno before strtoull()
* Show C0 in idle-info dump

- Improve cpupower installation procedure by making the systemd step
optional and allowing users to disable the installation of
systemd's unit file (João Marcos Costa)"

* tag 'pm-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits)
PM: sleep: core: Avoid bit field races related to work_in_progress
PM: sleep: wakeirq: harden dev_pm_clear_wake_irq() against races
cpufreq: Documentation: Update description of rate_limit_us default value
cpufreq: intel_pstate: Enable asym capacity only when CPU SMT is not possible
PM: wakeup: Handle empty list in wakeup_sources_walk_start()
PM: EM: Documentation: Fix bug in example code snippet
Documentation: Fix typos in energy model documentation
cpuidle: governors: teo: Refine intercepts-based idle state lookup
cpuidle: governors: teo: Adjust the classification of wakeup events
cpufreq: ondemand: Simplify idle cputime granularity test
cpufreq: userspace: make scaling_setspeed return the actual requested frequency
PM: hibernate: Drop NULL pointer checks before acomp_request_free()
cpufreq: CPPC: Add generic helpers for sysfs show/store
cpufreq: scmi: Fix device_node reference leak in scmi_cpu_domain_id()
cpufreq: ti-cpufreq: add support for AM62L3 SoC
cpufreq: dt-platdev: Add ti,am62l3 to blocklist
cpufreq/amd-pstate: Add comment explaining nominal_perf usage for performance policy
cpufreq: scmi: correct SCMI explanation
cpufreq: dt-platdev: Block the driver from probing on more QC platforms
rust: cpumask: rename methods of Cpumask for clarity and consistency
...

Linus Torvalds 4 months ago 9b1b3dcd d84e1733

+637 -550

66 changed files

expand all

Documentation

admin-guide

cpufreq.rst

devicetree

bindings

cpufreq

cpufreq-qcom-hw.yaml

power

energy-model.rst

runtime_pm.rst

scheduler

sched-energy.rst

MAINTAINERS

drivers

acpi

cppc_acpi.c

base

power

main.c

wakeirq.c

wakeup.c

cpufreq

Kconfig.arm

Makefile

amd-pstate.c

cppc_cpufreq.c

cpufreq-dt-platdev.c

cpufreq.c

cpufreq_ondemand.c

cpufreq_userspace.c

intel_pstate.c

omap-cpufreq.c

rcpufreq_dt.rs

scmi-cpufreq.c

ti-cpufreq.c

cpuidle

governors

menu.c

teo.c

gpu

drm

arm

malidp_crtc.c

bridge

imx

imx8qm-ldb.c

imx8qxp-ldb.c

imx8qxp-pixel-combiner.c

imx8qxp-pxl2dpi.c

imagination

pvr_power.h

imx

dc-crtc.c

vc4

vc4_hdmi.c

vc4_vec.c

hwspinlock

omap_hwspinlock.c

hwtracing

coresight

coresight-cpu-debug.c

idle

intel_idle.c

media

i2c

ccs

ccs-core.c

opp

core.c

of.c

platform

chrome

cros_hps_i2c.c

powercap

intel_rapl_msr.c

powercap_sys.c

ufs

core

ufshcd-priv.h

usb

core

driver.c

watchdog

rzg2l_wdt.c

rzv2h_wdt.c

include

acpi

cppc_acpi.h

linux

irq.h

pm.h

tick.h

kernel

irq

chip.c

power

main.c

swap.c

time

hrtimer.c

tick-internal.h

tick-sched.c

timer.c

rust

helpers

cpufreq.c

kernel

cpufreq.rs

cpumask.rs

tools

power

cpupower

Makefile

lib

cpuidle.c

utils

cpufreq-info.c

cpuidle-info.c

idle_monitor

cpuidle_sysfs.c

+1 -1

Documentation/admin-guide/pm/cpufreq.rst

··· 439 439 ``rate_limit_us`` 440 440 Minimum time (in microseconds) that has to pass between two consecutive 441 441 runs of governor computations (default: 1.5 times the scaling driver's 442 - transition latency or the maximum 2ms). 442 + transition latency or 1ms if the driver does not provide a latency value). 443 443 444 444 The purpose of this tunable is to reduce the scheduler context overhead 445 445 of the governor which might be excessive without it.

Documentation/devicetree/bindings/cpufreq/cpufreq-qcom-hw.yaml

··· 35 35 - description: v2 of CPUFREQ HW (EPSS) 36 36 items: 37 37 - enum: 38 + - qcom,milos-cpufreq-epss 38 39 - qcom,qcs8300-cpufreq-epss 39 40 - qcom,qdu1000-cpufreq-epss 40 41 - qcom,sa8255p-cpufreq-epss ··· 170 169 compatible: 171 170 contains: 172 171 enum: 172 + - qcom,milos-cpufreq-epss 173 173 - qcom,qcs8300-cpufreq-epss 174 174 - qcom,sc7280-cpufreq-epss 175 175 - qcom,sm8250-cpufreq-epss

+9 -9

Documentation/power/energy-model.rst

··· 14 14 The source of the information about the power consumed by devices can vary greatly 15 15 from one platform to another. These power costs can be estimated using 16 16 devicetree data in some cases. In others, the firmware will know better. 17 - Alternatively, userspace might be best positioned. And so on. In order to avoid 18 - each and every client subsystem to re-implement support for each and every 17 + Alternatively, userspace might be best positioned. In order to avoid 18 + having each and every client subsystem re-implement support for each and every 19 19 possible source of information on its own, the EM framework intervenes as an 20 20 abstraction layer which standardizes the format of power cost tables in the 21 21 kernel, hence enabling to avoid redundant work. ··· 32 32 Documentation/driver-api/thermal/power_allocator.rst. 33 33 Kernel subsystems might implement automatic detection to check whether EM 34 34 registered devices have inconsistent scale (based on EM internal flag). 35 - Important thing to keep in mind is that when the power values are expressed in 35 + An important thing to keep in mind is that when the power values are expressed in 36 36 an 'abstract scale' deriving real energy in micro-Joules would not be possible. 37 37 38 38 The figure below depicts an example of drivers (Arm-specific here, but the ··· 82 82 should call EM API to free it safely when it's no longer needed. The EM 83 83 framework will handle the clean-up when it's possible. 84 84 85 - The kernel code which want to modify the EM values is protected from concurrent 85 + The kernel code which wants to modify the EM values is protected from concurrent 86 86 access using a mutex. Therefore, the device driver code must run in sleeping 87 87 context when it tries to modify the EM. 88 88 ··· 113 113 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 114 114 115 115 The 'advanced' EM gets its name due to the fact that the driver is allowed 116 - to provide more precised power model. It's not limited to some implemented math 116 + to provide a more precise power model. It's not limited to some implemented math 117 117 formula in the framework (like it is in 'simple' EM case). It can better reflect 118 118 the real power measurements performed for each performance state. Thus, this 119 119 registration method should be preferred in case considering EM static power ··· 172 172 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 173 173 174 174 The 'simple' EM is registered using the framework helper function 175 - cpufreq_register_em_with_opp(). It implements a power model which is tight to 175 + cpufreq_register_em_with_opp(). It implements a power model which is tied to a 176 176 math formula:: 177 177 178 178 Power = C * V^2 * f ··· 251 251 states in ascending order. 252 252 This function must be called in the RCU read lock section (after the 253 253 rcu_read_lock()). When the EM table is not needed anymore there is a need to 254 - call rcu_real_unlock(). In this way the EM safely uses the RCU read section 254 + call rcu_read_unlock(). In this way the EM safely uses the RCU read section 255 255 and protects the users. It also allows the EM framework to manage the memory 256 256 and free it. More details how to use it can be found in Section 3.2 in the 257 257 example driver. ··· 308 308 05 309 309 06 /* Use the 'foo' protocol to ceil the frequency */ 310 310 07 freq = foo_get_freq_ceil(dev, *KHz); 311 - 08 if (freq < 0); 311 + 08 if (freq < 0) 312 312 09 return freq; 313 313 10 314 314 11 /* Estimate the power cost for the dev at the relevant freq. */ 315 315 12 power = foo_estimate_power(dev, freq); 316 - 13 if (power < 0); 316 + 13 if (power < 0) 317 317 14 return power; 318 318 15 319 319 16 /* Return the values to the EM framework */

+3 -4

Documentation/power/runtime_pm.rst

··· 712 712 * During system suspend pm_runtime_get_noresume() is called for every device 713 713 right before executing the subsystem-level .prepare() callback for it and 714 714 pm_runtime_barrier() is called for every device right before executing the 715 - subsystem-level .suspend() callback for it. In addition to that the PM core 716 - calls __pm_runtime_disable() with 'false' as the second argument for every 717 - device right before executing the subsystem-level .suspend_late() callback 718 - for it. 715 + subsystem-level .suspend() callback for it. In addition to that, the PM 716 + core disables runtime PM for every device right before executing the 717 + subsystem-level .suspend_late() callback for it. 719 718 720 719 * During system resume pm_runtime_enable() and pm_runtime_put() are called for 721 720 every device right after executing the subsystem-level .resume_early()

+4 -4

Documentation/scheduler/sched-energy.rst

··· 244 244 245 245 246 246 From these calculations, the Case 1 has the lowest total energy. So CPU 1 247 - is be the best candidate from an energy-efficiency standpoint. 247 + is the best candidate from an energy-efficiency standpoint. 248 248 249 249 Big CPUs are generally more power hungry than the little ones and are thus used 250 250 mainly when a task doesn't fit the littles. However, little CPUs aren't always ··· 252 252 of the little CPUs can be less energy-efficient than the lowest OPPs of the 253 253 bigs, for example. So, if the little CPUs happen to have enough utilization at 254 254 a specific point in time, a small task waking up at that moment could be better 255 - of executing on the big side in order to save energy, even though it would fit 255 + off executing on the big side in order to save energy, even though it would fit 256 256 on the little side. 257 257 258 258 And even in the case where all OPPs of the big CPUs are less energy-efficient ··· 285 285 throughput. In order to avoid hurting performance with EAS, CPUs are flagged as 286 286 'over-utilized' as soon as they are used at more than 80% of their compute 287 287 capacity. As long as no CPUs are over-utilized in a root domain, load balancing 288 - is disabled and EAS overridess the wake-up balancing code. EAS is likely to load 288 + is disabled and EAS overrides the wake-up balancing code. EAS is likely to load 289 289 the most energy efficient CPUs of the system more than the others if that can be 290 290 done without harming throughput. So, the load-balancer is disabled to prevent 291 291 it from breaking the energy-efficient task placement found by EAS. It is safe to ··· 385 385 6.5 Scale-invariant utilization signals 386 386 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 387 387 388 - In order to make accurate prediction across CPUs and for all performance 388 + In order to make accurate predictions across CPUs and for all performance 389 389 states, EAS needs frequency-invariant and CPU-invariant PELT signals. These can 390 390 be obtained using the architecture-defined arch_scale{cpu,freq}_capacity() 391 391 callbacks.

+1 -1

MAINTAINERS

··· 6561 6561 CPU IDLE TIME MANAGEMENT FRAMEWORK 6562 6562 M: "Rafael J. Wysocki" <rafael@kernel.org> 6563 6563 M: Daniel Lezcano <daniel.lezcano@linaro.org> 6564 + R: Christian Loehle <christian.loehle@arm.com> 6564 6565 L: linux-pm@vger.kernel.org 6565 6566 S: Maintained 6566 6567 B: https://bugzilla.kernel.org ··· 19150 19149 L: linux-omap@vger.kernel.org 19151 19150 S: Maintained 19152 19151 F: arch/arm/*omap*/*pm* 19153 - F: drivers/cpufreq/omap-cpufreq.c 19154 19152 19155 19153 OMAP POWERDOMAIN SOC ADAPTATION LAYER SUPPORT 19156 19154 M: Paul Walmsley <paul@pwsan.com>

+27 -21

drivers/acpi/cppc_acpi.c

··· 1424 1424 EXPORT_SYMBOL_GPL(cppc_get_perf_caps); 1425 1425 1426 1426 /** 1427 + * cppc_perf_ctrs_in_pcc_cpu - Check if any perf counters of a CPU are in PCC. 1428 + * @cpu: CPU on which to check perf counters. 1429 + * 1430 + * Return: true if any of the counters are in PCC regions, false otherwise 1431 + */ 1432 + bool cppc_perf_ctrs_in_pcc_cpu(unsigned int cpu) 1433 + { 1434 + struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpu); 1435 + struct cpc_register_resource *ref_perf_reg; 1436 + 1437 + /* 1438 + * If reference perf register is not supported then we should use the 1439 + * nominal perf value 1440 + */ 1441 + ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF]; 1442 + if (!CPC_SUPPORTED(ref_perf_reg)) 1443 + ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF]; 1444 + 1445 + return CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) || 1446 + CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) || 1447 + CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]) || 1448 + CPC_IN_PCC(ref_perf_reg); 1449 + } 1450 + EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc_cpu); 1451 + 1452 + /** 1427 1453 * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region. 1428 1454 * 1429 1455 * CPPC has flexibility about how CPU performance counters are accessed. ··· 1463 1437 int cpu; 1464 1438 1465 1439 for_each_online_cpu(cpu) { 1466 - struct cpc_register_resource *ref_perf_reg; 1467 - struct cpc_desc *cpc_desc; 1468 - 1469 - cpc_desc = per_cpu(cpc_desc_ptr, cpu); 1470 - 1471 - if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) || 1472 - CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) || 1473 - CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME])) 1474 - return true; 1475 - 1476 - 1477 - ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF]; 1478 - 1479 - /* 1480 - * If reference perf register is not supported then we should 1481 - * use the nominal perf value 1482 - */ 1483 - if (!CPC_SUPPORTED(ref_perf_reg)) 1484 - ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF]; 1485 - 1486 - if (CPC_IN_PCC(ref_perf_reg)) 1440 + if (cppc_perf_ctrs_in_pcc_cpu(cpu)) 1487 1441 return true; 1488 1442 } 1489 1443

+4 -3

drivers/base/power/main.c

··· 1647 1647 goto Complete; 1648 1648 1649 1649 /* 1650 - * Disable runtime PM for the device without checking if there is a 1651 - * pending resume request for it. 1650 + * After this point, any runtime PM operations targeting the device 1651 + * will fail until the corresponding pm_runtime_enable() call in 1652 + * device_resume_early(). 1652 1653 */ 1653 - __pm_runtime_disable(dev, false); 1654 + pm_runtime_disable(dev); 1654 1655 1655 1656 if (dev->power.syscore) 1656 1657 goto Skip;

+7 -4

drivers/base/power/wakeirq.c

··· 83 83 */ 84 84 void dev_pm_clear_wake_irq(struct device *dev) 85 85 { 86 - struct wake_irq *wirq = dev->power.wakeirq; 86 + struct wake_irq *wirq; 87 87 unsigned long flags; 88 88 89 - if (!wirq) 90 - return; 91 - 92 89 spin_lock_irqsave(&dev->power.lock, flags); 90 + wirq = dev->power.wakeirq; 91 + if (!wirq) { 92 + spin_unlock_irqrestore(&dev->power.lock, flags); 93 + return; 94 + } 95 + 93 96 device_wakeup_detach_irq(dev); 94 97 dev->power.wakeirq = NULL; 95 98 spin_unlock_irqrestore(&dev->power.lock, flags);

+1 -3

drivers/base/power/wakeup.c

··· 275 275 */ 276 276 struct wakeup_source *wakeup_sources_walk_start(void) 277 277 { 278 - struct list_head *ws_head = &wakeup_sources; 279 - 280 - return list_entry_rcu(ws_head->next, struct wakeup_source, entry); 278 + return list_first_or_null_rcu(&wakeup_sources, struct wakeup_source, entry); 281 279 } 282 280 EXPORT_SYMBOL_GPL(wakeup_sources_walk_start); 283 281

-5

drivers/cpufreq/Kconfig.arm

··· 141 141 The driver implements the cpufreq interface for this HW engine. 142 142 Say Y if you want to support CPUFreq HW. 143 143 144 - config ARM_OMAP2PLUS_CPUFREQ 145 - bool "TI OMAP2+" 146 - depends on ARCH_OMAP2PLUS || COMPILE_TEST 147 - default ARCH_OMAP2PLUS 148 - 149 144 config ARM_QCOM_CPUFREQ_NVMEM 150 145 tristate "Qualcomm nvmem based CPUFreq" 151 146 depends on ARCH_QCOM || COMPILE_TEST

-1

drivers/cpufreq/Makefile

··· 69 69 obj-$(CONFIG_ARM_MEDIATEK_CPUFREQ) += mediatek-cpufreq.o 70 70 obj-$(CONFIG_ARM_MEDIATEK_CPUFREQ_HW) += mediatek-cpufreq-hw.o 71 71 obj-$(CONFIG_MACH_MVEBU_V7) += mvebu-cpufreq.o 72 - obj-$(CONFIG_ARM_OMAP2PLUS_CPUFREQ) += omap-cpufreq.o 73 72 obj-$(CONFIG_ARM_PXA2xx_CPUFREQ) += pxa2xx-cpufreq.o 74 73 obj-$(CONFIG_PXA3xx) += pxa3xx-cpufreq.o 75 74 obj-$(CONFIG_ARM_QCOM_CPUFREQ_HW) += qcom-cpufreq-hw.o

+13

drivers/cpufreq/amd-pstate.c

··· 636 636 WRITE_ONCE(cpudata->max_limit_freq, policy->max); 637 637 638 638 if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) { 639 + /* 640 + * For performance policy, set MinPerf to nominal_perf rather than 641 + * highest_perf or lowest_nonlinear_perf. 642 + * 643 + * Per commit 0c411b39e4f4c, using highest_perf was observed 644 + * to cause frequency throttling on power-limited platforms, leading to 645 + * performance regressions. Using lowest_nonlinear_perf would limit 646 + * performance too much for HPC workloads requiring high frequency 647 + * operation and minimal wakeup latency from idle states. 648 + * 649 + * nominal_perf therefore provides a balance by avoiding throttling 650 + * while still maintaining enough performance for HPC workloads. 651 + */ 639 652 perf.min_limit_perf = min(perf.nominal_perf, perf.max_limit_perf); 640 653 WRITE_ONCE(cpudata->min_limit_freq, min(cpudata->nominal_freq, cpudata->max_limit_freq)); 641 654 } else {

+92 -78

drivers/cpufreq/cppc_cpufreq.c

··· 54 54 struct cppc_perf_fb_ctrs *fb_ctrs_t1); 55 55 56 56 /** 57 - * cppc_scale_freq_workfn - CPPC arch_freq_scale updater for frequency invariance 58 - * @work: The work item. 57 + * __cppc_scale_freq_tick - CPPC arch_freq_scale updater for frequency invariance 58 + * @cppc_fi: per-cpu CPPC FIE data. 59 59 * 60 - * The CPPC driver register itself with the topology core to provide its own 60 + * The CPPC driver registers itself with the topology core to provide its own 61 61 * implementation (cppc_scale_freq_tick()) of topology_scale_freq_tick() which 62 62 * gets called by the scheduler on every tick. 63 63 * 64 64 * Note that the arch specific counters have higher priority than CPPC counters, 65 65 * if available, though the CPPC driver doesn't need to have any special 66 66 * handling for that. 67 - * 68 - * On an invocation of cppc_scale_freq_tick(), we schedule an irq work (since we 69 - * reach here from hard-irq context), which then schedules a normal work item 70 - * and cppc_scale_freq_workfn() updates the per_cpu arch_freq_scale variable 71 - * based on the counter updates since the last tick. 72 67 */ 73 - static void cppc_scale_freq_workfn(struct kthread_work *work) 68 + static void __cppc_scale_freq_tick(struct cppc_freq_invariance *cppc_fi) 74 69 { 75 - struct cppc_freq_invariance *cppc_fi; 76 70 struct cppc_perf_fb_ctrs fb_ctrs = {0}; 77 71 struct cppc_cpudata *cpu_data; 78 72 unsigned long local_freq_scale; 79 73 u64 perf; 80 74 81 - cppc_fi = container_of(work, struct cppc_freq_invariance, work); 82 75 cpu_data = cppc_fi->cpu_data; 83 76 84 77 if (cppc_get_perf_ctrs(cppc_fi->cpu, &fb_ctrs)) { ··· 95 102 per_cpu(arch_freq_scale, cppc_fi->cpu) = local_freq_scale; 96 103 } 97 104 105 + static void cppc_scale_freq_tick(void) 106 + { 107 + __cppc_scale_freq_tick(&per_cpu(cppc_freq_inv, smp_processor_id())); 108 + } 109 + 110 + static struct scale_freq_data cppc_sftd = { 111 + .source = SCALE_FREQ_SOURCE_CPPC, 112 + .set_freq_scale = cppc_scale_freq_tick, 113 + }; 114 + 115 + static void cppc_scale_freq_workfn(struct kthread_work *work) 116 + { 117 + struct cppc_freq_invariance *cppc_fi; 118 + 119 + cppc_fi = container_of(work, struct cppc_freq_invariance, work); 120 + __cppc_scale_freq_tick(cppc_fi); 121 + } 122 + 98 123 static void cppc_irq_work(struct irq_work *irq_work) 99 124 { 100 125 struct cppc_freq_invariance *cppc_fi; ··· 121 110 kthread_queue_work(kworker_fie, &cppc_fi->work); 122 111 } 123 112 124 - static void cppc_scale_freq_tick(void) 113 + /* 114 + * Reading perf counters may sleep if the CPC regs are in PCC. Thus, we 115 + * schedule an irq work in scale_freq_tick (since we reach here from hard-irq 116 + * context), which then schedules a normal work item cppc_scale_freq_workfn() 117 + * that updates the per_cpu arch_freq_scale variable based on the counter 118 + * updates since the last tick. 119 + */ 120 + static void cppc_scale_freq_tick_pcc(void) 125 121 { 126 122 struct cppc_freq_invariance *cppc_fi = &per_cpu(cppc_freq_inv, smp_processor_id()); 127 123 ··· 139 121 irq_work_queue(&cppc_fi->irq_work); 140 122 } 141 123 142 - static struct scale_freq_data cppc_sftd = { 124 + static struct scale_freq_data cppc_sftd_pcc = { 143 125 .source = SCALE_FREQ_SOURCE_CPPC, 144 - .set_freq_scale = cppc_scale_freq_tick, 126 + .set_freq_scale = cppc_scale_freq_tick_pcc, 145 127 }; 146 128 147 129 static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy) 148 130 { 131 + struct scale_freq_data *sftd = &cppc_sftd; 149 132 struct cppc_freq_invariance *cppc_fi; 150 133 int cpu, ret; 151 134 ··· 157 138 cppc_fi = &per_cpu(cppc_freq_inv, cpu); 158 139 cppc_fi->cpu = cpu; 159 140 cppc_fi->cpu_data = policy->driver_data; 160 - kthread_init_work(&cppc_fi->work, cppc_scale_freq_workfn); 161 - init_irq_work(&cppc_fi->irq_work, cppc_irq_work); 141 + if (cppc_perf_ctrs_in_pcc_cpu(cpu)) { 142 + kthread_init_work(&cppc_fi->work, cppc_scale_freq_workfn); 143 + init_irq_work(&cppc_fi->irq_work, cppc_irq_work); 144 + sftd = &cppc_sftd_pcc; 145 + } 162 146 163 147 ret = cppc_get_perf_ctrs(cpu, &cppc_fi->prev_perf_fb_ctrs); 164 148 ··· 177 155 } 178 156 179 157 /* Register for freq-invariance */ 180 - topology_set_scale_freq_source(&cppc_sftd, policy->cpus); 158 + topology_set_scale_freq_source(sftd, policy->cpus); 181 159 } 182 160 183 161 /* ··· 200 178 topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_CPPC, policy->related_cpus); 201 179 202 180 for_each_cpu(cpu, policy->related_cpus) { 181 + if (!cppc_perf_ctrs_in_pcc_cpu(cpu)) 182 + continue; 203 183 cppc_fi = &per_cpu(cppc_freq_inv, cpu); 204 184 irq_work_sync(&cppc_fi->irq_work); 205 185 kthread_cancel_work_sync(&cppc_fi->work); 206 186 } 207 187 } 208 188 209 - static void __init cppc_freq_invariance_init(void) 189 + static void cppc_fie_kworker_init(void) 210 190 { 211 191 struct sched_attr attr = { 212 192 .size = sizeof(struct sched_attr), ··· 225 201 }; 226 202 int ret; 227 203 228 - if (fie_disabled != FIE_ENABLED && fie_disabled != FIE_DISABLED) { 229 - fie_disabled = FIE_ENABLED; 230 - if (cppc_perf_ctrs_in_pcc()) { 231 - pr_info("FIE not enabled on systems with registers in PCC\n"); 232 - fie_disabled = FIE_DISABLED; 233 - } 234 - } 235 - 236 - if (fie_disabled) 237 - return; 238 - 239 204 kworker_fie = kthread_run_worker(0, "cppc_fie"); 240 205 if (IS_ERR(kworker_fie)) { 241 206 pr_warn("%s: failed to create kworker_fie: %ld\n", __func__, 242 207 PTR_ERR(kworker_fie)); 243 208 fie_disabled = FIE_DISABLED; 209 + kworker_fie = NULL; 244 210 return; 245 211 } 246 212 ··· 240 226 ret); 241 227 kthread_destroy_worker(kworker_fie); 242 228 fie_disabled = FIE_DISABLED; 229 + kworker_fie = NULL; 243 230 } 231 + } 232 + 233 + static void __init cppc_freq_invariance_init(void) 234 + { 235 + bool perf_ctrs_in_pcc = cppc_perf_ctrs_in_pcc(); 236 + 237 + if (fie_disabled == FIE_UNSET) { 238 + if (perf_ctrs_in_pcc) { 239 + pr_info("FIE not enabled on systems with registers in PCC\n"); 240 + fie_disabled = FIE_DISABLED; 241 + } else { 242 + fie_disabled = FIE_ENABLED; 243 + } 244 + } 245 + 246 + if (fie_disabled || !perf_ctrs_in_pcc) 247 + return; 248 + 249 + cppc_fie_kworker_init(); 244 250 } 245 251 246 252 static void cppc_freq_invariance_exit(void) 247 253 { 248 - if (fie_disabled) 249 - return; 250 - 251 - kthread_destroy_worker(kworker_fie); 254 + if (kworker_fie) 255 + kthread_destroy_worker(kworker_fie); 252 256 } 253 257 254 258 #else ··· 863 831 return count; 864 832 } 865 833 866 - static ssize_t show_auto_act_window(struct cpufreq_policy *policy, char *buf) 834 + static ssize_t cppc_cpufreq_sysfs_show_u64(unsigned int cpu, 835 + int (*get_func)(int, u64 *), 836 + char *buf) 867 837 { 868 838 u64 val; 869 - int ret; 839 + int ret = get_func((int)cpu, &val); 870 840 871 - ret = cppc_get_auto_act_window(policy->cpu, &val); 872 - 873 - /* show "<unsupported>" when this register is not supported by cpc */ 874 841 if (ret == -EOPNOTSUPP) 875 842 return sysfs_emit(buf, "<unsupported>\n"); 876 843 ··· 879 848 return sysfs_emit(buf, "%llu\n", val); 880 849 } 881 850 882 - static ssize_t store_auto_act_window(struct cpufreq_policy *policy, 883 - const char *buf, size_t count) 884 - { 885 - u64 usec; 886 - int ret; 887 - 888 - ret = kstrtou64(buf, 0, &usec); 889 - if (ret) 890 - return ret; 891 - 892 - ret = cppc_set_auto_act_window(policy->cpu, usec); 893 - if (ret) 894 - return ret; 895 - 896 - return count; 897 - } 898 - 899 - static ssize_t show_energy_performance_preference_val(struct cpufreq_policy *policy, char *buf) 900 - { 901 - u64 val; 902 - int ret; 903 - 904 - ret = cppc_get_epp_perf(policy->cpu, &val); 905 - 906 - /* show "<unsupported>" when this register is not supported by cpc */ 907 - if (ret == -EOPNOTSUPP) 908 - return sysfs_emit(buf, "<unsupported>\n"); 909 - 910 - if (ret) 911 - return ret; 912 - 913 - return sysfs_emit(buf, "%llu\n", val); 914 - } 915 - 916 - static ssize_t store_energy_performance_preference_val(struct cpufreq_policy *policy, 917 - const char *buf, size_t count) 851 + static ssize_t cppc_cpufreq_sysfs_store_u64(unsigned int cpu, 852 + int (*set_func)(int, u64), 853 + const char *buf, size_t count) 918 854 { 919 855 u64 val; 920 856 int ret; ··· 890 892 if (ret) 891 893 return ret; 892 894 893 - ret = cppc_set_epp(policy->cpu, val); 894 - if (ret) 895 - return ret; 895 + ret = set_func((int)cpu, val); 896 896 897 - return count; 897 + return ret ? ret : count; 898 898 } 899 + 900 + #define CPPC_CPUFREQ_ATTR_RW_U64(_name, _get_func, _set_func) \ 901 + static ssize_t show_##_name(struct cpufreq_policy *policy, char *buf) \ 902 + { \ 903 + return cppc_cpufreq_sysfs_show_u64(policy->cpu, _get_func, buf);\ 904 + } \ 905 + static ssize_t store_##_name(struct cpufreq_policy *policy, \ 906 + const char *buf, size_t count) \ 907 + { \ 908 + return cppc_cpufreq_sysfs_store_u64(policy->cpu, _set_func, \ 909 + buf, count); \ 910 + } 911 + 912 + CPPC_CPUFREQ_ATTR_RW_U64(auto_act_window, cppc_get_auto_act_window, 913 + cppc_set_auto_act_window) 914 + 915 + CPPC_CPUFREQ_ATTR_RW_U64(energy_performance_preference_val, 916 + cppc_get_epp_perf, cppc_set_epp) 899 917 900 918 cpufreq_freq_attr_ro(freqdomain_cpus); 901 919 cpufreq_freq_attr_rw(auto_select);

drivers/cpufreq/cpufreq-dt-platdev.c

··· 147 147 { .compatible = "nvidia,tegra30", }, 148 148 { .compatible = "nvidia,tegra114", }, 149 149 { .compatible = "nvidia,tegra124", }, 150 + { .compatible = "nvidia,tegra186", }, 151 + { .compatible = "nvidia,tegra194", }, 150 152 { .compatible = "nvidia,tegra210", }, 151 153 { .compatible = "nvidia,tegra234", }, 152 154 ··· 171 169 { .compatible = "qcom,sdm845", }, 172 170 { .compatible = "qcom,sdx75", }, 173 171 { .compatible = "qcom,sm6115", }, 172 + { .compatible = "qcom,sm6125", }, 173 + { .compatible = "qcom,sm6150", }, 174 174 { .compatible = "qcom,sm6350", }, 175 175 { .compatible = "qcom,sm6375", }, 176 + { .compatible = "qcom,sm7125", }, 176 177 { .compatible = "qcom,sm7225", }, 177 178 { .compatible = "qcom,sm7325", }, 178 179 { .compatible = "qcom,sm8150", }, ··· 196 191 { .compatible = "ti,am625", }, 197 192 { .compatible = "ti,am62a7", }, 198 193 { .compatible = "ti,am62d2", }, 194 + { .compatible = "ti,am62l3", }, 199 195 { .compatible = "ti,am62p5", }, 200 196 201 197 { .compatible = "qcom,ipq5332", },

+6 -7

drivers/cpufreq/cpufreq.c

··· 2803 2803 { 2804 2804 struct cpufreq_policy *policy; 2805 2805 unsigned long flags; 2806 - int ret = 0; 2806 + int ret = -EOPNOTSUPP; 2807 2807 2808 2808 /* 2809 2809 * Don't compare 'cpufreq_driver->boost_enabled' with 'state' here to ··· 2820 2820 continue; 2821 2821 2822 2822 ret = policy_set_boost(policy, state); 2823 - if (ret) 2824 - goto err_reset_state; 2823 + if (unlikely(ret)) 2824 + break; 2825 2825 } 2826 + 2826 2827 cpus_read_unlock(); 2827 2828 2828 - return 0; 2829 - 2830 - err_reset_state: 2831 - cpus_read_unlock(); 2829 + if (likely(!ret)) 2830 + return 0; 2832 2831 2833 2832 write_lock_irqsave(&cpufreq_driver_lock, flags); 2834 2833 cpufreq_driver->boost_enabled = !state;

+1 -6

drivers/cpufreq/cpufreq_ondemand.c

··· 334 334 static int od_init(struct dbs_data *dbs_data) 335 335 { 336 336 struct od_dbs_tuners *tuners; 337 - u64 idle_time; 338 - int cpu; 339 337 340 338 tuners = kzalloc(sizeof(*tuners), GFP_KERNEL); 341 339 if (!tuners) 342 340 return -ENOMEM; 343 341 344 - cpu = get_cpu(); 345 - idle_time = get_cpu_idle_time_us(cpu, NULL); 346 - put_cpu(); 347 - if (idle_time != -1ULL) { 342 + if (tick_nohz_is_active()) { 348 343 /* Idle micro accounting is supported. Use finer thresholds */ 349 344 dbs_data->up_threshold = MICRO_FREQUENCY_UP_THRESHOLD; 350 345 } else {

+3 -1

drivers/cpufreq/cpufreq_userspace.c

··· 49 49 50 50 static ssize_t show_speed(struct cpufreq_policy *policy, char *buf) 51 51 { 52 - return sprintf(buf, "%u\n", policy->cur); 52 + struct userspace_policy *userspace = policy->governor_data; 53 + 54 + return sprintf(buf, "%u\n", userspace->setspeed); 53 55 } 54 56 55 57 static int cpufreq_userspace_policy_init(struct cpufreq_policy *policy)

+1 -1

drivers/cpufreq/intel_pstate.c

··· 1161 1161 * the capacity of SMT threads is not deterministic even approximately, 1162 1162 * do not do that when SMT is in use. 1163 1163 */ 1164 - if (hwp_is_hybrid && !sched_smt_active() && arch_enable_hybrid_capacity_scale()) { 1164 + if (hwp_is_hybrid && !cpu_smt_possible() && arch_enable_hybrid_capacity_scale()) { 1165 1165 hybrid_refresh_cpu_capacity_scaling(); 1166 1166 /* 1167 1167 * Disabling ITMT causes sched domains to be rebuilt to disable asym

-195

drivers/cpufreq/omap-cpufreq.c

··· 1 - // SPDX-License-Identifier: GPL-2.0-only 2 - /* 3 - * CPU frequency scaling for OMAP using OPP information 4 - * 5 - * Copyright (C) 2005 Nokia Corporation 6 - * Written by Tony Lindgren <tony@atomide.com> 7 - * 8 - * Based on cpu-sa1110.c, Copyright (C) 2001 Russell King 9 - * 10 - * Copyright (C) 2007-2011 Texas Instruments, Inc. 11 - * - OMAP3/4 support by Rajendra Nayak, Santosh Shilimkar 12 - */ 13 - 14 - #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt 15 - 16 - #include <linux/types.h> 17 - #include <linux/kernel.h> 18 - #include <linux/sched.h> 19 - #include <linux/cpufreq.h> 20 - #include <linux/delay.h> 21 - #include <linux/init.h> 22 - #include <linux/err.h> 23 - #include <linux/clk.h> 24 - #include <linux/io.h> 25 - #include <linux/pm_opp.h> 26 - #include <linux/cpu.h> 27 - #include <linux/module.h> 28 - #include <linux/platform_device.h> 29 - #include <linux/regulator/consumer.h> 30 - 31 - /* OPP tolerance in percentage */ 32 - #define OPP_TOLERANCE 4 33 - 34 - static struct cpufreq_frequency_table *freq_table; 35 - static atomic_t freq_table_users = ATOMIC_INIT(0); 36 - static struct device *mpu_dev; 37 - static struct regulator *mpu_reg; 38 - 39 - static int omap_target(struct cpufreq_policy *policy, unsigned int index) 40 - { 41 - int r, ret; 42 - struct dev_pm_opp *opp; 43 - unsigned long freq, volt = 0, volt_old = 0, tol = 0; 44 - unsigned int old_freq, new_freq; 45 - 46 - old_freq = policy->cur; 47 - new_freq = freq_table[index].frequency; 48 - 49 - freq = new_freq * 1000; 50 - ret = clk_round_rate(policy->clk, freq); 51 - if (ret < 0) { 52 - dev_warn(mpu_dev, 53 - "CPUfreq: Cannot find matching frequency for %lu\n", 54 - freq); 55 - return ret; 56 - } 57 - freq = ret; 58 - 59 - if (mpu_reg) { 60 - opp = dev_pm_opp_find_freq_ceil(mpu_dev, &freq); 61 - if (IS_ERR(opp)) { 62 - dev_err(mpu_dev, "%s: unable to find MPU OPP for %d\n", 63 - __func__, new_freq); 64 - return -EINVAL; 65 - } 66 - volt = dev_pm_opp_get_voltage(opp); 67 - dev_pm_opp_put(opp); 68 - tol = volt * OPP_TOLERANCE / 100; 69 - volt_old = regulator_get_voltage(mpu_reg); 70 - } 71 - 72 - dev_dbg(mpu_dev, "cpufreq-omap: %u MHz, %ld mV --> %u MHz, %ld mV\n", 73 - old_freq / 1000, volt_old ? volt_old / 1000 : -1, 74 - new_freq / 1000, volt ? volt / 1000 : -1); 75 - 76 - /* scaling up? scale voltage before frequency */ 77 - if (mpu_reg && (new_freq > old_freq)) { 78 - r = regulator_set_voltage(mpu_reg, volt - tol, volt + tol); 79 - if (r < 0) { 80 - dev_warn(mpu_dev, "%s: unable to scale voltage up.\n", 81 - __func__); 82 - return r; 83 - } 84 - } 85 - 86 - ret = clk_set_rate(policy->clk, new_freq * 1000); 87 - 88 - /* scaling down? scale voltage after frequency */ 89 - if (mpu_reg && (new_freq < old_freq)) { 90 - r = regulator_set_voltage(mpu_reg, volt - tol, volt + tol); 91 - if (r < 0) { 92 - dev_warn(mpu_dev, "%s: unable to scale voltage down.\n", 93 - __func__); 94 - clk_set_rate(policy->clk, old_freq * 1000); 95 - return r; 96 - } 97 - } 98 - 99 - return ret; 100 - } 101 - 102 - static inline void freq_table_free(void) 103 - { 104 - if (atomic_dec_and_test(&freq_table_users)) 105 - dev_pm_opp_free_cpufreq_table(mpu_dev, &freq_table); 106 - } 107 - 108 - static int omap_cpu_init(struct cpufreq_policy *policy) 109 - { 110 - int result; 111 - 112 - policy->clk = clk_get(NULL, "cpufreq_ck"); 113 - if (IS_ERR(policy->clk)) 114 - return PTR_ERR(policy->clk); 115 - 116 - if (!freq_table) { 117 - result = dev_pm_opp_init_cpufreq_table(mpu_dev, &freq_table); 118 - if (result) { 119 - dev_err(mpu_dev, 120 - "%s: cpu%d: failed creating freq table[%d]\n", 121 - __func__, policy->cpu, result); 122 - clk_put(policy->clk); 123 - return result; 124 - } 125 - } 126 - 127 - atomic_inc_return(&freq_table_users); 128 - 129 - /* FIXME: what's the actual transition time? */ 130 - cpufreq_generic_init(policy, freq_table, 300 * 1000); 131 - 132 - return 0; 133 - } 134 - 135 - static void omap_cpu_exit(struct cpufreq_policy *policy) 136 - { 137 - freq_table_free(); 138 - clk_put(policy->clk); 139 - } 140 - 141 - static struct cpufreq_driver omap_driver = { 142 - .flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK, 143 - .verify = cpufreq_generic_frequency_table_verify, 144 - .target_index = omap_target, 145 - .get = cpufreq_generic_get, 146 - .init = omap_cpu_init, 147 - .exit = omap_cpu_exit, 148 - .register_em = cpufreq_register_em_with_opp, 149 - .name = "omap", 150 - }; 151 - 152 - static int omap_cpufreq_probe(struct platform_device *pdev) 153 - { 154 - mpu_dev = get_cpu_device(0); 155 - if (!mpu_dev) { 156 - pr_warn("%s: unable to get the MPU device\n", __func__); 157 - return -EINVAL; 158 - } 159 - 160 - mpu_reg = regulator_get(mpu_dev, "vcc"); 161 - if (IS_ERR(mpu_reg)) { 162 - pr_warn("%s: unable to get MPU regulator\n", __func__); 163 - mpu_reg = NULL; 164 - } else { 165 - /* 166 - * Ensure physical regulator is present. 167 - * (e.g. could be dummy regulator.) 168 - */ 169 - if (regulator_get_voltage(mpu_reg) < 0) { 170 - pr_warn("%s: physical regulator not present for MPU\n", 171 - __func__); 172 - regulator_put(mpu_reg); 173 - mpu_reg = NULL; 174 - } 175 - } 176 - 177 - return cpufreq_register_driver(&omap_driver); 178 - } 179 - 180 - static void omap_cpufreq_remove(struct platform_device *pdev) 181 - { 182 - cpufreq_unregister_driver(&omap_driver); 183 - } 184 - 185 - static struct platform_driver omap_cpufreq_platdrv = { 186 - .driver = { 187 - .name = "omap-cpufreq", 188 - }, 189 - .probe = omap_cpufreq_probe, 190 - .remove = omap_cpufreq_remove, 191 - }; 192 - module_platform_driver(omap_cpufreq_platdrv); 193 - 194 - MODULE_DESCRIPTION("cpufreq driver for OMAP SoCs"); 195 - MODULE_LICENSE("GPL");

+2 -3

drivers/cpufreq/rcpufreq_dt.rs

··· 3 3 //! Rust based implementation of the cpufreq-dt driver. 4 4 5 5 use kernel::{ 6 - c_str, 7 6 clk::Clk, 8 7 cpu, cpufreq, 9 8 cpumask::CpumaskVar, ··· 51 52 52 53 #[vtable] 53 54 impl cpufreq::Driver for CPUFreqDTDriver { 54 - const NAME: &'static CStr = c_str!("cpufreq-dt"); 55 + const NAME: &'static CStr = c"cpufreq-dt"; 55 56 const FLAGS: u16 = cpufreq::flags::NEED_INITIAL_FREQ_CHECK | cpufreq::flags::IS_COOLING_DEV; 56 57 const BOOST_ENABLED: bool = true; 57 58 ··· 196 197 OF_TABLE, 197 198 MODULE_OF_TABLE, 198 199 <CPUFreqDTDriver as platform::Driver>::IdInfo, 199 - [(of::DeviceId::new(c_str!("operating-points-v2")), ())] 200 + [(of::DeviceId::new(c"operating-points-v2"), ())] 200 201 ); 201 202 202 203 impl platform::Driver for CPUFreqDTDriver {

+2 -1

drivers/cpufreq/scmi-cpufreq.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 /* 3 - * System Control and Power Interface (SCMI) based CPUFreq Interface driver 3 + * System Control and Management Interface (SCMI) based CPUFreq Interface driver 4 4 * 5 5 * Copyright (C) 2018-2021 ARM Ltd. 6 6 * Sudeep Holla <sudeep.holla@arm.com> ··· 101 101 return -EINVAL; 102 102 } 103 103 104 + of_node_put(domain_id.np); 104 105 return domain_id.args[0]; 105 106 } 106 107

+33 -1

drivers/cpufreq/ti-cpufreq.c

··· 70 70 #define AM62A7_SUPPORT_R_MPU_OPP BIT(1) 71 71 #define AM62A7_SUPPORT_V_MPU_OPP BIT(2) 72 72 73 + #define AM62L3_EFUSE_E_MPU_OPP 5 74 + #define AM62L3_EFUSE_O_MPU_OPP 15 75 + 76 + #define AM62L3_SUPPORT_E_MPU_OPP BIT(0) 77 + #define AM62L3_SUPPORT_O_MPU_OPP BIT(1) 78 + 73 79 #define AM62P5_EFUSE_O_MPU_OPP 15 74 80 #define AM62P5_EFUSE_S_MPU_OPP 19 75 81 #define AM62P5_EFUSE_T_MPU_OPP 20 ··· 219 213 return calculated_efuse; 220 214 } 221 215 216 + static unsigned long am62l3_efuse_xlate(struct ti_cpufreq_data *opp_data, 217 + unsigned long efuse) 218 + { 219 + unsigned long calculated_efuse = AM62L3_SUPPORT_E_MPU_OPP; 220 + 221 + switch (efuse) { 222 + case AM62L3_EFUSE_O_MPU_OPP: 223 + calculated_efuse |= AM62L3_SUPPORT_O_MPU_OPP; 224 + fallthrough; 225 + case AM62L3_EFUSE_E_MPU_OPP: 226 + calculated_efuse |= AM62L3_SUPPORT_E_MPU_OPP; 227 + } 228 + 229 + return calculated_efuse; 230 + } 231 + 222 232 static struct ti_cpufreq_soc_data am3x_soc_data = { 223 233 .efuse_xlate = amx3_efuse_xlate, 224 234 .efuse_fallback = AM33XX_800M_ARM_MPU_MAX_FREQ, ··· 335 313 static const struct soc_device_attribute k3_cpufreq_soc[] = { 336 314 { .family = "AM62X", }, 337 315 { .family = "AM62AX", }, 338 - { .family = "AM62PX", }, 339 316 { .family = "AM62DX", }, 317 + { .family = "AM62LX", }, 318 + { .family = "AM62PX", }, 340 319 { /* sentinel */ } 341 320 }; 342 321 ··· 352 329 353 330 static struct ti_cpufreq_soc_data am62a7_soc_data = { 354 331 .efuse_xlate = am62a7_efuse_xlate, 332 + .efuse_offset = 0x0, 333 + .efuse_mask = 0x07c0, 334 + .efuse_shift = 0x6, 335 + .multi_regulator = false, 336 + }; 337 + 338 + static struct ti_cpufreq_soc_data am62l3_soc_data = { 339 + .efuse_xlate = am62l3_efuse_xlate, 355 340 .efuse_offset = 0x0, 356 341 .efuse_mask = 0x07c0, 357 342 .efuse_shift = 0x6, ··· 494 463 { .compatible = "ti,am625", .data = &am625_soc_data, }, 495 464 { .compatible = "ti,am62a7", .data = &am62a7_soc_data, }, 496 465 { .compatible = "ti,am62d2", .data = &am62a7_soc_data, }, 466 + { .compatible = "ti,am62l3", .data = &am62l3_soc_data, }, 497 467 { .compatible = "ti,am62p5", .data = &am62p5_soc_data, }, 498 468 /* legacy */ 499 469 { .compatible = "ti,omap3430", .data = &omap34xx_soc_data, },

+12 -12

drivers/cpuidle/governors/menu.c

············

+79 -19

drivers/cpuidle/governors/teo.c

··· 48 48 * in accordance with what happened last time. 49 49 * 50 50 * The "hits" metric reflects the relative frequency of situations in which the 51 - * sleep length and the idle duration measured after CPU wakeup fall into the 52 - * same bin (that is, the CPU appears to wake up "on time" relative to the sleep 53 - * length). In turn, the "intercepts" metric reflects the relative frequency of 54 - * non-timer wakeup events for which the measured idle duration falls into a bin 55 - * that corresponds to an idle state shallower than the one whose bin is fallen 56 - * into by the sleep length (these events are also referred to as "intercepts" 51 + * sleep length and the idle duration measured after CPU wakeup are close enough 52 + * (that is, the CPU appears to wake up "on time" relative to the sleep length). 53 + * In turn, the "intercepts" metric reflects the relative frequency of non-timer 54 + * wakeup events for which the measured idle duration is significantly different 55 + * from the sleep length (these events are also referred to as "intercepts" 57 56 * below). 58 57 * 59 58 * The governor also counts "intercepts" with the measured idle duration below ··· 74 75 * than the candidate one (it represents the cases in which the CPU was 75 76 * likely woken up by a non-timer wakeup source). 76 77 * 78 + * Also find the idle state with the maximum intercepts metric (if there are 79 + * multiple states with the maximum intercepts metric, choose the one with 80 + * the highest index). 81 + * 77 82 * 2. If the second sum computed in step 1 is greater than a half of the sum of 78 83 * both metrics for the candidate state bin and all subsequent bins (if any), 79 84 * a shallower idle state is likely to be more suitable, so look for it. 80 85 * 81 86 * - Traverse the enabled idle states shallower than the candidate one in the 82 - * descending order. 87 + * descending order, starting at the state with the maximum intercepts 88 + * metric found in step 1. 83 89 * 84 90 * - For each of them compute the sum of the "intercepts" metrics over all 85 91 * of the idle states between it and the candidate one (including the ··· 171 167 */ 172 168 static void teo_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) 173 169 { 170 + s64 lat_ns = drv->states[dev->last_state_idx].exit_latency_ns; 174 171 struct teo_cpu *cpu_data = this_cpu_ptr(&teo_cpus); 175 172 int i, idx_timer = 0, idx_duration = 0; 176 173 s64 target_residency_ns, measured_ns; ··· 187 182 */ 188 183 measured_ns = S64_MAX; 189 184 } else { 190 - s64 lat_ns = drv->states[dev->last_state_idx].exit_latency_ns; 191 - 192 185 measured_ns = dev->last_residency_ns; 193 186 /* 194 187 * The delay between the wakeup and the first instruction ··· 242 239 cpu_data->state_bins[drv->state_count-1].hits += PULSE; 243 240 return; 244 241 } 242 + /* 243 + * If intercepts within the tick period range are not frequent 244 + * enough, count this wakeup as a hit, since it is likely that 245 + * the tick has woken up the CPU because an expected intercept 246 + * was not there. Otherwise, one of the intercepts may have 247 + * been incidentally preceded by the tick wakeup. 248 + */ 249 + if (3 * cpu_data->tick_intercepts < 2 * total) { 250 + cpu_data->state_bins[idx_timer].hits += PULSE; 251 + return; 252 + } 245 253 } 246 254 247 255 /* 248 - * If the measured idle duration falls into the same bin as the sleep 249 - * length, this is a "hit", so update the "hits" metric for that bin. 256 + * If the measured idle duration (adjusted for the entered state exit 257 + * latency) falls into the same bin as the sleep length and the latter 258 + * is less than the "raw" measured idle duration (so the wakeup appears 259 + * to have occurred after the anticipated timer event), this is a "hit", 260 + * so update the "hits" metric for that bin. 261 + * 250 262 * Otherwise, update the "intercepts" metric for the bin fallen into by 251 263 * the measured idle duration. 252 264 */ 253 - if (idx_timer == idx_duration) { 265 + if (idx_timer == idx_duration && 266 + cpu_data->sleep_length_ns - measured_ns < lat_ns / 2) { 254 267 cpu_data->state_bins[idx_timer].hits += PULSE; 255 268 } else { 256 269 cpu_data->state_bins[idx_duration].intercepts += PULSE; ··· 313 294 ktime_t delta_tick = TICK_NSEC / 2; 314 295 unsigned int idx_intercept_sum = 0; 315 296 unsigned int intercept_sum = 0; 297 + unsigned int intercept_max = 0; 316 298 unsigned int idx_hit_sum = 0; 317 299 unsigned int hit_sum = 0; 300 + int intercept_max_idx = -1; 318 301 int constraint_idx = 0; 319 302 int idx0 = 0, idx = -1; 320 303 s64 duration_ns; ··· 347 326 if (!dev->states_usage[0].disable) 348 327 idx = 0; 349 328 350 - /* Compute the sums of metrics for early wakeup pattern detection. */ 329 + /* 330 + * Compute the sums of metrics for early wakeup pattern detection and 331 + * look for the state bin with the maximum intercepts metric below the 332 + * deepest enabled one (if there are multiple states with the maximum 333 + * intercepts metric, choose the one with the highest index). 334 + */ 351 335 for (i = 1; i < drv->state_count; i++) { 352 336 struct teo_bin *prev_bin = &cpu_data->state_bins[i-1]; 337 + unsigned int prev_intercepts = prev_bin->intercepts; 353 338 struct cpuidle_state *s = &drv->states[i]; 354 339 355 340 /* 356 341 * Update the sums of idle state metrics for all of the states 357 342 * shallower than the current one. 358 343 */ 359 - intercept_sum += prev_bin->intercepts; 360 344 hit_sum += prev_bin->hits; 345 + intercept_sum += prev_intercepts; 346 + /* 347 + * Check if this is the bin with the maximum number of 348 + * intercepts so far and in that case update the index of 349 + * the state with the maximum intercepts metric. 350 + */ 351 + if (prev_intercepts >= intercept_max) { 352 + intercept_max = prev_intercepts; 353 + intercept_max_idx = i - 1; 354 + } 361 355 362 356 if (dev->states_usage[i].disable) 363 357 continue; ··· 424 388 while (min_idx < idx && 425 389 drv->states[min_idx].target_residency_ns < TICK_NSEC) 426 390 min_idx++; 391 + 392 + /* 393 + * Avoid selecting a state with a lower index, but with 394 + * the same target residency as the current candidate 395 + * one. 396 + */ 397 + if (drv->states[min_idx].target_residency_ns == 398 + drv->states[idx].target_residency_ns) 399 + goto constraint; 427 400 } 428 401 429 402 /* 430 - * Look for the deepest idle state whose target residency had 431 - * not exceeded the idle duration in over a half of the relevant 432 - * cases in the past. 403 + * If the minimum state index is greater than or equal to the 404 + * index of the state with the maximum intercepts metric and 405 + * the corresponding state is enabled, there is no need to look 406 + * at the deeper states. 407 + */ 408 + if (min_idx >= intercept_max_idx && 409 + !dev->states_usage[min_idx].disable) { 410 + idx = min_idx; 411 + goto constraint; 412 + } 413 + 414 + /* 415 + * Look for the deepest enabled idle state, at most as deep as 416 + * the one with the maximum intercepts metric, whose target 417 + * residency had not been greater than the idle duration in over 418 + * a half of the relevant cases in the past. 433 419 * 434 420 * Take the possible duration limitation present if the tick 435 421 * has been stopped already into account. ··· 463 405 continue; 464 406 465 407 idx = i; 466 - if (2 * intercept_sum > idx_intercept_sum) 408 + if (2 * intercept_sum > idx_intercept_sum && 409 + i <= intercept_max_idx) 467 410 break; 468 411 } 469 412 } 470 413 414 + constraint: 471 415 /* 472 416 * If there is a latency constraint, it may be necessary to select an 473 417 * idle state shallower than the current candidate one. ··· 524 464 * total wakeup events, do not stop the tick. 525 465 */ 526 466 if (drv->states[idx].target_residency_ns < TICK_NSEC && 527 - cpu_data->tick_intercepts > cpu_data->total / 2 + cpu_data->total / 8) 467 + 3 * cpu_data->tick_intercepts >= 2 * cpu_data->total) 528 468 duration_ns = TICK_NSEC / 2; 529 469 530 470 end:

+1 -5

drivers/gpu/drm/arm/malidp_crtc.c

··· 77 77 crtc); 78 78 struct malidp_drm *malidp = crtc_to_malidp_device(crtc); 79 79 struct malidp_hw_device *hwdev = malidp->dev; 80 - int err; 81 80 82 81 /* always disable planes on the CRTC that is being turned off */ 83 82 drm_atomic_helper_disable_planes_on_crtc(old_state, false); ··· 86 87 87 88 clk_disable_unprepare(hwdev->pxlclk); 88 89 89 - err = pm_runtime_put(crtc->dev->dev); 90 - if (err < 0) { 91 - DRM_DEBUG_DRIVER("Failed to disable runtime power management: %d\n", err); 92 - } 90 + pm_runtime_put(crtc->dev->dev); 93 91 } 94 92 95 93 static const struct gamma_curve_segment {

+1 -3

drivers/gpu/drm/bridge/imx/imx8qm-ldb.c

··· 280 280 clk_disable_unprepare(imx8qm_ldb->clk_bypass); 281 281 clk_disable_unprepare(imx8qm_ldb->clk_pixel); 282 282 283 - ret = pm_runtime_put(dev); 284 - if (ret < 0) 285 - DRM_DEV_ERROR(dev, "failed to put runtime PM: %d\n", ret); 283 + pm_runtime_put(dev); 286 284 } 287 285 288 286 static const u32 imx8qm_ldb_bus_output_fmts[] = {

+1 -3

drivers/gpu/drm/bridge/imx/imx8qxp-ldb.c

··· 282 282 if (is_split && companion) 283 283 companion->funcs->atomic_disable(companion, state); 284 284 285 - ret = pm_runtime_put(dev); 286 - if (ret < 0) 287 - DRM_DEV_ERROR(dev, "failed to put runtime PM: %d\n", ret); 285 + pm_runtime_put(dev); 288 286 } 289 287 290 288 static const u32 imx8qxp_ldb_bus_output_fmts[] = {

+1 -4

drivers/gpu/drm/bridge/imx/imx8qxp-pixel-combiner.c

··· 181 181 { 182 182 struct imx8qxp_pc_channel *ch = bridge->driver_private; 183 183 struct imx8qxp_pc *pc = ch->pc; 184 - int ret; 185 184 186 - ret = pm_runtime_put(pc->dev); 187 - if (ret < 0) 188 - DRM_DEV_ERROR(pc->dev, "failed to put runtime PM: %d\n", ret); 185 + pm_runtime_put(pc->dev); 189 186 } 190 187 191 188 static const u32 imx8qxp_pc_bus_output_fmts[] = {

+1 -4

drivers/gpu/drm/bridge/imx/imx8qxp-pxl2dpi.c

··· 127 127 struct drm_atomic_state *state) 128 128 { 129 129 struct imx8qxp_pxl2dpi *p2d = bridge->driver_private; 130 - int ret; 131 130 132 - ret = pm_runtime_put(p2d->dev); 133 - if (ret < 0) 134 - DRM_DEV_ERROR(p2d->dev, "failed to put runtime PM: %d\n", ret); 131 + pm_runtime_put(p2d->dev); 135 132 136 133 if (p2d->companion) 137 134 p2d->companion->funcs->atomic_disable(p2d->companion, state);

+2 -2

drivers/gpu/drm/imagination/pvr_power.h

··· 30 30 return pm_runtime_resume_and_get(drm_dev->dev); 31 31 } 32 32 33 - static __always_inline int 33 + static __always_inline void 34 34 pvr_power_put(struct pvr_device *pvr_dev) 35 35 { 36 36 struct drm_device *drm_dev = from_pvr_device(pvr_dev); 37 37 38 - return pm_runtime_put(drm_dev->dev); 38 + pm_runtime_put(drm_dev->dev); 39 39 } 40 40 41 41 int pvr_power_domains_init(struct pvr_device *pvr_dev);

+3 -9

drivers/gpu/drm/imx/dc/dc-crtc.c

··· 300 300 drm_atomic_get_new_crtc_state(state, crtc); 301 301 struct dc_drm_device *dc_drm = to_dc_drm_device(crtc->dev); 302 302 struct dc_crtc *dc_crtc = to_dc_crtc(crtc); 303 - int idx, ret; 303 + int idx; 304 304 305 305 if (!drm_dev_enter(crtc->dev, &idx)) 306 306 goto out; ··· 313 313 dc_fg_disable_clock(dc_crtc->fg); 314 314 315 315 /* request pixel engine power-off as plane is off too */ 316 - ret = pm_runtime_put(dc_drm->pe->dev); 317 - if (ret) 318 - dc_crtc_err(crtc, "failed to put DC pixel engine RPM: %d\n", 319 - ret); 316 + pm_runtime_put(dc_drm->pe->dev); 320 317 321 318 /* request display engine power-off when CRTC is disabled */ 322 - ret = pm_runtime_put(dc_crtc->de->dev); 323 - if (ret < 0) 324 - dc_crtc_err(crtc, "failed to put DC display engine RPM: %d\n", 325 - ret); 319 + pm_runtime_put(dc_crtc->de->dev); 326 320 327 321 drm_dev_exit(idx); 328 322

+1 -4

drivers/gpu/drm/vc4/vc4_hdmi.c

··· 848 848 struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder); 849 849 struct drm_device *drm = vc4_hdmi->connector.dev; 850 850 unsigned long flags; 851 - int ret; 852 851 int idx; 853 852 854 853 mutex_lock(&vc4_hdmi->mutex); ··· 866 867 clk_disable_unprepare(vc4_hdmi->pixel_bvb_clock); 867 868 clk_disable_unprepare(vc4_hdmi->pixel_clock); 868 869 869 - ret = pm_runtime_put(&vc4_hdmi->pdev->dev); 870 - if (ret < 0) 871 - drm_err(drm, "Failed to release power domain: %d\n", ret); 870 + pm_runtime_put(&vc4_hdmi->pdev->dev); 872 871 873 872 drm_dev_exit(idx); 874 873

+2 -10

drivers/gpu/drm/vc4/vc4_vec.c

··· 542 542 { 543 543 struct drm_device *drm = encoder->dev; 544 544 struct vc4_vec *vec = encoder_to_vc4_vec(encoder); 545 - int idx, ret; 545 + int idx; 546 546 547 547 if (!drm_dev_enter(drm, &idx)) 548 548 return; ··· 556 556 557 557 clk_disable_unprepare(vec->clock); 558 558 559 - ret = pm_runtime_put(&vec->pdev->dev); 560 - if (ret < 0) { 561 - drm_err(drm, "Failed to release power domain: %d\n", ret); 562 - goto err_dev_exit; 563 - } 559 + pm_runtime_put(&vec->pdev->dev); 564 560 565 - drm_dev_exit(idx); 566 - return; 567 - 568 - err_dev_exit: 569 561 drm_dev_exit(idx); 570 562 } 571 563

+1 -3

drivers/hwspinlock/omap_hwspinlock.c

··· 101 101 * runtime PM will make sure the clock of this module is 102 102 * enabled again iff at least one lock is requested 103 103 */ 104 - ret = pm_runtime_put(&pdev->dev); 105 - if (ret < 0) 106 - return ret; 104 + pm_runtime_put(&pdev->dev); 107 105 108 106 /* one of the four lsb's must be set, and nothing else */ 109 107 if (hweight_long(i & 0xf) != 1 || i > 8)

+4 -8

drivers/hwtracing/coresight/coresight-cpu-debug.c

··· 451 451 return ret; 452 452 } 453 453 454 - static int debug_disable_func(void) 454 + static void debug_disable_func(void) 455 455 { 456 456 struct debug_drvdata *drvdata; 457 - int cpu, ret, err = 0; 457 + int cpu; 458 458 459 459 /* 460 460 * Disable debug power domains, records the error and keep ··· 466 466 if (!drvdata) 467 467 continue; 468 468 469 - ret = pm_runtime_put(drvdata->dev); 470 - if (ret < 0) 471 - err = ret; 469 + pm_runtime_put(drvdata->dev); 472 470 } 473 - 474 - return err; 475 471 } 476 472 477 473 static ssize_t debug_func_knob_write(struct file *f, ··· 488 492 if (val) 489 493 ret = debug_enable_func(); 490 494 else 491 - ret = debug_disable_func(); 495 + debug_disable_func(); 492 496 493 497 if (ret) { 494 498 pr_err("%s: unable to %s debug function: %d\n",

+225 -43

drivers/idle/intel_idle.c

··· 45 45 #include <linux/kernel.h> 46 46 #include <linux/cpuidle.h> 47 47 #include <linux/tick.h> 48 + #include <linux/time64.h> 48 49 #include <trace/events/power.h> 49 50 #include <linux/sched.h> 50 51 #include <linux/sched/smt.h> ··· 64 63 #include <asm/fpu/api.h> 65 64 #include <asm/smp.h> 66 65 67 - #define INTEL_IDLE_VERSION "0.5.1" 68 - 69 66 static struct cpuidle_driver intel_idle_driver = { 70 67 .name = "intel_idle", 71 68 .owner = THIS_MODULE, ··· 71 72 /* intel_idle.max_cstate=0 disables driver */ 72 73 static int max_cstate = CPUIDLE_STATE_MAX - 1; 73 74 static unsigned int disabled_states_mask __read_mostly; 74 - static unsigned int preferred_states_mask __read_mostly; 75 75 static bool force_irq_on __read_mostly; 76 76 static bool ibrs_off __read_mostly; 77 + 78 + /* The maximum allowed length for the 'table' module parameter */ 79 + #define MAX_CMDLINE_TABLE_LEN 256 80 + /* Maximum allowed C-state latency */ 81 + #define MAX_CMDLINE_LATENCY_US (5 * USEC_PER_MSEC) 82 + /* Maximum allowed C-state target residency */ 83 + #define MAX_CMDLINE_RESIDENCY_US (100 * USEC_PER_MSEC) 84 + 85 + static char cmdline_table_str[MAX_CMDLINE_TABLE_LEN] __read_mostly; 77 86 78 87 static struct cpuidle_device __percpu *intel_idle_cpuidle_devices; 79 88 ··· 113 106 114 107 static const struct idle_cpu *icpu __initdata; 115 108 static struct cpuidle_state *cpuidle_state_table __initdata; 109 + 110 + /* C-states data from the 'intel_idle.table' cmdline parameter */ 111 + static struct cpuidle_state cmdline_states[CPUIDLE_STATE_MAX] __initdata; 116 112 117 113 static unsigned int mwait_substates __initdata; 118 114 ··· 2062 2052 } 2063 2053 2064 2054 /** 2065 - * adl_idle_state_table_update - Adjust AlderLake idle states table. 2066 - */ 2067 - static void __init adl_idle_state_table_update(void) 2068 - { 2069 - /* Check if user prefers C1 over C1E. */ 2070 - if (preferred_states_mask & BIT(1) && !(preferred_states_mask & BIT(2))) { 2071 - cpuidle_state_table[0].flags &= ~CPUIDLE_FLAG_UNUSABLE; 2072 - cpuidle_state_table[1].flags |= CPUIDLE_FLAG_UNUSABLE; 2073 - 2074 - /* Disable C1E by clearing the "C1E promotion" bit. */ 2075 - c1e_promotion = C1E_PROMOTION_DISABLE; 2076 - return; 2077 - } 2078 - 2079 - /* Make sure C1E is enabled by default */ 2080 - c1e_promotion = C1E_PROMOTION_ENABLE; 2081 - } 2082 - 2083 - /** 2084 2055 * spr_idle_state_table_update - Adjust Sapphire Rapids idle states table. 2085 2056 */ 2086 2057 static void __init spr_idle_state_table_update(void) ··· 2166 2175 case INTEL_SAPPHIRERAPIDS_X: 2167 2176 case INTEL_EMERALDRAPIDS_X: 2168 2177 spr_idle_state_table_update(); 2169 - break; 2170 - case INTEL_ALDERLAKE: 2171 - case INTEL_ALDERLAKE_L: 2172 - case INTEL_ATOM_GRACEMONT: 2173 - adl_idle_state_table_update(); 2174 2178 break; 2175 2179 case INTEL_ATOM_SILVERMONT: 2176 2180 case INTEL_ATOM_AIRMONT: ··· 2406 2420 put_device(sysfs_root); 2407 2421 } 2408 2422 2423 + /** 2424 + * get_cmdline_field - Get the current field from a cmdline string. 2425 + * @args: The cmdline string to get the current field from. 2426 + * @field: Pointer to the current field upon return. 2427 + * @sep: The fields separator character. 2428 + * 2429 + * Examples: 2430 + * Input: args="C1:1:1,C1E:2:10", sep=':' 2431 + * Output: field="C1", return "1:1,C1E:2:10" 2432 + * Input: args="C1:1:1,C1E:2:10", sep=',' 2433 + * Output: field="C1:1:1", return "C1E:2:10" 2434 + * Ipnut: args="::", sep=':' 2435 + * Output: field="", return ":" 2436 + * 2437 + * Return: The continuation of the cmdline string after the field or NULL. 2438 + */ 2439 + static char *get_cmdline_field(char *args, char **field, char sep) 2440 + { 2441 + unsigned int i; 2442 + 2443 + for (i = 0; args[i] && !isspace(args[i]); i++) { 2444 + if (args[i] == sep) 2445 + break; 2446 + } 2447 + 2448 + *field = args; 2449 + 2450 + if (args[i] != sep) 2451 + return NULL; 2452 + 2453 + args[i] = '\0'; 2454 + return args + i + 1; 2455 + } 2456 + 2457 + /** 2458 + * validate_cmdline_cstate - Validate a C-state from cmdline. 2459 + * @state: The C-state to validate. 2460 + * @prev_state: The previous C-state in the table or NULL. 2461 + * 2462 + * Return: 0 if the C-state is valid or -EINVAL otherwise. 2463 + */ 2464 + static int validate_cmdline_cstate(struct cpuidle_state *state, 2465 + struct cpuidle_state *prev_state) 2466 + { 2467 + if (state->exit_latency == 0) 2468 + /* Exit latency 0 can only be used for the POLL state */ 2469 + return -EINVAL; 2470 + 2471 + if (state->exit_latency > MAX_CMDLINE_LATENCY_US) 2472 + return -EINVAL; 2473 + 2474 + if (state->target_residency > MAX_CMDLINE_RESIDENCY_US) 2475 + return -EINVAL; 2476 + 2477 + if (state->target_residency < state->exit_latency) 2478 + return -EINVAL; 2479 + 2480 + if (!prev_state) 2481 + return 0; 2482 + 2483 + if (state->exit_latency <= prev_state->exit_latency) 2484 + return -EINVAL; 2485 + 2486 + if (state->target_residency <= prev_state->target_residency) 2487 + return -EINVAL; 2488 + 2489 + return 0; 2490 + } 2491 + 2492 + /** 2493 + * cmdline_table_adjust - Adjust the C-states table with data from cmdline. 2494 + * @drv: cpuidle driver (assumed to point to intel_idle_driver). 2495 + * 2496 + * Adjust the C-states table with data from the 'intel_idle.table' module 2497 + * parameter (if specified). 2498 + */ 2499 + static void __init cmdline_table_adjust(struct cpuidle_driver *drv) 2500 + { 2501 + char *args = cmdline_table_str; 2502 + struct cpuidle_state *state; 2503 + int i; 2504 + 2505 + if (args[0] == '\0') 2506 + /* The 'intel_idle.table' module parameter was not specified */ 2507 + return; 2508 + 2509 + /* Create a copy of the C-states table */ 2510 + for (i = 0; i < drv->state_count; i++) 2511 + cmdline_states[i] = drv->states[i]; 2512 + 2513 + /* 2514 + * Adjust the C-states table copy with data from the 'intel_idle.table' 2515 + * module parameter. 2516 + */ 2517 + while (args) { 2518 + char *fields, *name, *val; 2519 + 2520 + /* 2521 + * Get the next C-state definition, which is expected to be 2522 + * '<name>:<latency_us>:<target_residency_us>'. Treat "empty" 2523 + * fields as unchanged. For example, 2524 + * '<name>::<target_residency_us>' leaves the latency unchanged. 2525 + */ 2526 + args = get_cmdline_field(args, &fields, ','); 2527 + 2528 + /* name */ 2529 + fields = get_cmdline_field(fields, &name, ':'); 2530 + if (!fields) 2531 + goto error; 2532 + 2533 + if (!strcmp(name, "POLL")) { 2534 + pr_err("Cannot adjust POLL\n"); 2535 + continue; 2536 + } 2537 + 2538 + /* Find the C-state by its name */ 2539 + state = NULL; 2540 + for (i = 0; i < drv->state_count; i++) { 2541 + if (!strcmp(name, drv->states[i].name)) { 2542 + state = &cmdline_states[i]; 2543 + break; 2544 + } 2545 + } 2546 + 2547 + if (!state) { 2548 + pr_err("C-state '%s' was not found\n", name); 2549 + continue; 2550 + } 2551 + 2552 + /* Latency */ 2553 + fields = get_cmdline_field(fields, &val, ':'); 2554 + if (!fields) 2555 + goto error; 2556 + 2557 + if (*val) { 2558 + if (kstrtouint(val, 0, &state->exit_latency)) 2559 + goto error; 2560 + } 2561 + 2562 + /* Target residency */ 2563 + fields = get_cmdline_field(fields, &val, ':'); 2564 + 2565 + if (*val) { 2566 + if (kstrtouint(val, 0, &state->target_residency)) 2567 + goto error; 2568 + } 2569 + 2570 + /* 2571 + * Allow for 3 more fields, but ignore them. Helps to make 2572 + * possible future extensions of the cmdline format backward 2573 + * compatible. 2574 + */ 2575 + for (i = 0; fields && i < 3; i++) { 2576 + fields = get_cmdline_field(fields, &val, ':'); 2577 + if (!fields) 2578 + break; 2579 + } 2580 + 2581 + if (fields) { 2582 + pr_err("Too many fields for C-state '%s'\n", state->name); 2583 + goto error; 2584 + } 2585 + 2586 + pr_info("C-state from cmdline: name=%s, latency=%u, residency=%u\n", 2587 + state->name, state->exit_latency, state->target_residency); 2588 + } 2589 + 2590 + /* Validate the adjusted C-states, start with index 1 to skip POLL */ 2591 + for (i = 1; i < drv->state_count; i++) { 2592 + struct cpuidle_state *prev_state; 2593 + 2594 + state = &cmdline_states[i]; 2595 + prev_state = &cmdline_states[i - 1]; 2596 + 2597 + if (validate_cmdline_cstate(state, prev_state)) { 2598 + pr_err("C-state '%s' validation failed\n", state->name); 2599 + goto error; 2600 + } 2601 + } 2602 + 2603 + /* Copy the adjusted C-states table back */ 2604 + for (i = 1; i < drv->state_count; i++) 2605 + drv->states[i] = cmdline_states[i]; 2606 + 2607 + pr_info("Adjusted C-states with data from 'intel_idle.table'\n"); 2608 + return; 2609 + 2610 + error: 2611 + pr_info("Failed to adjust C-states with data from 'intel_idle.table'\n"); 2612 + } 2613 + 2409 2614 static int __init intel_idle_init(void) 2410 2615 { 2411 2616 const struct x86_cpu_id *id; ··· 2655 2478 return -ENODEV; 2656 2479 } 2657 2480 2658 - pr_debug("v" INTEL_IDLE_VERSION " model 0x%X\n", 2659 - boot_cpu_data.x86_model); 2660 - 2661 2481 intel_idle_cpuidle_devices = alloc_percpu(struct cpuidle_device); 2662 2482 if (!intel_idle_cpuidle_devices) 2663 2483 return -ENOMEM; 2664 2484 2485 + intel_idle_cpuidle_driver_init(&intel_idle_driver); 2486 + cmdline_table_adjust(&intel_idle_driver); 2487 + 2665 2488 retval = intel_idle_sysfs_init(); 2666 2489 if (retval) 2667 2490 pr_warn("failed to initialized sysfs"); 2668 - 2669 - intel_idle_cpuidle_driver_init(&intel_idle_driver); 2670 2491 2671 2492 retval = cpuidle_register_driver(&intel_idle_driver); 2672 2493 if (retval) { ··· 2713 2538 module_param_named(states_off, disabled_states_mask, uint, 0444); 2714 2539 MODULE_PARM_DESC(states_off, "Mask of disabled idle states"); 2715 2540 /* 2716 - * Some platforms come with mutually exclusive C-states, so that if one is 2717 - * enabled, the other C-states must not be used. Example: C1 and C1E on 2718 - * Sapphire Rapids platform. This parameter allows for selecting the 2719 - * preferred C-states among the groups of mutually exclusive C-states - the 2720 - * selected C-states will be registered, the other C-states from the mutually 2721 - * exclusive group won't be registered. If the platform has no mutually 2722 - * exclusive C-states, this parameter has no effect. 2723 - */ 2724 - module_param_named(preferred_cstates, preferred_states_mask, uint, 0444); 2725 - MODULE_PARM_DESC(preferred_cstates, "Mask of preferred idle states"); 2726 - /* 2727 2541 * Debugging option that forces the driver to enter all C-states with 2728 2542 * interrupts enabled. Does not apply to C-states with 2729 2543 * 'CPUIDLE_FLAG_INIT_XSTATE' and 'CPUIDLE_FLAG_IBRS' flags. ··· 2724 2560 */ 2725 2561 module_param(ibrs_off, bool, 0444); 2726 2562 MODULE_PARM_DESC(ibrs_off, "Disable IBRS when idle"); 2563 + 2564 + /* 2565 + * Define the C-states table from a user input string. Expected format is 2566 + * 'name:latency:residency', where: 2567 + * - name: The C-state name. 2568 + * - latency: The C-state exit latency in us. 2569 + * - residency: The C-state target residency in us. 2570 + * 2571 + * Multiple C-states can be defined by separating them with commas: 2572 + * 'name1:latency1:residency1,name2:latency2:residency2' 2573 + * 2574 + * Example: intel_idle.table=C1:1:1,C1E:5:10,C6:100:600 2575 + * 2576 + * To leave latency or residency unchanged, use an empty field, for example: 2577 + * 'C1:1:1,C1E::10' - leaves C1E latency unchanged. 2578 + */ 2579 + module_param_string(table, cmdline_table_str, MAX_CMDLINE_TABLE_LEN, 0444); 2580 + MODULE_PARM_DESC(table, "Build the C-states table from a user input string");

+3 -1

drivers/media/i2c/ccs/ccs-core.c

··· 1974 1974 struct ccs_sensor *sensor = to_ccs_sensor(subdev); 1975 1975 struct i2c_client *client = v4l2_get_subdevdata(&sensor->src->sd); 1976 1976 1977 - return pm_runtime_put(&client->dev); 1977 + pm_runtime_put(&client->dev); 1978 + 1979 + return 0; 1978 1980 } 1979 1981 1980 1982 static int ccs_enum_mbus_code(struct v4l2_subdev *subdev,

+1 -1

drivers/opp/core.c

··· 241 241 { 242 242 if (IS_ERR_OR_NULL(opp) || !opp->available) { 243 243 pr_err("%s: Invalid parameters\n", __func__); 244 - return 0; 244 + return U32_MAX; 245 245 } 246 246 247 247 return opp->level;

+1 -3

drivers/opp/of.c

··· 956 956 /* Initializes OPP tables based on new bindings */ 957 957 static int _of_add_opp_table_v2(struct device *dev, struct opp_table *opp_table) 958 958 { 959 - struct device_node *np; 960 959 int ret, count = 0; 961 960 struct dev_pm_opp *opp; 962 961 ··· 970 971 } 971 972 972 973 /* We have opp-table node now, iterate over it and add OPPs */ 973 - for_each_available_child_of_node(opp_table->np, np) { 974 + for_each_available_child_of_node_scoped(opp_table->np, np) { 974 975 opp = _opp_add_static_v2(opp_table, dev, np); 975 976 if (IS_ERR(opp)) { 976 977 ret = PTR_ERR(opp); 977 978 dev_err(dev, "%s: Failed to add OPP, %d\n", __func__, 978 979 ret); 979 - of_node_put(np); 980 980 goto remove_static_opp; 981 981 } else if (opp) { 982 982 count++;

+3 -1

drivers/platform/chrome/cros_hps_i2c.c

··· 46 46 struct hps_drvdata, misc_device); 47 47 struct device *dev = &hps->client->dev; 48 48 49 - return pm_runtime_put(dev); 49 + pm_runtime_put(dev); 50 + 51 + return 0; 50 52 } 51 53 52 54 static const struct file_operations hps_fops = {

drivers/powercap/intel_rapl_msr.c

··· 162 162 163 163 /* List of verified CPUs. */ 164 164 static const struct x86_cpu_id pl4_support_ids[] = { 165 + X86_MATCH_VFM(INTEL_ICELAKE_L, NULL), 165 166 X86_MATCH_VFM(INTEL_TIGERLAKE_L, NULL), 166 167 X86_MATCH_VFM(INTEL_ALDERLAKE, NULL), 167 168 X86_MATCH_VFM(INTEL_ALDERLAKE_L, NULL),

+6 -7

drivers/powercap/powercap_sys.c

··· 27 27 \ 28 28 if (power_zone->ops->get_##_attr) { \ 29 29 if (!power_zone->ops->get_##_attr(power_zone, &value)) \ 30 - len = sprintf(buf, "%lld\n", value); \ 30 + len = sysfs_emit(buf, "%lld\n", value); \ 31 31 } \ 32 32 \ 33 33 return len; \ ··· 75 75 pconst = &power_zone->constraints[id]; \ 76 76 if (pconst && pconst->ops && pconst->ops->get_##_attr) { \ 77 77 if (!pconst->ops->get_##_attr(power_zone, id, &value)) \ 78 - len = sprintf(buf, "%lld\n", value); \ 78 + len = sysfs_emit(buf, "%lld\n", value); \ 79 79 } \ 80 80 \ 81 81 return len; \ ··· 171 171 if (pconst && pconst->ops && pconst->ops->get_name) { 172 172 name = pconst->ops->get_name(power_zone, id); 173 173 if (name) { 174 - sprintf(buf, "%.*s\n", POWERCAP_CONSTRAINT_NAME_LEN - 1, 175 - name); 176 - len = strlen(buf); 174 + len = sysfs_emit(buf, "%.*s\n", 175 + POWERCAP_CONSTRAINT_NAME_LEN - 1, name); 177 176 } 178 177 } 179 178 ··· 349 350 { 350 351 struct powercap_zone *power_zone = to_powercap_zone(dev); 351 352 352 - return sprintf(buf, "%s\n", power_zone->name); 353 + return sysfs_emit(buf, "%s\n", power_zone->name); 353 354 } 354 355 355 356 static DEVICE_ATTR_RO(name); ··· 437 438 mode = false; 438 439 } 439 440 440 - return sprintf(buf, "%d\n", mode); 441 + return sysfs_emit(buf, "%d\n", mode); 441 442 } 442 443 443 444 static ssize_t enabled_store(struct device *dev,

+2 -2

drivers/ufs/core/ufshcd-priv.h

··· 348 348 return pm_runtime_resume(&hba->ufs_device_wlun->sdev_gendev); 349 349 } 350 350 351 - static inline int ufshcd_rpm_put(struct ufs_hba *hba) 351 + static inline void ufshcd_rpm_put(struct ufs_hba *hba) 352 352 { 353 - return pm_runtime_put(&hba->ufs_device_wlun->sdev_gendev); 353 + pm_runtime_put(&hba->ufs_device_wlun->sdev_gendev); 354 354 } 355 355 356 356 /**

+3 -5

drivers/usb/core/driver.c

··· 1810 1810 void usb_autopm_put_interface_async(struct usb_interface *intf) 1811 1811 { 1812 1812 struct usb_device *udev = interface_to_usbdev(intf); 1813 - int status; 1814 1813 1815 1814 usb_mark_last_busy(udev); 1816 - status = pm_runtime_put(&intf->dev); 1817 - dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n", 1818 - __func__, atomic_read(&intf->dev.power.usage_count), 1819 - status); 1815 + pm_runtime_put(&intf->dev); 1816 + dev_vdbg(&intf->dev, "%s: cnt %d\n", 1817 + __func__, atomic_read(&intf->dev.power.usage_count)); 1820 1818 } 1821 1819 EXPORT_SYMBOL_GPL(usb_autopm_put_interface_async); 1822 1820

+1 -3

drivers/watchdog/rzg2l_wdt.c

··· 132 132 if (ret) 133 133 return ret; 134 134 135 - ret = pm_runtime_put(wdev->parent); 136 - if (ret < 0) 137 - return ret; 135 + pm_runtime_put(wdev->parent); 138 136 139 137 return 0; 140 138 }

+2 -6

drivers/watchdog/rzv2h_wdt.c

··· 174 174 if (priv->of_data->wdtdcr) 175 175 rzt2h_wdt_wdtdcr_count_stop(priv); 176 176 177 - ret = pm_runtime_put(wdev->parent); 178 - if (ret < 0) 179 - return ret; 177 + pm_runtime_put(wdev->parent); 180 178 181 179 return 0; 182 180 } ··· 268 270 269 271 rzt2h_wdt_wdtdcr_count_stop(priv); 270 272 271 - ret = pm_runtime_put(&pdev->dev); 272 - if (ret < 0) 273 - return ret; 273 + pm_runtime_put(&pdev->dev); 274 274 275 275 return 0; 276 276 }

include/acpi/cppc_acpi.h

··· 154 154 extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls); 155 155 extern int cppc_set_enable(int cpu, bool enable); 156 156 extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps); 157 + extern bool cppc_perf_ctrs_in_pcc_cpu(unsigned int cpu); 157 158 extern bool cppc_perf_ctrs_in_pcc(void); 158 159 extern unsigned int cppc_perf_to_khz(struct cppc_perf_caps *caps, unsigned int perf); 159 160 extern unsigned int cppc_khz_to_perf(struct cppc_perf_caps *caps, unsigned int freq); ··· 204 203 static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps) 205 204 { 206 205 return -EOPNOTSUPP; 206 + } 207 + static inline bool cppc_perf_ctrs_in_pcc_cpu(unsigned int cpu) 208 + { 209 + return false; 207 210 } 208 211 static inline bool cppc_perf_ctrs_in_pcc(void) 209 212 {

+1 -1

include/linux/irq.h

··· 658 658 659 659 extern int irq_chip_compose_msi_msg(struct irq_data *data, struct msi_msg *msg); 660 660 extern int irq_chip_pm_get(struct irq_data *data); 661 - extern int irq_chip_pm_put(struct irq_data *data); 661 + extern void irq_chip_pm_put(struct irq_data *data); 662 662 #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY 663 663 extern void handle_fasteoi_ack_irq(struct irq_desc *desc); 664 664 extern void handle_fasteoi_mask_irq(struct irq_desc *desc);

+1 -1

include/linux/pm.h

··· 681 681 struct list_head entry; 682 682 struct completion completion; 683 683 struct wakeup_source *wakeup; 684 + bool work_in_progress; /* Owned by the PM core */ 684 685 bool wakeup_path:1; 685 686 bool syscore:1; 686 687 bool no_pm_callbacks:1; /* Owned by the PM core */ 687 - bool work_in_progress:1; /* Owned by the PM core */ 688 688 bool smart_suspend:1; /* Owned by the PM core */ 689 689 bool must_resume:1; /* Owned by the PM core */ 690 690 bool may_skip_resume:1; /* Set by subsystems */

include/linux/tick.h

··· 126 126 127 127 #ifdef CONFIG_NO_HZ_COMMON 128 128 extern bool tick_nohz_enabled; 129 + extern bool tick_nohz_is_active(void); 129 130 extern bool tick_nohz_tick_stopped(void); 130 131 extern bool tick_nohz_tick_stopped_cpu(int cpu); 131 132 extern void tick_nohz_idle_stop_tick(void); ··· 143 142 extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time); 144 143 #else /* !CONFIG_NO_HZ_COMMON */ 145 144 #define tick_nohz_enabled (0) 145 + static inline bool tick_nohz_is_active(void) { return false; } 146 146 static inline int tick_nohz_tick_stopped(void) { return 0; } 147 147 static inline int tick_nohz_tick_stopped_cpu(int cpu) { return 0; } 148 148 static inline void tick_nohz_idle_stop_tick(void) { }

+11 -11

kernel/irq/chip.c

··· 974 974 irq_state_set_disabled(desc); 975 975 if (is_chained) { 976 976 desc->action = NULL; 977 - WARN_ON(irq_chip_pm_put(irq_desc_get_irq_data(desc))); 977 + irq_chip_pm_put(irq_desc_get_irq_data(desc)); 978 978 } 979 979 desc->depth = 1; 980 980 } ··· 1530 1530 } 1531 1531 1532 1532 /** 1533 - * irq_chip_pm_put - Disable power for an IRQ chip 1533 + * irq_chip_pm_put - Drop a PM reference on an IRQ chip 1534 1534 * @data: Pointer to interrupt specific data 1535 1535 * 1536 - * Disable the power to the IRQ chip referenced by the interrupt data 1537 - * structure, belongs. Note that power will only be disabled, once this 1538 - * function has been called for all IRQs that have called irq_chip_pm_get(). 1536 + * Drop a power management reference, acquired via irq_chip_pm_get(), on the IRQ 1537 + * chip represented by the interrupt data structure. 1538 + * 1539 + * Note that this will not disable power to the IRQ chip until this function 1540 + * has been called for all IRQs that have called irq_chip_pm_get() and it may 1541 + * not disable power at all (if user space prevents that, for example). 1539 1542 */ 1540 - int irq_chip_pm_put(struct irq_data *data) 1543 + void irq_chip_pm_put(struct irq_data *data) 1541 1544 { 1542 1545 struct device *dev = irq_get_pm_device(data); 1543 - int retval = 0; 1544 1546 1545 - if (IS_ENABLED(CONFIG_PM) && dev) 1546 - retval = pm_runtime_put(dev); 1547 - 1548 - return (retval < 0) ? retval : 0; 1547 + if (dev) 1548 + pm_runtime_put(dev); 1549 1549 }

+1 -1

kernel/power/main.c

··· 1125 1125 1126 1126 static int __init pm_start_workqueues(void) 1127 1127 { 1128 - pm_wq = alloc_workqueue("pm", WQ_FREEZABLE | WQ_UNBOUND, 0); 1128 + pm_wq = alloc_workqueue("pm", WQ_UNBOUND, 0); 1129 1129 if (!pm_wq) 1130 1130 return -ENOMEM; 1131 1131

+4 -4

kernel/power/swap.c

··· 902 902 for (thr = 0; thr < nr_threads; thr++) { 903 903 if (data[thr].thr) 904 904 kthread_stop(data[thr].thr); 905 - if (data[thr].cr) 906 - acomp_request_free(data[thr].cr); 905 + 906 + acomp_request_free(data[thr].cr); 907 907 908 908 if (!IS_ERR_OR_NULL(data[thr].cc)) 909 909 crypto_free_acomp(data[thr].cc); ··· 1502 1502 for (thr = 0; thr < nr_threads; thr++) { 1503 1503 if (data[thr].thr) 1504 1504 kthread_stop(data[thr].thr); 1505 - if (data[thr].cr) 1506 - acomp_request_free(data[thr].cr); 1505 + 1506 + acomp_request_free(data[thr].cr); 1507 1507 1508 1508 if (!IS_ERR_OR_NULL(data[thr].cc)) 1509 1509 crypto_free_acomp(data[thr].cc);

+1 -1

kernel/time/hrtimer.c

··· 943 943 cpumask_var_t mask; 944 944 int cpu; 945 945 946 - if (!hrtimer_hres_active(cpu_base) && !tick_nohz_active) 946 + if (!hrtimer_hres_active(cpu_base) && !tick_nohz_is_active()) 947 947 goto out_timerfd; 948 948 949 949 if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) {

-2

kernel/time/tick-internal.h

··· 156 156 #endif 157 157 158 158 #ifdef CONFIG_NO_HZ_COMMON 159 - extern unsigned long tick_nohz_active; 160 159 extern void timers_update_nohz(void); 161 160 extern u64 get_jiffies_update(unsigned long *basej); 162 161 # ifdef CONFIG_SMP ··· 170 171 # endif 171 172 #else /* CONFIG_NO_HZ_COMMON */ 172 173 static inline void timers_update_nohz(void) { } 173 - #define tick_nohz_active (0) 174 174 #endif 175 175 176 176 DECLARE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases);

+7 -1

kernel/time/tick-sched.c

··· 693 693 * NO HZ enabled ? 694 694 */ 695 695 bool tick_nohz_enabled __read_mostly = true; 696 - unsigned long tick_nohz_active __read_mostly; 696 + static unsigned long tick_nohz_active __read_mostly; 697 697 /* 698 698 * Enable / Disable tickless mode 699 699 */ ··· 703 703 } 704 704 705 705 __setup("nohz=", setup_tick_nohz); 706 + 707 + bool tick_nohz_is_active(void) 708 + { 709 + return tick_nohz_active; 710 + } 711 + EXPORT_SYMBOL_GPL(tick_nohz_is_active); 706 712 707 713 bool tick_nohz_tick_stopped(void) 708 714 {

+1 -1

kernel/time/timer.c

··· 281 281 282 282 static void timers_update_migration(void) 283 283 { 284 - if (sysctl_timer_migration && tick_nohz_active) 284 + if (sysctl_timer_migration && tick_nohz_is_active()) 285 285 static_branch_enable(&timers_migration_enabled); 286 286 else 287 287 static_branch_disable(&timers_migration_enabled);

+2 -1

rust/helpers/cpufreq.c

··· 3 3 #include <linux/cpufreq.h> 4 4 5 5 #ifdef CONFIG_CPU_FREQ 6 - void rust_helper_cpufreq_register_em_with_opp(struct cpufreq_policy *policy) 6 + __rust_helper void 7 + rust_helper_cpufreq_register_em_with_opp(struct cpufreq_policy *policy) 7 8 { 8 9 cpufreq_register_em_with_opp(policy); 9 10 }

+3 -2

rust/kernel/cpufreq.rs

··· 840 840 /// ``` 841 841 /// use kernel::{ 842 842 /// cpufreq, 843 - /// c_str, 844 843 /// device::{Core, Device}, 845 844 /// macros::vtable, 846 845 /// of, platform, ··· 852 853 /// 853 854 /// #[vtable] 854 855 /// impl cpufreq::Driver for SampleDriver { 855 - /// const NAME: &'static CStr = c_str!("cpufreq-sample"); 856 + /// const NAME: &'static CStr = c"cpufreq-sample"; 856 857 /// const FLAGS: u16 = cpufreq::flags::NEED_INITIAL_FREQ_CHECK | cpufreq::flags::IS_COOLING_DEV; 857 858 /// const BOOST_ENABLED: bool = true; 858 859 /// ··· 1014 1015 ..pin_init::zeroed() 1015 1016 }; 1016 1017 1018 + // Always inline to optimize out error path of `build_assert`. 1019 + #[inline(always)] 1017 1020 const fn copy_name(name: &'static CStr) -> [c_char; CPUFREQ_NAME_LEN] { 1018 1021 let src = name.to_bytes_with_nul(); 1019 1022 let mut dst = [0; CPUFREQ_NAME_LEN];

+5 -5

rust/kernel/cpumask.rs

··· 39 39 /// fn set_clear_cpu(ptr: *mut bindings::cpumask, set_cpu: CpuId, clear_cpu: CpuId) { 40 40 /// // SAFETY: The `ptr` is valid for writing and remains valid for the lifetime of the 41 41 /// // returned reference. 42 - /// let mask = unsafe { Cpumask::as_mut_ref(ptr) }; 42 + /// let mask = unsafe { Cpumask::from_raw_mut(ptr) }; 43 43 /// 44 44 /// mask.set(set_cpu); 45 45 /// mask.clear(clear_cpu); ··· 49 49 pub struct Cpumask(Opaque<bindings::cpumask>); 50 50 51 51 impl Cpumask { 52 - /// Creates a mutable reference to an existing `struct cpumask` pointer. 52 + /// Creates a mutable reference from an existing `struct cpumask` pointer. 53 53 /// 54 54 /// # Safety 55 55 /// 56 56 /// The caller must ensure that `ptr` is valid for writing and remains valid for the lifetime 57 57 /// of the returned reference. 58 - pub unsafe fn as_mut_ref<'a>(ptr: *mut bindings::cpumask) -> &'a mut Self { 58 + pub unsafe fn from_raw_mut<'a>(ptr: *mut bindings::cpumask) -> &'a mut Self { 59 59 // SAFETY: Guaranteed by the safety requirements of the function. 60 60 // 61 61 // INVARIANT: The caller ensures that `ptr` is valid for writing and remains valid for the ··· 63 63 unsafe { &mut *ptr.cast() } 64 64 } 65 65 66 - /// Creates a reference to an existing `struct cpumask` pointer. 66 + /// Creates a reference from an existing `struct cpumask` pointer. 67 67 /// 68 68 /// # Safety 69 69 /// 70 70 /// The caller must ensure that `ptr` is valid for reading and remains valid for the lifetime 71 71 /// of the returned reference. 72 - pub unsafe fn as_ref<'a>(ptr: *const bindings::cpumask) -> &'a Self { 72 + pub unsafe fn from_raw<'a>(ptr: *const bindings::cpumask) -> &'a Self { 73 73 // SAFETY: Guaranteed by the safety requirements of the function. 74 74 // 75 75 // INVARIANT: The caller ensures that `ptr` is valid for reading and remains valid for the

+12 -5

tools/power/cpupower/Makefile

··· 315 315 $(INSTALL_DATA) lib/cpuidle.h $(DESTDIR)${includedir}/cpuidle.h 316 316 $(INSTALL_DATA) lib/powercap.h $(DESTDIR)${includedir}/powercap.h 317 317 318 - install-tools: $(OUTPUT)cpupower 318 + # SYSTEMD=false disables installation of the systemd unit file 319 + SYSTEMD ?= true 320 + 321 + install-systemd: 322 + $(INSTALL) -d $(DESTDIR)${unitdir} 323 + sed 's|___CDIR___|${confdir}|; s|___LDIR___|${libexecdir}|' cpupower.service.in > '$(DESTDIR)${unitdir}/cpupower.service' 324 + $(SETPERM_DATA) '$(DESTDIR)${unitdir}/cpupower.service' 325 + 326 + INSTALL_SYSTEMD := $(if $(filter true,$(strip $(SYSTEMD))),install-systemd) 327 + 328 + install-tools: $(OUTPUT)cpupower $(INSTALL_SYSTEMD) 319 329 $(INSTALL) -d $(DESTDIR)${bindir} 320 330 $(INSTALL_PROGRAM) $(OUTPUT)cpupower $(DESTDIR)${bindir} 321 331 $(INSTALL) -d $(DESTDIR)${bash_completion_dir} ··· 334 324 $(INSTALL_DATA) cpupower-service.conf '$(DESTDIR)${confdir}' 335 325 $(INSTALL) -d $(DESTDIR)${libexecdir} 336 326 $(INSTALL_PROGRAM) cpupower.sh '$(DESTDIR)${libexecdir}/cpupower' 337 - $(INSTALL) -d $(DESTDIR)${unitdir} 338 - sed 's|___CDIR___|${confdir}|; s|___LDIR___|${libexecdir}|' cpupower.service.in > '$(DESTDIR)${unitdir}/cpupower.service' 339 - $(SETPERM_DATA) '$(DESTDIR)${unitdir}/cpupower.service' 340 327 341 328 install-man: 342 329 $(INSTALL_DATA) -D man/cpupower.1 $(DESTDIR)${mandir}/man1/cpupower.1 ··· 413 406 @echo ' uninstall - Remove previously installed files from the dir defined by "DESTDIR"' 414 407 @echo ' cmdline or Makefile config block option (default: "")' 415 408 416 - .PHONY: all utils libcpupower update-po create-gmo install-lib install-tools install-man install-gmo install uninstall clean help 409 + .PHONY: all utils libcpupower update-po create-gmo install-lib install-systemd install-tools install-man install-gmo install uninstall clean help

+3 -4

tools/power/cpupower/lib/cpuidle.c

··· 150 150 if (len == 0) 151 151 return 0; 152 152 153 + errno = 0; 153 154 value = strtoull(linebuf, &endp, 0); 154 155 155 156 if (endp == linebuf || errno == ERANGE) ··· 194 193 if (result == NULL) 195 194 return NULL; 196 195 197 - if (result[strlen(result) - 1] == '\n') 198 - result[strlen(result) - 1] = '\0'; 196 + result[strcspn(result, "\n")] = '\0'; 199 197 200 198 return result; 201 199 } ··· 366 366 if (result == NULL) 367 367 return NULL; 368 368 369 - if (result[strlen(result) - 1] == '\n') 370 - result[strlen(result) - 1] = '\0'; 369 + result[strcspn(result, "\n")] = '\0'; 371 370 372 371 return result; 373 372 }

+1 -1

tools/power/cpupower/utils/cpufreq-info.c

··· 270 270 { 271 271 unsigned long freq; 272 272 273 - if (cpupower_cpu_info.caps & CPUPOWER_CAP_APERF) 273 + if (!(cpupower_cpu_info.caps & CPUPOWER_CAP_APERF)) 274 274 return -EINVAL; 275 275 276 276 freq = cpufreq_get_freq_hardware(cpu);

+1 -1

tools/power/cpupower/utils/cpuidle-info.c

··· 111 111 printf(_("max_cstate: C%u\n"), cstates-1); 112 112 printf(_("maximum allowed latency: %lu usec\n"), max_allowed_cstate); 113 113 printf(_("states:\t\n")); 114 - for (cstate = 1; cstate < cstates; cstate++) { 114 + for (cstate = 0; cstate < cstates; cstate++) { 115 115 printf(_(" C%d: " 116 116 "type[C%d] "), cstate, cstate); 117 117 printf(_("promotion[--] demotion[--] "));

+1 -1

tools/power/cpupower/utils/idle_monitor/cpuidle_sysfs.c

··· 70 70 current_count[cpu][state] = 71 71 cpuidle_state_time(cpu, state); 72 72 dprint("CPU %d - State: %d - Val: %llu\n", 73 - cpu, state, previous_count[cpu][state]); 73 + cpu, state, current_count[cpu][state]); 74 74 } 75 75 } 76 76 return 0;

Configure Feed

Configure Feed