Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'pm-5.9-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull one more power management update from Rafael Wysocki:
"Modify the intel_pstate driver to allow it to work in the passive mode
with hardware-managed P-states (HWP) enabled"

* tag 'pm-5.9-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Implement passive mode with HWP enabled

+229 -113
+43 -46
Documentation/admin-guide/pm/intel_pstate.rst
··· 54 54 Operation Modes 55 55 =============== 56 56 57 - ``intel_pstate`` can operate in three different modes: in the active mode with 58 - or without hardware-managed P-states support and in the passive mode. Which of 59 - them will be in effect depends on what kernel command line options are used and 60 - on the capabilities of the processor. 57 + ``intel_pstate`` can operate in two different modes, active or passive. In the 58 + active mode, it uses its own internal performance scaling governor algorithm or 59 + allows the hardware to do preformance scaling by itself, while in the passive 60 + mode it responds to requests made by a generic ``CPUFreq`` governor implementing 61 + a certain performance scaling algorithm. Which of them will be in effect 62 + depends on what kernel command line options are used and on the capabilities of 63 + the processor. 61 64 62 65 Active Mode 63 66 ----------- ··· 197 194 hardware-managed P-states (HWP) support. It is always used if the 198 195 ``intel_pstate=passive`` argument is passed to the kernel in the command line 199 196 regardless of whether or not the given processor supports HWP. [Note that the 200 - ``intel_pstate=no_hwp`` setting implies ``intel_pstate=passive`` if it is used 201 - without ``intel_pstate=active``.] Like in the active mode without HWP support, 202 - in this mode ``intel_pstate`` may refuse to work with processors that are not 203 - recognized by it. 197 + ``intel_pstate=no_hwp`` setting causes the driver to start in the passive mode 198 + if it is not combined with ``intel_pstate=active``.] Like in the active mode 199 + without HWP support, in this mode ``intel_pstate`` may refuse to work with 200 + processors that are not recognized by it if HWP is prevented from being enabled 201 + through the kernel command line. 204 202 205 203 If the driver works in this mode, the ``scaling_driver`` policy attribute in 206 204 ``sysfs`` for all ``CPUFreq`` policies contains the string "intel_cpufreq". ··· 322 318 323 319 For this reason, there is a list of supported processors in ``intel_pstate`` and 324 320 the driver initialization will fail if the detected processor is not in that 325 - list, unless it supports the `HWP feature <Active Mode_>`_. [The interface to 326 - obtain all of the information listed above is the same for all of the processors 327 - supporting the HWP feature, which is why they all are supported by 328 - ``intel_pstate``.] 321 + list, unless it supports the HWP feature. [The interface to obtain all of the 322 + information listed above is the same for all of the processors supporting the 323 + HWP feature, which is why ``intel_pstate`` works with all of them.] 329 324 330 325 331 326 User Space Interface in ``sysfs`` ··· 428 425 as well as the per-policy ones) are then reset to their default 429 426 values, possibly depending on the target operation mode.] 430 427 431 - That only is supported in some configurations, though (for example, if 432 - the `HWP feature is enabled in the processor <Active Mode With HWP_>`_, 433 - the operation mode of the driver cannot be changed), and if it is not 434 - supported in the current configuration, writes to this attribute will 435 - fail with an appropriate error. 436 - 437 428 ``energy_efficiency`` 438 - This attribute is only present on platforms, which have CPUs matching 439 - Kaby Lake or Coffee Lake desktop CPU model. By default 440 - energy efficiency optimizations are disabled on these CPU models in HWP 441 - mode by this driver. Enabling energy efficiency may limit maximum 442 - operating frequency in both HWP and non HWP mode. In non HWP mode, 443 - optimizations are done only in the turbo frequency range. In HWP mode, 444 - optimizations are done in the entire frequency range. Setting this 445 - attribute to "1" enables energy efficiency optimizations and setting 446 - to "0" disables energy efficiency optimizations. 429 + This attribute is only present on platforms with CPUs matching the Kaby 430 + Lake or Coffee Lake desktop CPU model. By default, energy-efficiency 431 + optimizations are disabled on these CPU models if HWP is enabled. 432 + Enabling energy-efficiency optimizations may limit maximum operating 433 + frequency with or without the HWP feature. With HWP enabled, the 434 + optimizations are done only in the turbo frequency range. Without it, 435 + they are done in the entire available frequency range. Setting this 436 + attribute to "1" enables the energy-efficiency optimizations and setting 437 + to "0" disables them. 447 438 448 439 Interpretation of Policy Attributes 449 440 ----------------------------------- ··· 481 484 policy for the time interval between the last two invocations of the 482 485 driver's utilization update callback by the CPU scheduler for that CPU. 483 486 484 - One more policy attribute is present if the `HWP feature is enabled in the 485 - processor <Active Mode With HWP_>`_: 487 + One more policy attribute is present if the HWP feature is enabled in the 488 + processor: 486 489 487 490 ``base_frequency`` 488 491 Shows the base frequency of the CPU. Any frequency above this will be ··· 523 526 524 527 3. The global and per-policy limits can be set independently. 525 528 526 - If the `HWP feature is enabled in the processor <Active Mode With HWP_>`_, the 527 - resulting effective values are written into its registers whenever the limits 528 - change in order to request its internal P-state selection logic to always set 529 - P-states within these limits. Otherwise, the limits are taken into account by 530 - scaling governors (in the `passive mode <Passive Mode_>`_) and by the driver 529 + In the `active mode with the HWP feature enabled <Active Mode With HWP_>`_, the 530 + resulting effective values are written into hardware registers whenever the 531 + limits change in order to request its internal P-state selection logic to always 532 + set P-states within these limits. Otherwise, the limits are taken into account 533 + by scaling governors (in the `passive mode <Passive Mode_>`_) and by the driver 531 534 every time before setting a new P-state for a CPU. 532 535 533 536 Additionally, if the ``intel_pstate=per_cpu_perf_limits`` command line argument ··· 538 541 Energy vs Performance Hints 539 542 --------------------------- 540 543 541 - If ``intel_pstate`` works in the `active mode with the HWP feature enabled 542 - <Active Mode With HWP_>`_ in the processor, additional attributes are present 543 - in every ``CPUFreq`` policy directory in ``sysfs``. They are intended to allow 544 - user space to help ``intel_pstate`` to adjust the processor's internal P-state 545 - selection logic by focusing it on performance or on energy-efficiency, or 546 - somewhere between the two extremes: 544 + If the hardware-managed P-states (HWP) is enabled in the processor, additional 545 + attributes, intended to allow user space to help ``intel_pstate`` to adjust the 546 + processor's internal P-state selection logic by focusing it on performance or on 547 + energy-efficiency, or somewhere between the two extremes, are present in every 548 + ``CPUFreq`` policy directory in ``sysfs``. They are : 547 549 548 550 ``energy_performance_preference`` 549 551 Current value of the energy vs performance hint for the given policy ··· 646 650 Do not register ``intel_pstate`` as the scaling driver even if the 647 651 processor is supported by it. 648 652 653 + ``active`` 654 + Register ``intel_pstate`` in the `active mode <Active Mode_>`_ to start 655 + with. 656 + 649 657 ``passive`` 650 658 Register ``intel_pstate`` in the `passive mode <Passive Mode_>`_ to 651 659 start with. 652 - 653 - This option implies the ``no_hwp`` one described below. 654 660 655 661 ``force`` 656 662 Register ``intel_pstate`` as the scaling driver instead of ··· 668 670 driver is used instead of ``acpi-cpufreq``. 669 671 670 672 ``no_hwp`` 671 - Do not enable the `hardware-managed P-states (HWP) feature 672 - <Active Mode With HWP_>`_ even if it is supported by the processor. 673 + Do not enable the hardware-managed P-states (HWP) feature even if it is 674 + supported by the processor. 673 675 674 676 ``hwp_only`` 675 677 Register ``intel_pstate`` as the scaling driver only if the 676 - `hardware-managed P-states (HWP) feature <Active Mode With HWP_>`_ is 677 - supported by the processor. 678 + hardware-managed P-states (HWP) feature is supported by the processor. 678 679 679 680 ``support_acpi_ppc`` 680 681 Take ACPI ``_PPC`` performance limits into account.
+2 -4
drivers/cpufreq/cpufreq.c
··· 73 73 static unsigned int __cpufreq_get(struct cpufreq_policy *policy); 74 74 static int cpufreq_init_governor(struct cpufreq_policy *policy); 75 75 static void cpufreq_exit_governor(struct cpufreq_policy *policy); 76 - static int cpufreq_start_governor(struct cpufreq_policy *policy); 77 - static void cpufreq_stop_governor(struct cpufreq_policy *policy); 78 76 static void cpufreq_governor_limits(struct cpufreq_policy *policy); 79 77 static int cpufreq_set_policy(struct cpufreq_policy *policy, 80 78 struct cpufreq_governor *new_gov, ··· 2264 2266 module_put(policy->governor->owner); 2265 2267 } 2266 2268 2267 - static int cpufreq_start_governor(struct cpufreq_policy *policy) 2269 + int cpufreq_start_governor(struct cpufreq_policy *policy) 2268 2270 { 2269 2271 int ret; 2270 2272 ··· 2291 2293 return 0; 2292 2294 } 2293 2295 2294 - static void cpufreq_stop_governor(struct cpufreq_policy *policy) 2296 + void cpufreq_stop_governor(struct cpufreq_policy *policy) 2295 2297 { 2296 2298 if (cpufreq_suspended || !policy->governor) 2297 2299 return;
+182 -63
drivers/cpufreq/intel_pstate.c
··· 36 36 #define INTEL_PSTATE_SAMPLING_INTERVAL (10 * NSEC_PER_MSEC) 37 37 38 38 #define INTEL_CPUFREQ_TRANSITION_LATENCY 20000 39 + #define INTEL_CPUFREQ_TRANSITION_DELAY_HWP 5000 39 40 #define INTEL_CPUFREQ_TRANSITION_DELAY 500 40 41 41 42 #ifdef CONFIG_ACPI ··· 221 220 * preference/bias 222 221 * @epp_saved: Saved EPP/EPB during system suspend or CPU offline 223 222 * operation 223 + * @epp_cached Cached HWP energy-performance preference value 224 224 * @hwp_req_cached: Cached value of the last HWP Request MSR 225 225 * @hwp_cap_cached: Cached value of the last HWP Capabilities MSR 226 226 * @last_io_update: Last time when IO wake flag was set ··· 259 257 s16 epp_policy; 260 258 s16 epp_default; 261 259 s16 epp_saved; 260 + s16 epp_cached; 262 261 u64 hwp_req_cached; 263 262 u64 hwp_cap_cached; 264 263 u64 last_io_update; ··· 642 639 return index; 643 640 } 644 641 642 + static int intel_pstate_set_epp(struct cpudata *cpu, u32 epp) 643 + { 644 + /* 645 + * Use the cached HWP Request MSR value, because in the active mode the 646 + * register itself may be updated by intel_pstate_hwp_boost_up() or 647 + * intel_pstate_hwp_boost_down() at any time. 648 + */ 649 + u64 value = READ_ONCE(cpu->hwp_req_cached); 650 + 651 + value &= ~GENMASK_ULL(31, 24); 652 + value |= (u64)epp << 24; 653 + /* 654 + * The only other updater of hwp_req_cached in the active mode, 655 + * intel_pstate_hwp_set(), is called under the same lock as this 656 + * function, so it cannot run in parallel with the update below. 657 + */ 658 + WRITE_ONCE(cpu->hwp_req_cached, value); 659 + return wrmsrl_on_cpu(cpu->cpu, MSR_HWP_REQUEST, value); 660 + } 661 + 645 662 static int intel_pstate_set_energy_pref_index(struct cpudata *cpu_data, 646 663 int pref_index, bool use_raw, 647 664 u32 raw_epp) ··· 673 650 epp = cpu_data->epp_default; 674 651 675 652 if (boot_cpu_has(X86_FEATURE_HWP_EPP)) { 676 - /* 677 - * Use the cached HWP Request MSR value, because the register 678 - * itself may be updated by intel_pstate_hwp_boost_up() or 679 - * intel_pstate_hwp_boost_down() at any time. 680 - */ 681 - u64 value = READ_ONCE(cpu_data->hwp_req_cached); 682 - 683 - value &= ~GENMASK_ULL(31, 24); 684 - 685 653 if (use_raw) 686 654 epp = raw_epp; 687 655 else if (epp == -EINVAL) 688 656 epp = epp_values[pref_index - 1]; 689 657 690 - value |= (u64)epp << 24; 691 - /* 692 - * The only other updater of hwp_req_cached in the active mode, 693 - * intel_pstate_hwp_set(), is called under the same lock as this 694 - * function, so it cannot run in parallel with the update below. 695 - */ 696 - WRITE_ONCE(cpu_data->hwp_req_cached, value); 697 - ret = wrmsrl_on_cpu(cpu_data->cpu, MSR_HWP_REQUEST, value); 658 + ret = intel_pstate_set_epp(cpu_data, epp); 698 659 } else { 699 660 if (epp == -EINVAL) 700 661 epp = (pref_index - 1) << 2; ··· 704 697 705 698 cpufreq_freq_attr_ro(energy_performance_available_preferences); 706 699 700 + static struct cpufreq_driver intel_pstate; 701 + 707 702 static ssize_t store_energy_performance_preference( 708 703 struct cpufreq_policy *policy, const char *buf, size_t count) 709 704 { 710 - struct cpudata *cpu_data = all_cpu_data[policy->cpu]; 705 + struct cpudata *cpu = all_cpu_data[policy->cpu]; 711 706 char str_preference[21]; 712 707 bool raw = false; 713 708 ssize_t ret; ··· 734 725 raw = true; 735 726 } 736 727 728 + /* 729 + * This function runs with the policy R/W semaphore held, which 730 + * guarantees that the driver pointer will not change while it is 731 + * running. 732 + */ 733 + if (!intel_pstate_driver) 734 + return -EAGAIN; 735 + 737 736 mutex_lock(&intel_pstate_limits_lock); 738 737 739 - ret = intel_pstate_set_energy_pref_index(cpu_data, ret, raw, epp); 740 - if (!ret) 741 - ret = count; 738 + if (intel_pstate_driver == &intel_pstate) { 739 + ret = intel_pstate_set_energy_pref_index(cpu, ret, raw, epp); 740 + } else { 741 + /* 742 + * In the passive mode the governor needs to be stopped on the 743 + * target CPU before the EPP update and restarted after it, 744 + * which is super-heavy-weight, so make sure it is worth doing 745 + * upfront. 746 + */ 747 + if (!raw) 748 + epp = ret ? epp_values[ret - 1] : cpu->epp_default; 749 + 750 + if (cpu->epp_cached != epp) { 751 + int err; 752 + 753 + cpufreq_stop_governor(policy); 754 + ret = intel_pstate_set_epp(cpu, epp); 755 + err = cpufreq_start_governor(policy); 756 + if (!ret) { 757 + cpu->epp_cached = epp; 758 + ret = err; 759 + } 760 + } 761 + } 742 762 743 763 mutex_unlock(&intel_pstate_limits_lock); 744 764 745 - return ret; 765 + return ret ?: count; 746 766 } 747 767 748 768 static ssize_t show_energy_performance_preference( ··· 1183 1145 return count; 1184 1146 } 1185 1147 1186 - static struct cpufreq_driver intel_pstate; 1187 - 1188 1148 static void update_qos_request(enum freq_qos_req_type type) 1189 1149 { 1190 1150 int max_state, turbo_max, freq, i, perf_pct; ··· 1366 1330 1367 1331 static const struct x86_cpu_id intel_pstate_cpu_ee_disable_ids[]; 1368 1332 1333 + static struct kobject *intel_pstate_kobject; 1334 + 1369 1335 static void __init intel_pstate_sysfs_expose_params(void) 1370 1336 { 1371 - struct kobject *intel_pstate_kobject; 1372 1337 int rc; 1373 1338 1374 1339 intel_pstate_kobject = kobject_create_and_add("intel_pstate", ··· 1394 1357 rc = sysfs_create_file(intel_pstate_kobject, &min_perf_pct.attr); 1395 1358 WARN_ON(rc); 1396 1359 1397 - if (hwp_active) { 1398 - rc = sysfs_create_file(intel_pstate_kobject, 1399 - &hwp_dynamic_boost.attr); 1400 - WARN_ON(rc); 1401 - } 1402 - 1403 1360 if (x86_match_cpu(intel_pstate_cpu_ee_disable_ids)) { 1404 1361 rc = sysfs_create_file(intel_pstate_kobject, &energy_efficiency.attr); 1405 1362 WARN_ON(rc); 1406 1363 } 1407 1364 } 1365 + 1366 + static void intel_pstate_sysfs_expose_hwp_dynamic_boost(void) 1367 + { 1368 + int rc; 1369 + 1370 + if (!hwp_active) 1371 + return; 1372 + 1373 + rc = sysfs_create_file(intel_pstate_kobject, &hwp_dynamic_boost.attr); 1374 + WARN_ON_ONCE(rc); 1375 + } 1376 + 1377 + static void intel_pstate_sysfs_hide_hwp_dynamic_boost(void) 1378 + { 1379 + if (!hwp_active) 1380 + return; 1381 + 1382 + sysfs_remove_file(intel_pstate_kobject, &hwp_dynamic_boost.attr); 1383 + } 1384 + 1408 1385 /************************** sysfs end ************************/ 1409 1386 1410 1387 static void intel_pstate_hwp_enable(struct cpudata *cpudata) ··· 2298 2247 2299 2248 static void intel_cpufreq_stop_cpu(struct cpufreq_policy *policy) 2300 2249 { 2301 - intel_pstate_set_min_pstate(all_cpu_data[policy->cpu]); 2250 + if (hwp_active) 2251 + intel_pstate_hwp_force_min_perf(policy->cpu); 2252 + else 2253 + intel_pstate_set_min_pstate(all_cpu_data[policy->cpu]); 2302 2254 } 2303 2255 2304 2256 static void intel_pstate_stop_cpu(struct cpufreq_policy *policy) ··· 2309 2255 pr_debug("CPU %d exiting\n", policy->cpu); 2310 2256 2311 2257 intel_pstate_clear_update_util_hook(policy->cpu); 2312 - if (hwp_active) { 2258 + if (hwp_active) 2313 2259 intel_pstate_hwp_save_state(policy); 2314 - intel_pstate_hwp_force_min_perf(policy->cpu); 2315 - } else { 2316 - intel_cpufreq_stop_cpu(policy); 2317 - } 2260 + 2261 + intel_cpufreq_stop_cpu(policy); 2318 2262 } 2319 2263 2320 2264 static int intel_pstate_cpu_exit(struct cpufreq_policy *policy) ··· 2442 2390 fp_toint(cpu->iowait_boost * 100)); 2443 2391 } 2444 2392 2393 + static void intel_cpufreq_adjust_hwp(struct cpudata *cpu, u32 target_pstate, 2394 + bool fast_switch) 2395 + { 2396 + u64 prev = READ_ONCE(cpu->hwp_req_cached), value = prev; 2397 + 2398 + value &= ~HWP_MIN_PERF(~0L); 2399 + value |= HWP_MIN_PERF(target_pstate); 2400 + 2401 + /* 2402 + * The entire MSR needs to be updated in order to update the HWP min 2403 + * field in it, so opportunistically update the max too if needed. 2404 + */ 2405 + value &= ~HWP_MAX_PERF(~0L); 2406 + value |= HWP_MAX_PERF(cpu->max_perf_ratio); 2407 + 2408 + if (value == prev) 2409 + return; 2410 + 2411 + WRITE_ONCE(cpu->hwp_req_cached, value); 2412 + if (fast_switch) 2413 + wrmsrl(MSR_HWP_REQUEST, value); 2414 + else 2415 + wrmsrl_on_cpu(cpu->cpu, MSR_HWP_REQUEST, value); 2416 + } 2417 + 2418 + static void intel_cpufreq_adjust_perf_ctl(struct cpudata *cpu, 2419 + u32 target_pstate, bool fast_switch) 2420 + { 2421 + if (fast_switch) 2422 + wrmsrl(MSR_IA32_PERF_CTL, 2423 + pstate_funcs.get_val(cpu, target_pstate)); 2424 + else 2425 + wrmsrl_on_cpu(cpu->cpu, MSR_IA32_PERF_CTL, 2426 + pstate_funcs.get_val(cpu, target_pstate)); 2427 + } 2428 + 2429 + static int intel_cpufreq_update_pstate(struct cpudata *cpu, int target_pstate, 2430 + bool fast_switch) 2431 + { 2432 + int old_pstate = cpu->pstate.current_pstate; 2433 + 2434 + target_pstate = intel_pstate_prepare_request(cpu, target_pstate); 2435 + if (target_pstate != old_pstate) { 2436 + cpu->pstate.current_pstate = target_pstate; 2437 + if (hwp_active) 2438 + intel_cpufreq_adjust_hwp(cpu, target_pstate, 2439 + fast_switch); 2440 + else 2441 + intel_cpufreq_adjust_perf_ctl(cpu, target_pstate, 2442 + fast_switch); 2443 + } 2444 + 2445 + intel_cpufreq_trace(cpu, fast_switch ? INTEL_PSTATE_TRACE_FAST_SWITCH : 2446 + INTEL_PSTATE_TRACE_TARGET, old_pstate); 2447 + 2448 + return target_pstate; 2449 + } 2450 + 2445 2451 static int intel_cpufreq_target(struct cpufreq_policy *policy, 2446 2452 unsigned int target_freq, 2447 2453 unsigned int relation) 2448 2454 { 2449 2455 struct cpudata *cpu = all_cpu_data[policy->cpu]; 2450 2456 struct cpufreq_freqs freqs; 2451 - int target_pstate, old_pstate; 2457 + int target_pstate; 2452 2458 2453 2459 update_turbo_state(); 2454 2460 ··· 2514 2404 freqs.new = target_freq; 2515 2405 2516 2406 cpufreq_freq_transition_begin(policy, &freqs); 2407 + 2517 2408 switch (relation) { 2518 2409 case CPUFREQ_RELATION_L: 2519 2410 target_pstate = DIV_ROUND_UP(freqs.new, cpu->pstate.scaling); ··· 2526 2415 target_pstate = DIV_ROUND_CLOSEST(freqs.new, cpu->pstate.scaling); 2527 2416 break; 2528 2417 } 2529 - target_pstate = intel_pstate_prepare_request(cpu, target_pstate); 2530 - old_pstate = cpu->pstate.current_pstate; 2531 - if (target_pstate != cpu->pstate.current_pstate) { 2532 - cpu->pstate.current_pstate = target_pstate; 2533 - wrmsrl_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, 2534 - pstate_funcs.get_val(cpu, target_pstate)); 2535 - } 2418 + 2419 + target_pstate = intel_cpufreq_update_pstate(cpu, target_pstate, false); 2420 + 2536 2421 freqs.new = target_pstate * cpu->pstate.scaling; 2537 - intel_cpufreq_trace(cpu, INTEL_PSTATE_TRACE_TARGET, old_pstate); 2422 + 2538 2423 cpufreq_freq_transition_end(policy, &freqs, false); 2539 2424 2540 2425 return 0; ··· 2540 2433 unsigned int target_freq) 2541 2434 { 2542 2435 struct cpudata *cpu = all_cpu_data[policy->cpu]; 2543 - int target_pstate, old_pstate; 2436 + int target_pstate; 2544 2437 2545 2438 update_turbo_state(); 2546 2439 2547 2440 target_pstate = DIV_ROUND_UP(target_freq, cpu->pstate.scaling); 2548 - target_pstate = intel_pstate_prepare_request(cpu, target_pstate); 2549 - old_pstate = cpu->pstate.current_pstate; 2550 - intel_pstate_update_pstate(cpu, target_pstate); 2551 - intel_cpufreq_trace(cpu, INTEL_PSTATE_TRACE_FAST_SWITCH, old_pstate); 2441 + 2442 + target_pstate = intel_cpufreq_update_pstate(cpu, target_pstate, true); 2443 + 2552 2444 return target_pstate * cpu->pstate.scaling; 2553 2445 } 2554 2446 ··· 2567 2461 return ret; 2568 2462 2569 2463 policy->cpuinfo.transition_latency = INTEL_CPUFREQ_TRANSITION_LATENCY; 2570 - policy->transition_delay_us = INTEL_CPUFREQ_TRANSITION_DELAY; 2571 2464 /* This reflects the intel_pstate_get_cpu_pstates() setting. */ 2572 2465 policy->cur = policy->cpuinfo.min_freq; 2573 2466 ··· 2578 2473 2579 2474 cpu = all_cpu_data[policy->cpu]; 2580 2475 2581 - if (hwp_active) 2476 + if (hwp_active) { 2477 + u64 value; 2478 + 2582 2479 intel_pstate_get_hwp_max(policy->cpu, &turbo_max, &max_state); 2583 - else 2480 + policy->transition_delay_us = INTEL_CPUFREQ_TRANSITION_DELAY_HWP; 2481 + rdmsrl_on_cpu(cpu->cpu, MSR_HWP_REQUEST, &value); 2482 + WRITE_ONCE(cpu->hwp_req_cached, value); 2483 + cpu->epp_cached = (value & GENMASK_ULL(31, 24)) >> 24; 2484 + } else { 2584 2485 turbo_max = cpu->pstate.turbo_pstate; 2486 + policy->transition_delay_us = INTEL_CPUFREQ_TRANSITION_DELAY; 2487 + } 2585 2488 2586 2489 min_freq = DIV_ROUND_UP(turbo_max * global.min_perf_pct, 100); 2587 2490 min_freq *= cpu->pstate.scaling; ··· 2666 2553 } 2667 2554 } 2668 2555 put_online_cpus(); 2556 + 2557 + if (intel_pstate_driver == &intel_pstate) 2558 + intel_pstate_sysfs_hide_hwp_dynamic_boost(); 2559 + 2669 2560 intel_pstate_driver = NULL; 2670 2561 } 2671 2562 2672 2563 static int intel_pstate_register_driver(struct cpufreq_driver *driver) 2673 2564 { 2674 2565 int ret; 2566 + 2567 + if (driver == &intel_pstate) 2568 + intel_pstate_sysfs_expose_hwp_dynamic_boost(); 2675 2569 2676 2570 memset(&global, 0, sizeof(global)); 2677 2571 global.max_perf_pct = 100; ··· 2697 2577 2698 2578 static int intel_pstate_unregister_driver(void) 2699 2579 { 2700 - if (hwp_active) 2701 - return -EBUSY; 2702 - 2703 2580 cpufreq_unregister_driver(intel_pstate_driver); 2704 2581 intel_pstate_driver_cleanup(); 2705 2582 ··· 2952 2835 hwp_active++; 2953 2836 hwp_mode_bdw = id->driver_data; 2954 2837 intel_pstate.attr = hwp_cpufreq_attrs; 2955 - default_driver = &intel_pstate; 2838 + intel_cpufreq.attr = hwp_cpufreq_attrs; 2839 + if (!default_driver) 2840 + default_driver = &intel_pstate; 2841 + 2956 2842 goto hwp_cpu_matched; 2957 2843 } 2958 2844 } else { ··· 3026 2906 if (!str) 3027 2907 return -EINVAL; 3028 2908 3029 - if (!strcmp(str, "disable")) { 2909 + if (!strcmp(str, "disable")) 3030 2910 no_load = 1; 3031 - } else if (!strcmp(str, "active")) { 2911 + else if (!strcmp(str, "active")) 3032 2912 default_driver = &intel_pstate; 3033 - } else if (!strcmp(str, "passive")) { 2913 + else if (!strcmp(str, "passive")) 3034 2914 default_driver = &intel_cpufreq; 3035 - no_hwp = 1; 3036 - } 2915 + 3037 2916 if (!strcmp(str, "no_hwp")) { 3038 2917 pr_info("HWP disabled\n"); 3039 2918 no_hwp = 1;
+2
include/linux/cpufreq.h
··· 576 576 unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy); 577 577 int cpufreq_register_governor(struct cpufreq_governor *governor); 578 578 void cpufreq_unregister_governor(struct cpufreq_governor *governor); 579 + int cpufreq_start_governor(struct cpufreq_policy *policy); 580 + void cpufreq_stop_governor(struct cpufreq_policy *policy); 579 581 580 582 #define cpufreq_governor_init(__governor) \ 581 583 static int __init __governor##_init(void) \