Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'pm-6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"Update the documentation of cpuidle governors that does not match the
code any more after previous functional changes (Rafael Wysocki) and
fix up the cpufreq Kconfig file broken inadvertently by a previous
update (Viresh Kumar)"

* tag 'pm-6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: Move endif to the end of Kconfig file
cpuidle: teo: Update documentation after previous changes
cpuidle: menu: Update documentation after previous changes

+80 -85
+30 -40
Documentation/admin-guide/pm/cpuidle.rst
··· 269 269 the CPU will ask the processor hardware to enter), it attempts to predict the 270 270 idle duration and uses the predicted value for idle state selection. 271 271 272 - It first obtains the time until the closest timer event with the assumption 273 - that the scheduler tick will be stopped. That time, referred to as the *sleep 274 - length* in what follows, is the upper bound on the time before the next CPU 275 - wakeup. It is used to determine the sleep length range, which in turn is needed 276 - to get the sleep length correction factor. 277 - 278 - The ``menu`` governor maintains two arrays of sleep length correction factors. 279 - One of them is used when tasks previously running on the given CPU are waiting 280 - for some I/O operations to complete and the other one is used when that is not 281 - the case. Each array contains several correction factor values that correspond 282 - to different sleep length ranges organized so that each range represented in the 283 - array is approximately 10 times wider than the previous one. 284 - 285 - The correction factor for the given sleep length range (determined before 286 - selecting the idle state for the CPU) is updated after the CPU has been woken 287 - up and the closer the sleep length is to the observed idle duration, the closer 288 - to 1 the correction factor becomes (it must fall between 0 and 1 inclusive). 289 - The sleep length is multiplied by the correction factor for the range that it 290 - falls into to obtain the first approximation of the predicted idle duration. 291 - 292 - Next, the governor uses a simple pattern recognition algorithm to refine its 272 + It first uses a simple pattern recognition algorithm to obtain a preliminary 293 273 idle duration prediction. Namely, it saves the last 8 observed idle duration 294 274 values and, when predicting the idle duration next time, it computes the average 295 275 and variance of them. If the variance is small (smaller than 400 square ··· 281 301 taken as the "typical interval" value and so on, until either the "typical 282 302 interval" is determined or too many data points are disregarded, in which case 283 303 the "typical interval" is assumed to equal "infinity" (the maximum unsigned 284 - integer value). The "typical interval" computed this way is compared with the 285 - sleep length multiplied by the correction factor and the minimum of the two is 286 - taken as the predicted idle duration. 304 + integer value). 287 305 288 - Then, the governor computes an extra latency limit to help "interactive" 289 - workloads. It uses the observation that if the exit latency of the selected 290 - idle state is comparable with the predicted idle duration, the total time spent 291 - in that state probably will be very short and the amount of energy to save by 292 - entering it will be relatively small, so likely it is better to avoid the 293 - overhead related to entering that state and exiting it. Thus selecting a 294 - shallower state is likely to be a better option then. The first approximation 295 - of the extra latency limit is the predicted idle duration itself which 296 - additionally is divided by a value depending on the number of tasks that 297 - previously ran on the given CPU and now they are waiting for I/O operations to 298 - complete. The result of that division is compared with the latency limit coming 299 - from the power management quality of service, or `PM QoS <cpu-pm-qos_>`_, 300 - framework and the minimum of the two is taken as the limit for the idle states' 301 - exit latency. 306 + If the "typical interval" computed this way is long enough, the governor obtains 307 + the time until the closest timer event with the assumption that the scheduler 308 + tick will be stopped. That time, referred to as the *sleep length* in what follows, 309 + is the upper bound on the time before the next CPU wakeup. It is used to determine 310 + the sleep length range, which in turn is needed to get the sleep length correction 311 + factor. 312 + 313 + The ``menu`` governor maintains an array containing several correction factor 314 + values that correspond to different sleep length ranges organized so that each 315 + range represented in the array is approximately 10 times wider than the previous 316 + one. 317 + 318 + The correction factor for the given sleep length range (determined before 319 + selecting the idle state for the CPU) is updated after the CPU has been woken 320 + up and the closer the sleep length is to the observed idle duration, the closer 321 + to 1 the correction factor becomes (it must fall between 0 and 1 inclusive). 322 + The sleep length is multiplied by the correction factor for the range that it 323 + falls into to obtain an approximation of the predicted idle duration that is 324 + compared to the "typical interval" determined previously and the minimum of 325 + the two is taken as the idle duration prediction. 326 + 327 + If the "typical interval" value is small, which means that the CPU is likely 328 + to be woken up soon enough, the sleep length computation is skipped as it may 329 + be costly and the idle duration is simply predicted to equal the "typical 330 + interval" value. 302 331 303 332 Now, the governor is ready to walk the list of idle states and choose one of 304 333 them. For this purpose, it compares the target residency of each state with 305 - the predicted idle duration and the exit latency of it with the computed latency 306 - limit. It selects the state with the target residency closest to the predicted 334 + the predicted idle duration and the exit latency of it with the with the latency 335 + limit coming from the power management quality of service, or `PM QoS <cpu-pm-qos_>`_, 336 + framework. It selects the state with the target residency closest to the predicted 307 337 idle duration, but still below it, and exit latency that does not exceed the 308 338 limit. 309 339
+2 -2
drivers/cpufreq/Kconfig
··· 325 325 This adds the CPUFreq driver support for Freescale QorIQ SoCs 326 326 which are capable of changing the CPU's frequency dynamically. 327 327 328 - endif 329 - 330 328 config ACPI_CPPC_CPUFREQ 331 329 tristate "CPUFreq driver based on the ACPI CPPC spec" 332 330 depends on ACPI_PROCESSOR ··· 352 354 by using CPPC delivered and reference performance counters. 353 355 354 356 If in doubt, say N. 357 + 358 + endif 355 359 356 360 endmenu
+48 -43
drivers/cpuidle/governors/teo.c
··· 10 10 * DOC: teo-description 11 11 * 12 12 * The idea of this governor is based on the observation that on many systems 13 - * timer events are two or more orders of magnitude more frequent than any 14 - * other interrupts, so they are likely to be the most significant cause of CPU 15 - * wakeups from idle states. Moreover, information about what happened in the 16 - * (relatively recent) past can be used to estimate whether or not the deepest 17 - * idle state with target residency within the (known) time till the closest 18 - * timer event, referred to as the sleep length, is likely to be suitable for 19 - * the upcoming CPU idle period and, if not, then which of the shallower idle 20 - * states to choose instead of it. 13 + * timer interrupts are two or more orders of magnitude more frequent than any 14 + * other interrupt types, so they are likely to dominate CPU wakeup patterns. 15 + * Moreover, in principle, the time when the next timer event is going to occur 16 + * can be determined at the idle state selection time, although doing that may 17 + * be costly, so it can be regarded as the most reliable source of information 18 + * for idle state selection. 21 19 * 22 - * Of course, non-timer wakeup sources are more important in some use cases 23 - * which can be covered by taking a few most recent idle time intervals of the 24 - * CPU into account. However, even in that context it is not necessary to 25 - * consider idle duration values greater than the sleep length, because the 26 - * closest timer will ultimately wake up the CPU anyway unless it is woken up 27 - * earlier. 20 + * Of course, non-timer wakeup sources are more important in some use cases, 21 + * but even then it is generally unnecessary to consider idle duration values 22 + * greater than the time time till the next timer event, referred as the sleep 23 + * length in what follows, because the closest timer will ultimately wake up the 24 + * CPU anyway unless it is woken up earlier. 28 25 * 29 - * Thus this governor estimates whether or not the prospective idle duration of 30 - * a CPU is likely to be significantly shorter than the sleep length and selects 31 - * an idle state for it accordingly. 26 + * However, since obtaining the sleep length may be costly, the governor first 27 + * checks if it can select a shallow idle state using wakeup pattern information 28 + * from recent times, in which case it can do without knowing the sleep length 29 + * at all. For this purpose, it counts CPU wakeup events and looks for an idle 30 + * state whose target residency has not exceeded the idle duration (measured 31 + * after wakeup) in the majority of relevant recent cases. If the target 32 + * residency of that state is small enough, it may be used right away and the 33 + * sleep length need not be determined. 32 34 * 33 35 * The computations carried out by this governor are based on using bins whose 34 36 * boundaries are aligned with the target residency parameter values of the CPU ··· 41 39 * idle state 2, the third bin spans from the target residency of idle state 2 42 40 * up to, but not including, the target residency of idle state 3 and so on. 43 41 * The last bin spans from the target residency of the deepest idle state 44 - * supplied by the driver to infinity. 42 + * supplied by the driver to the scheduler tick period length or to infinity if 43 + * the tick period length is less than the target residency of that state. In 44 + * the latter case, the governor also counts events with the measured idle 45 + * duration between the tick period length and the target residency of the 46 + * deepest idle state. 45 47 * 46 48 * Two metrics called "hits" and "intercepts" are associated with each bin. 47 49 * They are updated every time before selecting an idle state for the given CPU ··· 55 49 * sleep length and the idle duration measured after CPU wakeup fall into the 56 50 * same bin (that is, the CPU appears to wake up "on time" relative to the sleep 57 51 * length). In turn, the "intercepts" metric reflects the relative frequency of 58 - * situations in which the measured idle duration is so much shorter than the 59 - * sleep length that the bin it falls into corresponds to an idle state 60 - * shallower than the one whose bin is fallen into by the sleep length (these 61 - * situations are referred to as "intercepts" below). 52 + * non-timer wakeup events for which the measured idle duration falls into a bin 53 + * that corresponds to an idle state shallower than the one whose bin is fallen 54 + * into by the sleep length (these events are also referred to as "intercepts" 55 + * below). 62 56 * 63 57 * In order to select an idle state for a CPU, the governor takes the following 64 58 * steps (modulo the possible latency constraint that must be taken into account 65 59 * too): 66 60 * 67 - * 1. Find the deepest CPU idle state whose target residency does not exceed 68 - * the current sleep length (the candidate idle state) and compute 2 sums as 69 - * follows: 61 + * 1. Find the deepest enabled CPU idle state (the candidate idle state) and 62 + * compute 2 sums as follows: 70 63 * 71 - * - The sum of the "hits" and "intercepts" metrics for the candidate state 72 - * and all of the deeper idle states (it represents the cases in which the 73 - * CPU was idle long enough to avoid being intercepted if the sleep length 74 - * had been equal to the current one). 64 + * - The sum of the "hits" metric for all of the idle states shallower than 65 + * the candidate one (it represents the cases in which the CPU was likely 66 + * woken up by a timer). 75 67 * 76 - * - The sum of the "intercepts" metrics for all of the idle states shallower 77 - * than the candidate one (it represents the cases in which the CPU was not 78 - * idle long enough to avoid being intercepted if the sleep length had been 79 - * equal to the current one). 68 + * - The sum of the "intercepts" metric for all of the idle states shallower 69 + * than the candidate one (it represents the cases in which the CPU was 70 + * likely woken up by a non-timer wakeup source). 80 71 * 81 - * 2. If the second sum is greater than the first one the CPU is likely to wake 82 - * up early, so look for an alternative idle state to select. 72 + * 2. If the second sum computed in step 1 is greater than a half of the sum of 73 + * both metrics for the candidate state bin and all subsequent bins(if any), 74 + * a shallower idle state is likely to be more suitable, so look for it. 83 75 * 84 - * - Traverse the idle states shallower than the candidate one in the 76 + * - Traverse the enabled idle states shallower than the candidate one in the 85 77 * descending order. 86 78 * 87 79 * - For each of them compute the sum of the "intercepts" metrics over all 88 80 * of the idle states between it and the candidate one (including the 89 81 * former and excluding the latter). 90 82 * 91 - * - If each of these sums that needs to be taken into account (because the 92 - * check related to it has indicated that the CPU is likely to wake up 93 - * early) is greater than a half of the corresponding sum computed in step 94 - * 1 (which means that the target residency of the state in question had 95 - * not exceeded the idle duration in over a half of the relevant cases), 96 - * select the given idle state instead of the candidate one. 83 + * - If this sum is greater than a half of the second sum computed in step 1, 84 + * use the given idle state as the new candidate one. 97 85 * 98 - * 3. By default, select the candidate state. 86 + * 3. If the current candidate state is state 0 or its target residency is short 87 + * enough, return it and prevent the scheduler tick from being stopped. 88 + * 89 + * 4. Obtain the sleep length value and check if it is below the target 90 + * residency of the current candidate state, in which case a new shallower 91 + * candidate state needs to be found, so look for it. 99 92 */ 100 93 101 94 #include <linux/cpuidle.h>