Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

lib/cpumask: add FORCE_NR_CPUS config option

The size of cpumasks is hard-limited by compile-time parameter NR_CPUS,
but defined at boot-time when kernel parses ACPI/DT tables, and stored in
nr_cpu_ids. In many practical cases, number of CPUs for a target is known
at compile time, and can be provided with NR_CPUS.

In that case, compiler may be instructed to rely on NR_CPUS as on actual
number of CPUs, not an upper limit. It allows to optimize many cpumask
routines and significantly shrink size of the kernel image.

This patch adds FORCE_NR_CPUS option to teach the compiler to rely on
NR_CPUS and enable corresponding optimizations.

If FORCE_NR_CPUS=y, kernel will not set nr_cpu_ids at boot, but only check
that the actual number of possible CPUs is equal to NR_CPUS, and WARN if
that doesn't hold.

The new option is especially useful in embedded applications because
kernel configurations are unique for each SoC, the number of CPUs is
constant and known well, and memory limitations are typically harder.

For my 4-CPU ARM64 build with NR_CPUS=4, FORCE_NR_CPUS=y saves 46KB:
add/remove: 3/4 grow/shrink: 46/729 up/down: 652/-46952 (-46300)

Signed-off-by: Yury Norov <yury.norov@gmail.com>

+17 -4
+7 -3
include/linux/cpumask.h
··· 35 35 */ 36 36 #define cpumask_pr_args(maskp) nr_cpu_ids, cpumask_bits(maskp) 37 37 38 - #if NR_CPUS == 1 39 - #define nr_cpu_ids 1U 38 + #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS) 39 + #define nr_cpu_ids ((unsigned int)NR_CPUS) 40 40 #else 41 41 extern unsigned int nr_cpu_ids; 42 + #endif 42 43 43 44 static inline void set_nr_cpu_ids(unsigned int nr) 44 45 { 46 + #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS) 47 + WARN_ON(nr != nr_cpu_ids); 48 + #else 45 49 nr_cpu_ids = nr; 46 - } 47 50 #endif 51 + } 48 52 49 53 /* Deprecated. Always use nr_cpu_ids. */ 50 54 #define nr_cpumask_bits nr_cpu_ids
+1 -1
kernel/smp.c
··· 1088 1088 1089 1089 early_param("maxcpus", maxcpus); 1090 1090 1091 - #if (NR_CPUS > 1) 1091 + #if (NR_CPUS > 1) && !defined(CONFIG_FORCE_NR_CPUS) 1092 1092 /* Setup number of possible processor ids */ 1093 1093 unsigned int nr_cpu_ids __read_mostly = NR_CPUS; 1094 1094 EXPORT_SYMBOL(nr_cpu_ids);
+9
lib/Kconfig
··· 527 527 them on the stack. This is a bit more expensive, but avoids 528 528 stack overflow. 529 529 530 + config FORCE_NR_CPUS 531 + bool "NR_CPUS is set to an actual number of CPUs" 532 + depends on SMP 533 + help 534 + Say Yes if you have NR_CPUS set to an actual number of possible 535 + CPUs in your system, not to a default value. This forces the core 536 + code to rely on compile-time value and optimize kernel routines 537 + better. 538 + 530 539 config CPU_RMAP 531 540 bool 532 541 depends on SMP