Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

perf pmu: Make the loading of formats lazy

The sysfs format files are loaded eagerly in a PMU. Add a flag so that
we create the format but only load the contents when necessary.

Reduce the size of the value in struct perf_pmu_format and avoid holes
so there is no additional space requirement.

For "perf stat -e cycles true" this reduces the number of openat calls
from 648 to 573 (about 12%). The benchmark pmu scan speed is improved
by roughly 5%.

Before:

$ perf bench internals pmu-scan
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 1061.100 usec (+- 9.965 usec)
Average PMU scanning took: 4725.300 usec (+- 260.599 usec)

After:

$ perf bench internals pmu-scan
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 989.170 usec (+- 6.873 usec)
Average PMU scanning took: 4520.960 usec (+- 251.272 usec)

Committer testing:

On a AMD Ryzen 5950x:

Before:

$ perf bench internals pmu-scan -i1000
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 563.466 usec (+- 1.008 usec)
Average PMU scanning took: 1619.174 usec (+- 23.627 usec)
$ perf stat -r5 perf bench internals pmu-scan -i1000
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 583.401 usec (+- 2.098 usec)
Average PMU scanning took: 1677.352 usec (+- 24.636 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 553.254 usec (+- 0.825 usec)
Average PMU scanning took: 1635.655 usec (+- 24.312 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 557.733 usec (+- 0.980 usec)
Average PMU scanning took: 1600.659 usec (+- 23.344 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 554.906 usec (+- 0.774 usec)
Average PMU scanning took: 1595.338 usec (+- 23.288 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 551.798 usec (+- 0.967 usec)
Average PMU scanning took: 1623.213 usec (+- 23.998 usec)

Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):

3276.82 msec task-clock:u # 0.990 CPUs utilized ( +- 0.82% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
1008 page-faults:u # 307.615 /sec ( +- 0.04% )
12049614778 cycles:u # 3.677 GHz ( +- 0.07% ) (83.34%)
117507478 stalled-cycles-frontend:u # 0.98% frontend cycles idle ( +- 0.33% ) (83.32%)
27106761 stalled-cycles-backend:u # 0.22% backend cycles idle ( +- 9.55% ) (83.36%)
33294953848 instructions:u # 2.76 insn per cycle
# 0.00 stalled cycles per insn ( +- 0.03% ) (83.31%)
6849825049 branches:u # 2.090 G/sec ( +- 0.03% ) (83.37%)
71533903 branch-misses:u # 1.04% of all branches ( +- 0.20% ) (83.30%)

3.3088 +- 0.0302 seconds time elapsed ( +- 0.91% )

$

After:

$ perf stat -r5 perf bench internals pmu-scan -i1000
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 550.702 usec (+- 0.958 usec)
Average PMU scanning took: 1566.577 usec (+- 22.747 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 548.315 usec (+- 0.555 usec)
Average PMU scanning took: 1565.499 usec (+- 22.760 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 548.073 usec (+- 0.555 usec)
Average PMU scanning took: 1586.097 usec (+- 23.299 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 561.184 usec (+- 2.709 usec)
Average PMU scanning took: 1567.153 usec (+- 22.548 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 546.987 usec (+- 0.553 usec)
Average PMU scanning took: 1562.814 usec (+- 22.729 usec)

Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):

3170.86 msec task-clock:u # 0.992 CPUs utilized ( +- 0.22% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
1010 page-faults:u # 318.526 /sec ( +- 0.04% )
11890047674 cycles:u # 3.750 GHz ( +- 0.14% ) (83.27%)
119090499 stalled-cycles-frontend:u # 1.00% frontend cycles idle ( +- 0.46% ) (83.40%)
32502449 stalled-cycles-backend:u # 0.27% backend cycles idle ( +- 8.32% ) (83.30%)
33119141261 instructions:u # 2.79 insn per cycle
# 0.00 stalled cycles per insn ( +- 0.01% ) (83.37%)
6812816561 branches:u # 2.149 G/sec ( +- 0.01% ) (83.29%)
70157855 branch-misses:u # 1.03% of all branches ( +- 0.28% ) (83.38%)

3.19710 +- 0.00826 seconds time elapsed ( +- 0.26% )

$

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

authored by

Ian Rogers and committed by
Arnaldo Carvalho de Melo
50402641 9e1f1693

+106 -65
+1 -1
tools/perf/tests/pmu.c
··· 171 171 } 172 172 173 173 pmu->name = strdup("perf-pmu-test"); 174 - ret = perf_pmu__format_parse(pmu, fd); 174 + ret = perf_pmu__format_parse(pmu, fd, /*eager_load=*/true); 175 175 if (ret) 176 176 goto out; 177 177
+96 -48
tools/perf/util/pmu.c
··· 40 40 * value=PERF_PMU_FORMAT_VALUE_CONFIG and bits 0 to 7 will be set. 41 41 */ 42 42 struct perf_pmu_format { 43 + /** @list: Element on list within struct perf_pmu. */ 44 + struct list_head list; 45 + /** @bits: Which config bits are set by this format value. */ 46 + DECLARE_BITMAP(bits, PERF_PMU_FORMAT_BITS); 43 47 /** @name: The modifier/file name. */ 44 48 char *name; 45 49 /** ··· 51 47 * are from PERF_PMU_FORMAT_VALUE_CONFIG to 52 48 * PERF_PMU_FORMAT_VALUE_CONFIG_END. 53 49 */ 54 - int value; 55 - /** @bits: Which config bits are set by this format value. */ 56 - DECLARE_BITMAP(bits, PERF_PMU_FORMAT_BITS); 57 - /** @list: Element on list within struct perf_pmu. */ 58 - struct list_head list; 50 + u16 value; 51 + /** @loaded: Has the contents been loaded/parsed. */ 52 + bool loaded; 59 53 }; 54 + 55 + static struct perf_pmu_format *perf_pmu__new_format(struct list_head *list, char *name) 56 + { 57 + struct perf_pmu_format *format; 58 + 59 + format = zalloc(sizeof(*format)); 60 + if (!format) 61 + return NULL; 62 + 63 + format->name = strdup(name); 64 + if (!format->name) { 65 + free(format); 66 + return NULL; 67 + } 68 + list_add_tail(&format->list, list); 69 + return format; 70 + } 71 + 72 + /* Called at the end of parsing a format. */ 73 + void perf_pmu_format__set_value(void *vformat, int config, unsigned long *bits) 74 + { 75 + struct perf_pmu_format *format = vformat; 76 + 77 + format->value = config; 78 + memcpy(format->bits, bits, sizeof(format->bits)); 79 + } 80 + 81 + static void __perf_pmu_format__load(struct perf_pmu_format *format, FILE *file) 82 + { 83 + void *scanner; 84 + int ret; 85 + 86 + ret = perf_pmu_lex_init(&scanner); 87 + if (ret) 88 + return; 89 + 90 + perf_pmu_set_in(file, scanner); 91 + ret = perf_pmu_parse(format, scanner); 92 + perf_pmu_lex_destroy(scanner); 93 + format->loaded = true; 94 + } 95 + 96 + static void perf_pmu_format__load(struct perf_pmu *pmu, struct perf_pmu_format *format) 97 + { 98 + char path[PATH_MAX]; 99 + FILE *file = NULL; 100 + 101 + if (format->loaded) 102 + return; 103 + 104 + if (!perf_pmu__pathname_scnprintf(path, sizeof(path), pmu->name, "format")) 105 + return; 106 + 107 + assert(strlen(path) + strlen(format->name) + 2 < sizeof(path)); 108 + strcat(path, "/"); 109 + strcat(path, format->name); 110 + 111 + file = fopen(path, "r"); 112 + if (!file) 113 + return; 114 + __perf_pmu_format__load(format, file); 115 + fclose(file); 116 + } 60 117 61 118 /* 62 119 * Parse & process all the sysfs attributes located under 63 120 * the directory specified in 'dir' parameter. 64 121 */ 65 - int perf_pmu__format_parse(struct perf_pmu *pmu, int dirfd) 122 + int perf_pmu__format_parse(struct perf_pmu *pmu, int dirfd, bool eager_load) 66 123 { 67 124 struct dirent *evt_ent; 68 125 DIR *format_dir; ··· 133 68 if (!format_dir) 134 69 return -EINVAL; 135 70 136 - while (!ret && (evt_ent = readdir(format_dir))) { 71 + while ((evt_ent = readdir(format_dir)) != NULL) { 72 + struct perf_pmu_format *format; 137 73 char *name = evt_ent->d_name; 138 - int fd; 139 - void *scanner; 140 - FILE *file; 141 74 142 75 if (!strcmp(name, ".") || !strcmp(name, "..")) 143 76 continue; 144 77 145 - 146 - ret = -EINVAL; 147 - fd = openat(dirfd, name, O_RDONLY); 148 - if (fd < 0) 149 - break; 150 - 151 - file = fdopen(fd, "r"); 152 - if (!file) { 153 - close(fd); 78 + format = perf_pmu__new_format(&pmu->format, name); 79 + if (!format) { 80 + ret = -ENOMEM; 154 81 break; 155 82 } 156 83 157 - ret = perf_pmu_lex_init(&scanner); 158 - if (ret) { 84 + if (eager_load) { 85 + FILE *file; 86 + int fd = openat(dirfd, name, O_RDONLY); 87 + 88 + if (fd < 0) { 89 + ret = -errno; 90 + break; 91 + } 92 + file = fdopen(fd, "r"); 93 + if (!file) { 94 + close(fd); 95 + break; 96 + } 97 + __perf_pmu_format__load(format, file); 159 98 fclose(file); 160 - break; 161 99 } 162 - 163 - perf_pmu_set_in(file, scanner); 164 - ret = perf_pmu_parse(&pmu->format, name, scanner); 165 - perf_pmu_lex_destroy(scanner); 166 - fclose(file); 167 100 } 168 101 169 102 closedir(format_dir); ··· 182 119 return 0; 183 120 184 121 /* it'll close the fd */ 185 - if (perf_pmu__format_parse(pmu, fd)) 122 + if (perf_pmu__format_parse(pmu, fd, /*eager_load=*/false)) 186 123 return -1; 187 124 188 125 return 0; ··· 1025 962 if (pmu == &perf_pmu__fake) 1026 963 return; 1027 964 1028 - list_for_each_entry(format, &pmu->format, list) 965 + list_for_each_entry(format, &pmu->format, list) { 966 + perf_pmu_format__load(pmu, format); 1029 967 if (format->value >= PERF_PMU_FORMAT_VALUE_CONFIG_END) { 1030 968 pr_warning("WARNING: '%s' format '%s' requires 'perf_event_attr::config%d'" 1031 969 "which is not supported by this version of perf!\n", 1032 970 pmu->name, format->name, format->value); 1033 971 return; 1034 972 } 973 + } 1035 974 } 1036 975 1037 976 bool evsel__is_aux_event(const struct evsel *evsel) ··· 1106 1041 if (!format) 1107 1042 return -1; 1108 1043 1044 + perf_pmu_format__load(pmu, format); 1109 1045 return format->value; 1110 1046 } 1111 1047 ··· 1243 1177 free(pmu_term); 1244 1178 return -EINVAL; 1245 1179 } 1246 - 1180 + perf_pmu_format__load(pmu, format); 1247 1181 switch (format->value) { 1248 1182 case PERF_PMU_FORMAT_VALUE_CONFIG: 1249 1183 vp = &attr->config; ··· 1469 1403 1470 1404 return NULL; 1471 1405 } 1472 - 1473 - int perf_pmu__new_format(struct list_head *list, char *name, 1474 - int config, unsigned long *bits) 1475 - { 1476 - struct perf_pmu_format *format; 1477 - 1478 - format = zalloc(sizeof(*format)); 1479 - if (!format) 1480 - return -ENOMEM; 1481 - 1482 - format->name = strdup(name); 1483 - format->value = config; 1484 - memcpy(format->bits, bits, sizeof(format->bits)); 1485 - 1486 - list_add_tail(&format->list, list); 1487 - return 0; 1488 - } 1489 - 1490 1406 static void perf_pmu__del_formats(struct list_head *formats) 1491 1407 { 1492 1408 struct perf_pmu_format *fmt, *tmp;
+2 -3
tools/perf/util/pmu.h
··· 227 227 struct perf_pmu_info *info); 228 228 struct perf_pmu_alias *perf_pmu__find_alias(struct perf_pmu *pmu, const char *event); 229 229 230 - int perf_pmu__new_format(struct list_head *list, char *name, 231 - int config, unsigned long *bits); 232 - int perf_pmu__format_parse(struct perf_pmu *pmu, int dirfd); 230 + int perf_pmu__format_parse(struct perf_pmu *pmu, int dirfd, bool eager_load); 231 + void perf_pmu_format__set_value(void *format, int config, unsigned long *bits); 233 232 bool perf_pmu__has_format(const struct perf_pmu *pmu, const char *name); 234 233 235 234 bool is_pmu_core(const char *name);
+7 -13
tools/perf/util/pmu.y
··· 1 1 %define api.pure full 2 - %parse-param {struct list_head *format} 3 - %parse-param {char *name} 2 + %parse-param {void *format} 4 3 %parse-param {void *scanner} 5 4 %lex-param {void* scanner} 6 5 ··· 20 21 YYABORT; \ 21 22 } while (0) 22 23 23 - static void perf_pmu_error(struct list_head *list, char *name, void *scanner, char const *msg); 24 + static void perf_pmu_error(void *format, void *scanner, const char *msg); 24 25 25 26 static void perf_pmu__set_format(unsigned long *bits, long from, long to) 26 27 { ··· 58 59 format_term: 59 60 PP_CONFIG ':' bits 60 61 { 61 - ABORT_ON(perf_pmu__new_format(format, name, 62 - PERF_PMU_FORMAT_VALUE_CONFIG, 63 - $3)); 62 + perf_pmu_format__set_value(format, PERF_PMU_FORMAT_VALUE_CONFIG, $3); 64 63 } 65 64 | 66 65 PP_CONFIG PP_VALUE ':' bits 67 66 { 68 - ABORT_ON(perf_pmu__new_format(format, name, 69 - $2, 70 - $4)); 67 + perf_pmu_format__set_value(format, $2, $4); 71 68 } 72 69 73 70 bits: ··· 90 95 91 96 %% 92 97 93 - static void perf_pmu_error(struct list_head *list __maybe_unused, 94 - char *name __maybe_unused, 95 - void *scanner __maybe_unused, 96 - char const *msg __maybe_unused) 98 + static void perf_pmu_error(void *format __maybe_unused, 99 + void *scanner __maybe_unused, 100 + const char *msg __maybe_unused) 97 101 { 98 102 }