Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
"This tree includes an x86 PMU scheduling fix, but most changes are
late breaking tooling fixes and updates:

User visible fixes:

- Create config.detected into OUTPUT directory, fixing parallel
builds sharing the same source directory (Aaro Kiskinen)

- Allow to specify custom linker command, fixing some MIPS64 builds.
(Aaro Kiskinen)

- Fix to show proper convergence stats in 'perf bench numa' (Srikar
Dronamraju)

User visible changes:

- Validate syscall list passed via -e argument to 'perf trace'.
(Arnaldo Carvalho de Melo)

- Introduce 'perf stat --per-thread' (Jiri Olsa)

- Check access permission for --kallsyms and --vmlinux (Li Zhang)

- Move toggling event logic from 'perf top' and into hists browser,
allowing freeze/unfreeze with event lists with more than one entry
(Namhyung Kim)

- Add missing newlines when dumping PERF_RECORD_FINISHED_ROUND and
showing the Aggregated stats in 'perf report -D' (Adrian Hunter)

Infrastructure fixes:

- Add missing break for PERF_RECORD_ITRACE_START, which caused those
events samples to be parsed as well as PERF_RECORD_LOST_SAMPLES.
ITRACE_START only appears when Intel PT or BTS are present, so..
(Jiri Olsa)

- Call the perf_session destructor when bailing out in the inject,
kmem, report, kvm and mem tools (Taeung Song)

Infrastructure changes:

- Move stuff out of 'perf stat' and into the lib for further use
(Jiri Olsa)

- Reference count the cpu_map and thread_map classes (Jiri Olsa)

- Set evsel->{cpus,threads} from the evlist, if not set, allowing the
generalization of some 'perf stat' functions that previously were
accessing private static evlist variable (Jiri Olsa)

- Delete an unnecessary check before the calling free_event_desc()
(Markus Elfring)

- Allow auxtrace data alignment (Adrian Hunter)

- Allow events with dot (Andi Kleen)

- Fix failure to 'perf probe' events on arm (He Kuang)

- Add testing for Makefile.perf (Jiri Olsa)

- Add test for make install with prefix (Jiri Olsa)

- Fix single target build dependency check (Jiri Olsa)

- Access thread_map entries via accessors, prep patch to hold more
info per entry, for ongoing 'perf stat --per-thread' work (Jiri
Olsa)

- Use __weak definition from compiler.h (Sukadev Bhattiprolu)

- Split perf_pmu__new_alias() (Sukadev Bhattiprolu)"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
perf tools: Allow to specify custom linker command
perf tools: Create config.detected into OUTPUT directory
perf mem: Fill in the missing session freeing after an error occurs
perf kvm: Fill in the missing session freeing after an error occurs
perf report: Fill in the missing session freeing after an error occurs
perf kmem: Fill in the missing session freeing after an error occurs
perf inject: Fill in the missing session freeing after an error occurs
perf tools: Add missing break for PERF_RECORD_ITRACE_START
perf/x86: Fix 'active_events' imbalance
perf symbols: Check access permission when reading symbol files
perf stat: Introduce --per-thread option
perf stat: Introduce print_counters function
perf stat: Using init_stats instead of memset
perf stat: Rename print_interval to process_interval
perf stat: Remove perf_evsel__read_cb function
perf stat: Move perf_stat initialization counter process code
perf stat: Move zero_per_pkg into counter process code
perf stat: Separate counters reading and processing
perf stat: Introduce read_counters function
perf stat: Introduce perf_evsel__read function
...

+886 -423
+13 -23
arch/x86/kernel/cpu/perf_event.c
··· 357 357 */ 358 358 int x86_add_exclusive(unsigned int what) 359 359 { 360 - int ret = -EBUSY, i; 360 + int i; 361 361 362 - if (atomic_inc_not_zero(&x86_pmu.lbr_exclusive[what])) 363 - return 0; 364 - 365 - mutex_lock(&pmc_reserve_mutex); 366 - for (i = 0; i < ARRAY_SIZE(x86_pmu.lbr_exclusive); i++) { 367 - if (i != what && atomic_read(&x86_pmu.lbr_exclusive[i])) 368 - goto out; 362 + if (!atomic_inc_not_zero(&x86_pmu.lbr_exclusive[what])) { 363 + mutex_lock(&pmc_reserve_mutex); 364 + for (i = 0; i < ARRAY_SIZE(x86_pmu.lbr_exclusive); i++) { 365 + if (i != what && atomic_read(&x86_pmu.lbr_exclusive[i])) 366 + goto fail_unlock; 367 + } 368 + atomic_inc(&x86_pmu.lbr_exclusive[what]); 369 + mutex_unlock(&pmc_reserve_mutex); 369 370 } 370 371 371 - atomic_inc(&x86_pmu.lbr_exclusive[what]); 372 - ret = 0; 372 + atomic_inc(&active_events); 373 + return 0; 373 374 374 - out: 375 + fail_unlock: 375 376 mutex_unlock(&pmc_reserve_mutex); 376 - 377 - /* 378 - * Assuming that all exclusive events will share the PMI handler 379 - * (which checks active_events for whether there is work to do), 380 - * we can bump active_events counter right here, except for 381 - * x86_lbr_exclusive_lbr events that go through x86_pmu_event_init() 382 - * path, which already bumps active_events for them. 383 - */ 384 - if (!ret && what != x86_lbr_exclusive_lbr) 385 - atomic_inc(&active_events); 386 - 387 - return ret; 377 + return -EBUSY; 388 378 } 389 379 390 380 void x86_del_exclusive(unsigned int what)
+1 -1
tools/build/Makefile.build
··· 25 25 include $(build-dir)/Build.include 26 26 27 27 # do not force detected configuration 28 - -include .config-detected 28 + -include $(OUTPUT).config-detected 29 29 30 30 # Init all relevant variables used in build files so 31 31 # 1) they have correct type
+4
tools/perf/Documentation/perf-stat.txt
··· 144 144 use --per-core in addition to -a. (system-wide). The output includes the 145 145 core number and the number of online logical processors on that physical processor. 146 146 147 + --per-thread:: 148 + Aggregate counts per monitored threads, when monitoring threads (-t option) 149 + or processes (-p option). 150 + 147 151 -D msecs:: 148 152 --delay msecs:: 149 153 After starting the program, wait msecs before measuring. This is useful to
+2 -2
tools/perf/Makefile
··· 83 83 # 84 84 # All other targets get passed through: 85 85 # 86 - %: 86 + %: FORCE 87 87 $(print_msg) 88 88 $(make) 89 89 90 - .PHONY: tags TAGS 90 + .PHONY: tags TAGS FORCE Makefile
+2 -2
tools/perf/Makefile.perf
··· 110 110 $(Q)touch $(OUTPUT)PERF-VERSION-FILE 111 111 112 112 CC = $(CROSS_COMPILE)gcc 113 - LD = $(CROSS_COMPILE)ld 113 + LD ?= $(CROSS_COMPILE)ld 114 114 AR = $(CROSS_COMPILE)ar 115 115 PKG_CONFIG = $(CROSS_COMPILE)pkg-config 116 116 ··· 545 545 clean: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean config-clean 546 546 $(call QUIET_CLEAN, core-objs) $(RM) $(LIB_FILE) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS) 547 547 $(Q)find . -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete 548 - $(Q)$(RM) .config-detected 548 + $(Q)$(RM) $(OUTPUT).config-detected 549 549 $(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf perf-read-vdso32 perf-read-vdsox32 550 550 $(call QUIET_CLEAN, core-gen) $(RM) *.spec *.pyc *.pyo */*.pyc */*.pyo $(OUTPUT)common-cmds.h TAGS tags cscope* $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)FEATURE-DUMP $(OUTPUT)util/*-bison* $(OUTPUT)util/*-flex* 551 551 $(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) clean
+4 -3
tools/perf/builtin-inject.c
··· 630 630 if (inject.session == NULL) 631 631 return -1; 632 632 633 - if (symbol__init(&inject.session->header.env) < 0) 634 - return -1; 633 + ret = symbol__init(&inject.session->header.env); 634 + if (ret < 0) 635 + goto out_delete; 635 636 636 637 ret = __cmd_inject(&inject); 637 638 639 + out_delete: 638 640 perf_session__delete(inject.session); 639 - 640 641 return ret; 641 642 }
+2 -2
tools/perf/builtin-kmem.c
··· 1916 1916 if (!perf_evlist__find_tracepoint_by_name(session->evlist, 1917 1917 "kmem:kmalloc")) { 1918 1918 pr_err(errmsg, "slab", "slab"); 1919 - return -1; 1919 + goto out_delete; 1920 1920 } 1921 1921 } 1922 1922 ··· 1927 1927 "kmem:mm_page_alloc"); 1928 1928 if (evsel == NULL) { 1929 1929 pr_err(errmsg, "page", "page"); 1930 - return -1; 1930 + goto out_delete; 1931 1931 } 1932 1932 1933 1933 kmem_page_size = pevent_get_page_size(evsel->tp_format->pevent);
+10 -4
tools/perf/builtin-kvm.c
··· 1061 1061 1062 1062 symbol__init(&kvm->session->header.env); 1063 1063 1064 - if (!perf_session__has_traces(kvm->session, "kvm record")) 1065 - return -EINVAL; 1064 + if (!perf_session__has_traces(kvm->session, "kvm record")) { 1065 + ret = -EINVAL; 1066 + goto out_delete; 1067 + } 1066 1068 1067 1069 /* 1068 1070 * Do not use 'isa' recorded in kvm_exit tracepoint since it is not ··· 1072 1070 */ 1073 1071 ret = cpu_isa_config(kvm); 1074 1072 if (ret < 0) 1075 - return ret; 1073 + goto out_delete; 1076 1074 1077 - return perf_session__process_events(kvm->session); 1075 + ret = perf_session__process_events(kvm->session); 1076 + 1077 + out_delete: 1078 + perf_session__delete(kvm->session); 1079 + return ret; 1078 1080 } 1079 1081 1080 1082 static int parse_target_str(struct perf_kvm_stat *kvm)
+6 -10
tools/perf/builtin-mem.c
··· 124 124 .mode = PERF_DATA_MODE_READ, 125 125 .force = mem->force, 126 126 }; 127 - int err = -EINVAL; 128 127 int ret; 129 128 struct perf_session *session = perf_session__new(&file, false, 130 129 &mem->tool); ··· 134 135 if (mem->cpu_list) { 135 136 ret = perf_session__cpu_bitmap(session, mem->cpu_list, 136 137 mem->cpu_bitmap); 137 - if (ret) 138 + if (ret < 0) 138 139 goto out_delete; 139 140 } 140 141 141 - if (symbol__init(&session->header.env) < 0) 142 - return -1; 142 + ret = symbol__init(&session->header.env); 143 + if (ret < 0) 144 + goto out_delete; 143 145 144 146 printf("# PID, TID, IP, ADDR, LOCAL WEIGHT, DSRC, SYMBOL\n"); 145 147 146 - err = perf_session__process_events(session); 147 - if (err) 148 - return err; 149 - 150 - return 0; 148 + ret = perf_session__process_events(session); 151 149 152 150 out_delete: 153 151 perf_session__delete(session); 154 - return err; 152 + return ret; 155 153 } 156 154 157 155 static int report_events(int argc, const char **argv, struct perf_mem *mem)
+15 -2
tools/perf/builtin-report.c
··· 742 742 743 743 argc = parse_options(argc, argv, options, report_usage, 0); 744 744 745 + if (symbol_conf.vmlinux_name && 746 + access(symbol_conf.vmlinux_name, R_OK)) { 747 + pr_err("Invalid file: %s\n", symbol_conf.vmlinux_name); 748 + return -EINVAL; 749 + } 750 + if (symbol_conf.kallsyms_name && 751 + access(symbol_conf.kallsyms_name, R_OK)) { 752 + pr_err("Invalid file: %s\n", symbol_conf.kallsyms_name); 753 + return -EINVAL; 754 + } 755 + 745 756 if (report.use_stdio) 746 757 use_browser = 0; 747 758 else if (report.use_tui) ··· 839 828 if (report.header || report.header_only) { 840 829 perf_session__fprintf_info(session, stdout, 841 830 report.show_full_info); 842 - if (report.header_only) 843 - return 0; 831 + if (report.header_only) { 832 + ret = 0; 833 + goto error; 834 + } 844 835 } else if (use_browser == 0) { 845 836 fputs("# To display the perf.data header info, please use --header/--header-only options.\n#\n", 846 837 stdout);
+208 -206
tools/perf/builtin-stat.c
··· 67 67 #define CNTR_NOT_SUPPORTED "<not supported>" 68 68 #define CNTR_NOT_COUNTED "<not counted>" 69 69 70 - static void print_stat(int argc, const char **argv); 71 - static void print_counter_aggr(struct perf_evsel *counter, char *prefix); 72 - static void print_counter(struct perf_evsel *counter, char *prefix); 73 - static void print_aggr(char *prefix); 70 + static void print_counters(struct timespec *ts, int argc, const char **argv); 74 71 75 72 /* Default events used for perf stat -T */ 76 73 static const char *transaction_attrs = { ··· 138 141 } 139 142 } 140 143 141 - static inline struct cpu_map *perf_evsel__cpus(struct perf_evsel *evsel) 144 + static void perf_stat__reset_stats(void) 142 145 { 143 - return (evsel->cpus && !target.cpu_list) ? evsel->cpus : evsel_list->cpus; 144 - } 145 - 146 - static inline int perf_evsel__nr_cpus(struct perf_evsel *evsel) 147 - { 148 - return perf_evsel__cpus(evsel)->nr; 149 - } 150 - 151 - static void perf_evsel__reset_stat_priv(struct perf_evsel *evsel) 152 - { 153 - int i; 154 - struct perf_stat *ps = evsel->priv; 155 - 156 - for (i = 0; i < 3; i++) 157 - init_stats(&ps->res_stats[i]); 158 - 159 - perf_stat_evsel_id_init(evsel); 160 - } 161 - 162 - static int perf_evsel__alloc_stat_priv(struct perf_evsel *evsel) 163 - { 164 - evsel->priv = zalloc(sizeof(struct perf_stat)); 165 - if (evsel->priv == NULL) 166 - return -ENOMEM; 167 - perf_evsel__reset_stat_priv(evsel); 168 - return 0; 169 - } 170 - 171 - static void perf_evsel__free_stat_priv(struct perf_evsel *evsel) 172 - { 173 - zfree(&evsel->priv); 174 - } 175 - 176 - static int perf_evsel__alloc_prev_raw_counts(struct perf_evsel *evsel) 177 - { 178 - struct perf_counts *counts; 179 - 180 - counts = perf_counts__new(perf_evsel__nr_cpus(evsel)); 181 - if (counts) 182 - evsel->prev_raw_counts = counts; 183 - 184 - return counts ? 0 : -ENOMEM; 185 - } 186 - 187 - static void perf_evsel__free_prev_raw_counts(struct perf_evsel *evsel) 188 - { 189 - perf_counts__delete(evsel->prev_raw_counts); 190 - evsel->prev_raw_counts = NULL; 191 - } 192 - 193 - static void perf_evlist__free_stats(struct perf_evlist *evlist) 194 - { 195 - struct perf_evsel *evsel; 196 - 197 - evlist__for_each(evlist, evsel) { 198 - perf_evsel__free_stat_priv(evsel); 199 - perf_evsel__free_counts(evsel); 200 - perf_evsel__free_prev_raw_counts(evsel); 201 - } 202 - } 203 - 204 - static int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw) 205 - { 206 - struct perf_evsel *evsel; 207 - 208 - evlist__for_each(evlist, evsel) { 209 - if (perf_evsel__alloc_stat_priv(evsel) < 0 || 210 - perf_evsel__alloc_counts(evsel, perf_evsel__nr_cpus(evsel)) < 0 || 211 - (alloc_raw && perf_evsel__alloc_prev_raw_counts(evsel) < 0)) 212 - goto out_free; 213 - } 214 - 215 - return 0; 216 - 217 - out_free: 218 - perf_evlist__free_stats(evlist); 219 - return -1; 220 - } 221 - 222 - static void perf_stat__reset_stats(struct perf_evlist *evlist) 223 - { 224 - struct perf_evsel *evsel; 225 - 226 - evlist__for_each(evlist, evsel) { 227 - perf_evsel__reset_stat_priv(evsel); 228 - perf_evsel__reset_counts(evsel, perf_evsel__nr_cpus(evsel)); 229 - } 230 - 146 + perf_evlist__reset_stats(evsel_list); 231 147 perf_stat__reset_shadow_stats(); 232 148 } 233 149 ··· 214 304 return 0; 215 305 } 216 306 217 - static int read_cb(struct perf_evsel *evsel, int cpu, int thread __maybe_unused, 218 - struct perf_counts_values *count) 307 + static int 308 + process_counter_values(struct perf_evsel *evsel, int cpu, int thread, 309 + struct perf_counts_values *count) 219 310 { 220 311 struct perf_counts_values *aggr = &evsel->counts->aggr; 221 312 static struct perf_counts_values zero; ··· 231 320 count = &zero; 232 321 233 322 switch (aggr_mode) { 323 + case AGGR_THREAD: 234 324 case AGGR_CORE: 235 325 case AGGR_SOCKET: 236 326 case AGGR_NONE: 237 327 if (!evsel->snapshot) 238 - perf_evsel__compute_deltas(evsel, cpu, count); 328 + perf_evsel__compute_deltas(evsel, cpu, thread, count); 239 329 perf_counts_values__scale(count, scale, NULL); 240 - evsel->counts->cpu[cpu] = *count; 241 330 if (aggr_mode == AGGR_NONE) 242 331 perf_stat__update_shadow_stats(evsel, count->values, cpu); 243 332 break; ··· 254 343 return 0; 255 344 } 256 345 257 - static int read_counter(struct perf_evsel *counter); 346 + static int process_counter_maps(struct perf_evsel *counter) 347 + { 348 + int nthreads = thread_map__nr(counter->threads); 349 + int ncpus = perf_evsel__nr_cpus(counter); 350 + int cpu, thread; 258 351 259 - /* 260 - * Read out the results of a single counter: 261 - * aggregate counts across CPUs in system-wide mode 262 - */ 263 - static int read_counter_aggr(struct perf_evsel *counter) 352 + if (counter->system_wide) 353 + nthreads = 1; 354 + 355 + for (thread = 0; thread < nthreads; thread++) { 356 + for (cpu = 0; cpu < ncpus; cpu++) { 357 + if (process_counter_values(counter, cpu, thread, 358 + perf_counts(counter->counts, cpu, thread))) 359 + return -1; 360 + } 361 + } 362 + 363 + return 0; 364 + } 365 + 366 + static int process_counter(struct perf_evsel *counter) 264 367 { 265 368 struct perf_counts_values *aggr = &counter->counts->aggr; 266 369 struct perf_stat *ps = counter->priv; 267 370 u64 *count = counter->counts->aggr.values; 268 - int i; 371 + int i, ret; 269 372 270 373 aggr->val = aggr->ena = aggr->run = 0; 374 + init_stats(ps->res_stats); 271 375 272 - if (read_counter(counter)) 273 - return -1; 376 + if (counter->per_pkg) 377 + zero_per_pkg(counter); 378 + 379 + ret = process_counter_maps(counter); 380 + if (ret) 381 + return ret; 382 + 383 + if (aggr_mode != AGGR_GLOBAL) 384 + return 0; 274 385 275 386 if (!counter->snapshot) 276 - perf_evsel__compute_deltas(counter, -1, aggr); 387 + perf_evsel__compute_deltas(counter, -1, -1, aggr); 277 388 perf_counts_values__scale(aggr, scale, &counter->counts->scaled); 278 389 279 390 for (i = 0; i < 3; i++) ··· 330 397 if (counter->system_wide) 331 398 nthreads = 1; 332 399 333 - if (counter->per_pkg) 334 - zero_per_pkg(counter); 335 - 336 400 for (thread = 0; thread < nthreads; thread++) { 337 401 for (cpu = 0; cpu < ncpus; cpu++) { 338 - if (perf_evsel__read_cb(counter, cpu, thread, read_cb)) 402 + struct perf_counts_values *count; 403 + 404 + count = perf_counts(counter->counts, cpu, thread); 405 + if (perf_evsel__read(counter, cpu, thread, count)) 339 406 return -1; 340 407 } 341 408 } ··· 343 410 return 0; 344 411 } 345 412 346 - static void print_interval(void) 413 + static void read_counters(bool close) 347 414 { 348 - static int num_print_interval; 349 415 struct perf_evsel *counter; 350 - struct perf_stat *ps; 351 - struct timespec ts, rs; 352 - char prefix[64]; 353 416 354 - if (aggr_mode == AGGR_GLOBAL) { 355 - evlist__for_each(evsel_list, counter) { 356 - ps = counter->priv; 357 - memset(ps->res_stats, 0, sizeof(ps->res_stats)); 358 - read_counter_aggr(counter); 359 - } 360 - } else { 361 - evlist__for_each(evsel_list, counter) { 362 - ps = counter->priv; 363 - memset(ps->res_stats, 0, sizeof(ps->res_stats)); 364 - read_counter(counter); 417 + evlist__for_each(evsel_list, counter) { 418 + if (read_counter(counter)) 419 + pr_warning("failed to read counter %s\n", counter->name); 420 + 421 + if (process_counter(counter)) 422 + pr_warning("failed to process counter %s\n", counter->name); 423 + 424 + if (close) { 425 + perf_evsel__close_fd(counter, perf_evsel__nr_cpus(counter), 426 + thread_map__nr(evsel_list->threads)); 365 427 } 366 428 } 429 + } 430 + 431 + static void process_interval(void) 432 + { 433 + struct timespec ts, rs; 434 + 435 + read_counters(false); 367 436 368 437 clock_gettime(CLOCK_MONOTONIC, &ts); 369 438 diff_timespec(&rs, &ts, &ref_time); 370 - sprintf(prefix, "%6lu.%09lu%s", rs.tv_sec, rs.tv_nsec, csv_sep); 371 439 372 - if (num_print_interval == 0 && !csv_output) { 373 - switch (aggr_mode) { 374 - case AGGR_SOCKET: 375 - fprintf(output, "# time socket cpus counts %*s events\n", unit_width, "unit"); 376 - break; 377 - case AGGR_CORE: 378 - fprintf(output, "# time core cpus counts %*s events\n", unit_width, "unit"); 379 - break; 380 - case AGGR_NONE: 381 - fprintf(output, "# time CPU counts %*s events\n", unit_width, "unit"); 382 - break; 383 - case AGGR_GLOBAL: 384 - default: 385 - fprintf(output, "# time counts %*s events\n", unit_width, "unit"); 386 - } 387 - } 388 - 389 - if (++num_print_interval == 25) 390 - num_print_interval = 0; 391 - 392 - switch (aggr_mode) { 393 - case AGGR_CORE: 394 - case AGGR_SOCKET: 395 - print_aggr(prefix); 396 - break; 397 - case AGGR_NONE: 398 - evlist__for_each(evsel_list, counter) 399 - print_counter(counter, prefix); 400 - break; 401 - case AGGR_GLOBAL: 402 - default: 403 - evlist__for_each(evsel_list, counter) 404 - print_counter_aggr(counter, prefix); 405 - } 406 - 407 - fflush(output); 440 + print_counters(&rs, 0, NULL); 408 441 } 409 442 410 443 static void handle_initial_delay(void) ··· 485 586 if (interval) { 486 587 while (!waitpid(child_pid, &status, WNOHANG)) { 487 588 nanosleep(&ts, NULL); 488 - print_interval(); 589 + process_interval(); 489 590 } 490 591 } 491 592 wait(&status); ··· 503 604 while (!done) { 504 605 nanosleep(&ts, NULL); 505 606 if (interval) 506 - print_interval(); 607 + process_interval(); 507 608 } 508 609 } 509 610 ··· 511 612 512 613 update_stats(&walltime_nsecs_stats, t1 - t0); 513 614 514 - if (aggr_mode == AGGR_GLOBAL) { 515 - evlist__for_each(evsel_list, counter) { 516 - read_counter_aggr(counter); 517 - perf_evsel__close_fd(counter, perf_evsel__nr_cpus(counter), 518 - thread_map__nr(evsel_list->threads)); 519 - } 520 - } else { 521 - evlist__for_each(evsel_list, counter) { 522 - read_counter(counter); 523 - perf_evsel__close_fd(counter, perf_evsel__nr_cpus(counter), 1); 524 - } 525 - } 615 + read_counters(true); 526 616 527 617 return WEXITSTATUS(status); 528 618 } ··· 602 714 fprintf(output, "CPU%*d%s", 603 715 csv_output ? 0 : -4, 604 716 perf_evsel__cpus(evsel)->map[id], csv_sep); 717 + break; 718 + case AGGR_THREAD: 719 + fprintf(output, "%*s-%*d%s", 720 + csv_output ? 0 : 16, 721 + thread_map__comm(evsel->threads, id), 722 + csv_output ? 0 : -8, 723 + thread_map__pid(evsel->threads, id), 724 + csv_sep); 605 725 break; 606 726 case AGGR_GLOBAL: 607 727 default: ··· 711 815 s2 = aggr_get_id(evsel_list->cpus, cpu2); 712 816 if (s2 != id) 713 817 continue; 714 - val += counter->counts->cpu[cpu].val; 715 - ena += counter->counts->cpu[cpu].ena; 716 - run += counter->counts->cpu[cpu].run; 818 + val += perf_counts(counter->counts, cpu, 0)->val; 819 + ena += perf_counts(counter->counts, cpu, 0)->ena; 820 + run += perf_counts(counter->counts, cpu, 0)->run; 717 821 nr++; 718 822 } 719 823 if (prefix) ··· 756 860 print_running(run, ena); 757 861 fputc('\n', output); 758 862 } 863 + } 864 + } 865 + 866 + static void print_aggr_thread(struct perf_evsel *counter, char *prefix) 867 + { 868 + int nthreads = thread_map__nr(counter->threads); 869 + int ncpus = cpu_map__nr(counter->cpus); 870 + int cpu, thread; 871 + double uval; 872 + 873 + for (thread = 0; thread < nthreads; thread++) { 874 + u64 ena = 0, run = 0, val = 0; 875 + 876 + for (cpu = 0; cpu < ncpus; cpu++) { 877 + val += perf_counts(counter->counts, cpu, thread)->val; 878 + ena += perf_counts(counter->counts, cpu, thread)->ena; 879 + run += perf_counts(counter->counts, cpu, thread)->run; 880 + } 881 + 882 + if (prefix) 883 + fprintf(output, "%s", prefix); 884 + 885 + uval = val * counter->scale; 886 + 887 + if (nsec_counter(counter)) 888 + nsec_printout(thread, 0, counter, uval); 889 + else 890 + abs_printout(thread, 0, counter, uval); 891 + 892 + if (!csv_output) 893 + print_noise(counter, 1.0); 894 + 895 + print_running(run, ena); 896 + fputc('\n', output); 759 897 } 760 898 } 761 899 ··· 855 925 int cpu; 856 926 857 927 for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) { 858 - val = counter->counts->cpu[cpu].val; 859 - ena = counter->counts->cpu[cpu].ena; 860 - run = counter->counts->cpu[cpu].run; 928 + val = perf_counts(counter->counts, cpu, 0)->val; 929 + ena = perf_counts(counter->counts, cpu, 0)->ena; 930 + run = perf_counts(counter->counts, cpu, 0)->run; 861 931 862 932 if (prefix) 863 933 fprintf(output, "%s", prefix); ··· 902 972 } 903 973 } 904 974 905 - static void print_stat(int argc, const char **argv) 975 + static void print_interval(char *prefix, struct timespec *ts) 906 976 { 907 - struct perf_evsel *counter; 977 + static int num_print_interval; 978 + 979 + sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep); 980 + 981 + if (num_print_interval == 0 && !csv_output) { 982 + switch (aggr_mode) { 983 + case AGGR_SOCKET: 984 + fprintf(output, "# time socket cpus counts %*s events\n", unit_width, "unit"); 985 + break; 986 + case AGGR_CORE: 987 + fprintf(output, "# time core cpus counts %*s events\n", unit_width, "unit"); 988 + break; 989 + case AGGR_NONE: 990 + fprintf(output, "# time CPU counts %*s events\n", unit_width, "unit"); 991 + break; 992 + case AGGR_THREAD: 993 + fprintf(output, "# time comm-pid counts %*s events\n", unit_width, "unit"); 994 + break; 995 + case AGGR_GLOBAL: 996 + default: 997 + fprintf(output, "# time counts %*s events\n", unit_width, "unit"); 998 + } 999 + } 1000 + 1001 + if (++num_print_interval == 25) 1002 + num_print_interval = 0; 1003 + } 1004 + 1005 + static void print_header(int argc, const char **argv) 1006 + { 908 1007 int i; 909 1008 910 1009 fflush(stdout); ··· 959 1000 fprintf(output, " (%d runs)", run_count); 960 1001 fprintf(output, ":\n\n"); 961 1002 } 1003 + } 1004 + 1005 + static void print_footer(void) 1006 + { 1007 + if (!null_run) 1008 + fprintf(output, "\n"); 1009 + fprintf(output, " %17.9f seconds time elapsed", 1010 + avg_stats(&walltime_nsecs_stats)/1e9); 1011 + if (run_count > 1) { 1012 + fprintf(output, " "); 1013 + print_noise_pct(stddev_stats(&walltime_nsecs_stats), 1014 + avg_stats(&walltime_nsecs_stats)); 1015 + } 1016 + fprintf(output, "\n\n"); 1017 + } 1018 + 1019 + static void print_counters(struct timespec *ts, int argc, const char **argv) 1020 + { 1021 + struct perf_evsel *counter; 1022 + char buf[64], *prefix = NULL; 1023 + 1024 + if (interval) 1025 + print_interval(prefix = buf, ts); 1026 + else 1027 + print_header(argc, argv); 962 1028 963 1029 switch (aggr_mode) { 964 1030 case AGGR_CORE: 965 1031 case AGGR_SOCKET: 966 - print_aggr(NULL); 1032 + print_aggr(prefix); 1033 + break; 1034 + case AGGR_THREAD: 1035 + evlist__for_each(evsel_list, counter) 1036 + print_aggr_thread(counter, prefix); 967 1037 break; 968 1038 case AGGR_GLOBAL: 969 1039 evlist__for_each(evsel_list, counter) 970 - print_counter_aggr(counter, NULL); 1040 + print_counter_aggr(counter, prefix); 971 1041 break; 972 1042 case AGGR_NONE: 973 1043 evlist__for_each(evsel_list, counter) 974 - print_counter(counter, NULL); 1044 + print_counter(counter, prefix); 975 1045 break; 976 1046 default: 977 1047 break; 978 1048 } 979 1049 980 - if (!csv_output) { 981 - if (!null_run) 982 - fprintf(output, "\n"); 983 - fprintf(output, " %17.9f seconds time elapsed", 984 - avg_stats(&walltime_nsecs_stats)/1e9); 985 - if (run_count > 1) { 986 - fprintf(output, " "); 987 - print_noise_pct(stddev_stats(&walltime_nsecs_stats), 988 - avg_stats(&walltime_nsecs_stats)); 989 - } 990 - fprintf(output, "\n\n"); 991 - } 1050 + if (!interval && !csv_output) 1051 + print_footer(); 1052 + 1053 + fflush(output); 992 1054 } 993 1055 994 1056 static volatile int signr = -1; ··· 1081 1101 break; 1082 1102 case AGGR_NONE: 1083 1103 case AGGR_GLOBAL: 1104 + case AGGR_THREAD: 1084 1105 default: 1085 1106 break; 1086 1107 } ··· 1306 1325 "aggregate counts per processor socket", AGGR_SOCKET), 1307 1326 OPT_SET_UINT(0, "per-core", &aggr_mode, 1308 1327 "aggregate counts per physical processor core", AGGR_CORE), 1328 + OPT_SET_UINT(0, "per-thread", &aggr_mode, 1329 + "aggregate counts per thread", AGGR_THREAD), 1309 1330 OPT_UINTEGER('D', "delay", &initial_delay, 1310 1331 "ms to wait before starting measurement after program start"), 1311 1332 OPT_END() ··· 1399 1416 run_count = 1; 1400 1417 } 1401 1418 1402 - /* no_aggr, cgroup are for system-wide only */ 1403 - if ((aggr_mode != AGGR_GLOBAL || nr_cgroups) && 1419 + if ((aggr_mode == AGGR_THREAD) && !target__has_task(&target)) { 1420 + fprintf(stderr, "The --per-thread option is only available " 1421 + "when monitoring via -p -t options.\n"); 1422 + parse_options_usage(NULL, options, "p", 1); 1423 + parse_options_usage(NULL, options, "t", 1); 1424 + goto out; 1425 + } 1426 + 1427 + /* 1428 + * no_aggr, cgroup are for system-wide only 1429 + * --per-thread is aggregated per thread, we dont mix it with cpu mode 1430 + */ 1431 + if (((aggr_mode != AGGR_GLOBAL && aggr_mode != AGGR_THREAD) || nr_cgroups) && 1404 1432 !target__has_cpu(&target)) { 1405 1433 fprintf(stderr, "both cgroup and no-aggregation " 1406 1434 "modes only available in system-wide mode\n"); ··· 1439 1445 } 1440 1446 goto out; 1441 1447 } 1448 + 1449 + /* 1450 + * Initialize thread_map with comm names, 1451 + * so we could print it out on output. 1452 + */ 1453 + if (aggr_mode == AGGR_THREAD) 1454 + thread_map__read_comms(evsel_list->threads); 1455 + 1442 1456 if (interval && interval < 100) { 1443 1457 pr_err("print interval must be >= 100ms\n"); 1444 1458 parse_options_usage(stat_usage, options, "I", 1); ··· 1480 1478 1481 1479 status = run_perf_stat(argc, argv); 1482 1480 if (forever && status != -1) { 1483 - print_stat(argc, argv); 1484 - perf_stat__reset_stats(evsel_list); 1481 + print_counters(NULL, argc, argv); 1482 + perf_stat__reset_stats(); 1485 1483 } 1486 1484 } 1487 1485 1488 1486 if (!forever && status != -1 && !interval) 1489 - print_stat(argc, argv); 1487 + print_counters(NULL, argc, argv); 1490 1488 1491 1489 perf_evlist__free_stats(evsel_list); 1492 1490 out:
+3 -21
tools/perf/builtin-top.c
··· 586 586 hists->uid_filter_str = top->record_opts.target.uid_str; 587 587 } 588 588 589 - while (true) { 590 - int key = perf_evlist__tui_browse_hists(top->evlist, help, &hbt, 591 - top->min_percent, 592 - &top->session->header.env); 593 - 594 - if (key != 'f') 595 - break; 596 - 597 - perf_evlist__toggle_enable(top->evlist); 598 - /* 599 - * No need to refresh, resort/decay histogram entries 600 - * if we are not collecting samples: 601 - */ 602 - if (top->evlist->enabled) { 603 - hbt.refresh = top->delay_secs; 604 - help = "Press 'f' to disable the events or 'h' to see other hotkeys"; 605 - } else { 606 - help = "Press 'f' again to re-enable the events"; 607 - hbt.refresh = 0; 608 - } 609 - } 589 + perf_evlist__tui_browse_hists(top->evlist, help, &hbt, 590 + top->min_percent, 591 + &top->session->header.env); 610 592 611 593 done = 1; 612 594 return NULL;
+34 -2
tools/perf/builtin-trace.c
··· 1617 1617 return syscall__set_arg_fmts(sc); 1618 1618 } 1619 1619 1620 + static int trace__validate_ev_qualifier(struct trace *trace) 1621 + { 1622 + int err = 0; 1623 + struct str_node *pos; 1624 + 1625 + strlist__for_each(pos, trace->ev_qualifier) { 1626 + const char *sc = pos->s; 1627 + 1628 + if (audit_name_to_syscall(sc, trace->audit.machine) < 0) { 1629 + if (err == 0) { 1630 + fputs("Error:\tInvalid syscall ", trace->output); 1631 + err = -EINVAL; 1632 + } else { 1633 + fputs(", ", trace->output); 1634 + } 1635 + 1636 + fputs(sc, trace->output); 1637 + } 1638 + } 1639 + 1640 + if (err < 0) { 1641 + fputs("\nHint:\ttry 'perf list syscalls:sys_enter_*'" 1642 + "\nHint:\tand: 'man syscalls'\n", trace->output); 1643 + } 1644 + 1645 + return err; 1646 + } 1647 + 1620 1648 /* 1621 1649 * args is to be interpreted as a series of longs but we need to handle 1622 1650 * 8-byte unaligned accesses. args points to raw_data within the event ··· 2353 2325 */ 2354 2326 if (trace->filter_pids.nr > 0) 2355 2327 err = perf_evlist__set_filter_pids(evlist, trace->filter_pids.nr, trace->filter_pids.entries); 2356 - else if (evlist->threads->map[0] == -1) 2328 + else if (thread_map__pid(evlist->threads, 0) == -1) 2357 2329 err = perf_evlist__set_filter_pid(evlist, getpid()); 2358 2330 2359 2331 if (err < 0) { ··· 2371 2343 if (forks) 2372 2344 perf_evlist__start_workload(evlist); 2373 2345 2374 - trace->multiple_threads = evlist->threads->map[0] == -1 || 2346 + trace->multiple_threads = thread_map__pid(evlist->threads, 0) == -1 || 2375 2347 evlist->threads->nr > 1 || 2376 2348 perf_evlist__first(evlist)->attr.inherit; 2377 2349 again: ··· 2890 2862 err = -ENOMEM; 2891 2863 goto out_close; 2892 2864 } 2865 + 2866 + err = trace__validate_ev_qualifier(&trace); 2867 + if (err) 2868 + goto out_close; 2893 2869 } 2894 2870 2895 2871 err = target__validate(&trace.opts.target);
+3 -3
tools/perf/config/Makefile
··· 11 11 obj-perf := $(abspath $(obj-perf))/ 12 12 endif 13 13 14 - $(shell echo -n > .config-detected) 15 - detected = $(shell echo "$(1)=y" >> .config-detected) 16 - detected_var = $(shell echo "$(1)=$($(1))" >> .config-detected) 14 + $(shell echo -n > $(OUTPUT).config-detected) 15 + detected = $(shell echo "$(1)=y" >> $(OUTPUT).config-detected) 16 + detected_var = $(shell echo "$(1)=$($(1))" >> $(OUTPUT).config-detected) 17 17 18 18 CFLAGS := $(EXTRA_CFLAGS) $(EXTRA_WARNINGS) 19 19
+1
tools/perf/tests/Build
··· 31 31 perf-y += sample-parsing.o 32 32 perf-y += parse-no-sample-id-all.o 33 33 perf-y += kmod-path.o 34 + perf-y += thread-map.o 34 35 35 36 perf-$(CONFIG_X86) += perf-time-to-tsc.o 36 37
+4
tools/perf/tests/builtin-test.c
··· 171 171 .func = test__kmod_path__parse, 172 172 }, 173 173 { 174 + .desc = "Test thread map", 175 + .func = test__thread_map, 176 + }, 177 + { 174 178 .func = NULL, 175 179 }, 176 180 };
+2 -2
tools/perf/tests/code-reading.c
··· 545 545 if (evlist) { 546 546 perf_evlist__delete(evlist); 547 547 } else { 548 - cpu_map__delete(cpus); 549 - thread_map__delete(threads); 548 + cpu_map__put(cpus); 549 + thread_map__put(threads); 550 550 } 551 551 machines__destroy_kernel_maps(&machines); 552 552 machine__delete_threads(machine);
+2 -2
tools/perf/tests/keep-tracking.c
··· 144 144 perf_evlist__disable(evlist); 145 145 perf_evlist__delete(evlist); 146 146 } else { 147 - cpu_map__delete(cpus); 148 - thread_map__delete(threads); 147 + cpu_map__put(cpus); 148 + thread_map__put(threads); 149 149 } 150 150 151 151 return err;
+28 -3
tools/perf/tests/make
··· 1 + ifndef MK 2 + ifeq ($(MAKECMDGOALS),) 3 + # no target specified, trigger the whole suite 4 + all: 5 + @echo "Testing Makefile"; $(MAKE) -sf tests/make MK=Makefile 6 + @echo "Testing Makefile.perf"; $(MAKE) -sf tests/make MK=Makefile.perf 7 + else 8 + # run only specific test over 'Makefile' 9 + %: 10 + @echo "Testing Makefile"; $(MAKE) -sf tests/make MK=Makefile $@ 11 + endif 12 + else 1 13 PERF := . 2 - MK := Makefile 3 14 4 15 include config/Makefile.arch 5 16 ··· 58 47 make_install_html := install-html 59 48 make_install_info := install-info 60 49 make_install_pdf := install-pdf 50 + make_install_prefix := install prefix=/tmp/krava 61 51 make_static := LDFLAGS=-static 62 52 63 53 # all the NO_* variable combined ··· 69 57 70 58 # $(run) contains all available tests 71 59 run := make_pure 60 + # Targets 'clean all' can be run together only through top level 61 + # Makefile because we detect clean target in Makefile.perf and 62 + # disable features detection 63 + ifeq ($(MK),Makefile) 72 64 run += make_clean_all 65 + endif 73 66 run += make_python_perf_so 74 67 run += make_debug 75 68 run += make_no_libperl ··· 100 83 run += make_util_pmu_bison_o 101 84 run += make_install 102 85 run += make_install_bin 86 + run += make_install_prefix 103 87 # FIXME 'install-*' commented out till they're fixed 104 88 # run += make_install_doc 105 89 # run += make_install_man ··· 175 157 test_make_install_bin := $(call test_dest_files,$(installed_files_bin)) 176 158 test_make_install_bin_O := $(call test_dest_files,$(installed_files_bin)) 177 159 160 + # We prefix all installed files for make_install_prefix 161 + # with '/tmp/krava' to match installed/prefix-ed files. 162 + installed_files_all_prefix := $(addprefix /tmp/krava/,$(installed_files_all)) 163 + test_make_install_prefix := $(call test_dest_files,$(installed_files_all_prefix)) 164 + test_make_install_prefix_O := $(call test_dest_files,$(installed_files_all_prefix)) 165 + 178 166 # FIXME nothing gets installed 179 167 test_make_install_man := test -f $$TMP_DEST/share/man/man1/perf.1 180 168 test_make_install_man_O := $(test_make_install_man) ··· 250 226 ( eval $$cmd ) >> $@ 2>&1 251 227 252 228 make_kernelsrc: 253 - @echo " - make -C <kernelsrc> tools/perf" 229 + @echo "- make -C <kernelsrc> tools/perf" 254 230 $(call clean); \ 255 231 (make -C ../.. tools/perf) > $@ 2>&1 && \ 256 232 test -x perf && rm -f $@ || (cat $@ ; false) 257 233 258 234 make_kernelsrc_tools: 259 - @echo " - make -C <kernelsrc>/tools perf" 235 + @echo "- make -C <kernelsrc>/tools perf" 260 236 $(call clean); \ 261 237 (make -C ../../tools perf) > $@ 2>&1 && \ 262 238 test -x perf && rm -f $@ || (cat $@ ; false) ··· 268 244 @echo OK 269 245 270 246 .PHONY: all $(run) $(run_O) tarpkg clean 247 + endif # ifndef MK
+2 -2
tools/perf/tests/mmap-basic.c
··· 140 140 cpus = NULL; 141 141 threads = NULL; 142 142 out_free_cpus: 143 - cpu_map__delete(cpus); 143 + cpu_map__put(cpus); 144 144 out_free_threads: 145 - thread_map__delete(threads); 145 + thread_map__put(threads); 146 146 return err; 147 147 }
+1 -1
tools/perf/tests/mmap-thread-lookup.c
··· 143 143 perf_event__process, 144 144 machine, 0, 500); 145 145 146 - thread_map__delete(map); 146 + thread_map__put(map); 147 147 return err; 148 148 } 149 149
+4 -4
tools/perf/tests/openat-syscall-all-cpus.c
··· 78 78 * we use the auto allocation it will allocate just for 1 cpu, 79 79 * as we start by cpu 0. 80 80 */ 81 - if (perf_evsel__alloc_counts(evsel, cpus->nr) < 0) { 81 + if (perf_evsel__alloc_counts(evsel, cpus->nr, 1) < 0) { 82 82 pr_debug("perf_evsel__alloc_counts(ncpus=%d)\n", cpus->nr); 83 83 goto out_close_fd; 84 84 } ··· 98 98 } 99 99 100 100 expected = nr_openat_calls + cpu; 101 - if (evsel->counts->cpu[cpu].val != expected) { 101 + if (perf_counts(evsel->counts, cpu, 0)->val != expected) { 102 102 pr_debug("perf_evsel__read_on_cpu: expected to intercept %d calls on cpu %d, got %" PRIu64 "\n", 103 - expected, cpus->map[cpu], evsel->counts->cpu[cpu].val); 103 + expected, cpus->map[cpu], perf_counts(evsel->counts, cpu, 0)->val); 104 104 err = -1; 105 105 } 106 106 } ··· 111 111 out_evsel_delete: 112 112 perf_evsel__delete(evsel); 113 113 out_thread_map_delete: 114 - thread_map__delete(threads); 114 + thread_map__put(threads); 115 115 return err; 116 116 }
+1 -1
tools/perf/tests/openat-syscall-tp-fields.c
··· 45 45 46 46 perf_evsel__config(evsel, &opts); 47 47 48 - evlist->threads->map[0] = getpid(); 48 + thread_map__set_pid(evlist->threads, 0, getpid()); 49 49 50 50 err = perf_evlist__open(evlist); 51 51 if (err < 0) {
+3 -3
tools/perf/tests/openat-syscall.c
··· 44 44 goto out_close_fd; 45 45 } 46 46 47 - if (evsel->counts->cpu[0].val != nr_openat_calls) { 47 + if (perf_counts(evsel->counts, 0, 0)->val != nr_openat_calls) { 48 48 pr_debug("perf_evsel__read_on_cpu: expected to intercept %d calls, got %" PRIu64 "\n", 49 - nr_openat_calls, evsel->counts->cpu[0].val); 49 + nr_openat_calls, perf_counts(evsel->counts, 0, 0)->val); 50 50 goto out_close_fd; 51 51 } 52 52 ··· 56 56 out_evsel_delete: 57 57 perf_evsel__delete(evsel); 58 58 out_thread_map_delete: 59 - thread_map__delete(threads); 59 + thread_map__put(threads); 60 60 return err; 61 61 }
+2 -2
tools/perf/tests/switch-tracking.c
··· 560 560 perf_evlist__disable(evlist); 561 561 perf_evlist__delete(evlist); 562 562 } else { 563 - cpu_map__delete(cpus); 564 - thread_map__delete(threads); 563 + cpu_map__put(cpus); 564 + thread_map__put(threads); 565 565 } 566 566 567 567 return err;
+1
tools/perf/tests/tests.h
··· 61 61 int test__fdarray__filter(void); 62 62 int test__fdarray__add(void); 63 63 int test__kmod_path__parse(void); 64 + int test__thread_map(void); 64 65 65 66 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__) 66 67 #ifdef HAVE_DWARF_UNWIND_SUPPORT
+38
tools/perf/tests/thread-map.c
··· 1 + #include <sys/types.h> 2 + #include <unistd.h> 3 + #include "tests.h" 4 + #include "thread_map.h" 5 + #include "debug.h" 6 + 7 + int test__thread_map(void) 8 + { 9 + struct thread_map *map; 10 + 11 + /* test map on current pid */ 12 + map = thread_map__new_by_pid(getpid()); 13 + TEST_ASSERT_VAL("failed to alloc map", map); 14 + 15 + thread_map__read_comms(map); 16 + 17 + TEST_ASSERT_VAL("wrong nr", map->nr == 1); 18 + TEST_ASSERT_VAL("wrong pid", 19 + thread_map__pid(map, 0) == getpid()); 20 + TEST_ASSERT_VAL("wrong comm", 21 + thread_map__comm(map, 0) && 22 + !strcmp(thread_map__comm(map, 0), "perf")); 23 + thread_map__put(map); 24 + 25 + /* test dummy pid */ 26 + map = thread_map__new_dummy(); 27 + TEST_ASSERT_VAL("failed to alloc map", map); 28 + 29 + thread_map__read_comms(map); 30 + 31 + TEST_ASSERT_VAL("wrong nr", map->nr == 1); 32 + TEST_ASSERT_VAL("wrong pid", thread_map__pid(map, 0) == -1); 33 + TEST_ASSERT_VAL("wrong comm", 34 + thread_map__comm(map, 0) && 35 + !strcmp(thread_map__comm(map, 0), "dummy")); 36 + thread_map__put(map); 37 + return 0; 38 + }
+17 -2
tools/perf/ui/browsers/hists.c
··· 1902 1902 case CTRL('c'): 1903 1903 goto out_free_stack; 1904 1904 case 'f': 1905 - if (!is_report_browser(hbt)) 1906 - goto out_free_stack; 1905 + if (!is_report_browser(hbt)) { 1906 + struct perf_top *top = hbt->arg; 1907 + 1908 + perf_evlist__toggle_enable(top->evlist); 1909 + /* 1910 + * No need to refresh, resort/decay histogram 1911 + * entries if we are not collecting samples: 1912 + */ 1913 + if (top->evlist->enabled) { 1914 + helpline = "Press 'f' to disable the events or 'h' to see other hotkeys"; 1915 + hbt->refresh = delay_secs; 1916 + } else { 1917 + helpline = "Press 'f' again to re-enable the events"; 1918 + hbt->refresh = 0; 1919 + } 1920 + continue; 1921 + } 1907 1922 /* Fall thru */ 1908 1923 default: 1909 1924 helpline = "Press '?' for help on key bindings";
+9 -2
tools/perf/util/auxtrace.c
··· 119 119 if (per_cpu) { 120 120 mp->cpu = evlist->cpus->map[idx]; 121 121 if (evlist->threads) 122 - mp->tid = evlist->threads->map[0]; 122 + mp->tid = thread_map__pid(evlist->threads, 0); 123 123 else 124 124 mp->tid = -1; 125 125 } else { 126 126 mp->cpu = -1; 127 - mp->tid = evlist->threads->map[idx]; 127 + mp->tid = thread_map__pid(evlist->threads, idx); 128 128 } 129 129 } 130 130 ··· 1180 1180 data1 = &data[head_off - len1]; 1181 1181 len2 = 0; 1182 1182 data2 = NULL; 1183 + } 1184 + 1185 + if (itr->alignment) { 1186 + unsigned int unwanted = len1 % itr->alignment; 1187 + 1188 + len1 -= unwanted; 1189 + size -= unwanted; 1183 1190 } 1184 1191 1185 1192 /* padding must be written by fn() e.g. record__process_auxtrace() */
+1
tools/perf/util/auxtrace.h
··· 303 303 const char *str); 304 304 u64 (*reference)(struct auxtrace_record *itr); 305 305 int (*read_finish)(struct auxtrace_record *itr, int idx); 306 + unsigned int alignment; 306 307 }; 307 308 308 309 #ifdef HAVE_AUXTRACE_SUPPORT
+4
tools/perf/util/cloexec.c
··· 7 7 8 8 static unsigned long flag = PERF_FLAG_FD_CLOEXEC; 9 9 10 + #ifdef __GLIBC_PREREQ 11 + #if !__GLIBC_PREREQ(2, 6) 10 12 int __weak sched_getcpu(void) 11 13 { 12 14 errno = ENOSYS; 13 15 return -1; 14 16 } 17 + #endif 18 + #endif 15 19 16 20 static int perf_flag_probe(void) 17 21 {
+24 -2
tools/perf/util/cpumap.c
··· 5 5 #include <assert.h> 6 6 #include <stdio.h> 7 7 #include <stdlib.h> 8 + #include "asm/bug.h" 8 9 9 10 static struct cpu_map *cpu_map__default_new(void) 10 11 { ··· 23 22 cpus->map[i] = i; 24 23 25 24 cpus->nr = nr_cpus; 25 + atomic_set(&cpus->refcnt, 1); 26 26 } 27 27 28 28 return cpus; ··· 37 35 if (cpus != NULL) { 38 36 cpus->nr = nr_cpus; 39 37 memcpy(cpus->map, tmp_cpus, payload_size); 38 + atomic_set(&cpus->refcnt, 1); 40 39 } 41 40 42 41 return cpus; ··· 197 194 if (cpus != NULL) { 198 195 cpus->nr = 1; 199 196 cpus->map[0] = -1; 197 + atomic_set(&cpus->refcnt, 1); 200 198 } 201 199 202 200 return cpus; 203 201 } 204 202 205 - void cpu_map__delete(struct cpu_map *map) 203 + static void cpu_map__delete(struct cpu_map *map) 206 204 { 207 - free(map); 205 + if (map) { 206 + WARN_ONCE(atomic_read(&map->refcnt) != 0, 207 + "cpu_map refcnt unbalanced\n"); 208 + free(map); 209 + } 210 + } 211 + 212 + struct cpu_map *cpu_map__get(struct cpu_map *map) 213 + { 214 + if (map) 215 + atomic_inc(&map->refcnt); 216 + return map; 217 + } 218 + 219 + void cpu_map__put(struct cpu_map *map) 220 + { 221 + if (map && atomic_dec_and_test(&map->refcnt)) 222 + cpu_map__delete(map); 208 223 } 209 224 210 225 int cpu_map__get_socket(struct cpu_map *map, int idx) ··· 284 263 /* ensure we process id in increasing order */ 285 264 qsort(c->map, c->nr, sizeof(int), cmp_ids); 286 265 266 + atomic_set(&cpus->refcnt, 1); 287 267 *res = c; 288 268 return 0; 289 269 }
+5 -1
tools/perf/util/cpumap.h
··· 3 3 4 4 #include <stdio.h> 5 5 #include <stdbool.h> 6 + #include <linux/atomic.h> 6 7 7 8 #include "perf.h" 8 9 #include "util/debug.h" 9 10 10 11 struct cpu_map { 12 + atomic_t refcnt; 11 13 int nr; 12 14 int map[]; 13 15 }; 14 16 15 17 struct cpu_map *cpu_map__new(const char *cpu_list); 16 18 struct cpu_map *cpu_map__dummy_new(void); 17 - void cpu_map__delete(struct cpu_map *map); 18 19 struct cpu_map *cpu_map__read(FILE *file); 19 20 size_t cpu_map__fprintf(struct cpu_map *map, FILE *fp); 20 21 int cpu_map__get_socket(struct cpu_map *map, int idx); 21 22 int cpu_map__get_core(struct cpu_map *map, int idx); 22 23 int cpu_map__build_socket_map(struct cpu_map *cpus, struct cpu_map **sockp); 23 24 int cpu_map__build_core_map(struct cpu_map *cpus, struct cpu_map **corep); 25 + 26 + struct cpu_map *cpu_map__get(struct cpu_map *map); 27 + void cpu_map__put(struct cpu_map *map); 24 28 25 29 static inline int cpu_map__socket(struct cpu_map *sock, int s) 26 30 {
+3 -3
tools/perf/util/event.c
··· 504 504 for (thread = 0; thread < threads->nr; ++thread) { 505 505 if (__event__synthesize_thread(comm_event, mmap_event, 506 506 fork_event, 507 - threads->map[thread], 0, 507 + thread_map__pid(threads, thread), 0, 508 508 process, tool, machine, 509 509 mmap_data, proc_map_timeout)) { 510 510 err = -1; ··· 515 515 * comm.pid is set to thread group id by 516 516 * perf_event__synthesize_comm 517 517 */ 518 - if ((int) comm_event->comm.pid != threads->map[thread]) { 518 + if ((int) comm_event->comm.pid != thread_map__pid(threads, thread)) { 519 519 bool need_leader = true; 520 520 521 521 /* is thread group leader in thread_map? */ 522 522 for (j = 0; j < threads->nr; ++j) { 523 - if ((int) comm_event->comm.pid == threads->map[j]) { 523 + if ((int) comm_event->comm.pid == thread_map__pid(threads, j)) { 524 524 need_leader = false; 525 525 break; 526 526 }
+32 -7
tools/perf/util/evlist.c
··· 114 114 { 115 115 perf_evlist__munmap(evlist); 116 116 perf_evlist__close(evlist); 117 - cpu_map__delete(evlist->cpus); 118 - thread_map__delete(evlist->threads); 117 + cpu_map__put(evlist->cpus); 118 + thread_map__put(evlist->threads); 119 119 evlist->cpus = NULL; 120 120 evlist->threads = NULL; 121 121 perf_evlist__purge(evlist); ··· 548 548 else 549 549 sid->cpu = -1; 550 550 if (!evsel->system_wide && evlist->threads && thread >= 0) 551 - sid->tid = evlist->threads->map[thread]; 551 + sid->tid = thread_map__pid(evlist->threads, thread); 552 552 else 553 553 sid->tid = -1; 554 554 } ··· 1101 1101 return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false); 1102 1102 } 1103 1103 1104 + static int perf_evlist__propagate_maps(struct perf_evlist *evlist, 1105 + struct target *target) 1106 + { 1107 + struct perf_evsel *evsel; 1108 + 1109 + evlist__for_each(evlist, evsel) { 1110 + /* 1111 + * We already have cpus for evsel (via PMU sysfs) so 1112 + * keep it, if there's no target cpu list defined. 1113 + */ 1114 + if (evsel->cpus && target->cpu_list) 1115 + cpu_map__put(evsel->cpus); 1116 + 1117 + if (!evsel->cpus || target->cpu_list) 1118 + evsel->cpus = cpu_map__get(evlist->cpus); 1119 + 1120 + evsel->threads = thread_map__get(evlist->threads); 1121 + 1122 + if (!evsel->cpus || !evsel->threads) 1123 + return -ENOMEM; 1124 + } 1125 + 1126 + return 0; 1127 + } 1128 + 1104 1129 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target) 1105 1130 { 1106 1131 evlist->threads = thread_map__new_str(target->pid, target->tid, ··· 1142 1117 if (evlist->cpus == NULL) 1143 1118 goto out_delete_threads; 1144 1119 1145 - return 0; 1120 + return perf_evlist__propagate_maps(evlist, target); 1146 1121 1147 1122 out_delete_threads: 1148 - thread_map__delete(evlist->threads); 1123 + thread_map__put(evlist->threads); 1149 1124 evlist->threads = NULL; 1150 1125 return -1; 1151 1126 } ··· 1378 1353 out: 1379 1354 return err; 1380 1355 out_free_cpus: 1381 - cpu_map__delete(evlist->cpus); 1356 + cpu_map__put(evlist->cpus); 1382 1357 evlist->cpus = NULL; 1383 1358 goto out; 1384 1359 } ··· 1500 1475 __func__, __LINE__); 1501 1476 goto out_close_pipes; 1502 1477 } 1503 - evlist->threads->map[0] = evlist->workload.pid; 1478 + thread_map__set_pid(evlist->threads, 0, evlist->workload.pid); 1504 1479 } 1505 1480 1506 1481 close(child_ready_pipe[1]);
-1
tools/perf/util/evlist.h
··· 289 289 290 290 void perf_evlist__set_tracking_event(struct perf_evlist *evlist, 291 291 struct perf_evsel *tracking_evsel); 292 - 293 292 #endif /* __PERF_EVLIST_H */
+14 -14
tools/perf/util/evsel.c
··· 885 885 perf_evsel__free_fd(evsel); 886 886 perf_evsel__free_id(evsel); 887 887 close_cgroup(evsel->cgrp); 888 + cpu_map__put(evsel->cpus); 889 + thread_map__put(evsel->threads); 888 890 zfree(&evsel->group_name); 889 891 zfree(&evsel->name); 890 892 perf_evsel__object.fini(evsel); ··· 898 896 free(evsel); 899 897 } 900 898 901 - void perf_evsel__compute_deltas(struct perf_evsel *evsel, int cpu, 899 + void perf_evsel__compute_deltas(struct perf_evsel *evsel, int cpu, int thread, 902 900 struct perf_counts_values *count) 903 901 { 904 902 struct perf_counts_values tmp; ··· 910 908 tmp = evsel->prev_raw_counts->aggr; 911 909 evsel->prev_raw_counts->aggr = *count; 912 910 } else { 913 - tmp = evsel->prev_raw_counts->cpu[cpu]; 914 - evsel->prev_raw_counts->cpu[cpu] = *count; 911 + tmp = *perf_counts(evsel->prev_raw_counts, cpu, thread); 912 + *perf_counts(evsel->prev_raw_counts, cpu, thread) = *count; 915 913 } 916 914 917 915 count->val = count->val - tmp.val; ··· 939 937 *pscaled = scaled; 940 938 } 941 939 942 - int perf_evsel__read_cb(struct perf_evsel *evsel, int cpu, int thread, 943 - perf_evsel__read_cb_t cb) 940 + int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, 941 + struct perf_counts_values *count) 944 942 { 945 - struct perf_counts_values count; 946 - 947 - memset(&count, 0, sizeof(count)); 943 + memset(count, 0, sizeof(*count)); 948 944 949 945 if (FD(evsel, cpu, thread) < 0) 950 946 return -EINVAL; 951 947 952 - if (readn(FD(evsel, cpu, thread), &count, sizeof(count)) < 0) 948 + if (readn(FD(evsel, cpu, thread), count, sizeof(*count)) < 0) 953 949 return -errno; 954 950 955 - return cb(evsel, cpu, thread, &count); 951 + return 0; 956 952 } 957 953 958 954 int __perf_evsel__read_on_cpu(struct perf_evsel *evsel, ··· 962 962 if (FD(evsel, cpu, thread) < 0) 963 963 return -EINVAL; 964 964 965 - if (evsel->counts == NULL && perf_evsel__alloc_counts(evsel, cpu + 1) < 0) 965 + if (evsel->counts == NULL && perf_evsel__alloc_counts(evsel, cpu + 1, thread + 1) < 0) 966 966 return -ENOMEM; 967 967 968 968 if (readn(FD(evsel, cpu, thread), &count, nv * sizeof(u64)) < 0) 969 969 return -errno; 970 970 971 - perf_evsel__compute_deltas(evsel, cpu, &count); 971 + perf_evsel__compute_deltas(evsel, cpu, thread, &count); 972 972 perf_counts_values__scale(&count, scale, NULL); 973 - evsel->counts->cpu[cpu] = count; 973 + *perf_counts(evsel->counts, cpu, thread) = count; 974 974 return 0; 975 975 } 976 976 ··· 1167 1167 int group_fd; 1168 1168 1169 1169 if (!evsel->cgrp && !evsel->system_wide) 1170 - pid = threads->map[thread]; 1170 + pid = thread_map__pid(threads, thread); 1171 1171 1172 1172 group_fd = get_group_fd(evsel, cpu, thread); 1173 1173 retry_open:
+16 -24
tools/perf/util/evsel.h
··· 8 8 #include <linux/types.h> 9 9 #include "xyarray.h" 10 10 #include "symbol.h" 11 - 12 - struct perf_counts_values { 13 - union { 14 - struct { 15 - u64 val; 16 - u64 ena; 17 - u64 run; 18 - }; 19 - u64 values[3]; 20 - }; 21 - }; 22 - 23 - struct perf_counts { 24 - s8 scaled; 25 - struct perf_counts_values aggr; 26 - struct perf_counts_values cpu[]; 27 - }; 11 + #include "cpumap.h" 12 + #include "stat.h" 28 13 29 14 struct perf_evsel; 30 15 ··· 67 82 struct cgroup_sel *cgrp; 68 83 void *handler; 69 84 struct cpu_map *cpus; 85 + struct thread_map *threads; 70 86 unsigned int sample_size; 71 87 int id_pos; 72 88 int is_pos; ··· 99 113 struct perf_evlist; 100 114 struct record_opts; 101 115 116 + static inline struct cpu_map *perf_evsel__cpus(struct perf_evsel *evsel) 117 + { 118 + return evsel->cpus; 119 + } 120 + 121 + static inline int perf_evsel__nr_cpus(struct perf_evsel *evsel) 122 + { 123 + return perf_evsel__cpus(evsel)->nr; 124 + } 125 + 102 126 void perf_counts_values__scale(struct perf_counts_values *count, 103 127 bool scale, s8 *pscaled); 104 128 105 - void perf_evsel__compute_deltas(struct perf_evsel *evsel, int cpu, 129 + void perf_evsel__compute_deltas(struct perf_evsel *evsel, int cpu, int thread, 106 130 struct perf_counts_values *count); 107 131 108 132 int perf_evsel__object_config(size_t object_size, ··· 229 233 (a)->attr.type == (b)->attr.type && \ 230 234 (a)->attr.config == (b)->attr.config) 231 235 232 - typedef int (perf_evsel__read_cb_t)(struct perf_evsel *evsel, 233 - int cpu, int thread, 234 - struct perf_counts_values *count); 235 - 236 - int perf_evsel__read_cb(struct perf_evsel *evsel, int cpu, int thread, 237 - perf_evsel__read_cb_t cb); 236 + int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, 237 + struct perf_counts_values *count); 238 238 239 239 int __perf_evsel__read_on_cpu(struct perf_evsel *evsel, 240 240 int cpu, int thread, bool scale);
+1 -2
tools/perf/util/header.c
··· 1063 1063 free(buf); 1064 1064 return events; 1065 1065 error: 1066 - if (events) 1067 - free_event_desc(events); 1066 + free_event_desc(events); 1068 1067 events = NULL; 1069 1068 goto out; 1070 1069 }
+1 -2
tools/perf/util/machine.c
··· 1448 1448 case PERF_RECORD_AUX: 1449 1449 ret = machine__process_aux_event(machine, event); break; 1450 1450 case PERF_RECORD_ITRACE_START: 1451 - ret = machine__process_itrace_start_event(machine, event); 1451 + ret = machine__process_itrace_start_event(machine, event); break; 1452 1452 case PERF_RECORD_LOST_SAMPLES: 1453 1453 ret = machine__process_lost_samples_event(machine, event, sample); break; 1454 - break; 1455 1454 default: 1456 1455 ret = -1; 1457 1456 break;
+4 -1
tools/perf/util/parse-events.c
··· 17 17 #include "parse-events-flex.h" 18 18 #include "pmu.h" 19 19 #include "thread_map.h" 20 + #include "cpumap.h" 20 21 #include "asm/bug.h" 21 22 22 23 #define MAX_NAME_LEN 100 ··· 286 285 if (!evsel) 287 286 return NULL; 288 287 289 - evsel->cpus = cpus; 288 + if (cpus) 289 + evsel->cpus = cpu_map__get(cpus); 290 + 290 291 if (name) 291 292 evsel->name = strdup(name); 292 293 list_add_tail(&evsel->node, list);
+2 -3
tools/perf/util/parse-events.l
··· 119 119 num_dec [0-9]+ 120 120 num_hex 0x[a-fA-F0-9]+ 121 121 num_raw_hex [a-fA-F0-9]+ 122 - name [a-zA-Z_*?][a-zA-Z0-9_*?]* 123 - name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?]* 122 + name [a-zA-Z_*?][a-zA-Z0-9_*?.]* 123 + name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?.]* 124 124 /* If you add a modifier you need to update check_modifier() */ 125 125 modifier_event [ukhpGHSDI]+ 126 126 modifier_bp [rwx]{1,3} ··· 165 165 return PE_EVENT_NAME; 166 166 } 167 167 168 - . | 169 168 <<EOF>> { 170 169 BEGIN(INITIAL); 171 170 REWIND(0);
+29 -16
tools/perf/util/pmu.c
··· 1 1 #include <linux/list.h> 2 + #include <linux/compiler.h> 2 3 #include <sys/types.h> 3 4 #include <unistd.h> 4 5 #include <stdio.h> ··· 206 205 return 0; 207 206 } 208 207 209 - static int perf_pmu__new_alias(struct list_head *list, char *dir, char *name, FILE *file) 208 + static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name, 209 + char *desc __maybe_unused, char *val) 210 210 { 211 211 struct perf_pmu_alias *alias; 212 - char buf[256]; 213 212 int ret; 214 - 215 - ret = fread(buf, 1, sizeof(buf), file); 216 - if (ret == 0) 217 - return -EINVAL; 218 - buf[ret] = 0; 219 213 220 214 alias = malloc(sizeof(*alias)); 221 215 if (!alias) ··· 221 225 alias->unit[0] = '\0'; 222 226 alias->per_pkg = false; 223 227 224 - ret = parse_events_terms(&alias->terms, buf); 228 + ret = parse_events_terms(&alias->terms, val); 225 229 if (ret) { 230 + pr_err("Cannot parse alias %s: %d\n", val, ret); 226 231 free(alias); 227 232 return ret; 228 233 } 229 234 230 235 alias->name = strdup(name); 231 - /* 232 - * load unit name and scale if available 233 - */ 234 - perf_pmu__parse_unit(alias, dir, name); 235 - perf_pmu__parse_scale(alias, dir, name); 236 - perf_pmu__parse_per_pkg(alias, dir, name); 237 - perf_pmu__parse_snapshot(alias, dir, name); 236 + if (dir) { 237 + /* 238 + * load unit name and scale if available 239 + */ 240 + perf_pmu__parse_unit(alias, dir, name); 241 + perf_pmu__parse_scale(alias, dir, name); 242 + perf_pmu__parse_per_pkg(alias, dir, name); 243 + perf_pmu__parse_snapshot(alias, dir, name); 244 + } 238 245 239 246 list_add_tail(&alias->list, list); 240 247 241 248 return 0; 249 + } 250 + 251 + static int perf_pmu__new_alias(struct list_head *list, char *dir, char *name, FILE *file) 252 + { 253 + char buf[256]; 254 + int ret; 255 + 256 + ret = fread(buf, 1, sizeof(buf), file); 257 + if (ret == 0) 258 + return -EINVAL; 259 + 260 + buf[ret] = 0; 261 + 262 + return __perf_pmu__new_alias(list, dir, name, NULL, buf); 242 263 } 243 264 244 265 static inline bool pmu_alias_info_file(char *name) ··· 449 436 return cpus; 450 437 } 451 438 452 - struct perf_event_attr *__attribute__((weak)) 439 + struct perf_event_attr * __weak 453 440 perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused) 454 441 { 455 442 return NULL;
+5 -1
tools/perf/util/probe-event.c
··· 249 249 static bool kprobe_blacklist__listed(unsigned long address); 250 250 static bool kprobe_warn_out_range(const char *symbol, unsigned long address) 251 251 { 252 + u64 etext_addr; 253 + 252 254 /* Get the address of _etext for checking non-probable text symbol */ 253 - if (kernel_get_symbol_address_by_name("_etext", false) < address) 255 + etext_addr = kernel_get_symbol_address_by_name("_etext", false); 256 + 257 + if (etext_addr != 0 && etext_addr < address) 254 258 pr_warning("%s is out of .text, skip it.\n", symbol); 255 259 else if (kprobe_blacklist__listed(address)) 256 260 pr_warning("%s is blacklisted function, skip it.\n", symbol);
+1
tools/perf/util/python-ext-sources
··· 20 20 util/strlist.c 21 21 util/trace-event.c 22 22 ../../lib/rbtree.c 23 + util/string.c
+2 -2
tools/perf/util/python.c
··· 384 384 385 385 static void pyrf_cpu_map__delete(struct pyrf_cpu_map *pcpus) 386 386 { 387 - cpu_map__delete(pcpus->cpus); 387 + cpu_map__put(pcpus->cpus); 388 388 pcpus->ob_type->tp_free((PyObject*)pcpus); 389 389 } 390 390 ··· 453 453 454 454 static void pyrf_thread_map__delete(struct pyrf_thread_map *pthreads) 455 455 { 456 - thread_map__delete(pthreads->threads); 456 + thread_map__put(pthreads->threads); 457 457 pthreads->ob_type->tp_free((PyObject*)pthreads); 458 458 } 459 459
+2 -2
tools/perf/util/record.c
··· 64 64 if (!cpus) 65 65 return false; 66 66 cpu = cpus->map[0]; 67 - cpu_map__delete(cpus); 67 + cpu_map__put(cpus); 68 68 69 69 do { 70 70 ret = perf_do_probe_api(fn, cpu, try[i++]); ··· 226 226 struct cpu_map *cpus = cpu_map__new(NULL); 227 227 228 228 cpu = cpus ? cpus->map[0] : 0; 229 - cpu_map__delete(cpus); 229 + cpu_map__put(cpus); 230 230 } else { 231 231 cpu = evlist->cpus->map[0]; 232 232 }
+4 -2
tools/perf/util/session.c
··· 686 686 union perf_event *event __maybe_unused, 687 687 struct ordered_events *oe) 688 688 { 689 + if (dump_trace) 690 + fprintf(stdout, "\n"); 689 691 return ordered_events__flush(oe, OE_FLUSH__ROUND); 690 692 } 691 693 ··· 1728 1726 if (perf_header__has_feat(&session->header, HEADER_AUXTRACE)) 1729 1727 msg = " (excludes AUX area (e.g. instruction trace) decoded / synthesized events)"; 1730 1728 1731 - ret = fprintf(fp, "Aggregated stats:%s\n", msg); 1729 + ret = fprintf(fp, "\nAggregated stats:%s\n", msg); 1732 1730 1733 1731 ret += events_stats__fprintf(&session->evlist->stats, fp); 1734 1732 return ret; ··· 1895 1893 err = 0; 1896 1894 1897 1895 out_delete_map: 1898 - cpu_map__delete(map); 1896 + cpu_map__put(map); 1899 1897 return err; 1900 1898 } 1901 1899
+120 -12
tools/perf/util/stat.c
··· 1 1 #include <math.h> 2 2 #include "stat.h" 3 + #include "evlist.h" 3 4 #include "evsel.h" 5 + #include "thread_map.h" 4 6 5 7 void update_stats(struct stats *stats, u64 val) 6 8 { ··· 97 95 } 98 96 } 99 97 100 - struct perf_counts *perf_counts__new(int ncpus) 98 + struct perf_counts *perf_counts__new(int ncpus, int nthreads) 101 99 { 102 - int size = sizeof(struct perf_counts) + 103 - ncpus * sizeof(struct perf_counts_values); 100 + struct perf_counts *counts = zalloc(sizeof(*counts)); 104 101 105 - return zalloc(size); 102 + if (counts) { 103 + struct xyarray *values; 104 + 105 + values = xyarray__new(ncpus, nthreads, sizeof(struct perf_counts_values)); 106 + if (!values) { 107 + free(counts); 108 + return NULL; 109 + } 110 + 111 + counts->values = values; 112 + } 113 + 114 + return counts; 106 115 } 107 116 108 117 void perf_counts__delete(struct perf_counts *counts) 109 118 { 110 - free(counts); 119 + if (counts) { 120 + xyarray__delete(counts->values); 121 + free(counts); 122 + } 111 123 } 112 124 113 - static void perf_counts__reset(struct perf_counts *counts, int ncpus) 125 + static void perf_counts__reset(struct perf_counts *counts) 114 126 { 115 - memset(counts, 0, (sizeof(*counts) + 116 - (ncpus * sizeof(struct perf_counts_values)))); 127 + xyarray__reset(counts->values); 117 128 } 118 129 119 - void perf_evsel__reset_counts(struct perf_evsel *evsel, int ncpus) 130 + void perf_evsel__reset_counts(struct perf_evsel *evsel) 120 131 { 121 - perf_counts__reset(evsel->counts, ncpus); 132 + perf_counts__reset(evsel->counts); 122 133 } 123 134 124 - int perf_evsel__alloc_counts(struct perf_evsel *evsel, int ncpus) 135 + int perf_evsel__alloc_counts(struct perf_evsel *evsel, int ncpus, int nthreads) 125 136 { 126 - evsel->counts = perf_counts__new(ncpus); 137 + evsel->counts = perf_counts__new(ncpus, nthreads); 127 138 return evsel->counts != NULL ? 0 : -ENOMEM; 128 139 } 129 140 ··· 144 129 { 145 130 perf_counts__delete(evsel->counts); 146 131 evsel->counts = NULL; 132 + } 133 + 134 + void perf_evsel__reset_stat_priv(struct perf_evsel *evsel) 135 + { 136 + int i; 137 + struct perf_stat *ps = evsel->priv; 138 + 139 + for (i = 0; i < 3; i++) 140 + init_stats(&ps->res_stats[i]); 141 + 142 + perf_stat_evsel_id_init(evsel); 143 + } 144 + 145 + int perf_evsel__alloc_stat_priv(struct perf_evsel *evsel) 146 + { 147 + evsel->priv = zalloc(sizeof(struct perf_stat)); 148 + if (evsel->priv == NULL) 149 + return -ENOMEM; 150 + perf_evsel__reset_stat_priv(evsel); 151 + return 0; 152 + } 153 + 154 + void perf_evsel__free_stat_priv(struct perf_evsel *evsel) 155 + { 156 + zfree(&evsel->priv); 157 + } 158 + 159 + int perf_evsel__alloc_prev_raw_counts(struct perf_evsel *evsel, 160 + int ncpus, int nthreads) 161 + { 162 + struct perf_counts *counts; 163 + 164 + counts = perf_counts__new(ncpus, nthreads); 165 + if (counts) 166 + evsel->prev_raw_counts = counts; 167 + 168 + return counts ? 0 : -ENOMEM; 169 + } 170 + 171 + void perf_evsel__free_prev_raw_counts(struct perf_evsel *evsel) 172 + { 173 + perf_counts__delete(evsel->prev_raw_counts); 174 + evsel->prev_raw_counts = NULL; 175 + } 176 + 177 + int perf_evsel__alloc_stats(struct perf_evsel *evsel, bool alloc_raw) 178 + { 179 + int ncpus = perf_evsel__nr_cpus(evsel); 180 + int nthreads = thread_map__nr(evsel->threads); 181 + 182 + if (perf_evsel__alloc_stat_priv(evsel) < 0 || 183 + perf_evsel__alloc_counts(evsel, ncpus, nthreads) < 0 || 184 + (alloc_raw && perf_evsel__alloc_prev_raw_counts(evsel, ncpus, nthreads) < 0)) 185 + return -ENOMEM; 186 + 187 + return 0; 188 + } 189 + 190 + int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw) 191 + { 192 + struct perf_evsel *evsel; 193 + 194 + evlist__for_each(evlist, evsel) { 195 + if (perf_evsel__alloc_stats(evsel, alloc_raw)) 196 + goto out_free; 197 + } 198 + 199 + return 0; 200 + 201 + out_free: 202 + perf_evlist__free_stats(evlist); 203 + return -1; 204 + } 205 + 206 + void perf_evlist__free_stats(struct perf_evlist *evlist) 207 + { 208 + struct perf_evsel *evsel; 209 + 210 + evlist__for_each(evlist, evsel) { 211 + perf_evsel__free_stat_priv(evsel); 212 + perf_evsel__free_counts(evsel); 213 + perf_evsel__free_prev_raw_counts(evsel); 214 + } 215 + } 216 + 217 + void perf_evlist__reset_stats(struct perf_evlist *evlist) 218 + { 219 + struct perf_evsel *evsel; 220 + 221 + evlist__for_each(evlist, evsel) { 222 + perf_evsel__reset_stat_priv(evsel); 223 + perf_evsel__reset_counts(evsel); 224 + } 147 225 }
+44 -3
tools/perf/util/stat.h
··· 3 3 4 4 #include <linux/types.h> 5 5 #include <stdio.h> 6 + #include "xyarray.h" 6 7 7 8 struct stats 8 9 { ··· 30 29 AGGR_GLOBAL, 31 30 AGGR_SOCKET, 32 31 AGGR_CORE, 32 + AGGR_THREAD, 33 33 }; 34 + 35 + struct perf_counts_values { 36 + union { 37 + struct { 38 + u64 val; 39 + u64 ena; 40 + u64 run; 41 + }; 42 + u64 values[3]; 43 + }; 44 + }; 45 + 46 + struct perf_counts { 47 + s8 scaled; 48 + struct perf_counts_values aggr; 49 + struct xyarray *values; 50 + }; 51 + 52 + static inline struct perf_counts_values* 53 + perf_counts(struct perf_counts *counts, int cpu, int thread) 54 + { 55 + return xyarray__entry(counts->values, cpu, thread); 56 + } 34 57 35 58 void update_stats(struct stats *stats, u64 val); 36 59 double avg_stats(struct stats *stats); ··· 71 46 } 72 47 73 48 struct perf_evsel; 49 + struct perf_evlist; 50 + 74 51 bool __perf_evsel_stat__is(struct perf_evsel *evsel, 75 52 enum perf_stat_evsel_id id); 76 53 ··· 89 62 void perf_stat__print_shadow_stats(FILE *out, struct perf_evsel *evsel, 90 63 double avg, int cpu, enum aggr_mode aggr); 91 64 92 - struct perf_counts *perf_counts__new(int ncpus); 65 + struct perf_counts *perf_counts__new(int ncpus, int nthreads); 93 66 void perf_counts__delete(struct perf_counts *counts); 94 67 95 - void perf_evsel__reset_counts(struct perf_evsel *evsel, int ncpus); 96 - int perf_evsel__alloc_counts(struct perf_evsel *evsel, int ncpus); 68 + void perf_evsel__reset_counts(struct perf_evsel *evsel); 69 + int perf_evsel__alloc_counts(struct perf_evsel *evsel, int ncpus, int nthreads); 97 70 void perf_evsel__free_counts(struct perf_evsel *evsel); 71 + 72 + void perf_evsel__reset_stat_priv(struct perf_evsel *evsel); 73 + int perf_evsel__alloc_stat_priv(struct perf_evsel *evsel); 74 + void perf_evsel__free_stat_priv(struct perf_evsel *evsel); 75 + 76 + int perf_evsel__alloc_prev_raw_counts(struct perf_evsel *evsel, 77 + int ncpus, int nthreads); 78 + void perf_evsel__free_prev_raw_counts(struct perf_evsel *evsel); 79 + 80 + int perf_evsel__alloc_stats(struct perf_evsel *evsel, bool alloc_raw); 81 + 82 + int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw); 83 + void perf_evlist__free_stats(struct perf_evlist *evlist); 84 + void perf_evlist__reset_stats(struct perf_evlist *evlist); 98 85 #endif
+1 -1
tools/perf/util/svghelper.c
··· 748 748 set_bit(c, cpumask_bits(b)); 749 749 } 750 750 751 - cpu_map__delete(m); 751 + cpu_map__put(m); 752 752 753 753 return ret; 754 754 }
+4 -1
tools/perf/util/symbol.c
··· 1132 1132 INIT_LIST_HEAD(&md.maps); 1133 1133 1134 1134 fd = open(kcore_filename, O_RDONLY); 1135 - if (fd < 0) 1135 + if (fd < 0) { 1136 + pr_err("%s requires CAP_SYS_RAWIO capability to access.\n", 1137 + kcore_filename); 1136 1138 return -EINVAL; 1139 + } 1137 1140 1138 1141 /* Read new maps into temporary lists */ 1139 1142 err = file__read_maps(fd, md.type == MAP__FUNCTION, kcore_mapfn, &md,
+117 -15
tools/perf/util/thread_map.c
··· 8 8 #include <unistd.h> 9 9 #include "strlist.h" 10 10 #include <string.h> 11 + #include <api/fs/fs.h> 12 + #include "asm/bug.h" 11 13 #include "thread_map.h" 12 14 #include "util.h" 15 + #include "debug.h" 13 16 14 17 /* Skip "." and ".." directories */ 15 18 static int filter(const struct dirent *dir) ··· 23 20 return 1; 24 21 } 25 22 23 + static void thread_map__reset(struct thread_map *map, int start, int nr) 24 + { 25 + size_t size = (nr - start) * sizeof(map->map[0]); 26 + 27 + memset(&map->map[start], 0, size); 28 + } 29 + 26 30 static struct thread_map *thread_map__realloc(struct thread_map *map, int nr) 27 31 { 28 - size_t size = sizeof(*map) + sizeof(pid_t) * nr; 32 + size_t size = sizeof(*map) + sizeof(map->map[0]) * nr; 33 + int start = map ? map->nr : 0; 29 34 30 - return realloc(map, size); 35 + map = realloc(map, size); 36 + /* 37 + * We only realloc to add more items, let's reset new items. 38 + */ 39 + if (map) 40 + thread_map__reset(map, start, nr); 41 + 42 + return map; 31 43 } 32 44 33 45 #define thread_map__alloc(__nr) thread_map__realloc(NULL, __nr) ··· 63 45 threads = thread_map__alloc(items); 64 46 if (threads != NULL) { 65 47 for (i = 0; i < items; i++) 66 - threads->map[i] = atoi(namelist[i]->d_name); 48 + thread_map__set_pid(threads, i, atoi(namelist[i]->d_name)); 67 49 threads->nr = items; 50 + atomic_set(&threads->refcnt, 1); 68 51 } 69 52 70 53 for (i=0; i<items; i++) ··· 80 61 struct thread_map *threads = thread_map__alloc(1); 81 62 82 63 if (threads != NULL) { 83 - threads->map[0] = tid; 84 - threads->nr = 1; 64 + thread_map__set_pid(threads, 0, tid); 65 + threads->nr = 1; 66 + atomic_set(&threads->refcnt, 1); 85 67 } 86 68 87 69 return threads; ··· 104 84 goto out_free_threads; 105 85 106 86 threads->nr = 0; 87 + atomic_set(&threads->refcnt, 1); 107 88 108 89 while (!readdir_r(proc, &dirent, &next) && next) { 109 90 char *end; ··· 144 123 threads = tmp; 145 124 } 146 125 147 - for (i = 0; i < items; i++) 148 - threads->map[threads->nr + i] = atoi(namelist[i]->d_name); 126 + for (i = 0; i < items; i++) { 127 + thread_map__set_pid(threads, threads->nr + i, 128 + atoi(namelist[i]->d_name)); 129 + } 149 130 150 131 for (i = 0; i < items; i++) 151 132 zfree(&namelist[i]); ··· 224 201 threads = nt; 225 202 226 203 for (i = 0; i < items; i++) { 227 - threads->map[j++] = atoi(namelist[i]->d_name); 204 + thread_map__set_pid(threads, j++, atoi(namelist[i]->d_name)); 228 205 zfree(&namelist[i]); 229 206 } 230 207 threads->nr = total_tasks; ··· 233 210 234 211 out: 235 212 strlist__delete(slist); 213 + if (threads) 214 + atomic_set(&threads->refcnt, 1); 236 215 return threads; 237 216 238 217 out_free_namelist: ··· 252 227 struct thread_map *threads = thread_map__alloc(1); 253 228 254 229 if (threads != NULL) { 255 - threads->map[0] = -1; 256 - threads->nr = 1; 230 + thread_map__set_pid(threads, 0, -1); 231 + threads->nr = 1; 232 + atomic_set(&threads->refcnt, 1); 257 233 } 258 234 return threads; 259 235 } ··· 293 267 goto out_free_threads; 294 268 295 269 threads = nt; 296 - threads->map[ntasks - 1] = tid; 297 - threads->nr = ntasks; 270 + thread_map__set_pid(threads, ntasks - 1, tid); 271 + threads->nr = ntasks; 298 272 } 299 273 out: 274 + if (threads) 275 + atomic_set(&threads->refcnt, 1); 300 276 return threads; 301 277 302 278 out_free_threads: ··· 318 290 return thread_map__new_by_tid_str(tid); 319 291 } 320 292 321 - void thread_map__delete(struct thread_map *threads) 293 + static void thread_map__delete(struct thread_map *threads) 322 294 { 323 - free(threads); 295 + if (threads) { 296 + int i; 297 + 298 + WARN_ONCE(atomic_read(&threads->refcnt) != 0, 299 + "thread map refcnt unbalanced\n"); 300 + for (i = 0; i < threads->nr; i++) 301 + free(thread_map__comm(threads, i)); 302 + free(threads); 303 + } 304 + } 305 + 306 + struct thread_map *thread_map__get(struct thread_map *map) 307 + { 308 + if (map) 309 + atomic_inc(&map->refcnt); 310 + return map; 311 + } 312 + 313 + void thread_map__put(struct thread_map *map) 314 + { 315 + if (map && atomic_dec_and_test(&map->refcnt)) 316 + thread_map__delete(map); 324 317 } 325 318 326 319 size_t thread_map__fprintf(struct thread_map *threads, FILE *fp) ··· 350 301 size_t printed = fprintf(fp, "%d thread%s: ", 351 302 threads->nr, threads->nr > 1 ? "s" : ""); 352 303 for (i = 0; i < threads->nr; ++i) 353 - printed += fprintf(fp, "%s%d", i ? ", " : "", threads->map[i]); 304 + printed += fprintf(fp, "%s%d", i ? ", " : "", thread_map__pid(threads, i)); 354 305 355 306 return printed + fprintf(fp, "\n"); 307 + } 308 + 309 + static int get_comm(char **comm, pid_t pid) 310 + { 311 + char *path; 312 + size_t size; 313 + int err; 314 + 315 + if (asprintf(&path, "%s/%d/comm", procfs__mountpoint(), pid) == -1) 316 + return -ENOMEM; 317 + 318 + err = filename__read_str(path, comm, &size); 319 + if (!err) { 320 + /* 321 + * We're reading 16 bytes, while filename__read_str 322 + * allocates data per BUFSIZ bytes, so we can safely 323 + * mark the end of the string. 324 + */ 325 + (*comm)[size] = 0; 326 + rtrim(*comm); 327 + } 328 + 329 + free(path); 330 + return err; 331 + } 332 + 333 + static void comm_init(struct thread_map *map, int i) 334 + { 335 + pid_t pid = thread_map__pid(map, i); 336 + char *comm = NULL; 337 + 338 + /* dummy pid comm initialization */ 339 + if (pid == -1) { 340 + map->map[i].comm = strdup("dummy"); 341 + return; 342 + } 343 + 344 + /* 345 + * The comm name is like extra bonus ;-), 346 + * so just warn if we fail for any reason. 347 + */ 348 + if (get_comm(&comm, pid)) 349 + pr_warning("Couldn't resolve comm name for pid %d\n", pid); 350 + 351 + map->map[i].comm = comm; 352 + } 353 + 354 + void thread_map__read_comms(struct thread_map *threads) 355 + { 356 + int i; 357 + 358 + for (i = 0; i < threads->nr; ++i) 359 + comm_init(threads, i); 356 360 }
+28 -3
tools/perf/util/thread_map.h
··· 3 3 4 4 #include <sys/types.h> 5 5 #include <stdio.h> 6 + #include <linux/atomic.h> 7 + 8 + struct thread_map_data { 9 + pid_t pid; 10 + char *comm; 11 + }; 6 12 7 13 struct thread_map { 14 + atomic_t refcnt; 8 15 int nr; 9 - pid_t map[]; 16 + struct thread_map_data map[]; 10 17 }; 11 18 12 19 struct thread_map *thread_map__new_dummy(void); ··· 22 15 struct thread_map *thread_map__new_by_uid(uid_t uid); 23 16 struct thread_map *thread_map__new(pid_t pid, pid_t tid, uid_t uid); 24 17 18 + struct thread_map *thread_map__get(struct thread_map *map); 19 + void thread_map__put(struct thread_map *map); 20 + 25 21 struct thread_map *thread_map__new_str(const char *pid, 26 22 const char *tid, uid_t uid); 27 - 28 - void thread_map__delete(struct thread_map *threads); 29 23 30 24 size_t thread_map__fprintf(struct thread_map *threads, FILE *fp); 31 25 ··· 35 27 return threads ? threads->nr : 1; 36 28 } 37 29 30 + static inline pid_t thread_map__pid(struct thread_map *map, int thread) 31 + { 32 + return map->map[thread].pid; 33 + } 34 + 35 + static inline void 36 + thread_map__set_pid(struct thread_map *map, int thread, pid_t pid) 37 + { 38 + map->map[thread].pid = pid; 39 + } 40 + 41 + static inline char *thread_map__comm(struct thread_map *map, int thread) 42 + { 43 + return map->map[thread].comm; 44 + } 45 + 46 + void thread_map__read_comms(struct thread_map *threads); 38 47 #endif /* __PERF_THREAD_MAP_H */