Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'trace-tools-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing tool updates from Steven Rostedt:
"Specific for timerlat:

- Improve the output of timerlat top by adding a missing \n, and by
avoiding printing color-formatting characters where they are
translated to regular characters.

- Improve timerlat auto-analysis output by replacing '\t' with spaces
to avoid copy-and-paste issues when reporting problems.

- Make the user-space (-u) option the default, as it is the most
complete test. Add a -k option to use the in-kernel workload.

- On timerlat top and hist, add a summary with the overall results.
For instance, the minimum value for all CPUs, the overall average
and the maximum value from all CPUs.

- timerlat hist was printing initial values (i.e., 0 as max, and ~0
as min) if the trace stopped before the first Ret-User event. This
problem was fixed by printing the " - " no value string to the
output if that was the case.

For all RTLA tools:

- Add a --warm-up <seconds> option, allowing the workload to run for
<seconds> before starting to collect results.

- Add a --trace-buffer-size option, allowing the user to set the
tracing buffer size for -t option. This option is mainly useful for
reducing the trace file. Now rtla depends on libtracefs >= 1.6.

- Fix the -t [trace_file] parsing, now it does not require the '='
before the option parameter, and better handles the multiple ways a
user can pass the trace_file.txt"

* tag 'trace-tools-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rtla: Documentation: Fix -t, --trace
rtla: Fix -t\--trace[=file]
rtla/timerlat: Fix histogram report when a cpu count is 0
rtla: Add --trace-buffer-size option
rtla/timerlat: Make user-space threads the default
rtla: Add the --warm-up option
rtla/timerlat: Add a summary for hist mode
rtla/timerlat: Add a summary for top mode
rtla/timerlat: Use pretty formatting only on interactive tty
rtla/auto-analysis: Replace \t with spaces
rtla/timerlat: Simplify "no value" printing on top

+653 -153
+7 -4
Documentation/tools/rtla/common_options.rst
··· 14 14 15 15 Print debug info. 16 16 17 - **-t**, **--trace**\[*=file*] 18 - 19 - Save the stopped trace to [*file|osnoise_trace.txt*]. 20 - 21 17 **-e**, **--event** *sys:event* 22 18 23 19 Enable an event in the trace (**-t**) session. The argument can be a specific event, e.g., **-e** *sched:sched_switch*, or all events of a system group, e.g., **-e** *sched*. Multiple **-e** are allowed. It is only active when **-t** or **-a** are set. ··· 45 49 **-C**, **--cgroup**\[*=cgroup*] 46 50 47 51 Set a *cgroup* to the tracer's threads. If the **-C** option is passed without arguments, the tracer's thread will inherit **rtla**'s *cgroup*. Otherwise, the threads will be placed on the *cgroup* passed to the option. 52 + 53 + **--warm-up** *s* 54 + 55 + After starting the workload, let it run for *s* seconds before starting collecting the data, allowing the system to warm-up. Statistical data generated during warm-up is discarded. 56 + 57 + **--trace-buffer-size** *kB* 58 + Set the per-cpu trace buffer size in kB for the tracing output. 48 59 49 60 **-h**, **--help** 50 61
+4
Documentation/tools/rtla/common_osnoise_options.rst
··· 25 25 26 26 Specify the minimum delta between two time reads to be considered noise. 27 27 The default threshold is *5 us*. 28 + 29 + **-t**, **--trace** \[*file*] 30 + 31 + Save the stopped trace to [*file|osnoise_trace.txt*].
+9 -1
Documentation/tools/rtla/common_timerlat_options.rst
··· 22 22 Save the stack trace at the *IRQ* if a *Thread* latency is higher than the 23 23 argument in us. 24 24 25 + **-t**, **--trace** \[*file*] 26 + 27 + Save the stopped trace to [*file|timerlat_trace.txt*]. 28 + 25 29 **--dma-latency** *us* 26 30 Set the /dev/cpu_dma_latency to *us*, aiming to bound exit from idle latencies. 27 31 *cyclictest* sets this value to *0* by default, use **--dma-latency** *0* to have 28 32 similar results. 33 + 34 + **-k**, **--kernel-threads** 35 + 36 + Use timerlat kernel-space threads, in contrast of **-u**. 29 37 30 38 **-u**, **--user-threads** 31 39 32 40 Set timerlat to run without a workload, and then dispatches user-space workloads 33 41 to wait on the timerlat_fd. Once the workload is awakes, it goes to sleep again 34 42 adding so the measurement for the kernel-to-user and user-to-kernel to the tracer 35 - output. 43 + output. **--user-threads** will be used unless the user specify **-k**. 36 44 37 45 **-U**, **--user-load** 38 46
+1 -1
tools/tracing/rtla/Makefile.config
··· 3 3 STOP_ERROR := 4 4 5 5 LIBTRACEEVENT_MIN_VERSION = 1.5 6 - LIBTRACEFS_MIN_VERSION = 1.3 6 + LIBTRACEFS_MIN_VERSION = 1.6 7 7 8 8 define lib_setup 9 9 $(eval LIB_INCLUDES += $(shell sh -c "$(PKG_CONFIG) --cflags lib$(1)"))
+47 -8
tools/tracing/rtla/src/osnoise_hist.c
··· 36 36 cpu_set_t hk_cpu_set; 37 37 struct sched_attr sched_param; 38 38 struct trace_events *events; 39 - 40 39 char no_header; 41 40 char no_summary; 42 41 char no_index; 43 42 char with_zeros; 44 43 int bucket_size; 45 44 int entries; 45 + int warmup; 46 + int buffer_size; 46 47 }; 47 48 48 49 struct osnoise_hist_cpu { ··· 437 436 static const char * const msg[] = { 438 437 "", 439 438 " usage: rtla osnoise hist [-h] [-D] [-d s] [-a us] [-p us] [-r us] [-s us] [-S us] \\", 440 - " [-T us] [-t[=file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] \\", 439 + " [-T us] [-t[file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] \\", 441 440 " [-c cpu-list] [-H cpu-list] [-P priority] [-b N] [-E N] [--no-header] [--no-summary] \\", 442 - " [--no-index] [--with-zeros] [-C[=cgroup_name]]", 441 + " [--no-index] [--with-zeros] [-C[=cgroup_name]] [--warm-up]", 443 442 "", 444 443 " -h/--help: print this menu", 445 444 " -a/--auto: set automatic trace mode, stopping the session if argument in us sample is hit", ··· 453 452 " -C/--cgroup[=cgroup_name]: set cgroup, if no cgroup_name is passed, the rtla's cgroup will be inherited", 454 453 " -d/--duration time[s|m|h|d]: duration of the session", 455 454 " -D/--debug: print debug info", 456 - " -t/--trace[=file]: save the stopped trace to [file|osnoise_trace.txt]", 455 + " -t/--trace[file]: save the stopped trace to [file|osnoise_trace.txt]", 457 456 " -e/--event <sys:event>: enable the <sys:event> in the trace instance, multiple -e are allowed", 458 457 " --filter <filter>: enable a trace event filter to the previous -e event", 459 458 " --trigger <trigger>: enable a trace event trigger to the previous -e event", ··· 469 468 " f:prio - use SCHED_FIFO with prio", 470 469 " d:runtime[us|ms|s]:period[us|ms|s] - use SCHED_DEADLINE with runtime and period", 471 470 " in nanoseconds", 471 + " --warm-up: let the workload run for s seconds before collecting data", 472 + " --trace-buffer-size kB: set the per-cpu trace buffer size in kB", 472 473 NULL, 473 474 }; 474 475 ··· 534 531 {"with-zeros", no_argument, 0, '3'}, 535 532 {"trigger", required_argument, 0, '4'}, 536 533 {"filter", required_argument, 0, '5'}, 534 + {"warm-up", required_argument, 0, '6'}, 535 + {"trace-buffer-size", required_argument, 0, '7'}, 537 536 {0, 0, 0, 0} 538 537 }; 539 538 540 539 /* getopt_long stores the option index here. */ 541 540 int option_index = 0; 542 541 543 - c = getopt_long(argc, argv, "a:c:C::b:d:e:E:DhH:p:P:r:s:S:t::T:01234:5:", 542 + c = getopt_long(argc, argv, "a:c:C::b:d:e:E:DhH:p:P:r:s:S:t::T:01234:5:6:7:", 544 543 long_options, &option_index); 545 544 546 545 /* detect the end of the options. */ ··· 645 640 params->threshold = get_llong_from_str(optarg); 646 641 break; 647 642 case 't': 648 - if (optarg) 649 - /* skip = */ 650 - params->trace_output = &optarg[1]; 643 + if (optarg) { 644 + if (optarg[0] == '=') 645 + params->trace_output = &optarg[1]; 646 + else 647 + params->trace_output = &optarg[0]; 648 + } else if (optind < argc && argv[optind][0] != '0') 649 + params->trace_output = argv[optind]; 651 650 else 652 651 params->trace_output = "osnoise_trace.txt"; 653 652 break; ··· 688 679 } else { 689 680 osnoise_hist_usage("--filter requires a previous -e\n"); 690 681 } 682 + break; 683 + case '6': 684 + params->warmup = get_llong_from_str(optarg); 685 + break; 686 + case '7': 687 + params->buffer_size = get_llong_from_str(optarg); 691 688 break; 692 689 default: 693 690 osnoise_hist_usage("Invalid option"); ··· 901 886 goto out_hist; 902 887 } 903 888 889 + if (params->buffer_size > 0) { 890 + retval = trace_set_buffer_size(&record->trace, params->buffer_size); 891 + if (retval) 892 + goto out_hist; 893 + } 904 894 } 905 895 906 896 /* ··· 918 898 if (params->trace_output) 919 899 trace_instance_start(&record->trace); 920 900 trace_instance_start(trace); 901 + 902 + if (params->warmup > 0) { 903 + debug_msg("Warming up for %d seconds\n", params->warmup); 904 + sleep(params->warmup); 905 + if (stop_tracing) 906 + goto out_hist; 907 + 908 + /* 909 + * Clean up the buffer. The osnoise workload do not run 910 + * with tracing off to avoid creating a performance penalty 911 + * when not needed. 912 + */ 913 + retval = tracefs_instance_file_write(trace->inst, "trace", ""); 914 + if (retval < 0) { 915 + debug_msg("Error cleaning up the buffer"); 916 + goto out_hist; 917 + } 918 + 919 + } 921 920 922 921 tool->start_time = time(NULL); 923 922 osnoise_hist_set_signals(params);
+48 -7
tools/tracing/rtla/src/osnoise_top.c
··· 40 40 int set_sched; 41 41 int cgroup; 42 42 int hk_cpus; 43 + int warmup; 44 + int buffer_size; 43 45 cpu_set_t hk_cpu_set; 44 46 struct sched_attr sched_param; 45 47 struct trace_events *events; ··· 283 281 284 282 static const char * const msg[] = { 285 283 " [-h] [-q] [-D] [-d s] [-a us] [-p us] [-r us] [-s us] [-S us] \\", 286 - " [-T us] [-t[=file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] \\", 287 - " [-c cpu-list] [-H cpu-list] [-P priority] [-C[=cgroup_name]]", 284 + " [-T us] [-t[file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] \\", 285 + " [-c cpu-list] [-H cpu-list] [-P priority] [-C[=cgroup_name]] [--warm-up s]", 288 286 "", 289 287 " -h/--help: print this menu", 290 288 " -a/--auto: set automatic trace mode, stopping the session if argument in us sample is hit", ··· 298 296 " -C/--cgroup[=cgroup_name]: set cgroup, if no cgroup_name is passed, the rtla's cgroup will be inherited", 299 297 " -d/--duration time[s|m|h|d]: duration of the session", 300 298 " -D/--debug: print debug info", 301 - " -t/--trace[=file]: save the stopped trace to [file|osnoise_trace.txt]", 299 + " -t/--trace[file]: save the stopped trace to [file|osnoise_trace.txt]", 302 300 " -e/--event <sys:event>: enable the <sys:event> in the trace instance, multiple -e are allowed", 303 301 " --filter <filter>: enable a trace event filter to the previous -e event", 304 302 " --trigger <trigger>: enable a trace event trigger to the previous -e event", ··· 309 307 " f:prio - use SCHED_FIFO with prio", 310 308 " d:runtime[us|ms|s]:period[us|ms|s] - use SCHED_DEADLINE with runtime and period", 311 309 " in nanoseconds", 310 + " --warm-up s: let the workload run for s seconds before collecting data", 311 + " --trace-buffer-size kB: set the per-cpu trace buffer size in kB", 312 312 NULL, 313 313 }; 314 314 ··· 385 381 {"trace", optional_argument, 0, 't'}, 386 382 {"trigger", required_argument, 0, '0'}, 387 383 {"filter", required_argument, 0, '1'}, 384 + {"warm-up", required_argument, 0, '2'}, 385 + {"trace-buffer-size", required_argument, 0, '3'}, 388 386 {0, 0, 0, 0} 389 387 }; 390 388 391 389 /* getopt_long stores the option index here. */ 392 390 int option_index = 0; 393 391 394 - c = getopt_long(argc, argv, "a:c:C::d:De:hH:p:P:qr:s:S:t::T:0:1:", 392 + c = getopt_long(argc, argv, "a:c:C::d:De:hH:p:P:qr:s:S:t::T:0:1:2:3:", 395 393 long_options, &option_index); 396 394 397 395 /* Detect the end of the options. */ ··· 486 480 params->stop_total_us = get_llong_from_str(optarg); 487 481 break; 488 482 case 't': 489 - if (optarg) 490 - /* skip = */ 491 - params->trace_output = &optarg[1]; 483 + if (optarg) { 484 + if (optarg[0] == '=') 485 + params->trace_output = &optarg[1]; 486 + else 487 + params->trace_output = &optarg[0]; 488 + } else if (optind < argc && argv[optind][0] != '-') 489 + params->trace_output = argv[optind]; 492 490 else 493 491 params->trace_output = "osnoise_trace.txt"; 494 492 break; ··· 520 510 } else { 521 511 osnoise_top_usage(params, "--filter requires a previous -e\n"); 522 512 } 513 + break; 514 + case '2': 515 + params->warmup = get_llong_from_str(optarg); 516 + break; 517 + case '3': 518 + params->buffer_size = get_llong_from_str(optarg); 523 519 break; 524 520 default: 525 521 osnoise_top_usage(params, "Invalid option"); ··· 735 719 if (retval) 736 720 goto out_top; 737 721 } 722 + 723 + if (params->buffer_size > 0) { 724 + retval = trace_set_buffer_size(&record->trace, params->buffer_size); 725 + if (retval) 726 + goto out_top; 727 + } 738 728 } 739 729 740 730 /* ··· 753 731 if (params->trace_output) 754 732 trace_instance_start(&record->trace); 755 733 trace_instance_start(trace); 734 + 735 + if (params->warmup > 0) { 736 + debug_msg("Warming up for %d seconds\n", params->warmup); 737 + sleep(params->warmup); 738 + if (stop_tracing) 739 + goto out_top; 740 + 741 + /* 742 + * Clean up the buffer. The osnoise workload do not run 743 + * with tracing off to avoid creating a performance penalty 744 + * when not needed. 745 + */ 746 + retval = tracefs_instance_file_write(trace->inst, "trace", ""); 747 + if (retval < 0) { 748 + debug_msg("Error cleaning up the buffer"); 749 + goto out_top; 750 + } 751 + 752 + } 756 753 757 754 tool->start_time = time(NULL); 758 755 osnoise_top_set_signals(params);
+63 -46
tools/tracing/rtla/src/timerlat_aa.c
··· 16 16 TIMERLAT_WAITING_THREAD, 17 17 }; 18 18 19 + /* Used to fill spaces in the output */ 20 + static const char *spaces = " "; 21 + 19 22 #define MAX_COMM 24 20 23 21 24 /* ··· 277 274 taa_data->prev_irq_timstamp = start; 278 275 279 276 trace_seq_reset(taa_data->prev_irqs_seq); 280 - trace_seq_printf(taa_data->prev_irqs_seq, "\t%24s \t\t\t%9.2f us\n", 281 - "nmi", ns_to_usf(duration)); 277 + trace_seq_printf(taa_data->prev_irqs_seq, " %24s %.*s %9.2f us\n", 278 + "nmi", 279 + 24, spaces, 280 + ns_to_usf(duration)); 282 281 return 0; 283 282 } 284 283 285 284 taa_data->thread_nmi_sum += duration; 286 - trace_seq_printf(taa_data->nmi_seq, " %24s \t\t\t%9.2f us\n", 287 - "nmi", ns_to_usf(duration)); 285 + trace_seq_printf(taa_data->nmi_seq, " %24s %.*s %9.2f us\n", 286 + "nmi", 287 + 24, spaces, ns_to_usf(duration)); 288 288 289 289 return 0; 290 290 } ··· 329 323 taa_data->prev_irq_timstamp = start; 330 324 331 325 trace_seq_reset(taa_data->prev_irqs_seq); 332 - trace_seq_printf(taa_data->prev_irqs_seq, "\t%24s:%-3llu \t\t%9.2f us\n", 333 - desc, vector, ns_to_usf(duration)); 326 + trace_seq_printf(taa_data->prev_irqs_seq, " %24s:%-3llu %.*s %9.2f us\n", 327 + desc, vector, 328 + 15, spaces, 329 + ns_to_usf(duration)); 334 330 return 0; 335 331 } 336 332 ··· 380 372 * IRQ interference. 381 373 */ 382 374 taa_data->thread_irq_sum += duration; 383 - trace_seq_printf(taa_data->irqs_seq, " %24s:%-3llu \t %9.2f us\n", 384 - desc, vector, ns_to_usf(duration)); 375 + trace_seq_printf(taa_data->irqs_seq, " %24s:%-3llu %.*s %9.2f us\n", 376 + desc, vector, 377 + 24, spaces, 378 + ns_to_usf(duration)); 385 379 386 380 return 0; 387 381 } ··· 418 408 419 409 taa_data->thread_softirq_sum += duration; 420 410 421 - trace_seq_printf(taa_data->softirqs_seq, "\t%24s:%-3llu \t %9.2f us\n", 422 - softirq_name[vector], vector, ns_to_usf(duration)); 411 + trace_seq_printf(taa_data->softirqs_seq, " %24s:%-3llu %.*s %9.2f us\n", 412 + softirq_name[vector], vector, 413 + 24, spaces, 414 + ns_to_usf(duration)); 423 415 return 0; 424 416 } 425 417 ··· 464 452 } else { 465 453 taa_data->thread_thread_sum += duration; 466 454 467 - trace_seq_printf(taa_data->threads_seq, "\t%24s:%-3llu \t\t%9.2f us\n", 468 - comm, pid, ns_to_usf(duration)); 455 + trace_seq_printf(taa_data->threads_seq, " %24s:%-12llu %.*s %9.2f us\n", 456 + comm, pid, 457 + 15, spaces, 458 + ns_to_usf(duration)); 469 459 } 470 460 471 461 return 0; ··· 496 482 function = tep_find_function(taa_ctx->tool->trace.tep, caller[i]); 497 483 if (!function) 498 484 break; 499 - trace_seq_printf(taa_data->stack_seq, "\t\t-> %s\n", function); 485 + trace_seq_printf(taa_data->stack_seq, " %.*s -> %s\n", 486 + 14, spaces, function); 500 487 } 501 488 } 502 489 return 0; ··· 583 568 exp_irq_ts = taa_data->timer_irq_start_time - taa_data->timer_irq_start_delay; 584 569 if (exp_irq_ts < taa_data->prev_irq_timstamp + taa_data->prev_irq_duration) { 585 570 if (taa_data->prev_irq_timstamp < taa_data->timer_irq_start_time) 586 - printf(" Previous IRQ interference: \t\t up to %9.2f us\n", 587 - ns_to_usf(taa_data->prev_irq_duration)); 571 + printf(" Previous IRQ interference: %.*s up to %9.2f us\n", 572 + 16, spaces, 573 + ns_to_usf(taa_data->prev_irq_duration)); 588 574 } 589 575 590 576 /* 591 577 * The delay that the IRQ suffered before starting. 592 578 */ 593 - printf(" IRQ handler delay: %16s %9.2f us (%.2f %%)\n", 594 - (ns_to_usf(taa_data->timer_exit_from_idle) > 10) ? "(exit from idle)" : "", 595 - ns_to_usf(taa_data->timer_irq_start_delay), 596 - ns_to_per(total, taa_data->timer_irq_start_delay)); 579 + printf(" IRQ handler delay: %.*s %16s %9.2f us (%.2f %%)\n", 16, spaces, 580 + (ns_to_usf(taa_data->timer_exit_from_idle) > 10) ? "(exit from idle)" : "", 581 + ns_to_usf(taa_data->timer_irq_start_delay), 582 + ns_to_per(total, taa_data->timer_irq_start_delay)); 597 583 598 584 /* 599 585 * Timerlat IRQ. 600 586 */ 601 - printf(" IRQ latency: \t\t\t\t %9.2f us\n", 602 - ns_to_usf(taa_data->tlat_irq_latency)); 587 + printf(" IRQ latency: %.*s %9.2f us\n", 40, spaces, 588 + ns_to_usf(taa_data->tlat_irq_latency)); 603 589 604 590 if (irq) { 605 591 /* ··· 611 595 * so it will be displayed, it is the key. 612 596 */ 613 597 printf(" Blocking thread:\n"); 614 - printf(" %24s:%-9llu\n", 615 - taa_data->run_thread_comm, taa_data->run_thread_pid); 598 + printf(" %.*s %24s:%-9llu\n", 6, spaces, taa_data->run_thread_comm, 599 + taa_data->run_thread_pid); 616 600 } else { 617 601 /* 618 602 * The duration of the IRQ handler that handled the timerlat IRQ. 619 603 */ 620 - printf(" Timerlat IRQ duration: \t\t %9.2f us (%.2f %%)\n", 621 - ns_to_usf(taa_data->timer_irq_duration), 622 - ns_to_per(total, taa_data->timer_irq_duration)); 604 + printf(" Timerlat IRQ duration: %.*s %9.2f us (%.2f %%)\n", 605 + 30, spaces, 606 + ns_to_usf(taa_data->timer_irq_duration), 607 + ns_to_per(total, taa_data->timer_irq_duration)); 623 608 624 609 /* 625 610 * The amount of time that the current thread postponed the scheduler. ··· 628 611 * Recalling that it is net from NMI/IRQ/Softirq interference, so there 629 612 * is no need to compute values here. 630 613 */ 631 - printf(" Blocking thread: \t\t\t %9.2f us (%.2f %%)\n", 632 - ns_to_usf(taa_data->thread_blocking_duration), 633 - ns_to_per(total, taa_data->thread_blocking_duration)); 614 + printf(" Blocking thread: %.*s %9.2f us (%.2f %%)\n", 36, spaces, 615 + ns_to_usf(taa_data->thread_blocking_duration), 616 + ns_to_per(total, taa_data->thread_blocking_duration)); 634 617 635 - printf(" %24s:%-9llu %9.2f us\n", 636 - taa_data->run_thread_comm, taa_data->run_thread_pid, 637 - ns_to_usf(taa_data->thread_blocking_duration)); 618 + printf(" %.*s %24s:%-9llu %.*s %9.2f us\n", 6, spaces, 619 + taa_data->run_thread_comm, taa_data->run_thread_pid, 620 + 12, spaces, ns_to_usf(taa_data->thread_blocking_duration)); 638 621 } 639 622 640 623 /* ··· 646 629 * NMIs can happen during the IRQ, so they are always possible. 647 630 */ 648 631 if (taa_data->thread_nmi_sum) 649 - printf(" NMI interference \t\t\t %9.2f us (%.2f %%)\n", 650 - ns_to_usf(taa_data->thread_nmi_sum), 651 - ns_to_per(total, taa_data->thread_nmi_sum)); 632 + printf(" NMI interference %.*s %9.2f us (%.2f %%)\n", 36, spaces, 633 + ns_to_usf(taa_data->thread_nmi_sum), 634 + ns_to_per(total, taa_data->thread_nmi_sum)); 652 635 653 636 /* 654 637 * If it is an IRQ latency, the other factors can be skipped. ··· 660 643 * Prints the interference caused by IRQs to the thread latency. 661 644 */ 662 645 if (taa_data->thread_irq_sum) { 663 - printf(" IRQ interference \t\t\t %9.2f us (%.2f %%)\n", 664 - ns_to_usf(taa_data->thread_irq_sum), 665 - ns_to_per(total, taa_data->thread_irq_sum)); 646 + printf(" IRQ interference %.*s %9.2f us (%.2f %%)\n", 36, spaces, 647 + ns_to_usf(taa_data->thread_irq_sum), 648 + ns_to_per(total, taa_data->thread_irq_sum)); 666 649 667 650 trace_seq_do_printf(taa_data->irqs_seq); 668 651 } ··· 671 654 * Prints the interference caused by Softirqs to the thread latency. 672 655 */ 673 656 if (taa_data->thread_softirq_sum) { 674 - printf(" Softirq interference \t\t\t %9.2f us (%.2f %%)\n", 675 - ns_to_usf(taa_data->thread_softirq_sum), 676 - ns_to_per(total, taa_data->thread_softirq_sum)); 657 + printf(" Softirq interference %.*s %9.2f us (%.2f %%)\n", 32, spaces, 658 + ns_to_usf(taa_data->thread_softirq_sum), 659 + ns_to_per(total, taa_data->thread_softirq_sum)); 677 660 678 661 trace_seq_do_printf(taa_data->softirqs_seq); 679 662 } ··· 687 670 * timer handling latency. 688 671 */ 689 672 if (taa_data->thread_thread_sum) { 690 - printf(" Thread interference \t\t\t %9.2f us (%.2f %%)\n", 691 - ns_to_usf(taa_data->thread_thread_sum), 692 - ns_to_per(total, taa_data->thread_thread_sum)); 673 + printf(" Thread interference %.*s %9.2f us (%.2f %%)\n", 33, spaces, 674 + ns_to_usf(taa_data->thread_thread_sum), 675 + ns_to_per(total, taa_data->thread_thread_sum)); 693 676 694 677 trace_seq_do_printf(taa_data->threads_seq); 695 678 } ··· 699 682 */ 700 683 print_total: 701 684 printf("------------------------------------------------------------------------\n"); 702 - printf(" %s latency: \t\t\t %9.2f us (100%%)\n", irq ? "IRQ" : "Thread", 703 - ns_to_usf(total)); 685 + printf(" %s latency: %.*s %9.2f us (100%%)\n", irq ? " IRQ" : "Thread", 686 + 37, spaces, ns_to_usf(total)); 704 687 } 705 688 706 689 static int timerlat_auto_analysis_collect_trace(struct timerlat_aa_context *taa_ctx)
+251 -43
tools/tracing/rtla/src/timerlat_hist.c
··· 40 40 int no_aa; 41 41 int dump_tasks; 42 42 int user_workload; 43 + int kernel_workload; 43 44 int user_hist; 44 45 cpu_set_t hk_cpu_set; 45 46 struct sched_attr sched_param; ··· 53 52 char with_zeros; 54 53 int bucket_size; 55 54 int entries; 55 + int warmup; 56 + int buffer_size; 56 57 }; 57 58 58 59 struct timerlat_hist_cpu { ··· 327 324 if (!data->hist[cpu].irq_count && !data->hist[cpu].thread_count) 328 325 continue; 329 326 330 - if (!params->no_irq) 331 - trace_seq_printf(trace->seq, "%9llu ", 332 - data->hist[cpu].min_irq); 327 + if (!params->no_irq) { 328 + if (data->hist[cpu].irq_count) 329 + trace_seq_printf(trace->seq, "%9llu ", 330 + data->hist[cpu].min_irq); 331 + else 332 + trace_seq_printf(trace->seq, " - "); 333 + } 333 334 334 - if (!params->no_thread) 335 - trace_seq_printf(trace->seq, "%9llu ", 336 - data->hist[cpu].min_thread); 335 + if (!params->no_thread) { 336 + if (data->hist[cpu].thread_count) 337 + trace_seq_printf(trace->seq, "%9llu ", 338 + data->hist[cpu].min_thread); 339 + else 340 + trace_seq_printf(trace->seq, " - "); 341 + } 337 342 338 - if (params->user_hist) 339 - trace_seq_printf(trace->seq, "%9llu ", 340 - data->hist[cpu].min_user); 343 + if (params->user_hist) { 344 + if (data->hist[cpu].user_count) 345 + trace_seq_printf(trace->seq, "%9llu ", 346 + data->hist[cpu].min_user); 347 + else 348 + trace_seq_printf(trace->seq, " - "); 349 + } 341 350 } 342 351 trace_seq_printf(trace->seq, "\n"); 343 352 ··· 399 384 if (!data->hist[cpu].irq_count && !data->hist[cpu].thread_count) 400 385 continue; 401 386 402 - if (!params->no_irq) 403 - trace_seq_printf(trace->seq, "%9llu ", 404 - data->hist[cpu].max_irq); 387 + if (!params->no_irq) { 388 + if (data->hist[cpu].irq_count) 389 + trace_seq_printf(trace->seq, "%9llu ", 390 + data->hist[cpu].max_irq); 391 + else 392 + trace_seq_printf(trace->seq, " - "); 393 + } 405 394 406 - if (!params->no_thread) 407 - trace_seq_printf(trace->seq, "%9llu ", 408 - data->hist[cpu].max_thread); 395 + if (!params->no_thread) { 396 + if (data->hist[cpu].thread_count) 397 + trace_seq_printf(trace->seq, "%9llu ", 398 + data->hist[cpu].max_thread); 399 + else 400 + trace_seq_printf(trace->seq, " - "); 401 + } 409 402 410 - if (params->user_hist) 411 - trace_seq_printf(trace->seq, "%9llu ", 412 - data->hist[cpu].max_user); 403 + if (params->user_hist) { 404 + if (data->hist[cpu].user_count) 405 + trace_seq_printf(trace->seq, "%9llu ", 406 + data->hist[cpu].max_user); 407 + else 408 + trace_seq_printf(trace->seq, " - "); 409 + } 413 410 } 414 411 trace_seq_printf(trace->seq, "\n"); 415 412 trace_seq_do_printf(trace->seq); 416 413 trace_seq_reset(trace->seq); 417 414 } 418 415 416 + static void 417 + timerlat_print_stats_all(struct timerlat_hist_params *params, 418 + struct trace_instance *trace, 419 + struct timerlat_hist_data *data) 420 + { 421 + struct timerlat_hist_cpu *cpu_data; 422 + struct timerlat_hist_cpu sum; 423 + int cpu; 424 + 425 + if (params->no_summary) 426 + return; 427 + 428 + memset(&sum, 0, sizeof(sum)); 429 + sum.min_irq = ~0; 430 + sum.min_thread = ~0; 431 + sum.min_user = ~0; 432 + 433 + for (cpu = 0; cpu < data->nr_cpus; cpu++) { 434 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 435 + continue; 436 + 437 + if (!data->hist[cpu].irq_count && !data->hist[cpu].thread_count) 438 + continue; 439 + 440 + cpu_data = &data->hist[cpu]; 441 + 442 + sum.irq_count += cpu_data->irq_count; 443 + update_min(&sum.min_irq, &cpu_data->min_irq); 444 + update_sum(&sum.sum_irq, &cpu_data->sum_irq); 445 + update_max(&sum.max_irq, &cpu_data->max_irq); 446 + 447 + sum.thread_count += cpu_data->thread_count; 448 + update_min(&sum.min_thread, &cpu_data->min_thread); 449 + update_sum(&sum.sum_thread, &cpu_data->sum_thread); 450 + update_max(&sum.max_thread, &cpu_data->max_thread); 451 + 452 + sum.user_count += cpu_data->user_count; 453 + update_min(&sum.min_user, &cpu_data->min_user); 454 + update_sum(&sum.sum_user, &cpu_data->sum_user); 455 + update_max(&sum.max_user, &cpu_data->max_user); 456 + } 457 + 458 + if (!params->no_index) 459 + trace_seq_printf(trace->seq, "ALL: "); 460 + 461 + if (!params->no_irq) 462 + trace_seq_printf(trace->seq, " IRQ"); 463 + 464 + if (!params->no_thread) 465 + trace_seq_printf(trace->seq, " Thr"); 466 + 467 + if (params->user_hist) 468 + trace_seq_printf(trace->seq, " Usr"); 469 + 470 + trace_seq_printf(trace->seq, "\n"); 471 + 472 + if (!params->no_index) 473 + trace_seq_printf(trace->seq, "count:"); 474 + 475 + if (!params->no_irq) 476 + trace_seq_printf(trace->seq, "%9d ", 477 + sum.irq_count); 478 + 479 + if (!params->no_thread) 480 + trace_seq_printf(trace->seq, "%9d ", 481 + sum.thread_count); 482 + 483 + if (params->user_hist) 484 + trace_seq_printf(trace->seq, "%9d ", 485 + sum.user_count); 486 + 487 + trace_seq_printf(trace->seq, "\n"); 488 + 489 + if (!params->no_index) 490 + trace_seq_printf(trace->seq, "min: "); 491 + 492 + if (!params->no_irq) 493 + trace_seq_printf(trace->seq, "%9llu ", 494 + sum.min_irq); 495 + 496 + if (!params->no_thread) 497 + trace_seq_printf(trace->seq, "%9llu ", 498 + sum.min_thread); 499 + 500 + if (params->user_hist) 501 + trace_seq_printf(trace->seq, "%9llu ", 502 + sum.min_user); 503 + 504 + trace_seq_printf(trace->seq, "\n"); 505 + 506 + if (!params->no_index) 507 + trace_seq_printf(trace->seq, "avg: "); 508 + 509 + if (!params->no_irq) 510 + trace_seq_printf(trace->seq, "%9llu ", 511 + sum.sum_irq / sum.irq_count); 512 + 513 + if (!params->no_thread) 514 + trace_seq_printf(trace->seq, "%9llu ", 515 + sum.sum_thread / sum.thread_count); 516 + 517 + if (params->user_hist) 518 + trace_seq_printf(trace->seq, "%9llu ", 519 + sum.sum_user / sum.user_count); 520 + 521 + trace_seq_printf(trace->seq, "\n"); 522 + 523 + if (!params->no_index) 524 + trace_seq_printf(trace->seq, "max: "); 525 + 526 + if (!params->no_irq) 527 + trace_seq_printf(trace->seq, "%9llu ", 528 + sum.max_irq); 529 + 530 + if (!params->no_thread) 531 + trace_seq_printf(trace->seq, "%9llu ", 532 + sum.max_thread); 533 + 534 + if (params->user_hist) 535 + trace_seq_printf(trace->seq, "%9llu ", 536 + sum.max_user); 537 + 538 + trace_seq_printf(trace->seq, "\n"); 539 + trace_seq_do_printf(trace->seq); 540 + trace_seq_reset(trace->seq); 541 + } 542 + 419 543 /* 420 - * timerlat_print_stats - print data for all CPUs 544 + * timerlat_print_stats - print data for each CPUs 421 545 */ 422 546 static void 423 547 timerlat_print_stats(struct timerlat_hist_params *params, struct osnoise_tool *tool) ··· 639 485 trace_seq_reset(trace->seq); 640 486 641 487 timerlat_print_summary(params, trace, data); 488 + timerlat_print_stats_all(params, trace, data); 642 489 } 643 490 644 491 /* ··· 652 497 char *msg[] = { 653 498 "", 654 499 " usage: [rtla] timerlat hist [-h] [-q] [-d s] [-D] [-n] [-a us] [-p us] [-i us] [-T us] [-s us] \\", 655 - " [-t[=file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] [-c cpu-list] [-H cpu-list]\\", 500 + " [-t[file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] [-c cpu-list] [-H cpu-list]\\", 656 501 " [-P priority] [-E N] [-b N] [--no-irq] [--no-thread] [--no-header] [--no-summary] \\", 657 - " [--no-index] [--with-zeros] [--dma-latency us] [-C[=cgroup_name]] [--no-aa] [--dump-task] [-u]", 502 + " [--no-index] [--with-zeros] [--dma-latency us] [-C[=cgroup_name]] [--no-aa] [--dump-task] [-u|-k]", 503 + " [--warm-up s]", 658 504 "", 659 505 " -h/--help: print this menu", 660 506 " -a/--auto: set automatic trace mode, stopping the session if argument in us latency is hit", ··· 669 513 " -d/--duration time[m|h|d]: duration of the session in seconds", 670 514 " --dump-tasks: prints the task running on all CPUs if stop conditions are met (depends on !--no-aa)", 671 515 " -D/--debug: print debug info", 672 - " -t/--trace[=file]: save the stopped trace to [file|timerlat_trace.txt]", 516 + " -t/--trace[file]: save the stopped trace to [file|timerlat_trace.txt]", 673 517 " -e/--event <sys:event>: enable the <sys:event> in the trace instance, multiple -e are allowed", 674 518 " --filter <filter>: enable a trace event filter to the previous -e event", 675 519 " --trigger <trigger>: enable a trace event trigger to the previous -e event", ··· 690 534 " f:prio - use SCHED_FIFO with prio", 691 535 " d:runtime[us|ms|s]:period[us|ms|s] - use SCHED_DEADLINE with runtime and period", 692 536 " in nanoseconds", 693 - " -u/--user-threads: use rtla user-space threads instead of in-kernel timerlat threads", 537 + " -u/--user-threads: use rtla user-space threads instead of kernel-space timerlat threads", 538 + " -k/--kernel-threads: use timerlat kernel-space threads instead of rtla user-space threads", 694 539 " -U/--user-load: enable timerlat for user-defined user-space workload", 540 + " --warm-up s: let the workload run for s seconds before collecting data", 541 + " --trace-buffer-size kB: set the per-cpu trace buffer size in kB", 695 542 NULL, 696 543 }; 697 544 ··· 756 597 {"thread", required_argument, 0, 'T'}, 757 598 {"trace", optional_argument, 0, 't'}, 758 599 {"user-threads", no_argument, 0, 'u'}, 600 + {"kernel-threads", no_argument, 0, 'k'}, 759 601 {"user-load", no_argument, 0, 'U'}, 760 602 {"event", required_argument, 0, 'e'}, 761 603 {"no-irq", no_argument, 0, '0'}, ··· 770 610 {"dma-latency", required_argument, 0, '8'}, 771 611 {"no-aa", no_argument, 0, '9'}, 772 612 {"dump-task", no_argument, 0, '\1'}, 613 + {"warm-up", required_argument, 0, '\2'}, 614 + {"trace-buffer-size", required_argument, 0, '\3'}, 773 615 {0, 0, 0, 0} 774 616 }; 775 617 776 618 /* getopt_long stores the option index here. */ 777 619 int option_index = 0; 778 620 779 - c = getopt_long(argc, argv, "a:c:C::b:d:e:E:DhH:i:np:P:s:t::T:uU0123456:7:8:9\1", 621 + c = getopt_long(argc, argv, "a:c:C::b:d:e:E:DhH:i:knp:P:s:t::T:uU0123456:7:8:9\1\2:\3", 780 622 long_options, &option_index); 781 623 782 624 /* detect the end of the options. */ ··· 861 699 case 'i': 862 700 params->stop_us = get_llong_from_str(optarg); 863 701 break; 702 + case 'k': 703 + params->kernel_workload = 1; 704 + break; 864 705 case 'n': 865 706 params->output_divisor = 1; 866 707 break; ··· 885 720 params->stop_total_us = get_llong_from_str(optarg); 886 721 break; 887 722 case 't': 888 - if (optarg) 889 - /* skip = */ 890 - params->trace_output = &optarg[1]; 723 + if (optarg) { 724 + if (optarg[0] == '=') 725 + params->trace_output = &optarg[1]; 726 + else 727 + params->trace_output = &optarg[0]; 728 + } else if (optind < argc && argv[optind][0] != '-') 729 + params->trace_output = argv[optind]; 891 730 else 892 731 params->trace_output = "timerlat_trace.txt"; 893 732 break; ··· 954 785 case '\1': 955 786 params->dump_tasks = 1; 956 787 break; 788 + case '\2': 789 + params->warmup = get_llong_from_str(optarg); 790 + break; 791 + case '\3': 792 + params->buffer_size = get_llong_from_str(optarg); 793 + break; 957 794 default: 958 795 timerlat_hist_usage("Invalid option"); 959 796 } ··· 981 806 */ 982 807 if (!params->stop_us && !params->stop_total_us) 983 808 params->no_aa = 1; 809 + 810 + if (params->kernel_workload && params->user_workload) 811 + timerlat_hist_usage("--kernel-threads and --user-threads are mutually exclusive!"); 984 812 985 813 return params; 986 814 } ··· 1058 880 * No need to check results as this is an automatic attempt. 1059 881 */ 1060 882 auto_house_keeping(&params->monitored_cpus); 883 + } 884 + 885 + /* 886 + * If the user did not specify a type of thread, try user-threads first. 887 + * Fall back to kernel threads otherwise. 888 + */ 889 + if (!params->kernel_workload && !params->user_workload) { 890 + retval = tracefs_file_exists(NULL, "osnoise/per_cpu/cpu0/timerlat_fd"); 891 + if (retval) { 892 + debug_msg("User-space interface detected, setting user-threads\n"); 893 + params->user_workload = 1; 894 + params->user_hist = 1; 895 + } else { 896 + debug_msg("User-space interface not detected, setting kernel-threads\n"); 897 + params->kernel_workload = 1; 898 + } 1061 899 } 1062 900 1063 901 if (params->user_hist) { ··· 1213 1019 if (retval) 1214 1020 goto out_hist; 1215 1021 } 1022 + 1023 + if (params->buffer_size > 0) { 1024 + retval = trace_set_buffer_size(&record->trace, params->buffer_size); 1025 + if (retval) 1026 + goto out_hist; 1027 + } 1216 1028 } 1217 1029 1218 1030 if (!params->no_aa) { ··· 1239 1039 } 1240 1040 } 1241 1041 1242 - /* 1243 - * Start the tracers here, after having set all instances. 1244 - * 1245 - * Let the trace instance start first for the case of hitting a stop 1246 - * tracing while enabling other instances. The trace instance is the 1247 - * one with most valuable information. 1248 - */ 1249 - if (params->trace_output) 1250 - trace_instance_start(&record->trace); 1251 - if (!params->no_aa) 1252 - trace_instance_start(&aa->trace); 1253 - trace_instance_start(trace); 1254 - 1255 - tool->start_time = time(NULL); 1256 - timerlat_hist_set_signals(params); 1257 - 1258 1042 if (params->user_workload) { 1259 1043 /* rtla asked to stop */ 1260 1044 params_u.should_run = 1; ··· 1257 1073 if (retval) 1258 1074 err_msg("Error creating timerlat user-space threads\n"); 1259 1075 } 1076 + 1077 + if (params->warmup > 0) { 1078 + debug_msg("Warming up for %d seconds\n", params->warmup); 1079 + sleep(params->warmup); 1080 + if (stop_tracing) 1081 + goto out_hist; 1082 + } 1083 + 1084 + /* 1085 + * Start the tracers here, after having set all instances. 1086 + * 1087 + * Let the trace instance start first for the case of hitting a stop 1088 + * tracing while enabling other instances. The trace instance is the 1089 + * one with most valuable information. 1090 + */ 1091 + if (params->trace_output) 1092 + trace_instance_start(&record->trace); 1093 + if (!params->no_aa) 1094 + trace_instance_start(&aa->trace); 1095 + trace_instance_start(trace); 1096 + 1097 + tool->start_time = time(NULL); 1098 + timerlat_hist_set_signals(params); 1260 1099 1261 1100 while (!stop_tracing) { 1262 1101 sleep(params->sleep_time); ··· 1306 1099 } 1307 1100 } 1308 1101 } 1102 + 1309 1103 if (params->user_workload && !params_u.stopped_running) { 1310 1104 params_u.should_run = 0; 1311 1105 sleep(1);
+207 -43
tools/tracing/rtla/src/timerlat_top.c
··· 44 44 int hk_cpus; 45 45 int user_top; 46 46 int user_workload; 47 + int kernel_workload; 48 + int pretty_output; 49 + int warmup; 50 + int buffer_size; 47 51 cpu_set_t hk_cpu_set; 48 52 struct sched_attr sched_param; 49 53 struct trace_events *events; ··· 122 118 return NULL; 123 119 } 124 120 121 + static void 122 + timerlat_top_reset_sum(struct timerlat_top_cpu *summary) 123 + { 124 + memset(summary, 0, sizeof(*summary)); 125 + summary->min_irq = ~0; 126 + summary->min_thread = ~0; 127 + summary->min_user = ~0; 128 + } 129 + 130 + static void 131 + timerlat_top_update_sum(struct osnoise_tool *tool, int cpu, struct timerlat_top_cpu *sum) 132 + { 133 + struct timerlat_top_data *data = tool->data; 134 + struct timerlat_top_cpu *cpu_data = &data->cpu_data[cpu]; 135 + 136 + sum->irq_count += cpu_data->irq_count; 137 + update_min(&sum->min_irq, &cpu_data->min_irq); 138 + update_sum(&sum->sum_irq, &cpu_data->sum_irq); 139 + update_max(&sum->max_irq, &cpu_data->max_irq); 140 + 141 + sum->thread_count += cpu_data->thread_count; 142 + update_min(&sum->min_thread, &cpu_data->min_thread); 143 + update_sum(&sum->sum_thread, &cpu_data->sum_thread); 144 + update_max(&sum->max_thread, &cpu_data->max_thread); 145 + 146 + sum->user_count += cpu_data->user_count; 147 + update_min(&sum->min_user, &cpu_data->min_user); 148 + update_sum(&sum->sum_user, &cpu_data->sum_user); 149 + update_max(&sum->max_user, &cpu_data->max_user); 150 + } 151 + 125 152 /* 126 153 * timerlat_hist_update - record a new timerlat occurent on cpu, updating data 127 154 */ ··· 214 179 /* 215 180 * timerlat_top_header - print the header of the tool output 216 181 */ 217 - static void timerlat_top_header(struct osnoise_tool *top) 182 + static void timerlat_top_header(struct timerlat_top_params *params, struct osnoise_tool *top) 218 183 { 219 - struct timerlat_top_params *params = top->params; 220 184 struct trace_seq *s = top->trace.seq; 221 185 char duration[26]; 222 186 223 187 get_duration(top->start_time, duration, sizeof(duration)); 224 188 225 - trace_seq_printf(s, "\033[2;37;40m"); 189 + if (params->pretty_output) 190 + trace_seq_printf(s, "\033[2;37;40m"); 191 + 226 192 trace_seq_printf(s, " Timer Latency "); 227 193 if (params->user_top) 228 194 trace_seq_printf(s, " "); 229 - trace_seq_printf(s, "\033[0;0;0m"); 195 + 196 + if (params->pretty_output) 197 + trace_seq_printf(s, "\033[0;0;0m"); 230 198 trace_seq_printf(s, "\n"); 231 199 232 200 trace_seq_printf(s, "%-6s | IRQ Timer Latency (%s) | Thread Timer Latency (%s)", duration, ··· 242 204 } 243 205 244 206 trace_seq_printf(s, "\n"); 245 - trace_seq_printf(s, "\033[2;30;47m"); 207 + if (params->pretty_output) 208 + trace_seq_printf(s, "\033[2;30;47m"); 209 + 246 210 trace_seq_printf(s, "CPU COUNT | cur min avg max | cur min avg max"); 247 211 if (params->user_top) 248 212 trace_seq_printf(s, " | cur min avg max"); 249 - trace_seq_printf(s, "\033[0;0;0m"); 213 + 214 + if (params->pretty_output) 215 + trace_seq_printf(s, "\033[0;0;0m"); 250 216 trace_seq_printf(s, "\n"); 251 217 } 218 + 219 + static const char *no_value = " -"; 252 220 253 221 /* 254 222 * timerlat_top_print - prints the output of a given CPU ··· 283 239 trace_seq_printf(s, "%3d #%-9d |", cpu, cpu_data->irq_count); 284 240 285 241 if (!cpu_data->irq_count) { 286 - trace_seq_printf(s, " - "); 287 - trace_seq_printf(s, " - "); 288 - trace_seq_printf(s, " - "); 289 - trace_seq_printf(s, " - |"); 242 + trace_seq_printf(s, "%s %s %s %s |", no_value, no_value, no_value, no_value); 290 243 } else { 291 244 trace_seq_printf(s, "%9llu ", cpu_data->cur_irq / params->output_divisor); 292 245 trace_seq_printf(s, "%9llu ", cpu_data->min_irq / params->output_divisor); ··· 292 251 } 293 252 294 253 if (!cpu_data->thread_count) { 295 - trace_seq_printf(s, " - "); 296 - trace_seq_printf(s, " - "); 297 - trace_seq_printf(s, " - "); 298 - trace_seq_printf(s, " -\n"); 254 + trace_seq_printf(s, "%s %s %s %s", no_value, no_value, no_value, no_value); 299 255 } else { 300 256 trace_seq_printf(s, "%9llu ", cpu_data->cur_thread / divisor); 301 257 trace_seq_printf(s, "%9llu ", cpu_data->min_thread / divisor); ··· 309 271 trace_seq_printf(s, " |"); 310 272 311 273 if (!cpu_data->user_count) { 312 - trace_seq_printf(s, " - "); 313 - trace_seq_printf(s, " - "); 314 - trace_seq_printf(s, " - "); 315 - trace_seq_printf(s, " -\n"); 274 + trace_seq_printf(s, "%s %s %s %s\n", no_value, no_value, no_value, no_value); 316 275 } else { 317 276 trace_seq_printf(s, "%9llu ", cpu_data->cur_user / divisor); 318 277 trace_seq_printf(s, "%9llu ", cpu_data->min_user / divisor); 319 278 trace_seq_printf(s, "%9llu ", 320 279 (cpu_data->sum_user / cpu_data->user_count) / divisor); 321 280 trace_seq_printf(s, "%9llu\n", cpu_data->max_user / divisor); 281 + } 282 + } 283 + 284 + /* 285 + * timerlat_top_print_sum - prints the summary output 286 + */ 287 + static void 288 + timerlat_top_print_sum(struct osnoise_tool *top, struct timerlat_top_cpu *summary) 289 + { 290 + const char *split = "----------------------------------------"; 291 + struct timerlat_top_params *params = top->params; 292 + unsigned long long count = summary->irq_count; 293 + int divisor = params->output_divisor; 294 + struct trace_seq *s = top->trace.seq; 295 + int e = 0; 296 + 297 + if (divisor == 0) 298 + return; 299 + 300 + /* 301 + * Skip if no data is available: is this cpu offline? 302 + */ 303 + if (!summary->irq_count && !summary->thread_count) 304 + return; 305 + 306 + while (count > 999999) { 307 + e++; 308 + count /= 10; 309 + } 310 + 311 + trace_seq_printf(s, "%.*s|%.*s|%.*s", 15, split, 40, split, 39, split); 312 + if (params->user_top) 313 + trace_seq_printf(s, "-|%.*s", 39, split); 314 + trace_seq_printf(s, "\n"); 315 + 316 + trace_seq_printf(s, "ALL #%-6llu e%d |", count, e); 317 + 318 + if (!summary->irq_count) { 319 + trace_seq_printf(s, " %s %s %s |", no_value, no_value, no_value); 320 + } else { 321 + trace_seq_printf(s, " "); 322 + trace_seq_printf(s, "%9llu ", summary->min_irq / params->output_divisor); 323 + trace_seq_printf(s, "%9llu ", (summary->sum_irq / summary->irq_count) / divisor); 324 + trace_seq_printf(s, "%9llu |", summary->max_irq / divisor); 325 + } 326 + 327 + if (!summary->thread_count) { 328 + trace_seq_printf(s, "%s %s %s %s", no_value, no_value, no_value, no_value); 329 + } else { 330 + trace_seq_printf(s, " "); 331 + trace_seq_printf(s, "%9llu ", summary->min_thread / divisor); 332 + trace_seq_printf(s, "%9llu ", 333 + (summary->sum_thread / summary->thread_count) / divisor); 334 + trace_seq_printf(s, "%9llu", summary->max_thread / divisor); 335 + } 336 + 337 + if (!params->user_top) { 338 + trace_seq_printf(s, "\n"); 339 + return; 340 + } 341 + 342 + trace_seq_printf(s, " |"); 343 + 344 + if (!summary->user_count) { 345 + trace_seq_printf(s, " %s %s %s |", no_value, no_value, no_value); 346 + } else { 347 + trace_seq_printf(s, " "); 348 + trace_seq_printf(s, "%9llu ", summary->min_user / divisor); 349 + trace_seq_printf(s, "%9llu ", 350 + (summary->sum_user / summary->user_count) / divisor); 351 + trace_seq_printf(s, "%9llu\n", summary->max_user / divisor); 322 352 } 323 353 } 324 354 ··· 406 300 timerlat_print_stats(struct timerlat_top_params *params, struct osnoise_tool *top) 407 301 { 408 302 struct trace_instance *trace = &top->trace; 303 + struct timerlat_top_cpu summary; 409 304 static int nr_cpus = -1; 410 305 int i; 411 306 ··· 419 312 if (!params->quiet) 420 313 clear_terminal(trace->seq); 421 314 422 - timerlat_top_header(top); 315 + timerlat_top_reset_sum(&summary); 316 + 317 + timerlat_top_header(params, top); 423 318 424 319 for (i = 0; i < nr_cpus; i++) { 425 320 if (params->cpus && !CPU_ISSET(i, &params->monitored_cpus)) 426 321 continue; 427 322 timerlat_top_print(top, i); 323 + timerlat_top_update_sum(top, i, &summary); 428 324 } 325 + 326 + timerlat_top_print_sum(top, &summary); 429 327 430 328 trace_seq_do_printf(trace->seq); 431 329 trace_seq_reset(trace->seq); ··· 446 334 static const char *const msg[] = { 447 335 "", 448 336 " usage: rtla timerlat [top] [-h] [-q] [-a us] [-d s] [-D] [-n] [-p us] [-i us] [-T us] [-s us] \\", 449 - " [[-t[=file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] [-c cpu-list] [-H cpu-list]\\", 450 - " [-P priority] [--dma-latency us] [--aa-only us] [-C[=cgroup_name]] [-u]", 337 + " [[-t[file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] [-c cpu-list] [-H cpu-list]\\", 338 + " [-P priority] [--dma-latency us] [--aa-only us] [-C[=cgroup_name]] [-u|-k] [--warm-up s]", 451 339 "", 452 340 " -h/--help: print this menu", 453 341 " -a/--auto: set automatic trace mode, stopping the session if argument in us latency is hit", ··· 462 350 " -d/--duration time[m|h|d]: duration of the session in seconds", 463 351 " -D/--debug: print debug info", 464 352 " --dump-tasks: prints the task running on all CPUs if stop conditions are met (depends on !--no-aa)", 465 - " -t/--trace[=file]: save the stopped trace to [file|timerlat_trace.txt]", 353 + " -t/--trace[file]: save the stopped trace to [file|timerlat_trace.txt]", 466 354 " -e/--event <sys:event>: enable the <sys:event> in the trace instance, multiple -e are allowed", 467 355 " --filter <command>: enable a trace event filter to the previous -e event", 468 356 " --trigger <command>: enable a trace event trigger to the previous -e event", ··· 476 364 " f:prio - use SCHED_FIFO with prio", 477 365 " d:runtime[us|ms|s]:period[us|ms|s] - use SCHED_DEADLINE with runtime and period", 478 366 " in nanoseconds", 479 - " -u/--user-threads: use rtla user-space threads instead of in-kernel timerlat threads", 367 + " -u/--user-threads: use rtla user-space threads instead of kernel-space timerlat threads", 368 + " -k/--kernel-threads: use timerlat kernel-space threads instead of rtla user-space threads", 480 369 " -U/--user-load: enable timerlat for user-defined user-space workload", 370 + " --warm-up s: let the workload run for s seconds before collecting data", 371 + " --trace-buffer-size kB: set the per-cpu trace buffer size in kB", 481 372 NULL, 482 373 }; 483 374 ··· 540 425 {"thread", required_argument, 0, 'T'}, 541 426 {"trace", optional_argument, 0, 't'}, 542 427 {"user-threads", no_argument, 0, 'u'}, 428 + {"kernel-threads", no_argument, 0, 'k'}, 543 429 {"user-load", no_argument, 0, 'U'}, 544 430 {"trigger", required_argument, 0, '0'}, 545 431 {"filter", required_argument, 0, '1'}, ··· 548 432 {"no-aa", no_argument, 0, '3'}, 549 433 {"dump-tasks", no_argument, 0, '4'}, 550 434 {"aa-only", required_argument, 0, '5'}, 435 + {"warm-up", required_argument, 0, '6'}, 436 + {"trace-buffer-size", required_argument, 0, '7'}, 551 437 {0, 0, 0, 0} 552 438 }; 553 439 554 440 /* getopt_long stores the option index here. */ 555 441 int option_index = 0; 556 442 557 - c = getopt_long(argc, argv, "a:c:C::d:De:hH:i:np:P:qs:t::T:uU0:1:2:345:", 443 + c = getopt_long(argc, argv, "a:c:C::d:De:hH:i:knp:P:qs:t::T:uU0:1:2:345:6:7:", 558 444 long_options, &option_index); 559 445 560 446 /* detect the end of the options. */ ··· 641 523 case 'i': 642 524 params->stop_us = get_llong_from_str(optarg); 643 525 break; 526 + case 'k': 527 + params->kernel_workload = true; 528 + break; 644 529 case 'n': 645 530 params->output_divisor = 1; 646 531 break; ··· 668 547 params->stop_total_us = get_llong_from_str(optarg); 669 548 break; 670 549 case 't': 671 - if (optarg) 672 - /* skip = */ 673 - params->trace_output = &optarg[1]; 550 + if (optarg) { 551 + if (optarg[0] == '=') 552 + params->trace_output = &optarg[1]; 553 + else 554 + params->trace_output = &optarg[0]; 555 + } else if (optind < argc && argv[optind][0] != '-') 556 + params->trace_output = argv[optind]; 674 557 else 675 558 params->trace_output = "timerlat_trace.txt"; 676 559 ··· 720 595 case '4': 721 596 params->dump_tasks = 1; 722 597 break; 598 + case '6': 599 + params->warmup = get_llong_from_str(optarg); 600 + break; 601 + case '7': 602 + params->buffer_size = get_llong_from_str(optarg); 603 + break; 723 604 default: 724 605 timerlat_top_usage("Invalid option"); 725 606 } ··· 744 613 745 614 if (params->no_aa && params->aa_only) 746 615 timerlat_top_usage("--no-aa and --aa-only are mutually exclusive!"); 616 + 617 + if (params->kernel_workload && params->user_workload) 618 + timerlat_top_usage("--kernel-threads and --user-threads are mutually exclusive!"); 747 619 748 620 return params; 749 621 } ··· 826 692 auto_house_keeping(&params->monitored_cpus); 827 693 } 828 694 695 + /* 696 + * If the user did not specify a type of thread, try user-threads first. 697 + * Fall back to kernel threads otherwise. 698 + */ 699 + if (!params->kernel_workload && !params->user_workload) { 700 + retval = tracefs_file_exists(NULL, "osnoise/per_cpu/cpu0/timerlat_fd"); 701 + if (retval) { 702 + debug_msg("User-space interface detected, setting user-threads\n"); 703 + params->user_workload = 1; 704 + params->user_top = 1; 705 + } else { 706 + debug_msg("User-space interface not detected, setting kernel-threads\n"); 707 + params->kernel_workload = 1; 708 + } 709 + } 710 + 829 711 if (params->user_top) { 830 712 retval = osnoise_set_workload(top->context, 0); 831 713 if (retval) { ··· 849 699 goto out_err; 850 700 } 851 701 } 702 + 703 + if (isatty(1) && !params->quiet) 704 + params->pretty_output = 1; 852 705 853 706 return 0; 854 707 ··· 983 830 if (retval) 984 831 goto out_top; 985 832 } 833 + 834 + if (params->buffer_size > 0) { 835 + retval = trace_set_buffer_size(&record->trace, params->buffer_size); 836 + if (retval) 837 + goto out_top; 838 + } 986 839 } 987 840 988 841 if (!params->no_aa) { ··· 1018 859 } 1019 860 } 1020 861 1021 - /* 1022 - * Start the tracers here, after having set all instances. 1023 - * 1024 - * Let the trace instance start first for the case of hitting a stop 1025 - * tracing while enabling other instances. The trace instance is the 1026 - * one with most valuable information. 1027 - */ 1028 - if (params->trace_output) 1029 - trace_instance_start(&record->trace); 1030 - if (!params->no_aa && aa != top) 1031 - trace_instance_start(&aa->trace); 1032 - trace_instance_start(trace); 1033 - 1034 - top->start_time = time(NULL); 1035 - timerlat_top_set_signals(params); 1036 - 1037 862 if (params->user_workload) { 1038 863 /* rtla asked to stop */ 1039 864 params_u.should_run = 1; ··· 1036 893 if (retval) 1037 894 err_msg("Error creating timerlat user-space threads\n"); 1038 895 } 896 + 897 + if (params->warmup > 0) { 898 + debug_msg("Warming up for %d seconds\n", params->warmup); 899 + sleep(params->warmup); 900 + } 901 + 902 + /* 903 + * Start the tracers here, after having set all instances. 904 + * 905 + * Let the trace instance start first for the case of hitting a stop 906 + * tracing while enabling other instances. The trace instance is the 907 + * one with most valuable information. 908 + */ 909 + if (params->trace_output) 910 + trace_instance_start(&record->trace); 911 + if (!params->no_aa && aa != top) 912 + trace_instance_start(&aa->trace); 913 + trace_instance_start(trace); 914 + 915 + top->start_time = time(NULL); 916 + timerlat_top_set_signals(params); 1039 917 1040 918 while (!stop_tracing) { 1041 919 sleep(params->sleep_time);
+15
tools/tracing/rtla/src/trace.c
··· 540 540 541 541 return 0; 542 542 } 543 + 544 + /* 545 + * trace_set_buffer_size - set the per-cpu tracing buffer size. 546 + */ 547 + int trace_set_buffer_size(struct trace_instance *trace, int size) 548 + { 549 + int retval; 550 + 551 + debug_msg("Setting trace buffer size to %d Kb\n", size); 552 + retval = tracefs_instance_set_buffer_size(trace->inst, size, -1); 553 + if (retval) 554 + err_msg("Error setting trace buffer size\n"); 555 + 556 + return retval; 557 + }
+1
tools/tracing/rtla/src/trace.h
··· 48 48 int trace_event_add_filter(struct trace_events *event, char *filter); 49 49 int trace_event_add_trigger(struct trace_events *event, char *trigger); 50 50 int trace_is_off(struct trace_instance *tool, struct trace_instance *trace); 51 + int trace_set_buffer_size(struct trace_instance *trace, int size);