Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'trace-tools-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing tooling updates from Steven Rostedt:

- Add cgroup support for rtla via the -C option

- Add --house-keeping option that tells rtla where to place the
housekeeping threads

- Have rtla/timerlat have its own tracing instance instead of using the
top level tracing instance that is the default for other tracing
users to use

- Add auto analysis to timerlat_hist

- Have rtla start the tracers after creating the instances

- Reduce rtla hwnoise down to 75% from 100% as it runs with preemption
disabled and can cause system instability at 100%

- Add support to run timerlat_top and timerlat_hist threads in
user-space instead of just using the kernel tasks

- Some minor clean ups and documentation changes

* tag 'trace-tools-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
Documentation: Add tools/rtla timerlat -u option documentation
rtla/timerlat_hist: Add timerlat user-space support
rtla/timerlat_top: Add timerlat user-space support
rtla/hwnoise: Reduce runtime to 75%
rtla: Start the tracers after creating all instances
rtla/timerlat_hist: Add auto-analysis support
rtla/timerlat: Give timerlat auto analysis its own instance
rtla: Automatically move rtla to a house-keeping cpu
rtla: Change monitored_cpus from char * to cpu_set_t
rtla: Add --house-keeping option
rtla: Add -C cgroup support

+1280 -109
+8
Documentation/tools/rtla/common_options.rst
··· 2 2 3 3 Set the osnoise tracer to run the sample threads in the cpu-list. 4 4 5 + **-H**, **--house-keeping** *cpu-list* 6 + 7 + Run rtla control threads only on the given cpu-list. 8 + 5 9 **-d**, **--duration** *time[s|m|h|d]* 6 10 7 11 Set the duration of the session. ··· 45 41 - *r:prio* - use SCHED_RR with *prio*; 46 42 - *f:prio* - use SCHED_FIFO with *prio*; 47 43 - *d:runtime[us|ms|s]:period[us|ms|s]* - use SCHED_DEADLINE with *runtime* and *period* in nanoseconds. 44 + 45 + **-C**, **--cgroup**\[*=cgroup*] 46 + 47 + Set a *cgroup* to the tracer's threads. If the **-C** option is passed without arguments, the tracer's thread will inherit **rtla**'s *cgroup*. Otherwise, the threads will be placed on the *cgroup* passed to the option. 48 48 49 49 **-h**, **--help** 50 50
-7
Documentation/tools/rtla/common_timerlat_aa.rst
··· 5 5 **--no-aa** 6 6 7 7 disable auto-analysis, reducing rtla timerlat cpu usage 8 - 9 - **--aa-only** *us* 10 - 11 - Set stop tracing conditions and run without collecting and displaying statistics. 12 - Print the auto-analysis if the system hits the stop tracing condition. This option 13 - is useful to reduce rtla timerlat CPU, enabling the debug without the overhead of 14 - collecting the statistics.
+7
Documentation/tools/rtla/common_timerlat_options.rst
··· 26 26 Set the /dev/cpu_dma_latency to *us*, aiming to bound exit from idle latencies. 27 27 *cyclictest* sets this value to *0* by default, use **--dma-latency** *0* to have 28 28 similar results. 29 + 30 + **-u**, **--user-threads** 31 + 32 + Set timerlat to run without a workload, and then dispatches user-space workloads 33 + to wait on the timerlat_fd. Once the workload is awakes, it goes to sleep again 34 + adding so the measurement for the kernel-to-user and user-to-kernel to the tracer 35 + output.
+5 -2
Documentation/tools/rtla/rtla-timerlat-hist.rst
··· 29 29 30 30 .. include:: common_options.rst 31 31 32 + .. include:: common_timerlat_aa.rst 33 + 32 34 EXAMPLE 33 35 ======= 34 36 In the example below, **rtla timerlat hist** is set to run for *10* minutes, 35 37 in the cpus *0-4*, *skipping zero* only lines. Moreover, **rtla timerlat 36 38 hist** will change the priority of the *timerlat* threads to run under 37 39 *SCHED_DEADLINE* priority, with a *10us* runtime every *1ms* period. The 38 - *1ms* period is also passed to the *timerlat* tracer:: 40 + *1ms* period is also passed to the *timerlat* tracer. Auto-analysis is disabled 41 + to reduce overhead :: 39 42 40 - [root@alien ~]# timerlat hist -d 10m -c 0-4 -P d:100us:1ms -p 1ms 43 + [root@alien ~]# timerlat hist -d 10m -c 0-4 -P d:100us:1ms -p 1ms --no-aa 41 44 # RTLA timerlat histogram 42 45 # Time unit is microseconds (us) 43 46 # Duration: 0 00:10:00
+7
Documentation/tools/rtla/rtla-timerlat-top.rst
··· 32 32 33 33 .. include:: common_timerlat_aa.rst 34 34 35 + **--aa-only** *us* 36 + 37 + Set stop tracing conditions and run without collecting and displaying statistics. 38 + Print the auto-analysis if the system hits the stop tracing condition. This option 39 + is useful to reduce rtla timerlat CPU, enabling the debug without the overhead of 40 + collecting the statistics. 41 + 35 42 EXAMPLE 36 43 ======= 37 44
+65
tools/tracing/rtla/src/osnoise.c
··· 841 841 context->orig_opt_irq_disable = OSNOISE_OPTION_INIT_VAL; 842 842 } 843 843 844 + static int osnoise_get_workload(struct osnoise_context *context) 845 + { 846 + if (context->opt_workload != OSNOISE_OPTION_INIT_VAL) 847 + return context->opt_workload; 848 + 849 + if (context->orig_opt_workload != OSNOISE_OPTION_INIT_VAL) 850 + return context->orig_opt_workload; 851 + 852 + context->orig_opt_workload = osnoise_options_get_option("OSNOISE_WORKLOAD"); 853 + 854 + return context->orig_opt_workload; 855 + } 856 + 857 + int osnoise_set_workload(struct osnoise_context *context, bool onoff) 858 + { 859 + int opt_workload = osnoise_get_workload(context); 860 + int retval; 861 + 862 + if (opt_workload == OSNOISE_OPTION_INIT_VAL) 863 + return -1; 864 + 865 + if (opt_workload == onoff) 866 + return 0; 867 + 868 + retval = osnoise_options_set_option("OSNOISE_WORKLOAD", onoff); 869 + if (retval < 0) 870 + return -1; 871 + 872 + context->opt_workload = onoff; 873 + 874 + return 0; 875 + } 876 + 877 + static void osnoise_restore_workload(struct osnoise_context *context) 878 + { 879 + int retval; 880 + 881 + if (context->orig_opt_workload == OSNOISE_OPTION_INIT_VAL) 882 + return; 883 + 884 + if (context->orig_opt_workload == context->opt_workload) 885 + goto out_done; 886 + 887 + retval = osnoise_options_set_option("OSNOISE_WORKLOAD", context->orig_opt_workload); 888 + if (retval < 0) 889 + err_msg("Could not restore original OSNOISE_WORKLOAD option\n"); 890 + 891 + out_done: 892 + context->orig_opt_workload = OSNOISE_OPTION_INIT_VAL; 893 + } 894 + 895 + static void osnoise_put_workload(struct osnoise_context *context) 896 + { 897 + osnoise_restore_workload(context); 898 + 899 + if (context->orig_opt_workload == OSNOISE_OPTION_INIT_VAL) 900 + return; 901 + 902 + context->orig_opt_workload = OSNOISE_OPTION_INIT_VAL; 903 + } 904 + 844 905 /* 845 906 * enable_osnoise - enable osnoise tracer in the trace_instance 846 907 */ ··· 969 908 context->orig_opt_irq_disable = OSNOISE_OPTION_INIT_VAL; 970 909 context->opt_irq_disable = OSNOISE_OPTION_INIT_VAL; 971 910 911 + context->orig_opt_workload = OSNOISE_OPTION_INIT_VAL; 912 + context->opt_workload = OSNOISE_OPTION_INIT_VAL; 913 + 972 914 osnoise_get_context(context); 973 915 974 916 return context; ··· 999 935 osnoise_put_print_stack(context); 1000 936 osnoise_put_tracing_thresh(context); 1001 937 osnoise_put_irq_disable(context); 938 + osnoise_put_workload(context); 1002 939 1003 940 free(context); 1004 941 }
+5
tools/tracing/rtla/src/osnoise.h
··· 42 42 /* -1 as init value because 0 is off */ 43 43 int orig_opt_irq_disable; 44 44 int opt_irq_disable; 45 + 46 + /* -1 as init value because 0 is off */ 47 + int orig_opt_workload; 48 + int opt_workload; 45 49 }; 46 50 47 51 /* ··· 88 84 long long print_stack); 89 85 90 86 int osnoise_set_irq_disable(struct osnoise_context *context, bool onoff); 87 + int osnoise_set_workload(struct osnoise_context *context, bool onoff); 91 88 92 89 /* 93 90 * osnoise_tool - osnoise based tool definition.
+76 -14
tools/tracing/rtla/src/osnoise_hist.c
··· 3 3 * Copyright (C) 2021 Red Hat Inc, Daniel Bristot de Oliveira <bristot@kernel.org> 4 4 */ 5 5 6 + #define _GNU_SOURCE 6 7 #include <getopt.h> 7 8 #include <stdlib.h> 8 9 #include <string.h> ··· 12 11 #include <errno.h> 13 12 #include <stdio.h> 14 13 #include <time.h> 14 + #include <sched.h> 15 15 16 16 #include "utils.h" 17 17 #include "osnoise.h" 18 18 19 19 struct osnoise_hist_params { 20 20 char *cpus; 21 - char *monitored_cpus; 21 + cpu_set_t monitored_cpus; 22 22 char *trace_output; 23 + char *cgroup_name; 23 24 unsigned long long runtime; 24 25 unsigned long long period; 25 26 long long threshold; ··· 31 28 int duration; 32 29 int set_sched; 33 30 int output_divisor; 31 + int cgroup; 32 + int hk_cpus; 33 + cpu_set_t hk_cpu_set; 34 34 struct sched_attr sched_param; 35 35 struct trace_events *events; 36 36 ··· 274 268 trace_seq_printf(s, "Index"); 275 269 276 270 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 277 - if (params->cpus && !params->monitored_cpus[cpu]) 271 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 278 272 continue; 279 273 280 274 if (!data->hist[cpu].count) ··· 305 299 trace_seq_printf(trace->seq, "count:"); 306 300 307 301 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 308 - if (params->cpus && !params->monitored_cpus[cpu]) 302 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 309 303 continue; 310 304 311 305 if (!data->hist[cpu].count) ··· 319 313 trace_seq_printf(trace->seq, "min: "); 320 314 321 315 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 322 - if (params->cpus && !params->monitored_cpus[cpu]) 316 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 323 317 continue; 324 318 325 319 if (!data->hist[cpu].count) ··· 334 328 trace_seq_printf(trace->seq, "avg: "); 335 329 336 330 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 337 - if (params->cpus && !params->monitored_cpus[cpu]) 331 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 338 332 continue; 339 333 340 334 if (!data->hist[cpu].count) ··· 352 346 trace_seq_printf(trace->seq, "max: "); 353 347 354 348 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 355 - if (params->cpus && !params->monitored_cpus[cpu]) 349 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 356 350 continue; 357 351 358 352 if (!data->hist[cpu].count) ··· 387 381 bucket * data->bucket_size); 388 382 389 383 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 390 - if (params->cpus && !params->monitored_cpus[cpu]) 384 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 391 385 continue; 392 386 393 387 if (!data->hist[cpu].count) ··· 411 405 trace_seq_printf(trace->seq, "over: "); 412 406 413 407 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 414 - if (params->cpus && !params->monitored_cpus[cpu]) 408 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 415 409 continue; 416 410 417 411 if (!data->hist[cpu].count) ··· 438 432 "", 439 433 " usage: rtla osnoise hist [-h] [-D] [-d s] [-a us] [-p us] [-r us] [-s us] [-S us] \\", 440 434 " [-T us] [-t[=file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] \\", 441 - " [-c cpu-list] [-P priority] [-b N] [-E N] [--no-header] [--no-summary] [--no-index] \\", 442 - " [--with-zeros]", 435 + " [-c cpu-list] [-H cpu-list] [-P priority] [-b N] [-E N] [--no-header] [--no-summary] \\", 436 + " [--no-index] [--with-zeros] [-C[=cgroup_name]]", 443 437 "", 444 438 " -h/--help: print this menu", 445 439 " -a/--auto: set automatic trace mode, stopping the session if argument in us sample is hit", ··· 449 443 " -S/--stop-total us: stop trace if the total sample is higher than the argument in us", 450 444 " -T/--threshold us: the minimum delta to be considered a noise", 451 445 " -c/--cpus cpu-list: list of cpus to run osnoise threads", 446 + " -H/--house-keeping cpus: run rtla control threads only on the given cpus", 447 + " -C/--cgroup[=cgroup_name]: set cgroup, if no cgroup_name is passed, the rtla's cgroup will be inherited", 452 448 " -d/--duration time[s|m|h|d]: duration of the session", 453 449 " -D/--debug: print debug info", 454 450 " -t/--trace[=file]: save the stopped trace to [file|osnoise_trace.txt]", ··· 509 501 {"bucket-size", required_argument, 0, 'b'}, 510 502 {"entries", required_argument, 0, 'E'}, 511 503 {"cpus", required_argument, 0, 'c'}, 504 + {"cgroup", optional_argument, 0, 'C'}, 512 505 {"debug", no_argument, 0, 'D'}, 513 506 {"duration", required_argument, 0, 'd'}, 507 + {"house-keeping", required_argument, 0, 'H'}, 514 508 {"help", no_argument, 0, 'h'}, 515 509 {"period", required_argument, 0, 'p'}, 516 510 {"priority", required_argument, 0, 'P'}, ··· 534 524 /* getopt_long stores the option index here. */ 535 525 int option_index = 0; 536 526 537 - c = getopt_long(argc, argv, "a:c:b:d:e:E:Dhp:P:r:s:S:t::T:01234:5:", 527 + c = getopt_long(argc, argv, "a:c:C::b:d:e:E:DhH:p:P:r:s:S:t::T:01234:5:", 538 528 long_options, &option_index); 539 529 540 530 /* detect the end of the options. */ ··· 559 549 osnoise_hist_usage("Bucket size needs to be > 0 and <= 1000000\n"); 560 550 break; 561 551 case 'c': 562 - retval = parse_cpu_list(optarg, &params->monitored_cpus); 552 + retval = parse_cpu_set(optarg, &params->monitored_cpus); 563 553 if (retval) 564 554 osnoise_hist_usage("\nInvalid -c cpu list\n"); 565 555 params->cpus = optarg; 556 + break; 557 + case 'C': 558 + params->cgroup = 1; 559 + if (!optarg) { 560 + /* will inherit this cgroup */ 561 + params->cgroup_name = NULL; 562 + } else if (*optarg == '=') { 563 + /* skip the = */ 564 + params->cgroup_name = ++optarg; 565 + } 566 566 break; 567 567 case 'D': 568 568 config_debug = 1; ··· 602 582 case 'h': 603 583 case '?': 604 584 osnoise_hist_usage(NULL); 585 + break; 586 + case 'H': 587 + params->hk_cpus = 1; 588 + retval = parse_cpu_set(optarg, &params->hk_cpu_set); 589 + if (retval) { 590 + err_msg("Error parsing house keeping CPUs\n"); 591 + exit(EXIT_FAILURE); 592 + } 605 593 break; 606 594 case 'p': 607 595 params->period = get_llong_from_str(optarg); ··· 746 718 } 747 719 } 748 720 721 + if (params->hk_cpus) { 722 + retval = sched_setaffinity(getpid(), sizeof(params->hk_cpu_set), 723 + &params->hk_cpu_set); 724 + if (retval == -1) { 725 + err_msg("Failed to set rtla to the house keeping CPUs\n"); 726 + goto out_err; 727 + } 728 + } else if (params->cpus) { 729 + /* 730 + * Even if the user do not set a house-keeping CPU, try to 731 + * move rtla to a CPU set different to the one where the user 732 + * set the workload to run. 733 + * 734 + * No need to check results as this is an automatic attempt. 735 + */ 736 + auto_house_keeping(&params->monitored_cpus); 737 + } 738 + 749 739 return 0; 750 740 751 741 out_err: ··· 862 816 } 863 817 } 864 818 865 - trace_instance_start(trace); 819 + if (params->cgroup) { 820 + retval = set_comm_cgroup("timerlat/", params->cgroup_name); 821 + if (!retval) { 822 + err_msg("Failed to move threads to cgroup\n"); 823 + goto out_free; 824 + } 825 + } 866 826 867 827 if (params->trace_output) { 868 828 record = osnoise_init_trace_tool("osnoise"); ··· 883 831 goto out_hist; 884 832 } 885 833 886 - trace_instance_start(&record->trace); 887 834 } 835 + 836 + /* 837 + * Start the tracer here, after having set all instances. 838 + * 839 + * Let the trace instance start first for the case of hitting a stop 840 + * tracing while enabling other instances. The trace instance is the 841 + * one with most valuable information. 842 + */ 843 + if (params->trace_output) 844 + trace_instance_start(&record->trace); 845 + trace_instance_start(trace); 888 846 889 847 tool->start_time = time(NULL); 890 848 osnoise_hist_set_signals(params);
+76 -9
tools/tracing/rtla/src/osnoise_top.c
··· 3 3 * Copyright (C) 2021 Red Hat Inc, Daniel Bristot de Oliveira <bristot@kernel.org> 4 4 */ 5 5 6 + #define _GNU_SOURCE 6 7 #include <getopt.h> 7 8 #include <stdlib.h> 8 9 #include <string.h> ··· 11 10 #include <unistd.h> 12 11 #include <stdio.h> 13 12 #include <time.h> 13 + #include <sched.h> 14 14 15 15 #include "osnoise.h" 16 16 #include "utils.h" ··· 26 24 */ 27 25 struct osnoise_top_params { 28 26 char *cpus; 29 - char *monitored_cpus; 27 + cpu_set_t monitored_cpus; 30 28 char *trace_output; 29 + char *cgroup_name; 31 30 unsigned long long runtime; 32 31 unsigned long long period; 33 32 long long threshold; ··· 38 35 int duration; 39 36 int quiet; 40 37 int set_sched; 38 + int cgroup; 39 + int hk_cpus; 40 + cpu_set_t hk_cpu_set; 41 41 struct sched_attr sched_param; 42 42 struct trace_events *events; 43 43 enum osnoise_mode mode; ··· 263 257 osnoise_top_header(top); 264 258 265 259 for (i = 0; i < nr_cpus; i++) { 266 - if (params->cpus && !params->monitored_cpus[i]) 260 + if (params->cpus && !CPU_ISSET(i, &params->monitored_cpus)) 267 261 continue; 268 262 osnoise_top_print(top, i); 269 263 } ··· 282 276 static const char * const msg[] = { 283 277 " [-h] [-q] [-D] [-d s] [-a us] [-p us] [-r us] [-s us] [-S us] \\", 284 278 " [-T us] [-t[=file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] \\", 285 - " [-c cpu-list] [-P priority]", 279 + " [-c cpu-list] [-H cpu-list] [-P priority] [-C[=cgroup_name]]", 286 280 "", 287 281 " -h/--help: print this menu", 288 282 " -a/--auto: set automatic trace mode, stopping the session if argument in us sample is hit", ··· 292 286 " -S/--stop-total us: stop trace if the total sample is higher than the argument in us", 293 287 " -T/--threshold us: the minimum delta to be considered a noise", 294 288 " -c/--cpus cpu-list: list of cpus to run osnoise threads", 289 + " -H/--house-keeping cpus: run rtla control threads only on the given cpus", 290 + " -C/--cgroup[=cgroup_name]: set cgroup, if no cgroup_name is passed, the rtla's cgroup will be inherited", 295 291 " -d/--duration time[s|m|h|d]: duration of the session", 296 292 " -D/--debug: print debug info", 297 293 " -t/--trace[=file]: save the stopped trace to [file|osnoise_trace.txt]", ··· 348 340 if (!params) 349 341 exit(1); 350 342 351 - if (strcmp(argv[0], "hwnoise") == 0) 343 + if (strcmp(argv[0], "hwnoise") == 0) { 352 344 params->mode = MODE_HWNOISE; 345 + /* 346 + * Reduce CPU usage for 75% to avoid killing the system. 347 + */ 348 + params->runtime = 750000; 349 + params->period = 1000000; 350 + } 353 351 354 352 while (1) { 355 353 static struct option long_options[] = { 356 354 {"auto", required_argument, 0, 'a'}, 357 355 {"cpus", required_argument, 0, 'c'}, 356 + {"cgroup", optional_argument, 0, 'C'}, 358 357 {"debug", no_argument, 0, 'D'}, 359 358 {"duration", required_argument, 0, 'd'}, 360 359 {"event", required_argument, 0, 'e'}, 360 + {"house-keeping", required_argument, 0, 'H'}, 361 361 {"help", no_argument, 0, 'h'}, 362 362 {"period", required_argument, 0, 'p'}, 363 363 {"priority", required_argument, 0, 'P'}, ··· 383 367 /* getopt_long stores the option index here. */ 384 368 int option_index = 0; 385 369 386 - c = getopt_long(argc, argv, "a:c:d:De:hp:P:qr:s:S:t::T:0:1:", 370 + c = getopt_long(argc, argv, "a:c:C::d:De:hH:p:P:qr:s:S:t::T:0:1:", 387 371 long_options, &option_index); 388 372 389 373 /* Detect the end of the options. */ ··· 403 387 404 388 break; 405 389 case 'c': 406 - retval = parse_cpu_list(optarg, &params->monitored_cpus); 390 + retval = parse_cpu_set(optarg, &params->monitored_cpus); 407 391 if (retval) 408 392 osnoise_top_usage(params, "\nInvalid -c cpu list\n"); 409 393 params->cpus = optarg; 394 + break; 395 + case 'C': 396 + params->cgroup = 1; 397 + if (!optarg) { 398 + /* will inherit this cgroup */ 399 + params->cgroup_name = NULL; 400 + } else if (*optarg == '=') { 401 + /* skip the = */ 402 + params->cgroup_name = ++optarg; 403 + } 410 404 break; 411 405 case 'D': 412 406 config_debug = 1; ··· 441 415 case 'h': 442 416 case '?': 443 417 osnoise_top_usage(params, NULL); 418 + break; 419 + case 'H': 420 + params->hk_cpus = 1; 421 + retval = parse_cpu_set(optarg, &params->hk_cpu_set); 422 + if (retval) { 423 + err_msg("Error parsing house keeping CPUs\n"); 424 + exit(EXIT_FAILURE); 425 + } 444 426 break; 445 427 case 'p': 446 428 params->period = get_llong_from_str(optarg); ··· 581 547 } 582 548 } 583 549 550 + if (params->hk_cpus) { 551 + retval = sched_setaffinity(getpid(), sizeof(params->hk_cpu_set), 552 + &params->hk_cpu_set); 553 + if (retval == -1) { 554 + err_msg("Failed to set rtla to the house keeping CPUs\n"); 555 + goto out_err; 556 + } 557 + } else if (params->cpus) { 558 + /* 559 + * Even if the user do not set a house-keeping CPU, try to 560 + * move rtla to a CPU set different to the one where the user 561 + * set the workload to run. 562 + * 563 + * No need to check results as this is an automatic attempt. 564 + */ 565 + auto_house_keeping(&params->monitored_cpus); 566 + } 567 + 584 568 return 0; 585 569 586 570 out_err: ··· 695 643 } 696 644 } 697 645 698 - trace_instance_start(trace); 646 + if (params->cgroup) { 647 + retval = set_comm_cgroup("osnoise/", params->cgroup_name); 648 + if (!retval) { 649 + err_msg("Failed to move threads to cgroup\n"); 650 + goto out_free; 651 + } 652 + } 699 653 700 654 if (params->trace_output) { 701 655 record = osnoise_init_trace_tool("osnoise"); ··· 715 657 if (retval) 716 658 goto out_top; 717 659 } 718 - 719 - trace_instance_start(&record->trace); 720 660 } 661 + 662 + /* 663 + * Start the tracer here, after having set all instances. 664 + * 665 + * Let the trace instance start first for the case of hitting a stop 666 + * tracing while enabling other instances. The trace instance is the 667 + * one with most valuable information. 668 + */ 669 + if (params->trace_output) 670 + trace_instance_start(&record->trace); 671 + trace_instance_start(trace); 721 672 722 673 tool->start_time = time(NULL); 723 674 osnoise_top_set_signals(params);
+33 -2
tools/tracing/rtla/src/timerlat_aa.c
··· 8 8 #include "utils.h" 9 9 #include "osnoise.h" 10 10 #include "timerlat.h" 11 + #include <unistd.h> 11 12 12 13 enum timelat_state { 13 14 TIMERLAT_INIT = 0, ··· 234 233 * 235 234 * Returns 0 on success, -1 otherwise. 236 235 */ 237 - int timerlat_aa_handler(struct trace_seq *s, struct tep_record *record, 236 + static int timerlat_aa_handler(struct trace_seq *s, struct tep_record *record, 238 237 struct tep_event *event, void *context) 239 238 { 240 239 struct timerlat_aa_context *taa_ctx = timerlat_aa_get_ctx(); ··· 666 665 ns_to_usf(total)); 667 666 } 668 667 668 + static int timerlat_auto_analysis_collect_trace(struct timerlat_aa_context *taa_ctx) 669 + { 670 + struct trace_instance *trace = &taa_ctx->tool->trace; 671 + int retval; 672 + 673 + retval = tracefs_iterate_raw_events(trace->tep, 674 + trace->inst, 675 + NULL, 676 + 0, 677 + collect_registered_events, 678 + trace); 679 + if (retval < 0) { 680 + err_msg("Error iterating on events\n"); 681 + return 0; 682 + } 683 + 684 + return 1; 685 + } 686 + 669 687 /** 670 688 * timerlat_auto_analysis - Analyze the collected data 671 689 */ ··· 696 676 int max_exit_from_idle_cpu; 697 677 struct tep_handle *tep; 698 678 int cpu; 679 + 680 + timerlat_auto_analysis_collect_trace(taa_ctx); 699 681 700 682 /* bring stop tracing to the ns scale */ 701 683 irq_thresh = irq_thresh * 1000; ··· 860 838 */ 861 839 static void timerlat_aa_unregister_events(struct osnoise_tool *tool, int dump_tasks) 862 840 { 841 + 842 + tep_unregister_event_handler(tool->trace.tep, -1, "ftrace", "timerlat", 843 + timerlat_aa_handler, tool); 844 + 863 845 tracefs_event_disable(tool->trace.inst, "osnoise", NULL); 864 846 865 847 tep_unregister_event_handler(tool->trace.tep, -1, "osnoise", "nmi_noise", ··· 900 874 static int timerlat_aa_register_events(struct osnoise_tool *tool, int dump_tasks) 901 875 { 902 876 int retval; 877 + 878 + tep_register_event_handler(tool->trace.tep, -1, "ftrace", "timerlat", 879 + timerlat_aa_handler, tool); 880 + 903 881 904 882 /* 905 883 * register auto-analysis handlers. ··· 985 955 * 986 956 * Returns 0 on success, -1 otherwise. 987 957 */ 988 - int timerlat_aa_init(struct osnoise_tool *tool, int nr_cpus, int dump_tasks) 958 + int timerlat_aa_init(struct osnoise_tool *tool, int dump_tasks) 989 959 { 960 + int nr_cpus = sysconf(_SC_NPROCESSORS_CONF); 990 961 struct timerlat_aa_context *taa_ctx; 991 962 int retval; 992 963
+1 -4
tools/tracing/rtla/src/timerlat_aa.h
··· 3 3 * Copyright (C) 2023 Red Hat Inc, Daniel Bristot de Oliveira <bristot@kernel.org> 4 4 */ 5 5 6 - int timerlat_aa_init(struct osnoise_tool *tool, int nr_cpus, int dump_task); 6 + int timerlat_aa_init(struct osnoise_tool *tool, int dump_task); 7 7 void timerlat_aa_destroy(void); 8 - 9 - int timerlat_aa_handler(struct trace_seq *s, struct tep_record *record, 10 - struct tep_event *event, void *context); 11 8 12 9 void timerlat_auto_analysis(int irq_thresh, int thread_thresh);
+239 -25
tools/tracing/rtla/src/timerlat_hist.c
··· 3 3 * Copyright (C) 2021 Red Hat Inc, Daniel Bristot de Oliveira <bristot@kernel.org> 4 4 */ 5 5 6 + #define _GNU_SOURCE 6 7 #include <getopt.h> 7 8 #include <stdlib.h> 8 9 #include <string.h> ··· 11 10 #include <unistd.h> 12 11 #include <stdio.h> 13 12 #include <time.h> 13 + #include <sched.h> 14 + #include <pthread.h> 14 15 15 16 #include "utils.h" 16 17 #include "osnoise.h" 17 18 #include "timerlat.h" 19 + #include "timerlat_aa.h" 20 + #include "timerlat_u.h" 18 21 19 22 struct timerlat_hist_params { 20 23 char *cpus; 21 - char *monitored_cpus; 24 + cpu_set_t monitored_cpus; 22 25 char *trace_output; 26 + char *cgroup_name; 23 27 unsigned long long runtime; 24 28 long long stop_us; 25 29 long long stop_total_us; ··· 35 29 int duration; 36 30 int set_sched; 37 31 int dma_latency; 32 + int cgroup; 33 + int hk_cpus; 34 + int no_aa; 35 + int dump_tasks; 36 + int user_hist; 37 + cpu_set_t hk_cpu_set; 38 38 struct sched_attr sched_param; 39 39 struct trace_events *events; 40 - 41 40 char no_irq; 42 41 char no_thread; 43 42 char no_header; ··· 56 45 struct timerlat_hist_cpu { 57 46 int *irq; 58 47 int *thread; 48 + int *user; 59 49 60 50 int irq_count; 61 51 int thread_count; 52 + int user_count; 62 53 63 54 unsigned long long min_irq; 64 55 unsigned long long sum_irq; ··· 69 56 unsigned long long min_thread; 70 57 unsigned long long sum_thread; 71 58 unsigned long long max_thread; 59 + 60 + unsigned long long min_user; 61 + unsigned long long sum_user; 62 + unsigned long long max_user; 72 63 }; 73 64 74 65 struct timerlat_hist_data { ··· 97 80 98 81 if (data->hist[cpu].thread) 99 82 free(data->hist[cpu].thread); 83 + 84 + if (data->hist[cpu].user) 85 + free(data->hist[cpu].user); 86 + 100 87 } 101 88 102 89 /* one set of histograms per CPU */ ··· 137 116 data->hist[cpu].irq = calloc(1, sizeof(*data->hist->irq) * (entries + 1)); 138 117 if (!data->hist[cpu].irq) 139 118 goto cleanup; 119 + 140 120 data->hist[cpu].thread = calloc(1, sizeof(*data->hist->thread) * (entries + 1)); 141 121 if (!data->hist[cpu].thread) 122 + goto cleanup; 123 + 124 + data->hist[cpu].user = calloc(1, sizeof(*data->hist->user) * (entries + 1)); 125 + if (!data->hist[cpu].user) 142 126 goto cleanup; 143 127 } 144 128 ··· 151 125 for (cpu = 0; cpu < nr_cpus; cpu++) { 152 126 data->hist[cpu].min_irq = ~0; 153 127 data->hist[cpu].min_thread = ~0; 128 + data->hist[cpu].min_user = ~0; 154 129 } 155 130 156 131 return data; ··· 166 139 */ 167 140 static void 168 141 timerlat_hist_update(struct osnoise_tool *tool, int cpu, 169 - unsigned long long thread, 142 + unsigned long long context, 170 143 unsigned long long latency) 171 144 { 172 145 struct timerlat_hist_params *params = tool->params; ··· 181 154 if (data->bucket_size) 182 155 bucket = latency / data->bucket_size; 183 156 184 - if (!thread) { 157 + if (!context) { 185 158 hist = data->hist[cpu].irq; 186 159 data->hist[cpu].irq_count++; 187 160 update_min(&data->hist[cpu].min_irq, &latency); 188 161 update_sum(&data->hist[cpu].sum_irq, &latency); 189 162 update_max(&data->hist[cpu].max_irq, &latency); 190 - } else { 163 + } else if (context == 1) { 191 164 hist = data->hist[cpu].thread; 192 165 data->hist[cpu].thread_count++; 193 166 update_min(&data->hist[cpu].min_thread, &latency); 194 167 update_sum(&data->hist[cpu].sum_thread, &latency); 195 168 update_max(&data->hist[cpu].max_thread, &latency); 169 + } else { /* user */ 170 + hist = data->hist[cpu].user; 171 + data->hist[cpu].user_count++; 172 + update_min(&data->hist[cpu].min_user, &latency); 173 + update_sum(&data->hist[cpu].sum_user, &latency); 174 + update_max(&data->hist[cpu].max_user, &latency); 196 175 } 197 176 198 177 if (bucket < entries) ··· 215 182 struct tep_event *event, void *data) 216 183 { 217 184 struct trace_instance *trace = data; 218 - unsigned long long thread, latency; 185 + unsigned long long context, latency; 219 186 struct osnoise_tool *tool; 220 187 int cpu = record->cpu; 221 188 222 189 tool = container_of(trace, struct osnoise_tool, trace); 223 190 224 - tep_get_field_val(s, event, "context", record, &thread, 1); 191 + tep_get_field_val(s, event, "context", record, &context, 1); 225 192 tep_get_field_val(s, event, "timer_latency", record, &latency, 1); 226 193 227 - timerlat_hist_update(tool, cpu, thread, latency); 194 + timerlat_hist_update(tool, cpu, context, latency); 228 195 229 196 return 0; 230 197 } ··· 255 222 trace_seq_printf(s, "Index"); 256 223 257 224 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 258 - if (params->cpus && !params->monitored_cpus[cpu]) 225 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 259 226 continue; 260 227 261 228 if (!data->hist[cpu].irq_count && !data->hist[cpu].thread_count) ··· 266 233 267 234 if (!params->no_thread) 268 235 trace_seq_printf(s, " Thr-%03d", cpu); 236 + 237 + if (params->user_hist) 238 + trace_seq_printf(s, " Usr-%03d", cpu); 269 239 } 270 240 trace_seq_printf(s, "\n"); 271 241 ··· 294 258 trace_seq_printf(trace->seq, "count:"); 295 259 296 260 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 297 - if (params->cpus && !params->monitored_cpus[cpu]) 261 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 298 262 continue; 299 263 300 264 if (!data->hist[cpu].irq_count && !data->hist[cpu].thread_count) ··· 307 271 if (!params->no_thread) 308 272 trace_seq_printf(trace->seq, "%9d ", 309 273 data->hist[cpu].thread_count); 274 + 275 + if (params->user_hist) 276 + trace_seq_printf(trace->seq, "%9d ", 277 + data->hist[cpu].user_count); 310 278 } 311 279 trace_seq_printf(trace->seq, "\n"); 312 280 ··· 318 278 trace_seq_printf(trace->seq, "min: "); 319 279 320 280 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 321 - if (params->cpus && !params->monitored_cpus[cpu]) 281 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 322 282 continue; 323 283 324 284 if (!data->hist[cpu].irq_count && !data->hist[cpu].thread_count) ··· 331 291 if (!params->no_thread) 332 292 trace_seq_printf(trace->seq, "%9llu ", 333 293 data->hist[cpu].min_thread); 294 + 295 + if (params->user_hist) 296 + trace_seq_printf(trace->seq, "%9llu ", 297 + data->hist[cpu].min_user); 334 298 } 335 299 trace_seq_printf(trace->seq, "\n"); 336 300 ··· 342 298 trace_seq_printf(trace->seq, "avg: "); 343 299 344 300 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 345 - if (params->cpus && !params->monitored_cpus[cpu]) 301 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 346 302 continue; 347 303 348 304 if (!data->hist[cpu].irq_count && !data->hist[cpu].thread_count) ··· 359 315 if (!params->no_thread) { 360 316 if (data->hist[cpu].thread_count) 361 317 trace_seq_printf(trace->seq, "%9llu ", 362 - data->hist[cpu].sum_thread / data->hist[cpu].thread_count); 318 + data->hist[cpu].sum_thread / data->hist[cpu].thread_count); 319 + else 320 + trace_seq_printf(trace->seq, " - "); 321 + } 322 + 323 + if (params->user_hist) { 324 + if (data->hist[cpu].user_count) 325 + trace_seq_printf(trace->seq, "%9llu ", 326 + data->hist[cpu].sum_user / data->hist[cpu].user_count); 363 327 else 364 328 trace_seq_printf(trace->seq, " - "); 365 329 } ··· 378 326 trace_seq_printf(trace->seq, "max: "); 379 327 380 328 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 381 - if (params->cpus && !params->monitored_cpus[cpu]) 329 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 382 330 continue; 383 331 384 332 if (!data->hist[cpu].irq_count && !data->hist[cpu].thread_count) ··· 391 339 if (!params->no_thread) 392 340 trace_seq_printf(trace->seq, "%9llu ", 393 341 data->hist[cpu].max_thread); 342 + 343 + if (params->user_hist) 344 + trace_seq_printf(trace->seq, "%9llu ", 345 + data->hist[cpu].max_user); 394 346 } 395 347 trace_seq_printf(trace->seq, "\n"); 396 348 trace_seq_do_printf(trace->seq); ··· 422 366 bucket * data->bucket_size); 423 367 424 368 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 425 - if (params->cpus && !params->monitored_cpus[cpu]) 369 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 426 370 continue; 427 371 428 372 if (!data->hist[cpu].irq_count && !data->hist[cpu].thread_count) ··· 438 382 total += data->hist[cpu].thread[bucket]; 439 383 trace_seq_printf(trace->seq, "%9d ", 440 384 data->hist[cpu].thread[bucket]); 385 + } 386 + 387 + if (params->user_hist) { 388 + total += data->hist[cpu].user[bucket]; 389 + trace_seq_printf(trace->seq, "%9d ", 390 + data->hist[cpu].user[bucket]); 441 391 } 442 392 443 393 } ··· 462 400 trace_seq_printf(trace->seq, "over: "); 463 401 464 402 for (cpu = 0; cpu < data->nr_cpus; cpu++) { 465 - if (params->cpus && !params->monitored_cpus[cpu]) 403 + if (params->cpus && !CPU_ISSET(cpu, &params->monitored_cpus)) 466 404 continue; 467 405 468 406 if (!data->hist[cpu].irq_count && !data->hist[cpu].thread_count) ··· 475 413 if (!params->no_thread) 476 414 trace_seq_printf(trace->seq, "%9d ", 477 415 data->hist[cpu].thread[data->entries]); 416 + 417 + if (params->user_hist) 418 + trace_seq_printf(trace->seq, "%9d ", 419 + data->hist[cpu].user[data->entries]); 478 420 } 479 421 trace_seq_printf(trace->seq, "\n"); 480 422 trace_seq_do_printf(trace->seq); ··· 497 431 char *msg[] = { 498 432 "", 499 433 " usage: [rtla] timerlat hist [-h] [-q] [-d s] [-D] [-n] [-a us] [-p us] [-i us] [-T us] [-s us] \\", 500 - " [-t[=file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] [-c cpu-list] \\", 434 + " [-t[=file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] [-c cpu-list] [-H cpu-list]\\", 501 435 " [-P priority] [-E N] [-b N] [--no-irq] [--no-thread] [--no-header] [--no-summary] \\", 502 - " [--no-index] [--with-zeros] [--dma-latency us]", 436 + " [--no-index] [--with-zeros] [--dma-latency us] [-C[=cgroup_name]] [--no-aa] [--dump-task] [-u]", 503 437 "", 504 438 " -h/--help: print this menu", 505 439 " -a/--auto: set automatic trace mode, stopping the session if argument in us latency is hit", ··· 508 442 " -T/--thread us: stop trace if the thread latency is higher than the argument in us", 509 443 " -s/--stack us: save the stack trace at the IRQ if a thread latency is higher than the argument in us", 510 444 " -c/--cpus cpus: run the tracer only on the given cpus", 445 + " -H/--house-keeping cpus: run rtla control threads only on the given cpus", 446 + " -C/--cgroup[=cgroup_name]: set cgroup, if no cgroup_name is passed, the rtla's cgroup will be inherited", 511 447 " -d/--duration time[m|h|d]: duration of the session in seconds", 448 + " --dump-tasks: prints the task running on all CPUs if stop conditions are met (depends on !--no-aa)", 512 449 " -D/--debug: print debug info", 513 450 " -t/--trace[=file]: save the stopped trace to [file|timerlat_trace.txt]", 514 451 " -e/--event <sys:event>: enable the <sys:event> in the trace instance, multiple -e are allowed", 515 452 " --filter <filter>: enable a trace event filter to the previous -e event", 516 453 " --trigger <trigger>: enable a trace event trigger to the previous -e event", 517 454 " -n/--nano: display data in nanoseconds", 455 + " --no-aa: disable auto-analysis, reducing rtla timerlat cpu usage", 518 456 " -b/--bucket-size N: set the histogram bucket size (default 1)", 519 457 " -E/--entries N: set the number of entries of the histogram (default 256)", 520 458 " --no-irq: ignore IRQ latencies", ··· 534 464 " f:prio - use SCHED_FIFO with prio", 535 465 " d:runtime[us|ms|s]:period[us|ms|s] - use SCHED_DEADLINE with runtime and period", 536 466 " in nanoseconds", 467 + " -u/--user-threads: use rtla user-space threads instead of in-kernel timerlat threads", 537 468 NULL, 538 469 }; 539 470 ··· 577 506 static struct option long_options[] = { 578 507 {"auto", required_argument, 0, 'a'}, 579 508 {"cpus", required_argument, 0, 'c'}, 509 + {"cgroup", optional_argument, 0, 'C'}, 580 510 {"bucket-size", required_argument, 0, 'b'}, 581 511 {"debug", no_argument, 0, 'D'}, 582 512 {"entries", required_argument, 0, 'E'}, 583 513 {"duration", required_argument, 0, 'd'}, 514 + {"house-keeping", required_argument, 0, 'H'}, 584 515 {"help", no_argument, 0, 'h'}, 585 516 {"irq", required_argument, 0, 'i'}, 586 517 {"nano", no_argument, 0, 'n'}, ··· 591 518 {"stack", required_argument, 0, 's'}, 592 519 {"thread", required_argument, 0, 'T'}, 593 520 {"trace", optional_argument, 0, 't'}, 521 + {"user-threads", no_argument, 0, 'u'}, 594 522 {"event", required_argument, 0, 'e'}, 595 523 {"no-irq", no_argument, 0, '0'}, 596 524 {"no-thread", no_argument, 0, '1'}, ··· 602 528 {"trigger", required_argument, 0, '6'}, 603 529 {"filter", required_argument, 0, '7'}, 604 530 {"dma-latency", required_argument, 0, '8'}, 531 + {"no-aa", no_argument, 0, '9'}, 532 + {"dump-task", no_argument, 0, '\1'}, 605 533 {0, 0, 0, 0} 606 534 }; 607 535 608 536 /* getopt_long stores the option index here. */ 609 537 int option_index = 0; 610 538 611 - c = getopt_long(argc, argv, "a:c:b:d:e:E:Dhi:np:P:s:t::T:0123456:7:8:", 539 + c = getopt_long(argc, argv, "a:c:C::b:d:e:E:DhH:i:np:P:s:t::T:u0123456:7:8:9\1", 612 540 long_options, &option_index); 613 541 614 542 /* detect the end of the options. */ ··· 623 547 624 548 /* set thread stop to auto_thresh */ 625 549 params->stop_total_us = auto_thresh; 550 + params->stop_us = auto_thresh; 626 551 627 552 /* get stack trace */ 628 553 params->print_stack = auto_thresh; ··· 633 556 634 557 break; 635 558 case 'c': 636 - retval = parse_cpu_list(optarg, &params->monitored_cpus); 559 + retval = parse_cpu_set(optarg, &params->monitored_cpus); 637 560 if (retval) 638 561 timerlat_hist_usage("\nInvalid -c cpu list\n"); 639 562 params->cpus = optarg; 563 + break; 564 + case 'C': 565 + params->cgroup = 1; 566 + if (!optarg) { 567 + /* will inherit this cgroup */ 568 + params->cgroup_name = NULL; 569 + } else if (*optarg == '=') { 570 + /* skip the = */ 571 + params->cgroup_name = ++optarg; 572 + } 640 573 break; 641 574 case 'b': 642 575 params->bucket_size = get_llong_from_str(optarg); ··· 682 595 case '?': 683 596 timerlat_hist_usage(NULL); 684 597 break; 598 + case 'H': 599 + params->hk_cpus = 1; 600 + retval = parse_cpu_set(optarg, &params->hk_cpu_set); 601 + if (retval) { 602 + err_msg("Error parsing house keeping CPUs\n"); 603 + exit(EXIT_FAILURE); 604 + } 605 + break; 685 606 case 'i': 686 607 params->stop_us = get_llong_from_str(optarg); 687 608 break; ··· 719 624 params->trace_output = &optarg[1]; 720 625 else 721 626 params->trace_output = "timerlat_trace.txt"; 627 + break; 628 + case 'u': 629 + params->user_hist = 1; 722 630 break; 723 631 case '0': /* no irq */ 724 632 params->no_irq = 1; ··· 770 672 exit(EXIT_FAILURE); 771 673 } 772 674 break; 675 + case '9': 676 + params->no_aa = 1; 677 + break; 678 + case '\1': 679 + params->dump_tasks = 1; 680 + break; 773 681 default: 774 682 timerlat_hist_usage("Invalid option"); 775 683 } ··· 792 688 if (params->no_index && !params->with_zeros) 793 689 timerlat_hist_usage("no-index set with with-zeros is not set - it does not make sense"); 794 690 691 + /* 692 + * Auto analysis only happens if stop tracing, thus: 693 + */ 694 + if (!params->stop_us && !params->stop_total_us) 695 + params->no_aa = 1; 696 + 795 697 return params; 796 698 } 797 699 ··· 807 697 static int 808 698 timerlat_hist_apply_config(struct osnoise_tool *tool, struct timerlat_hist_params *params) 809 699 { 810 - int retval; 700 + int retval, i; 811 701 812 702 if (!params->sleep_time) 813 703 params->sleep_time = 1; ··· 818 708 err_msg("Failed to apply CPUs config\n"); 819 709 goto out_err; 820 710 } 711 + } else { 712 + for (i = 0; i < sysconf(_SC_NPROCESSORS_CONF); i++) 713 + CPU_SET(i, &params->monitored_cpus); 821 714 } 822 715 823 716 if (params->stop_us) { ··· 851 738 retval = osnoise_set_print_stack(tool->context, params->print_stack); 852 739 if (retval) { 853 740 err_msg("Failed to set print stack\n"); 741 + goto out_err; 742 + } 743 + } 744 + 745 + if (params->hk_cpus) { 746 + retval = sched_setaffinity(getpid(), sizeof(params->hk_cpu_set), 747 + &params->hk_cpu_set); 748 + if (retval == -1) { 749 + err_msg("Failed to set rtla to the house keeping CPUs\n"); 750 + goto out_err; 751 + } 752 + } else if (params->cpus) { 753 + /* 754 + * Even if the user do not set a house-keeping CPU, try to 755 + * move rtla to a CPU set different to the one where the user 756 + * set the workload to run. 757 + * 758 + * No need to check results as this is an automatic attempt. 759 + */ 760 + auto_house_keeping(&params->monitored_cpus); 761 + } 762 + 763 + if (params->user_hist) { 764 + retval = osnoise_set_workload(tool->context, 0); 765 + if (retval) { 766 + err_msg("Failed to set OSNOISE_WORKLOAD option\n"); 854 767 goto out_err; 855 768 } 856 769 } ··· 941 802 { 942 803 struct timerlat_hist_params *params; 943 804 struct osnoise_tool *record = NULL; 805 + struct timerlat_u_params params_u; 944 806 struct osnoise_tool *tool = NULL; 807 + struct osnoise_tool *aa = NULL; 945 808 struct trace_instance *trace; 946 809 int dma_latency_fd = -1; 947 810 int return_value = 1; 811 + pthread_t timerlat_u; 948 812 int retval; 949 813 950 814 params = timerlat_hist_parse_args(argc, argv); ··· 982 840 } 983 841 } 984 842 843 + if (params->cgroup && !params->user_hist) { 844 + retval = set_comm_cgroup("timerlat/", params->cgroup_name); 845 + if (!retval) { 846 + err_msg("Failed to move threads to cgroup\n"); 847 + goto out_free; 848 + } 849 + } 850 + 985 851 if (params->dma_latency >= 0) { 986 852 dma_latency_fd = set_cpu_dma_latency(params->dma_latency); 987 853 if (dma_latency_fd < 0) { ··· 997 847 goto out_free; 998 848 } 999 849 } 1000 - 1001 - trace_instance_start(trace); 1002 850 1003 851 if (params->trace_output) { 1004 852 record = osnoise_init_trace_tool("timerlat"); ··· 1010 862 if (retval) 1011 863 goto out_hist; 1012 864 } 1013 - 1014 - trace_instance_start(&record->trace); 1015 865 } 866 + 867 + if (!params->no_aa) { 868 + aa = osnoise_init_tool("timerlat_aa"); 869 + if (!aa) 870 + goto out_hist; 871 + 872 + retval = timerlat_aa_init(aa, params->dump_tasks); 873 + if (retval) { 874 + err_msg("Failed to enable the auto analysis instance\n"); 875 + goto out_hist; 876 + } 877 + 878 + retval = enable_timerlat(&aa->trace); 879 + if (retval) { 880 + err_msg("Failed to enable timerlat tracer\n"); 881 + goto out_hist; 882 + } 883 + } 884 + 885 + /* 886 + * Start the tracers here, after having set all instances. 887 + * 888 + * Let the trace instance start first for the case of hitting a stop 889 + * tracing while enabling other instances. The trace instance is the 890 + * one with most valuable information. 891 + */ 892 + if (params->trace_output) 893 + trace_instance_start(&record->trace); 894 + if (!params->no_aa) 895 + trace_instance_start(&aa->trace); 896 + trace_instance_start(trace); 1016 897 1017 898 tool->start_time = time(NULL); 1018 899 timerlat_hist_set_signals(params); 900 + 901 + if (params->user_hist) { 902 + /* rtla asked to stop */ 903 + params_u.should_run = 1; 904 + /* all threads left */ 905 + params_u.stopped_running = 0; 906 + 907 + params_u.set = &params->monitored_cpus; 908 + if (params->set_sched) 909 + params_u.sched_param = &params->sched_param; 910 + else 911 + params_u.sched_param = NULL; 912 + 913 + params_u.cgroup_name = params->cgroup_name; 914 + 915 + retval = pthread_create(&timerlat_u, NULL, timerlat_u_dispatcher, &params_u); 916 + if (retval) 917 + err_msg("Error creating timerlat user-space threads\n"); 918 + } 1019 919 1020 920 while (!stop_tracing) { 1021 921 sleep(params->sleep_time); ··· 1081 885 1082 886 if (trace_is_off(&tool->trace, &record->trace)) 1083 887 break; 888 + 889 + /* is there still any user-threads ? */ 890 + if (params->user_hist) { 891 + if (params_u.stopped_running) { 892 + debug_msg("timerlat user-space threads stopped!\n"); 893 + break; 894 + } 895 + } 896 + } 897 + if (params->user_hist && !params_u.stopped_running) { 898 + params_u.should_run = 0; 899 + sleep(1); 1084 900 } 1085 901 1086 902 timerlat_print_stats(params, tool); ··· 1101 893 1102 894 if (trace_is_off(&tool->trace, &record->trace)) { 1103 895 printf("rtla timerlat hit stop tracing\n"); 896 + 897 + if (!params->no_aa) 898 + timerlat_auto_analysis(params->stop_us, params->stop_total_us); 899 + 1104 900 if (params->trace_output) { 1105 901 printf(" Saving trace to %s\n", params->trace_output); 1106 902 save_trace_to_file(record->trace.inst, params->trace_output); ··· 1112 900 } 1113 901 1114 902 out_hist: 903 + timerlat_aa_destroy(); 1115 904 if (dma_latency_fd >= 0) 1116 905 close(dma_latency_fd); 1117 906 trace_events_destroy(&record->trace, params->events); 1118 907 params->events = NULL; 1119 908 out_free: 1120 909 timerlat_free_histogram(tool->data); 910 + osnoise_destroy_tool(aa); 1121 911 osnoise_destroy_tool(record); 1122 912 osnoise_destroy_tool(tool); 1123 913 free(params);
+203 -28
tools/tracing/rtla/src/timerlat_top.c
··· 3 3 * Copyright (C) 2021 Red Hat Inc, Daniel Bristot de Oliveira <bristot@kernel.org> 4 4 */ 5 5 6 + #define _GNU_SOURCE 6 7 #include <getopt.h> 7 8 #include <stdlib.h> 8 9 #include <string.h> ··· 12 11 #include <stdio.h> 13 12 #include <time.h> 14 13 #include <errno.h> 14 + #include <sched.h> 15 + #include <pthread.h> 15 16 16 17 #include "utils.h" 17 18 #include "osnoise.h" 18 19 #include "timerlat.h" 19 20 #include "timerlat_aa.h" 21 + #include "timerlat_u.h" 20 22 21 23 struct timerlat_top_params { 22 24 char *cpus; 23 - char *monitored_cpus; 25 + cpu_set_t monitored_cpus; 24 26 char *trace_output; 27 + char *cgroup_name; 25 28 unsigned long long runtime; 26 29 long long stop_us; 27 30 long long stop_total_us; ··· 40 35 int no_aa; 41 36 int aa_only; 42 37 int dump_tasks; 38 + int cgroup; 39 + int hk_cpus; 40 + int user_top; 41 + cpu_set_t hk_cpu_set; 43 42 struct sched_attr sched_param; 44 43 struct trace_events *events; 45 44 }; ··· 51 42 struct timerlat_top_cpu { 52 43 int irq_count; 53 44 int thread_count; 45 + int user_count; 54 46 55 47 unsigned long long cur_irq; 56 48 unsigned long long min_irq; ··· 62 52 unsigned long long min_thread; 63 53 unsigned long long sum_thread; 64 54 unsigned long long max_thread; 55 + 56 + unsigned long long cur_user; 57 + unsigned long long min_user; 58 + unsigned long long sum_user; 59 + unsigned long long max_user; 65 60 }; 66 61 67 62 struct timerlat_top_data { ··· 107 92 for (cpu = 0; cpu < nr_cpus; cpu++) { 108 93 data->cpu_data[cpu].min_irq = ~0; 109 94 data->cpu_data[cpu].min_thread = ~0; 95 + data->cpu_data[cpu].min_user = ~0; 110 96 } 111 97 112 98 return data; ··· 134 118 update_min(&cpu_data->min_irq, &latency); 135 119 update_sum(&cpu_data->sum_irq, &latency); 136 120 update_max(&cpu_data->max_irq, &latency); 137 - } else { 121 + } else if (thread == 1) { 138 122 cpu_data->thread_count++; 139 123 cpu_data->cur_thread = latency; 140 124 update_min(&cpu_data->min_thread, &latency); 141 125 update_sum(&cpu_data->sum_thread, &latency); 142 126 update_max(&cpu_data->max_thread, &latency); 127 + } else { 128 + cpu_data->user_count++; 129 + cpu_data->cur_user = latency; 130 + update_min(&cpu_data->min_user, &latency); 131 + update_sum(&cpu_data->sum_user, &latency); 132 + update_max(&cpu_data->max_user, &latency); 143 133 } 144 134 } 145 135 ··· 172 150 timerlat_top_update(top, cpu, thread, latency); 173 151 } 174 152 175 - if (!params->no_aa) 176 - timerlat_aa_handler(s, record, event, context); 177 - 178 153 return 0; 179 154 } 180 155 ··· 188 169 189 170 trace_seq_printf(s, "\033[2;37;40m"); 190 171 trace_seq_printf(s, " Timer Latency "); 172 + if (params->user_top) 173 + trace_seq_printf(s, " "); 191 174 trace_seq_printf(s, "\033[0;0;0m"); 192 175 trace_seq_printf(s, "\n"); 193 176 194 - trace_seq_printf(s, "%-6s | IRQ Timer Latency (%s) | Thread Timer Latency (%s)\n", duration, 177 + trace_seq_printf(s, "%-6s | IRQ Timer Latency (%s) | Thread Timer Latency (%s)", duration, 195 178 params->output_divisor == 1 ? "ns" : "us", 196 179 params->output_divisor == 1 ? "ns" : "us"); 197 180 181 + if (params->user_top) { 182 + trace_seq_printf(s, " | Ret user Timer Latency (%s)", 183 + params->output_divisor == 1 ? "ns" : "us"); 184 + } 185 + 186 + trace_seq_printf(s, "\n"); 198 187 trace_seq_printf(s, "\033[2;30;47m"); 199 188 trace_seq_printf(s, "CPU COUNT | cur min avg max | cur min avg max"); 189 + if (params->user_top) 190 + trace_seq_printf(s, " | cur min avg max"); 200 191 trace_seq_printf(s, "\033[0;0;0m"); 201 192 trace_seq_printf(s, "\n"); 202 193 } ··· 259 230 trace_seq_printf(s, "%9llu ", cpu_data->min_thread / divisor); 260 231 trace_seq_printf(s, "%9llu ", 261 232 (cpu_data->sum_thread / cpu_data->thread_count) / divisor); 262 - trace_seq_printf(s, "%9llu\n", cpu_data->max_thread / divisor); 233 + trace_seq_printf(s, "%9llu", cpu_data->max_thread / divisor); 234 + } 235 + 236 + if (!params->user_top) { 237 + trace_seq_printf(s, "\n"); 238 + return; 239 + } 240 + 241 + trace_seq_printf(s, " |"); 242 + 243 + if (!cpu_data->user_count) { 244 + trace_seq_printf(s, " - "); 245 + trace_seq_printf(s, " - "); 246 + trace_seq_printf(s, " - "); 247 + trace_seq_printf(s, " -\n"); 248 + } else { 249 + trace_seq_printf(s, "%9llu ", cpu_data->cur_user / divisor); 250 + trace_seq_printf(s, "%9llu ", cpu_data->min_user / divisor); 251 + trace_seq_printf(s, "%9llu ", 252 + (cpu_data->sum_user / cpu_data->user_count) / divisor); 253 + trace_seq_printf(s, "%9llu\n", cpu_data->max_user / divisor); 263 254 } 264 255 } 265 256 ··· 314 265 timerlat_top_header(top); 315 266 316 267 for (i = 0; i < nr_cpus; i++) { 317 - if (params->cpus && !params->monitored_cpus[i]) 268 + if (params->cpus && !CPU_ISSET(i, &params->monitored_cpus)) 318 269 continue; 319 270 timerlat_top_print(top, i); 320 271 } ··· 333 284 static const char *const msg[] = { 334 285 "", 335 286 " usage: rtla timerlat [top] [-h] [-q] [-a us] [-d s] [-D] [-n] [-p us] [-i us] [-T us] [-s us] \\", 336 - " [[-t[=file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] [-c cpu-list] \\", 337 - " [-P priority] [--dma-latency us] [--aa-only us]", 287 + " [[-t[=file]] [-e sys[:event]] [--filter <filter>] [--trigger <trigger>] [-c cpu-list] [-H cpu-list]\\", 288 + " [-P priority] [--dma-latency us] [--aa-only us] [-C[=cgroup_name]] [-u]", 338 289 "", 339 290 " -h/--help: print this menu", 340 291 " -a/--auto: set automatic trace mode, stopping the session if argument in us latency is hit", ··· 344 295 " -T/--thread us: stop trace if the thread latency is higher than the argument in us", 345 296 " -s/--stack us: save the stack trace at the IRQ if a thread latency is higher than the argument in us", 346 297 " -c/--cpus cpus: run the tracer only on the given cpus", 298 + " -H/--house-keeping cpus: run rtla control threads only on the given cpus", 299 + " -C/--cgroup[=cgroup_name]: set cgroup, if no cgroup_name is passed, the rtla's cgroup will be inherited", 347 300 " -d/--duration time[m|h|d]: duration of the session in seconds", 348 301 " -D/--debug: print debug info", 349 302 " --dump-tasks: prints the task running on all CPUs if stop conditions are met (depends on !--no-aa)", ··· 363 312 " f:prio - use SCHED_FIFO with prio", 364 313 " d:runtime[us|ms|s]:period[us|ms|s] - use SCHED_DEADLINE with runtime and period", 365 314 " in nanoseconds", 315 + " -u/--user-threads: use rtla user-space threads instead of in-kernel timerlat threads", 366 316 NULL, 367 317 }; 368 318 ··· 404 352 static struct option long_options[] = { 405 353 {"auto", required_argument, 0, 'a'}, 406 354 {"cpus", required_argument, 0, 'c'}, 355 + {"cgroup", optional_argument, 0, 'C'}, 407 356 {"debug", no_argument, 0, 'D'}, 408 357 {"duration", required_argument, 0, 'd'}, 409 358 {"event", required_argument, 0, 'e'}, 410 359 {"help", no_argument, 0, 'h'}, 360 + {"house-keeping", required_argument, 0, 'H'}, 411 361 {"irq", required_argument, 0, 'i'}, 412 362 {"nano", no_argument, 0, 'n'}, 413 363 {"period", required_argument, 0, 'p'}, ··· 418 364 {"stack", required_argument, 0, 's'}, 419 365 {"thread", required_argument, 0, 'T'}, 420 366 {"trace", optional_argument, 0, 't'}, 367 + {"user-threads", no_argument, 0, 'u'}, 421 368 {"trigger", required_argument, 0, '0'}, 422 369 {"filter", required_argument, 0, '1'}, 423 370 {"dma-latency", required_argument, 0, '2'}, ··· 431 376 /* getopt_long stores the option index here. */ 432 377 int option_index = 0; 433 378 434 - c = getopt_long(argc, argv, "a:c:d:De:hi:np:P:qs:t::T:0:1:2:345:", 379 + c = getopt_long(argc, argv, "a:c:C::d:De:hH:i:np:P:qs:t::T:u0:1:2:345:", 435 380 long_options, &option_index); 436 381 437 382 /* detect the end of the options. */ ··· 467 412 params->aa_only = 1; 468 413 break; 469 414 case 'c': 470 - retval = parse_cpu_list(optarg, &params->monitored_cpus); 415 + retval = parse_cpu_set(optarg, &params->monitored_cpus); 471 416 if (retval) 472 417 timerlat_top_usage("\nInvalid -c cpu list\n"); 473 418 params->cpus = optarg; 419 + break; 420 + case 'C': 421 + params->cgroup = 1; 422 + if (!optarg) { 423 + /* will inherit this cgroup */ 424 + params->cgroup_name = NULL; 425 + } else if (*optarg == '=') { 426 + /* skip the = */ 427 + params->cgroup_name = ++optarg; 428 + } 474 429 break; 475 430 case 'D': 476 431 config_debug = 1; ··· 504 439 case 'h': 505 440 case '?': 506 441 timerlat_top_usage(NULL); 442 + break; 443 + case 'H': 444 + params->hk_cpus = 1; 445 + retval = parse_cpu_set(optarg, &params->hk_cpu_set); 446 + if (retval) { 447 + err_msg("Error parsing house keeping CPUs\n"); 448 + exit(EXIT_FAILURE); 449 + } 507 450 break; 508 451 case 'i': 509 452 params->stop_us = get_llong_from_str(optarg); ··· 546 473 else 547 474 params->trace_output = "timerlat_trace.txt"; 548 475 476 + break; 477 + case 'u': 478 + params->user_top = true; 549 479 break; 550 480 case '0': /* trigger */ 551 481 if (params->events) { ··· 614 538 timerlat_top_apply_config(struct osnoise_tool *top, struct timerlat_top_params *params) 615 539 { 616 540 int retval; 541 + int i; 617 542 618 543 if (!params->sleep_time) 619 544 params->sleep_time = 1; ··· 625 548 err_msg("Failed to apply CPUs config\n"); 626 549 goto out_err; 627 550 } 551 + } else { 552 + for (i = 0; i < sysconf(_SC_NPROCESSORS_CONF); i++) 553 + CPU_SET(i, &params->monitored_cpus); 628 554 } 629 555 630 556 if (params->stop_us) { ··· 664 584 } 665 585 } 666 586 587 + if (params->hk_cpus) { 588 + retval = sched_setaffinity(getpid(), sizeof(params->hk_cpu_set), 589 + &params->hk_cpu_set); 590 + if (retval == -1) { 591 + err_msg("Failed to set rtla to the house keeping CPUs\n"); 592 + goto out_err; 593 + } 594 + } else if (params->cpus) { 595 + /* 596 + * Even if the user do not set a house-keeping CPU, try to 597 + * move rtla to a CPU set different to the one where the user 598 + * set the workload to run. 599 + * 600 + * No need to check results as this is an automatic attempt. 601 + */ 602 + auto_house_keeping(&params->monitored_cpus); 603 + } 604 + 605 + if (params->user_top) { 606 + retval = osnoise_set_workload(top->context, 0); 607 + if (retval) { 608 + err_msg("Failed to set OSNOISE_WORKLOAD option\n"); 609 + goto out_err; 610 + } 611 + } 612 + 667 613 return 0; 668 614 669 615 out_err: ··· 704 598 { 705 599 struct osnoise_tool *top; 706 600 int nr_cpus; 707 - int retval; 708 601 709 602 nr_cpus = sysconf(_SC_NPROCESSORS_CONF); 710 603 ··· 719 614 720 615 tep_register_event_handler(top->trace.tep, -1, "ftrace", "timerlat", 721 616 timerlat_top_handler, top); 722 - 723 - /* 724 - * If no auto analysis, we are ready. 725 - */ 726 - if (params->no_aa) 727 - return top; 728 - 729 - retval = timerlat_aa_init(top, nr_cpus, params->dump_tasks); 730 - if (retval) 731 - goto out_err; 732 617 733 618 return top; 734 619 ··· 750 655 { 751 656 struct timerlat_top_params *params; 752 657 struct osnoise_tool *record = NULL; 658 + struct timerlat_u_params params_u; 753 659 struct osnoise_tool *top = NULL; 660 + struct osnoise_tool *aa = NULL; 754 661 struct trace_instance *trace; 755 662 int dma_latency_fd = -1; 663 + pthread_t timerlat_u; 756 664 int return_value = 1; 757 665 char *max_lat; 758 666 int retval; ··· 792 694 } 793 695 } 794 696 697 + if (params->cgroup && !params->user_top) { 698 + retval = set_comm_cgroup("timerlat/", params->cgroup_name); 699 + if (!retval) { 700 + err_msg("Failed to move threads to cgroup\n"); 701 + goto out_free; 702 + } 703 + } 704 + 795 705 if (params->dma_latency >= 0) { 796 706 dma_latency_fd = set_cpu_dma_latency(params->dma_latency); 797 707 if (dma_latency_fd < 0) { ··· 807 701 goto out_free; 808 702 } 809 703 } 810 - 811 - trace_instance_start(trace); 812 704 813 705 if (params->trace_output) { 814 706 record = osnoise_init_trace_tool("timerlat"); ··· 820 716 if (retval) 821 717 goto out_top; 822 718 } 823 - 824 - trace_instance_start(&record->trace); 825 719 } 720 + 721 + if (!params->no_aa) { 722 + if (params->aa_only) { 723 + /* as top is not used for display, use it for aa */ 724 + aa = top; 725 + } else { 726 + /* otherwise, a new instance is needed */ 727 + aa = osnoise_init_tool("timerlat_aa"); 728 + if (!aa) 729 + goto out_top; 730 + } 731 + 732 + retval = timerlat_aa_init(aa, params->dump_tasks); 733 + if (retval) { 734 + err_msg("Failed to enable the auto analysis instance\n"); 735 + goto out_top; 736 + } 737 + 738 + /* if it is re-using the main instance, there is no need to start it */ 739 + if (aa != top) { 740 + retval = enable_timerlat(&aa->trace); 741 + if (retval) { 742 + err_msg("Failed to enable timerlat tracer\n"); 743 + goto out_top; 744 + } 745 + } 746 + } 747 + 748 + /* 749 + * Start the tracers here, after having set all instances. 750 + * 751 + * Let the trace instance start first for the case of hitting a stop 752 + * tracing while enabling other instances. The trace instance is the 753 + * one with most valuable information. 754 + */ 755 + if (params->trace_output) 756 + trace_instance_start(&record->trace); 757 + if (!params->no_aa && aa != top) 758 + trace_instance_start(&aa->trace); 759 + trace_instance_start(trace); 826 760 827 761 top->start_time = time(NULL); 828 762 timerlat_top_set_signals(params); 763 + 764 + if (params->user_top) { 765 + /* rtla asked to stop */ 766 + params_u.should_run = 1; 767 + /* all threads left */ 768 + params_u.stopped_running = 0; 769 + 770 + params_u.set = &params->monitored_cpus; 771 + if (params->set_sched) 772 + params_u.sched_param = &params->sched_param; 773 + else 774 + params_u.sched_param = NULL; 775 + 776 + params_u.cgroup_name = params->cgroup_name; 777 + 778 + retval = pthread_create(&timerlat_u, NULL, timerlat_u_dispatcher, &params_u); 779 + if (retval) 780 + err_msg("Error creating timerlat user-space threads\n"); 781 + } 829 782 830 783 while (!stop_tracing) { 831 784 sleep(params->sleep_time); ··· 907 746 if (trace_is_off(&top->trace, &record->trace)) 908 747 break; 909 748 749 + /* is there still any user-threads ? */ 750 + if (params->user_top) { 751 + if (params_u.stopped_running) { 752 + debug_msg("timerlat user space threads stopped!\n"); 753 + break; 754 + } 755 + } 756 + } 757 + 758 + if (params->user_top && !params_u.stopped_running) { 759 + params_u.should_run = 0; 760 + sleep(1); 910 761 } 911 762 912 763 timerlat_print_stats(params, top); ··· 948 775 } 949 776 950 777 out_top: 778 + timerlat_aa_destroy(); 951 779 if (dma_latency_fd >= 0) 952 780 close(dma_latency_fd); 953 781 trace_events_destroy(&record->trace, params->events); 954 782 params->events = NULL; 955 783 out_free: 956 784 timerlat_free_top(top->data); 957 - timerlat_aa_destroy(); 785 + if (aa && aa != top) 786 + osnoise_destroy_tool(aa); 958 787 osnoise_destroy_tool(record); 959 788 osnoise_destroy_tool(top); 960 789 free(params);
+224
tools/tracing/rtla/src/timerlat_u.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Copyright (C) 2023 Red Hat Inc, Daniel Bristot de Oliveira <bristot@kernel.org> 4 + */ 5 + 6 + #define _GNU_SOURCE 7 + #include <sched.h> 8 + #include <fcntl.h> 9 + #include <stdlib.h> 10 + #include <unistd.h> 11 + #include <stdio.h> 12 + #include <errno.h> 13 + #include <string.h> 14 + #include <tracefs.h> 15 + #include <pthread.h> 16 + #include <sys/wait.h> 17 + #include <sys/prctl.h> 18 + 19 + #include "utils.h" 20 + #include "timerlat_u.h" 21 + 22 + /* 23 + * This is the user-space main for the tool timerlatu/ threads. 24 + * 25 + * It is as simple as this: 26 + * - set affinity 27 + * - set priority 28 + * - open tracer fd 29 + * - spin 30 + * - close 31 + */ 32 + static int timerlat_u_main(int cpu, struct timerlat_u_params *params) 33 + { 34 + struct sched_param sp = { .sched_priority = 95 }; 35 + char buffer[1024]; 36 + int timerlat_fd; 37 + cpu_set_t set; 38 + int retval; 39 + 40 + /* 41 + * This all is only setting up the tool. 42 + */ 43 + CPU_ZERO(&set); 44 + CPU_SET(cpu, &set); 45 + 46 + retval = sched_setaffinity(gettid(), sizeof(set), &set); 47 + if (retval == -1) { 48 + err_msg("Error setting user thread affinity\n"); 49 + exit(1); 50 + } 51 + 52 + if (!params->sched_param) { 53 + retval = sched_setscheduler(0, SCHED_FIFO, &sp); 54 + if (retval < 0) { 55 + err_msg("Error setting timerlat u default priority: %s\n", strerror(errno)); 56 + exit(1); 57 + } 58 + } else { 59 + retval = __set_sched_attr(getpid(), params->sched_param); 60 + if (retval) { 61 + /* __set_sched_attr prints an error message, so */ 62 + exit(0); 63 + } 64 + } 65 + 66 + if (params->cgroup_name) { 67 + retval = set_pid_cgroup(gettid(), params->cgroup_name); 68 + if (!retval) { 69 + err_msg("Error setting timerlat u cgroup pid\n"); 70 + pthread_exit(&retval); 71 + } 72 + } 73 + 74 + /* 75 + * This is the tool's loop. If you want to use as base for your own tool... 76 + * go ahead. 77 + */ 78 + snprintf(buffer, sizeof(buffer), "osnoise/per_cpu/cpu%d/timerlat_fd", cpu); 79 + 80 + timerlat_fd = tracefs_instance_file_open(NULL, buffer, O_RDONLY); 81 + if (timerlat_fd < 0) { 82 + err_msg("Error opening %s:%s\n", buffer, strerror(errno)); 83 + exit(1); 84 + } 85 + 86 + debug_msg("User-space timerlat pid %d on cpu %d\n", gettid(), cpu); 87 + 88 + /* add should continue with a signal handler */ 89 + while (true) { 90 + retval = read(timerlat_fd, buffer, 1024); 91 + if (retval < 0) 92 + break; 93 + } 94 + 95 + close(timerlat_fd); 96 + 97 + debug_msg("Leaving timerlat pid %d on cpu %d\n", gettid(), cpu); 98 + exit(0); 99 + } 100 + 101 + /* 102 + * timerlat_u_send_kill - send a kill signal for all processes 103 + * 104 + * Return the number of processes that received the kill. 105 + */ 106 + static int timerlat_u_send_kill(pid_t *procs, int nr_cpus) 107 + { 108 + int killed = 0; 109 + int i, retval; 110 + 111 + for (i = 0; i < nr_cpus; i++) { 112 + if (!procs[i]) 113 + continue; 114 + retval = kill(procs[i], SIGKILL); 115 + if (!retval) 116 + killed++; 117 + else 118 + err_msg("Error killing child process %d\n", procs[i]); 119 + } 120 + 121 + return killed; 122 + } 123 + 124 + /** 125 + * timerlat_u_dispatcher - dispatch one timerlatu/ process per monitored CPU 126 + * 127 + * This is a thread main that will fork one new process for each monitored 128 + * CPU. It will wait for: 129 + * 130 + * - rtla to tell to kill the child processes 131 + * - some child process to die, and the cleanup all the processes 132 + * 133 + * whichever comes first. 134 + * 135 + */ 136 + void *timerlat_u_dispatcher(void *data) 137 + { 138 + int nr_cpus = sysconf(_SC_NPROCESSORS_CONF); 139 + struct timerlat_u_params *params = data; 140 + char proc_name[128]; 141 + int procs_count = 0; 142 + int retval = 1; 143 + pid_t *procs; 144 + int wstatus; 145 + pid_t pid; 146 + int i; 147 + 148 + debug_msg("Dispatching timerlat u procs\n"); 149 + 150 + procs = calloc(nr_cpus, sizeof(pid_t)); 151 + if (!procs) 152 + pthread_exit(&retval); 153 + 154 + for (i = 0; i < nr_cpus; i++) { 155 + if (params->set && !CPU_ISSET(i, params->set)) 156 + continue; 157 + 158 + pid = fork(); 159 + 160 + /* child */ 161 + if (!pid) { 162 + 163 + /* 164 + * rename the process 165 + */ 166 + snprintf(proc_name, sizeof(proc_name), "timerlatu/%d", i); 167 + pthread_setname_np(pthread_self(), proc_name); 168 + prctl(PR_SET_NAME, (unsigned long)proc_name, 0, 0, 0); 169 + 170 + timerlat_u_main(i, params); 171 + /* timerlat_u_main should exit()! Anyways... */ 172 + pthread_exit(&retval); 173 + } 174 + 175 + /* parent */ 176 + if (pid == -1) { 177 + timerlat_u_send_kill(procs, nr_cpus); 178 + debug_msg("Failed to create child processes"); 179 + pthread_exit(&retval); 180 + } 181 + 182 + procs_count++; 183 + procs[i] = pid; 184 + } 185 + 186 + while (params->should_run) { 187 + /* check if processes died */ 188 + pid = waitpid(-1, &wstatus, WNOHANG); 189 + if (pid != 0) { 190 + for (i = 0; i < nr_cpus; i++) { 191 + if (procs[i] == pid) { 192 + procs[i] = 0; 193 + procs_count--; 194 + } 195 + } 196 + break; 197 + } 198 + 199 + sleep(1); 200 + } 201 + 202 + timerlat_u_send_kill(procs, nr_cpus); 203 + 204 + while (procs_count) { 205 + pid = waitpid(-1, &wstatus, 0); 206 + if (pid == -1) { 207 + err_msg("Failed to monitor child processes"); 208 + pthread_exit(&retval); 209 + } 210 + for (i = 0; i < nr_cpus; i++) { 211 + if (procs[i] == pid) { 212 + procs[i] = 0; 213 + procs_count--; 214 + } 215 + } 216 + } 217 + 218 + params->stopped_running = 1; 219 + 220 + free(procs); 221 + retval = 0; 222 + pthread_exit(&retval); 223 + 224 + }
+18
tools/tracing/rtla/src/timerlat_u.h
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Copyright (C) 2023 Red Hat Inc, Daniel Bristot de Oliveira <bristot@kernel.org> 4 + */ 5 + 6 + struct timerlat_u_params { 7 + /* timerlat -> timerlat_u: user-space threads can keep running */ 8 + int should_run; 9 + /* timerlat_u -> timerlat: all timerlat_u threads left, no reason to continue */ 10 + int stopped_running; 11 + 12 + /* threads config */ 13 + cpu_set_t *set; 14 + char *cgroup_name; 15 + struct sched_attr *sched_param; 16 + }; 17 + 18 + void *timerlat_u_dispatcher(void *data);
+306 -18
tools/tracing/rtla/src/utils.c
··· 3 3 * Copyright (C) 2021 Red Hat Inc, Daniel Bristot de Oliveira <bristot@kernel.org> 4 4 */ 5 5 6 + #define _GNU_SOURCE 6 7 #include <dirent.h> 7 8 #include <stdarg.h> 8 9 #include <stdlib.h> ··· 89 88 } 90 89 91 90 /* 92 - * parse_cpu_list - parse a cpu_list filling a char vector with cpus set 91 + * parse_cpu_set - parse a cpu_list filling cpu_set_t argument 93 92 * 94 - * Receives a cpu list, like 1-3,5 (cpus 1, 2, 3, 5), and then set the char 95 - * in the monitored_cpus. 93 + * Receives a cpu list, like 1-3,5 (cpus 1, 2, 3, 5), and then set 94 + * filling cpu_set_t argument. 96 95 * 97 - * XXX: convert to a bitmask. 96 + * Returns 1 on success, 0 otherwise. 98 97 */ 99 - int parse_cpu_list(char *cpu_list, char **monitored_cpus) 98 + int parse_cpu_set(char *cpu_list, cpu_set_t *set) 100 99 { 101 - char *mon_cpus; 102 100 const char *p; 103 101 int end_cpu; 104 102 int nr_cpus; 105 103 int cpu; 106 104 int i; 107 105 108 - nr_cpus = sysconf(_SC_NPROCESSORS_CONF); 106 + CPU_ZERO(set); 109 107 110 - mon_cpus = calloc(nr_cpus, sizeof(char)); 111 - if (!mon_cpus) 112 - goto err; 108 + nr_cpus = sysconf(_SC_NPROCESSORS_CONF); 113 109 114 110 for (p = cpu_list; *p; ) { 115 111 cpu = atoi(p); ··· 126 128 end_cpu = cpu; 127 129 128 130 if (cpu == end_cpu) { 129 - debug_msg("cpu_list: adding cpu %d\n", cpu); 130 - mon_cpus[cpu] = 1; 131 + debug_msg("cpu_set: adding cpu %d\n", cpu); 132 + CPU_SET(cpu, set); 131 133 } else { 132 134 for (i = cpu; i <= end_cpu; i++) { 133 - debug_msg("cpu_list: adding cpu %d\n", i); 134 - mon_cpus[i] = 1; 135 + debug_msg("cpu_set: adding cpu %d\n", i); 136 + CPU_SET(i, set); 135 137 } 136 138 } 137 139 ··· 139 141 p++; 140 142 } 141 143 142 - *monitored_cpus = mon_cpus; 143 - 144 144 return 0; 145 - 146 145 err: 147 - debug_msg("Error parsing the cpu list %s", cpu_list); 146 + debug_msg("Error parsing the cpu set %s\n", cpu_list); 148 147 return 1; 149 148 } 150 149 ··· 523 528 debug_msg("Set /dev/cpu_dma_latency to %d\n", latency); 524 529 525 530 return fd; 531 + } 532 + 533 + #define _STR(x) #x 534 + #define STR(x) _STR(x) 535 + 536 + /* 537 + * find_mount - find a the mount point of a given fs 538 + * 539 + * Returns 0 if mount is not found, otherwise return 1 and fill mp 540 + * with the mount point. 541 + */ 542 + static const int find_mount(const char *fs, char *mp, int sizeof_mp) 543 + { 544 + char mount_point[MAX_PATH]; 545 + char type[100]; 546 + int found; 547 + FILE *fp; 548 + 549 + fp = fopen("/proc/mounts", "r"); 550 + if (!fp) 551 + return 0; 552 + 553 + while (fscanf(fp, "%*s %" STR(MAX_PATH) "s %99s %*s %*d %*d\n", mount_point, type) == 2) { 554 + if (strcmp(type, fs) == 0) { 555 + found = 1; 556 + break; 557 + } 558 + } 559 + fclose(fp); 560 + 561 + if (!found) 562 + return 0; 563 + 564 + memset(mp, 0, sizeof_mp); 565 + strncpy(mp, mount_point, sizeof_mp - 1); 566 + 567 + debug_msg("Fs %s found at %s\n", fs, mp); 568 + return 1; 569 + } 570 + 571 + /* 572 + * get_self_cgroup - get the current thread cgroup path 573 + * 574 + * Parse /proc/$$/cgroup file to get the thread's cgroup. As an example of line to parse: 575 + * 576 + * 0::/user.slice/user-0.slice/session-3.scope'\n' 577 + * 578 + * This function is interested in the content after the second : and before the '\n'. 579 + * 580 + * Returns 1 if a string was found, 0 otherwise. 581 + */ 582 + static int get_self_cgroup(char *self_cg, int sizeof_self_cg) 583 + { 584 + char path[MAX_PATH], *start; 585 + int fd, retval; 586 + 587 + snprintf(path, MAX_PATH, "/proc/%d/cgroup", getpid()); 588 + 589 + fd = open(path, O_RDONLY); 590 + if (fd < 0) 591 + return 0; 592 + 593 + retval = read(fd, path, MAX_PATH); 594 + 595 + close(fd); 596 + 597 + if (retval <= 0) 598 + return 0; 599 + 600 + start = path; 601 + 602 + start = strstr(start, ":"); 603 + if (!start) 604 + return 0; 605 + 606 + /* skip ":" */ 607 + start++; 608 + 609 + start = strstr(start, ":"); 610 + if (!start) 611 + return 0; 612 + 613 + /* skip ":" */ 614 + start++; 615 + 616 + if (strlen(start) >= sizeof_self_cg) 617 + return 0; 618 + 619 + snprintf(self_cg, sizeof_self_cg, "%s", start); 620 + 621 + /* Swap '\n' with '\0' */ 622 + start = strstr(self_cg, "\n"); 623 + 624 + /* there must be '\n' */ 625 + if (!start) 626 + return 0; 627 + 628 + /* ok, it found a string after the second : and before the \n */ 629 + *start = '\0'; 630 + 631 + return 1; 632 + } 633 + 634 + /* 635 + * set_comm_cgroup - Set cgroup to pid_t pid 636 + * 637 + * If cgroup argument is not NULL, the threads will move to the given cgroup. 638 + * Otherwise, the cgroup of the calling, i.e., rtla, thread will be used. 639 + * 640 + * Supports cgroup v2. 641 + * 642 + * Returns 1 on success, 0 otherwise. 643 + */ 644 + int set_pid_cgroup(pid_t pid, const char *cgroup) 645 + { 646 + char cgroup_path[MAX_PATH - strlen("/cgroup.procs")]; 647 + char cgroup_procs[MAX_PATH]; 648 + char pid_str[24]; 649 + int retval; 650 + int cg_fd; 651 + 652 + retval = find_mount("cgroup2", cgroup_path, sizeof(cgroup_path)); 653 + if (!retval) { 654 + err_msg("Did not find cgroupv2 mount point\n"); 655 + return 0; 656 + } 657 + 658 + if (!cgroup) { 659 + retval = get_self_cgroup(&cgroup_path[strlen(cgroup_path)], 660 + sizeof(cgroup_path) - strlen(cgroup_path)); 661 + if (!retval) { 662 + err_msg("Did not find self cgroup\n"); 663 + return 0; 664 + } 665 + } else { 666 + snprintf(&cgroup_path[strlen(cgroup_path)], 667 + sizeof(cgroup_path) - strlen(cgroup_path), "%s/", cgroup); 668 + } 669 + 670 + snprintf(cgroup_procs, MAX_PATH, "%s/cgroup.procs", cgroup_path); 671 + 672 + debug_msg("Using cgroup path at: %s\n", cgroup_procs); 673 + 674 + cg_fd = open(cgroup_procs, O_RDWR); 675 + if (cg_fd < 0) 676 + return 0; 677 + 678 + snprintf(pid_str, sizeof(pid_str), "%d\n", pid); 679 + 680 + retval = write(cg_fd, pid_str, strlen(pid_str)); 681 + if (retval < 0) 682 + err_msg("Error setting cgroup attributes for pid:%s - %s\n", 683 + pid_str, strerror(errno)); 684 + else 685 + debug_msg("Set cgroup attributes for pid:%s\n", pid_str); 686 + 687 + close(cg_fd); 688 + 689 + return (retval >= 0); 690 + } 691 + 692 + /** 693 + * set_comm_cgroup - Set cgroup to threads starting with char *comm_prefix 694 + * 695 + * If cgroup argument is not NULL, the threads will move to the given cgroup. 696 + * Otherwise, the cgroup of the calling, i.e., rtla, thread will be used. 697 + * 698 + * Supports cgroup v2. 699 + * 700 + * Returns 1 on success, 0 otherwise. 701 + */ 702 + int set_comm_cgroup(const char *comm_prefix, const char *cgroup) 703 + { 704 + char cgroup_path[MAX_PATH - strlen("/cgroup.procs")]; 705 + char cgroup_procs[MAX_PATH]; 706 + struct dirent *proc_entry; 707 + DIR *procfs; 708 + int retval; 709 + int cg_fd; 710 + 711 + if (strlen(comm_prefix) >= MAX_PATH) { 712 + err_msg("Command prefix is too long: %d < strlen(%s)\n", 713 + MAX_PATH, comm_prefix); 714 + return 0; 715 + } 716 + 717 + retval = find_mount("cgroup2", cgroup_path, sizeof(cgroup_path)); 718 + if (!retval) { 719 + err_msg("Did not find cgroupv2 mount point\n"); 720 + return 0; 721 + } 722 + 723 + if (!cgroup) { 724 + retval = get_self_cgroup(&cgroup_path[strlen(cgroup_path)], 725 + sizeof(cgroup_path) - strlen(cgroup_path)); 726 + if (!retval) { 727 + err_msg("Did not find self cgroup\n"); 728 + return 0; 729 + } 730 + } else { 731 + snprintf(&cgroup_path[strlen(cgroup_path)], 732 + sizeof(cgroup_path) - strlen(cgroup_path), "%s/", cgroup); 733 + } 734 + 735 + snprintf(cgroup_procs, MAX_PATH, "%s/cgroup.procs", cgroup_path); 736 + 737 + debug_msg("Using cgroup path at: %s\n", cgroup_procs); 738 + 739 + cg_fd = open(cgroup_procs, O_RDWR); 740 + if (cg_fd < 0) 741 + return 0; 742 + 743 + procfs = opendir("/proc"); 744 + if (!procfs) { 745 + err_msg("Could not open procfs\n"); 746 + goto out_cg; 747 + } 748 + 749 + while ((proc_entry = readdir(procfs))) { 750 + 751 + retval = procfs_is_workload_pid(comm_prefix, proc_entry); 752 + if (!retval) 753 + continue; 754 + 755 + retval = write(cg_fd, proc_entry->d_name, strlen(proc_entry->d_name)); 756 + if (retval < 0) { 757 + err_msg("Error setting cgroup attributes for pid:%s - %s\n", 758 + proc_entry->d_name, strerror(errno)); 759 + goto out_procfs; 760 + } 761 + 762 + debug_msg("Set cgroup attributes for pid:%s\n", proc_entry->d_name); 763 + } 764 + 765 + closedir(procfs); 766 + close(cg_fd); 767 + return 1; 768 + 769 + out_procfs: 770 + closedir(procfs); 771 + out_cg: 772 + close(cg_fd); 773 + return 0; 774 + } 775 + 776 + /** 777 + * auto_house_keeping - Automatically move rtla out of measurement threads 778 + * 779 + * Try to move rtla away from the tracer, if possible. 780 + * 781 + * Returns 1 on success, 0 otherwise. 782 + */ 783 + int auto_house_keeping(cpu_set_t *monitored_cpus) 784 + { 785 + cpu_set_t rtla_cpus, house_keeping_cpus; 786 + int retval; 787 + 788 + /* first get the CPUs in which rtla can actually run. */ 789 + retval = sched_getaffinity(getpid(), sizeof(rtla_cpus), &rtla_cpus); 790 + if (retval == -1) { 791 + debug_msg("Could not get rtla affinity, rtla might run with the threads!\n"); 792 + return 0; 793 + } 794 + 795 + /* then check if the existing setup is already good. */ 796 + CPU_AND(&house_keeping_cpus, &rtla_cpus, monitored_cpus); 797 + if (!CPU_COUNT(&house_keeping_cpus)) { 798 + debug_msg("rtla and the monitored CPUs do not share CPUs."); 799 + debug_msg("Skipping auto house-keeping\n"); 800 + return 1; 801 + } 802 + 803 + /* remove the intersection */ 804 + CPU_XOR(&house_keeping_cpus, &rtla_cpus, monitored_cpus); 805 + 806 + /* get only those that rtla can run */ 807 + CPU_AND(&house_keeping_cpus, &house_keeping_cpus, &rtla_cpus); 808 + 809 + /* is there any cpu left? */ 810 + if (!CPU_COUNT(&house_keeping_cpus)) { 811 + debug_msg("Could not find any CPU for auto house-keeping\n"); 812 + return 0; 813 + } 814 + 815 + retval = sched_setaffinity(getpid(), sizeof(house_keeping_cpus), &house_keeping_cpus); 816 + if (retval == -1) { 817 + debug_msg("Could not set affinity for auto house-keeping\n"); 818 + return 0; 819 + } 820 + 821 + debug_msg("rtla automatically moved to an auto house-keeping cpu set\n"); 822 + 823 + return 1; 526 824 }
+7
tools/tracing/rtla/src/utils.h
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 + 2 3 #include <stdint.h> 3 4 #include <time.h> 5 + #include <sched.h> 4 6 5 7 /* 6 8 * '18446744073709551615\0' ··· 56 54 }; 57 55 58 56 int parse_prio(char *arg, struct sched_attr *sched_param); 57 + int parse_cpu_set(char *cpu_list, cpu_set_t *set); 58 + int __set_sched_attr(int pid, struct sched_attr *attr); 59 59 int set_comm_sched_attr(const char *comm_prefix, struct sched_attr *attr); 60 + int set_comm_cgroup(const char *comm_prefix, const char *cgroup); 61 + int set_pid_cgroup(pid_t pid, const char *cgroup); 60 62 int set_cpu_dma_latency(int32_t latency); 63 + int auto_house_keeping(cpu_set_t *monitored_cpus); 61 64 62 65 #define ns_to_usf(x) (((double)x/1000)) 63 66 #define ns_to_per(total, part) ((part * 100) / (double)total)