Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'x86_cache_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 resource control updates from Borislav Petkov:
"Add support on AMD for assigning QoS bandwidth counters to resources
(RMIDs) with the ability for those resources to be tracked by the
counters as long as they're assigned to them.

Previously, due to hw limitations, bandwidth counts from untracked
resources would get lost when those resources are not tracked.

Refactor the code and user interfaces to be able to also support
other, similar features on ARM, for example"

* tag 'x86_cache_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
fs/resctrl: Fix counter auto-assignment on mkdir with mbm_event enabled
MAINTAINERS: resctrl: Add myself as reviewer
x86/resctrl: Configure mbm_event mode if supported
fs/resctrl: Introduce the interface to switch between monitor modes
fs/resctrl: Disable BMEC event configuration when mbm_event mode is enabled
fs/resctrl: Introduce the interface to modify assignments in a group
fs/resctrl: Introduce mbm_L3_assignments to list assignments in a group
fs/resctrl: Auto assign counters on mkdir and clean up on group removal
fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdir
fs/resctrl: Provide interface to update the event configurations
fs/resctrl: Add event configuration directory under info/L3_MON/
fs/resctrl: Support counter read/reset with mbm_event assignment mode
x86/resctrl: Implement resctrl_arch_reset_cntr() and resctrl_arch_cntr_read()
x86/resctrl: Refactor resctrl_arch_rmid_read()
fs/resctrl: Introduce counter ID read, reset calls in mbm_event mode
fs/resctrl: Pass struct rdtgroup instead of individual members
fs/resctrl: Add the functionality to unassign MBM events
fs/resctrl: Add the functionality to assign MBM events
x86,fs/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC
fs/resctrl: Introduce event configuration field in struct mon_evt
...

+2025 -233
+1 -1
Documentation/admin-guide/kernel-parameters.txt
··· 6163 6163 rdt= [HW,X86,RDT] 6164 6164 Turn on/off individual RDT features. List is: 6165 6165 cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp, 6166 - mba, smba, bmec. 6166 + mba, smba, bmec, abmc. 6167 6167 E.g. to turn on cmt and turn off mba use: 6168 6168 rdt=cmt,!mba 6169 6169
+325
Documentation/filesystems/resctrl.rst
··· 26 26 MBA (Memory Bandwidth Allocation) "mba" 27 27 SMBA (Slow Memory Bandwidth Allocation) "" 28 28 BMEC (Bandwidth Monitoring Event Configuration) "" 29 + ABMC (Assignable Bandwidth Monitoring Counters) "" 29 30 =============================================== ================================ 30 31 31 32 Historically, new features were made visible by default in /proc/cpuinfo. This ··· 257 256 # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config 258 257 0=0x30;1=0x30;3=0x15;4=0x15 259 258 259 + "mbm_assign_mode": 260 + The supported counter assignment modes. The enclosed brackets indicate which mode 261 + is enabled. The MBM events associated with counters may reset when "mbm_assign_mode" 262 + is changed. 263 + :: 264 + 265 + # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode 266 + [mbm_event] 267 + default 268 + 269 + "mbm_event": 270 + 271 + mbm_event mode allows users to assign a hardware counter to an RMID, event 272 + pair and monitor the bandwidth usage as long as it is assigned. The hardware 273 + continues to track the assigned counter until it is explicitly unassigned by 274 + the user. Each event within a resctrl group can be assigned independently. 275 + 276 + In this mode, a monitoring event can only accumulate data while it is backed 277 + by a hardware counter. Use "mbm_L3_assignments" found in each CTRL_MON and MON 278 + group to specify which of the events should have a counter assigned. The number 279 + of counters available is described in the "num_mbm_cntrs" file. Changing the 280 + mode may cause all counters on the resource to reset. 281 + 282 + Moving to mbm_event counter assignment mode requires users to assign the counters 283 + to the events. Otherwise, the MBM event counters will return 'Unassigned' when read. 284 + 285 + The mode is beneficial for AMD platforms that support more CTRL_MON 286 + and MON groups than available hardware counters. By default, this 287 + feature is enabled on AMD platforms with the ABMC (Assignable Bandwidth 288 + Monitoring Counters) capability, ensuring counters remain assigned even 289 + when the corresponding RMID is not actively used by any processor. 290 + 291 + "default": 292 + 293 + In default mode, resctrl assumes there is a hardware counter for each 294 + event within every CTRL_MON and MON group. On AMD platforms, it is 295 + recommended to use the mbm_event mode, if supported, to prevent reset of MBM 296 + events between reads resulting from hardware re-allocating counters. This can 297 + result in misleading values or display "Unavailable" if no counter is assigned 298 + to the event. 299 + 300 + * To enable "mbm_event" counter assignment mode: 301 + :: 302 + 303 + # echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode 304 + 305 + * To enable "default" monitoring mode: 306 + :: 307 + 308 + # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode 309 + 310 + "num_mbm_cntrs": 311 + The maximum number of counters (total of available and assigned counters) in 312 + each domain when the system supports mbm_event mode. 313 + 314 + For example, on a system with maximum of 32 memory bandwidth monitoring 315 + counters in each of its L3 domains: 316 + :: 317 + 318 + # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs 319 + 0=32;1=32 320 + 321 + "available_mbm_cntrs": 322 + The number of counters available for assignment in each domain when mbm_event 323 + mode is enabled on the system. 324 + 325 + For example, on a system with 30 available [hardware] assignable counters 326 + in each of its L3 domains: 327 + :: 328 + 329 + # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs 330 + 0=30;1=30 331 + 332 + "event_configs": 333 + Directory that exists when "mbm_event" counter assignment mode is supported. 334 + Contains a sub-directory for each MBM event that can be assigned to a counter. 335 + 336 + Two MBM events are supported by default: mbm_local_bytes and mbm_total_bytes. 337 + Each MBM event's sub-directory contains a file named "event_filter" that is 338 + used to view and modify which memory transactions the MBM event is configured 339 + with. The file is accessible only when "mbm_event" counter assignment mode is 340 + enabled. 341 + 342 + List of memory transaction types supported: 343 + 344 + ========================== ======================================================== 345 + Name Description 346 + ========================== ======================================================== 347 + dirty_victim_writes_all Dirty Victims from the QOS domain to all types of memory 348 + remote_reads_slow_memory Reads to slow memory in the non-local NUMA domain 349 + local_reads_slow_memory Reads to slow memory in the local NUMA domain 350 + remote_non_temporal_writes Non-temporal writes to non-local NUMA domain 351 + local_non_temporal_writes Non-temporal writes to local NUMA domain 352 + remote_reads Reads to memory in the non-local NUMA domain 353 + local_reads Reads to memory in the local NUMA domain 354 + ========================== ======================================================== 355 + 356 + For example:: 357 + 358 + # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter 359 + local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes, 360 + local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all 361 + 362 + # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter 363 + local_reads,local_non_temporal_writes,local_reads_slow_memory 364 + 365 + Modify the event configuration by writing to the "event_filter" file within 366 + the "event_configs" directory. The read/write "event_filter" file contains the 367 + configuration of the event that reflects which memory transactions are counted by it. 368 + 369 + For example:: 370 + 371 + # echo "local_reads, local_non_temporal_writes" > 372 + /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter 373 + 374 + # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter 375 + local_reads,local_non_temporal_writes 376 + 377 + "mbm_assign_on_mkdir": 378 + Exists when "mbm_event" counter assignment mode is supported. Accessible 379 + only when "mbm_event" counter assignment mode is enabled. 380 + 381 + Determines if a counter will automatically be assigned to an RMID, MBM event 382 + pair when its associated monitor group is created via mkdir. Enabled by default 383 + on boot, also when switched from "default" mode to "mbm_event" counter assignment 384 + mode. Users can disable this capability by writing to the interface. 385 + 386 + "0": 387 + Auto assignment is disabled. 388 + "1": 389 + Auto assignment is enabled. 390 + 391 + Example:: 392 + 393 + # echo 0 > /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir 394 + # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir 395 + 0 396 + 260 397 "max_threshold_occupancy": 261 398 Read/write file provides the largest value (in 262 399 bytes) at which a previously used LLC_occupancy ··· 519 380 for the L3 cache they occupy). These are named "mon_sub_L3_YY" 520 381 where "YY" is the node number. 521 382 383 + When the 'mbm_event' counter assignment mode is enabled, reading 384 + an MBM event of a MON group returns 'Unassigned' if no hardware 385 + counter is assigned to it. For CTRL_MON groups, 'Unassigned' is 386 + returned if the MBM event does not have an assigned counter in the 387 + CTRL_MON group nor in any of its associated MON groups. 388 + 522 389 "mon_hw_id": 523 390 Available only with debug option. The identifier used by hardware 524 391 for the monitor group. On x86 this is the RMID. 392 + 393 + When monitoring is enabled all MON groups may also contain: 394 + 395 + "mbm_L3_assignments": 396 + Exists when "mbm_event" counter assignment mode is supported and lists the 397 + counter assignment states of the group. 398 + 399 + The assignment list is displayed in the following format: 400 + 401 + <Event>:<Domain ID>=<Assignment state>;<Domain ID>=<Assignment state> 402 + 403 + Event: A valid MBM event in the 404 + /sys/fs/resctrl/info/L3_MON/event_configs directory. 405 + 406 + Domain ID: A valid domain ID. When writing, '*' applies the changes 407 + to all the domains. 408 + 409 + Assignment states: 410 + 411 + _ : No counter assigned. 412 + 413 + e : Counter assigned exclusively. 414 + 415 + Example: 416 + 417 + To display the counter assignment states for the default group. 418 + :: 419 + 420 + # cd /sys/fs/resctrl 421 + # cat /sys/fs/resctrl/mbm_L3_assignments 422 + mbm_total_bytes:0=e;1=e 423 + mbm_local_bytes:0=e;1=e 424 + 425 + Assignments can be modified by writing to the interface. 426 + 427 + Examples: 428 + 429 + To unassign the counter associated with the mbm_total_bytes event on domain 0: 430 + :: 431 + 432 + # echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments 433 + # cat /sys/fs/resctrl/mbm_L3_assignments 434 + mbm_total_bytes:0=_;1=e 435 + mbm_local_bytes:0=e;1=e 436 + 437 + To unassign the counter associated with the mbm_total_bytes event on all the domains: 438 + :: 439 + 440 + # echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments 441 + # cat /sys/fs/resctrl/mbm_L3_assignments 442 + mbm_total_bytes:0=_;1=_ 443 + mbm_local_bytes:0=e;1=e 444 + 445 + To assign a counter associated with the mbm_total_bytes event on all domains in 446 + exclusive mode: 447 + :: 448 + 449 + # echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments 450 + # cat /sys/fs/resctrl/mbm_L3_assignments 451 + mbm_total_bytes:0=e;1=e 452 + mbm_local_bytes:0=e;1=e 525 453 526 454 When the "mba_MBps" mount option is used all CTRL_MON groups will also contain: 527 455 ··· 1634 1428 1635 1429 # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy 1636 1430 11234000 1431 + 1432 + 1433 + Examples on working with mbm_assign_mode 1434 + ======================================== 1435 + 1436 + a. Check if MBM counter assignment mode is supported. 1437 + :: 1438 + 1439 + # mount -t resctrl resctrl /sys/fs/resctrl/ 1440 + 1441 + # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode 1442 + [mbm_event] 1443 + default 1444 + 1445 + The "mbm_event" mode is detected and enabled. 1446 + 1447 + b. Check how many assignable counters are supported. 1448 + :: 1449 + 1450 + # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs 1451 + 0=32;1=32 1452 + 1453 + c. Check how many assignable counters are available for assignment in each domain. 1454 + :: 1455 + 1456 + # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs 1457 + 0=30;1=30 1458 + 1459 + d. To list the default group's assign states. 1460 + :: 1461 + 1462 + # cat /sys/fs/resctrl/mbm_L3_assignments 1463 + mbm_total_bytes:0=e;1=e 1464 + mbm_local_bytes:0=e;1=e 1465 + 1466 + e. To unassign the counter associated with the mbm_total_bytes event on domain 0. 1467 + :: 1468 + 1469 + # echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments 1470 + # cat /sys/fs/resctrl/mbm_L3_assignments 1471 + mbm_total_bytes:0=_;1=e 1472 + mbm_local_bytes:0=e;1=e 1473 + 1474 + f. To unassign the counter associated with the mbm_total_bytes event on all domains. 1475 + :: 1476 + 1477 + # echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments 1478 + # cat /sys/fs/resctrl/mbm_L3_assignment 1479 + mbm_total_bytes:0=_;1=_ 1480 + mbm_local_bytes:0=e;1=e 1481 + 1482 + g. To assign a counter associated with the mbm_total_bytes event on all domains in 1483 + exclusive mode. 1484 + :: 1485 + 1486 + # echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments 1487 + # cat /sys/fs/resctrl/mbm_L3_assignments 1488 + mbm_total_bytes:0=e;1=e 1489 + mbm_local_bytes:0=e;1=e 1490 + 1491 + h. Read the events mbm_total_bytes and mbm_local_bytes of the default group. There is 1492 + no change in reading the events with the assignment. 1493 + :: 1494 + 1495 + # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes 1496 + 779247936 1497 + # cat /sys/fs/resctrl/mon_data/mon_L3_01/mbm_total_bytes 1498 + 562324232 1499 + # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes 1500 + 212122123 1501 + # cat /sys/fs/resctrl/mon_data/mon_L3_01/mbm_local_bytes 1502 + 121212144 1503 + 1504 + i. Check the event configurations. 1505 + :: 1506 + 1507 + # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter 1508 + local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes, 1509 + local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all 1510 + 1511 + # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter 1512 + local_reads,local_non_temporal_writes,local_reads_slow_memory 1513 + 1514 + j. Change the event configuration for mbm_local_bytes. 1515 + :: 1516 + 1517 + # echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" > 1518 + /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter 1519 + 1520 + # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter 1521 + local_reads,local_non_temporal_writes,local_reads_slow_memory,remote_reads 1522 + 1523 + k. Now read the local events again. The first read may come back with "Unavailable" 1524 + status. The subsequent read of mbm_local_bytes will display the current value. 1525 + :: 1526 + 1527 + # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes 1528 + Unavailable 1529 + # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes 1530 + 2252323 1531 + # cat /sys/fs/resctrl/mon_data/mon_L3_01/mbm_local_bytes 1532 + Unavailable 1533 + # cat /sys/fs/resctrl/mon_data/mon_L3_01/mbm_local_bytes 1534 + 1566565 1535 + 1536 + l. Users have the option to go back to 'default' mbm_assign_mode if required. This can be 1537 + done using the following command. Note that switching the mbm_assign_mode may reset all 1538 + the MBM counters (and thus all MBM events) of all the resctrl groups. 1539 + :: 1540 + 1541 + # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode 1542 + # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode 1543 + mbm_event 1544 + [default] 1545 + 1546 + m. Unmount the resctrl filesystem. 1547 + :: 1548 + 1549 + # umount /sys/fs/resctrl/ 1637 1550 1638 1551 Intel RDT Errata 1639 1552 ================
+1
MAINTAINERS
··· 21186 21186 M: Reinette Chatre <reinette.chatre@intel.com> 21187 21187 R: Dave Martin <Dave.Martin@arm.com> 21188 21188 R: James Morse <james.morse@arm.com> 21189 + R: Babu Moger <babu.moger@amd.com> 21189 21190 L: linux-kernel@vger.kernel.org 21190 21191 S: Supported 21191 21192 F: Documentation/filesystems/resctrl.rst
+1
arch/x86/include/asm/cpufeatures.h
··· 496 496 #define X86_FEATURE_TSA_L1_NO (21*32+12) /* AMD CPU not vulnerable to TSA-L1 */ 497 497 #define X86_FEATURE_CLEAR_CPU_BUF_VM (21*32+13) /* Clear CPU buffers using VERW before VMRUN */ 498 498 #define X86_FEATURE_IBPB_EXIT_TO_USER (21*32+14) /* Use IBPB on exit-to-userspace, see VMSCAPE bug */ 499 + #define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */ 499 500 500 501 /* 501 502 * BUG word(s)
+2
arch/x86/include/asm/msr-index.h
··· 1230 1230 /* - AMD: */ 1231 1231 #define MSR_IA32_MBA_BW_BASE 0xc0000200 1232 1232 #define MSR_IA32_SMBA_BW_BASE 0xc0000280 1233 + #define MSR_IA32_L3_QOS_ABMC_CFG 0xc00003fd 1234 + #define MSR_IA32_L3_QOS_EXT_CFG 0xc00003ff 1233 1235 #define MSR_IA32_EVT_CFG_BASE 0xc0000400 1234 1236 1235 1237 /* AMD-V MSRs */
-16
arch/x86/include/asm/resctrl.h
··· 44 44 45 45 extern bool rdt_alloc_capable; 46 46 extern bool rdt_mon_capable; 47 - extern unsigned int rdt_mon_features; 48 47 49 48 DECLARE_STATIC_KEY_FALSE(rdt_enable_key); 50 49 DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); ··· 81 82 { 82 83 static_branch_disable_cpuslocked(&rdt_mon_enable_key); 83 84 static_branch_dec_cpuslocked(&rdt_enable_key); 84 - } 85 - 86 - static inline bool resctrl_arch_is_llc_occupancy_enabled(void) 87 - { 88 - return (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID)); 89 - } 90 - 91 - static inline bool resctrl_arch_is_mbm_total_enabled(void) 92 - { 93 - return (rdt_mon_features & (1 << QOS_L3_MBM_TOTAL_EVENT_ID)); 94 - } 95 - 96 - static inline bool resctrl_arch_is_mbm_local_enabled(void) 97 - { 98 - return (rdt_mon_features & (1 << QOS_L3_MBM_LOCAL_EVENT_ID)); 99 85 } 100 86 101 87 /*
+51 -28
arch/x86/kernel/cpu/resctrl/core.c
··· 107 107 struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; 108 108 109 109 /* RMID are independent numbers for x86. num_rmid_idx == num_rmid */ 110 - return r->num_rmid; 110 + return r->mon.num_rmid; 111 111 } 112 112 113 113 struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l) ··· 365 365 366 366 static void mon_domain_free(struct rdt_hw_mon_domain *hw_dom) 367 367 { 368 - kfree(hw_dom->arch_mbm_total); 369 - kfree(hw_dom->arch_mbm_local); 368 + int idx; 369 + 370 + for_each_mbm_idx(idx) 371 + kfree(hw_dom->arch_mbm_states[idx]); 370 372 kfree(hw_dom); 371 373 } 372 374 ··· 402 400 */ 403 401 static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_mon_domain *hw_dom) 404 402 { 405 - size_t tsize; 403 + size_t tsize = sizeof(*hw_dom->arch_mbm_states[0]); 404 + enum resctrl_event_id eventid; 405 + int idx; 406 406 407 - if (resctrl_arch_is_mbm_total_enabled()) { 408 - tsize = sizeof(*hw_dom->arch_mbm_total); 409 - hw_dom->arch_mbm_total = kcalloc(num_rmid, tsize, GFP_KERNEL); 410 - if (!hw_dom->arch_mbm_total) 411 - return -ENOMEM; 412 - } 413 - if (resctrl_arch_is_mbm_local_enabled()) { 414 - tsize = sizeof(*hw_dom->arch_mbm_local); 415 - hw_dom->arch_mbm_local = kcalloc(num_rmid, tsize, GFP_KERNEL); 416 - if (!hw_dom->arch_mbm_local) { 417 - kfree(hw_dom->arch_mbm_total); 418 - hw_dom->arch_mbm_total = NULL; 419 - return -ENOMEM; 420 - } 407 + for_each_mbm_event_id(eventid) { 408 + if (!resctrl_is_mon_event_enabled(eventid)) 409 + continue; 410 + idx = MBM_STATE_IDX(eventid); 411 + hw_dom->arch_mbm_states[idx] = kcalloc(num_rmid, tsize, GFP_KERNEL); 412 + if (!hw_dom->arch_mbm_states[idx]) 413 + goto cleanup; 421 414 } 422 415 423 416 return 0; 417 + cleanup: 418 + for_each_mbm_idx(idx) { 419 + kfree(hw_dom->arch_mbm_states[idx]); 420 + hw_dom->arch_mbm_states[idx] = NULL; 421 + } 422 + 423 + return -ENOMEM; 424 424 } 425 425 426 426 static int get_domain_id_from_scope(int cpu, enum resctrl_scope scope) ··· 520 516 d = container_of(hdr, struct rdt_mon_domain, hdr); 521 517 522 518 cpumask_set_cpu(cpu, &d->hdr.cpu_mask); 519 + /* Update the mbm_assign_mode state for the CPU if supported */ 520 + if (r->mon.mbm_cntr_assignable) 521 + resctrl_arch_mbm_cntr_assign_set_one(r); 523 522 return; 524 523 } 525 524 ··· 542 535 d->ci_id = ci->id; 543 536 cpumask_set_cpu(cpu, &d->hdr.cpu_mask); 544 537 538 + /* Update the mbm_assign_mode state for the CPU if supported */ 539 + if (r->mon.mbm_cntr_assignable) 540 + resctrl_arch_mbm_cntr_assign_set_one(r); 541 + 545 542 arch_mon_domain_online(r, d); 546 543 547 - if (arch_domain_mbm_alloc(r->num_rmid, hw_dom)) { 544 + if (arch_domain_mbm_alloc(r->mon.num_rmid, hw_dom)) { 548 545 mon_domain_free(hw_dom); 549 546 return; 550 547 } ··· 718 707 RDT_FLAG_MBA, 719 708 RDT_FLAG_SMBA, 720 709 RDT_FLAG_BMEC, 710 + RDT_FLAG_ABMC, 721 711 }; 722 712 723 713 #define RDT_OPT(idx, n, f) \ ··· 744 732 RDT_OPT(RDT_FLAG_MBA, "mba", X86_FEATURE_MBA), 745 733 RDT_OPT(RDT_FLAG_SMBA, "smba", X86_FEATURE_SMBA), 746 734 RDT_OPT(RDT_FLAG_BMEC, "bmec", X86_FEATURE_BMEC), 735 + RDT_OPT(RDT_FLAG_ABMC, "abmc", X86_FEATURE_ABMC), 747 736 }; 748 737 #define NUM_RDT_OPTIONS ARRAY_SIZE(rdt_options) 749 738 ··· 876 863 static __init bool get_rdt_mon_resources(void) 877 864 { 878 865 struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; 866 + bool ret = false; 879 867 880 - if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC)) 881 - rdt_mon_features |= (1 << QOS_L3_OCCUP_EVENT_ID); 882 - if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) 883 - rdt_mon_features |= (1 << QOS_L3_MBM_TOTAL_EVENT_ID); 884 - if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL)) 885 - rdt_mon_features |= (1 << QOS_L3_MBM_LOCAL_EVENT_ID); 868 + if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC)) { 869 + resctrl_enable_mon_event(QOS_L3_OCCUP_EVENT_ID); 870 + ret = true; 871 + } 872 + if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) { 873 + resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID); 874 + ret = true; 875 + } 876 + if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL)) { 877 + resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID); 878 + ret = true; 879 + } 880 + if (rdt_cpu_has(X86_FEATURE_ABMC)) 881 + ret = true; 886 882 887 - if (!rdt_mon_features) 883 + if (!ret) 888 884 return false; 889 885 890 886 return !rdt_get_mon_l3_config(r); ··· 987 965 /* Runs once on the BSP during boot. */ 988 966 void resctrl_cpu_detect(struct cpuinfo_x86 *c) 989 967 { 990 - if (!cpu_has(c, X86_FEATURE_CQM_LLC)) { 968 + if (!cpu_has(c, X86_FEATURE_CQM_LLC) && !cpu_has(c, X86_FEATURE_ABMC)) { 991 969 c->x86_cache_max_rmid = -1; 992 970 c->x86_cache_occ_scale = -1; 993 971 c->x86_cache_mbm_width_offset = -1; ··· 999 977 1000 978 if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC) || 1001 979 cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL) || 1002 - cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)) { 980 + cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL) || 981 + cpu_has(c, X86_FEATURE_ABMC)) { 1003 982 u32 eax, ebx, ecx, edx; 1004 983 1005 984 /* QoS sub-leaf, EAX=0Fh, ECX=1 */
+52 -4
arch/x86/kernel/cpu/resctrl/internal.h
··· 37 37 u64 prev_msr; 38 38 }; 39 39 40 + /* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature. */ 41 + #define ABMC_ENABLE_BIT 0 42 + 43 + /* 44 + * Qos Event Identifiers. 45 + */ 46 + #define ABMC_EXTENDED_EVT_ID BIT(31) 47 + #define ABMC_EVT_ID BIT(0) 48 + 40 49 /** 41 50 * struct rdt_hw_ctrl_domain - Arch private attributes of a set of CPUs that share 42 51 * a resource for a control function ··· 63 54 * struct rdt_hw_mon_domain - Arch private attributes of a set of CPUs that share 64 55 * a resource for a monitor function 65 56 * @d_resctrl: Properties exposed to the resctrl file system 66 - * @arch_mbm_total: arch private state for MBM total bandwidth 67 - * @arch_mbm_local: arch private state for MBM local bandwidth 57 + * @arch_mbm_states: Per-event pointer to the MBM event's saved state. 58 + * An MBM event's state is an array of struct arch_mbm_state 59 + * indexed by RMID on x86. 68 60 * 69 61 * Members of this structure are accessed via helpers that provide abstraction. 70 62 */ 71 63 struct rdt_hw_mon_domain { 72 64 struct rdt_mon_domain d_resctrl; 73 - struct arch_mbm_state *arch_mbm_total; 74 - struct arch_mbm_state *arch_mbm_local; 65 + struct arch_mbm_state *arch_mbm_states[QOS_NUM_L3_MBM_EVENTS]; 75 66 }; 76 67 77 68 static inline struct rdt_hw_ctrl_domain *resctrl_to_arch_ctrl_dom(struct rdt_ctrl_domain *r) ··· 111 102 * @mon_scale: cqm counter * mon_scale = occupancy in bytes 112 103 * @mbm_width: Monitor width, to detect and correct for overflow. 113 104 * @cdp_enabled: CDP state of this resource 105 + * @mbm_cntr_assign_enabled: ABMC feature is enabled 114 106 * 115 107 * Members of this structure are either private to the architecture 116 108 * e.g. mbm_width, or accessed via helpers that provide abstraction. e.g. ··· 125 115 unsigned int mon_scale; 126 116 unsigned int mbm_width; 127 117 bool cdp_enabled; 118 + bool mbm_cntr_assign_enabled; 128 119 }; 129 120 130 121 static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r) ··· 170 159 unsigned int full; 171 160 }; 172 161 162 + /* 163 + * ABMC counters are configured by writing to MSR_IA32_L3_QOS_ABMC_CFG. 164 + * 165 + * @bw_type : Event configuration that represents the memory 166 + * transactions being tracked by the @cntr_id. 167 + * @bw_src : Bandwidth source (RMID or CLOSID). 168 + * @reserved1 : Reserved. 169 + * @is_clos : @bw_src field is a CLOSID (not an RMID). 170 + * @cntr_id : Counter identifier. 171 + * @reserved : Reserved. 172 + * @cntr_en : Counting enable bit. 173 + * @cfg_en : Configuration enable bit. 174 + * 175 + * Configuration and counting: 176 + * Counter can be configured across multiple writes to MSR. Configuration 177 + * is applied only when @cfg_en = 1. Counter @cntr_id is reset when the 178 + * configuration is applied. 179 + * @cfg_en = 1, @cntr_en = 0 : Apply @cntr_id configuration but do not 180 + * count events. 181 + * @cfg_en = 1, @cntr_en = 1 : Apply @cntr_id configuration and start 182 + * counting events. 183 + */ 184 + union l3_qos_abmc_cfg { 185 + struct { 186 + unsigned long bw_type :32, 187 + bw_src :12, 188 + reserved1: 3, 189 + is_clos : 1, 190 + cntr_id : 5, 191 + reserved : 9, 192 + cntr_en : 1, 193 + cfg_en : 1; 194 + } split; 195 + unsigned long full; 196 + }; 197 + 173 198 void rdt_ctrl_update(void *arg); 174 199 175 200 int rdt_get_mon_l3_config(struct rdt_resource *r); ··· 215 168 void __init intel_rdt_mbm_apply_quirk(void); 216 169 217 170 void rdt_domain_reconfigure_cdp(struct rdt_resource *r); 171 + void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r); 218 172 219 173 #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
+209 -45
arch/x86/kernel/cpu/resctrl/monitor.c
··· 31 31 */ 32 32 bool rdt_mon_capable; 33 33 34 - /* 35 - * Global to indicate which monitoring events are enabled. 36 - */ 37 - unsigned int rdt_mon_features; 38 - 39 34 #define CF(cf) ((unsigned long)(1048576 * (cf) + 0.5)) 40 35 41 36 static int snc_nodes_per_l3_cache = 1; ··· 130 135 if (snc_nodes_per_l3_cache == 1) 131 136 return lrmid; 132 137 133 - return lrmid + (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->num_rmid; 138 + return lrmid + (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->mon.num_rmid; 134 139 } 135 140 136 141 static int __rmid_read_phys(u32 prmid, enum resctrl_event_id eventid, u64 *val) ··· 161 166 u32 rmid, 162 167 enum resctrl_event_id eventid) 163 168 { 164 - switch (eventid) { 165 - case QOS_L3_OCCUP_EVENT_ID: 169 + struct arch_mbm_state *state; 170 + 171 + if (!resctrl_is_mbm_event(eventid)) 166 172 return NULL; 167 - case QOS_L3_MBM_TOTAL_EVENT_ID: 168 - return &hw_dom->arch_mbm_total[rmid]; 169 - case QOS_L3_MBM_LOCAL_EVENT_ID: 170 - return &hw_dom->arch_mbm_local[rmid]; 171 - default: 172 - /* Never expect to get here */ 173 - WARN_ON_ONCE(1); 174 - return NULL; 175 - } 173 + 174 + state = hw_dom->arch_mbm_states[MBM_STATE_IDX(eventid)]; 175 + 176 + return state ? &state[rmid] : NULL; 176 177 } 177 178 178 179 void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d, ··· 197 206 void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d) 198 207 { 199 208 struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d); 209 + enum resctrl_event_id eventid; 210 + int idx; 200 211 201 - if (resctrl_arch_is_mbm_total_enabled()) 202 - memset(hw_dom->arch_mbm_total, 0, 203 - sizeof(*hw_dom->arch_mbm_total) * r->num_rmid); 204 - 205 - if (resctrl_arch_is_mbm_local_enabled()) 206 - memset(hw_dom->arch_mbm_local, 0, 207 - sizeof(*hw_dom->arch_mbm_local) * r->num_rmid); 212 + for_each_mbm_event_id(eventid) { 213 + if (!resctrl_is_mon_event_enabled(eventid)) 214 + continue; 215 + idx = MBM_STATE_IDX(eventid); 216 + memset(hw_dom->arch_mbm_states[idx], 0, 217 + sizeof(*hw_dom->arch_mbm_states[0]) * r->mon.num_rmid); 218 + } 208 219 } 209 220 210 221 static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width) ··· 217 224 return chunks >> shift; 218 225 } 219 226 220 - int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d, 221 - u32 unused, u32 rmid, enum resctrl_event_id eventid, 222 - u64 *val, void *ignored) 227 + static u64 get_corrected_val(struct rdt_resource *r, struct rdt_mon_domain *d, 228 + u32 rmid, enum resctrl_event_id eventid, u64 msr_val) 223 229 { 224 230 struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d); 225 231 struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); 226 - int cpu = cpumask_any(&d->hdr.cpu_mask); 227 232 struct arch_mbm_state *am; 228 - u64 msr_val, chunks; 229 - u32 prmid; 230 - int ret; 231 - 232 - resctrl_arch_rmid_read_context_check(); 233 - 234 - prmid = logical_rmid_to_physical_rmid(cpu, rmid); 235 - ret = __rmid_read_phys(prmid, eventid, &msr_val); 236 - if (ret) 237 - return ret; 233 + u64 chunks; 238 234 239 235 am = get_arch_mbm_state(hw_dom, rmid, eventid); 240 236 if (am) { ··· 235 253 chunks = msr_val; 236 254 } 237 255 238 - *val = chunks * hw_res->mon_scale; 256 + return chunks * hw_res->mon_scale; 257 + } 258 + 259 + int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d, 260 + u32 unused, u32 rmid, enum resctrl_event_id eventid, 261 + u64 *val, void *ignored) 262 + { 263 + int cpu = cpumask_any(&d->hdr.cpu_mask); 264 + u64 msr_val; 265 + u32 prmid; 266 + int ret; 267 + 268 + resctrl_arch_rmid_read_context_check(); 269 + 270 + prmid = logical_rmid_to_physical_rmid(cpu, rmid); 271 + ret = __rmid_read_phys(prmid, eventid, &msr_val); 272 + if (ret) 273 + return ret; 274 + 275 + *val = get_corrected_val(r, d, rmid, eventid, msr_val); 276 + 277 + return 0; 278 + } 279 + 280 + static int __cntr_id_read(u32 cntr_id, u64 *val) 281 + { 282 + u64 msr_val; 283 + 284 + /* 285 + * QM_EVTSEL Register definition: 286 + * ======================================================= 287 + * Bits Mnemonic Description 288 + * ======================================================= 289 + * 63:44 -- Reserved 290 + * 43:32 RMID RMID or counter ID in ABMC mode 291 + * when reading an MBM event 292 + * 31 ExtendedEvtID Extended Event Identifier 293 + * 30:8 -- Reserved 294 + * 7:0 EvtID Event Identifier 295 + * ======================================================= 296 + * The contents of a specific counter can be read by setting the 297 + * following fields in QM_EVTSEL.ExtendedEvtID(=1) and 298 + * QM_EVTSEL.EvtID = L3CacheABMC (=1) and setting QM_EVTSEL.RMID 299 + * to the desired counter ID. Reading the QM_CTR then returns the 300 + * contents of the specified counter. The RMID_VAL_ERROR bit is set 301 + * if the counter configuration is invalid, or if an invalid counter 302 + * ID is set in the QM_EVTSEL.RMID field. The RMID_VAL_UNAVAIL bit 303 + * is set if the counter data is unavailable. 304 + */ 305 + wrmsr(MSR_IA32_QM_EVTSEL, ABMC_EXTENDED_EVT_ID | ABMC_EVT_ID, cntr_id); 306 + rdmsrl(MSR_IA32_QM_CTR, msr_val); 307 + 308 + if (msr_val & RMID_VAL_ERROR) 309 + return -EIO; 310 + if (msr_val & RMID_VAL_UNAVAIL) 311 + return -EINVAL; 312 + 313 + *val = msr_val; 314 + return 0; 315 + } 316 + 317 + void resctrl_arch_reset_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, 318 + u32 unused, u32 rmid, int cntr_id, 319 + enum resctrl_event_id eventid) 320 + { 321 + struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d); 322 + struct arch_mbm_state *am; 323 + 324 + am = get_arch_mbm_state(hw_dom, rmid, eventid); 325 + if (am) { 326 + memset(am, 0, sizeof(*am)); 327 + 328 + /* Record any initial, non-zero count value. */ 329 + __cntr_id_read(cntr_id, &am->prev_msr); 330 + } 331 + } 332 + 333 + int resctrl_arch_cntr_read(struct rdt_resource *r, struct rdt_mon_domain *d, 334 + u32 unused, u32 rmid, int cntr_id, 335 + enum resctrl_event_id eventid, u64 *val) 336 + { 337 + u64 msr_val; 338 + int ret; 339 + 340 + ret = __cntr_id_read(cntr_id, &msr_val); 341 + if (ret) 342 + return ret; 343 + 344 + *val = get_corrected_val(r, d, rmid, eventid, msr_val); 239 345 240 346 return 0; 241 347 } ··· 416 346 unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset; 417 347 struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); 418 348 unsigned int threshold; 349 + u32 eax, ebx, ecx, edx; 419 350 420 351 snc_nodes_per_l3_cache = snc_get_config(); 421 352 422 353 resctrl_rmid_realloc_limit = boot_cpu_data.x86_cache_size * 1024; 423 354 hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale / snc_nodes_per_l3_cache; 424 - r->num_rmid = (boot_cpu_data.x86_cache_max_rmid + 1) / snc_nodes_per_l3_cache; 355 + r->mon.num_rmid = (boot_cpu_data.x86_cache_max_rmid + 1) / snc_nodes_per_l3_cache; 425 356 hw_res->mbm_width = MBM_CNTR_WIDTH_BASE; 426 357 427 358 if (mbm_offset > 0 && mbm_offset <= MBM_CNTR_WIDTH_OFFSET_MAX) ··· 437 366 * 438 367 * For a 35MB LLC and 56 RMIDs, this is ~1.8% of the LLC. 439 368 */ 440 - threshold = resctrl_rmid_realloc_limit / r->num_rmid; 369 + threshold = resctrl_rmid_realloc_limit / r->mon.num_rmid; 441 370 442 371 /* 443 372 * Because num_rmid may not be a power of two, round the value ··· 446 375 */ 447 376 resctrl_rmid_realloc_threshold = resctrl_arch_round_mon_val(threshold); 448 377 449 - if (rdt_cpu_has(X86_FEATURE_BMEC)) { 450 - u32 eax, ebx, ecx, edx; 451 - 378 + if (rdt_cpu_has(X86_FEATURE_BMEC) || rdt_cpu_has(X86_FEATURE_ABMC)) { 452 379 /* Detect list of bandwidth sources that can be tracked */ 453 380 cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx); 454 - r->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS; 381 + r->mon.mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS; 382 + } 383 + 384 + if (rdt_cpu_has(X86_FEATURE_ABMC)) { 385 + r->mon.mbm_cntr_assignable = true; 386 + cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx); 387 + r->mon.num_mbm_cntrs = (ebx & GENMASK(15, 0)) + 1; 388 + hw_res->mbm_cntr_assign_enabled = true; 455 389 } 456 390 457 391 r->mon_capable = true; ··· 476 400 477 401 mbm_cf_rmidthreshold = mbm_cf_table[cf_index].rmidthreshold; 478 402 mbm_cf = mbm_cf_table[cf_index].cf; 403 + } 404 + 405 + static void resctrl_abmc_set_one_amd(void *arg) 406 + { 407 + bool *enable = arg; 408 + 409 + if (*enable) 410 + msr_set_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT); 411 + else 412 + msr_clear_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT); 413 + } 414 + 415 + /* 416 + * ABMC enable/disable requires update of L3_QOS_EXT_CFG MSR on all the CPUs 417 + * associated with all monitor domains. 418 + */ 419 + static void _resctrl_abmc_enable(struct rdt_resource *r, bool enable) 420 + { 421 + struct rdt_mon_domain *d; 422 + 423 + lockdep_assert_cpus_held(); 424 + 425 + list_for_each_entry(d, &r->mon_domains, hdr.list) { 426 + on_each_cpu_mask(&d->hdr.cpu_mask, resctrl_abmc_set_one_amd, 427 + &enable, 1); 428 + resctrl_arch_reset_rmid_all(r, d); 429 + } 430 + } 431 + 432 + int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable) 433 + { 434 + struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); 435 + 436 + if (r->mon.mbm_cntr_assignable && 437 + hw_res->mbm_cntr_assign_enabled != enable) { 438 + _resctrl_abmc_enable(r, enable); 439 + hw_res->mbm_cntr_assign_enabled = enable; 440 + } 441 + 442 + return 0; 443 + } 444 + 445 + bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r) 446 + { 447 + return resctrl_to_arch_res(r)->mbm_cntr_assign_enabled; 448 + } 449 + 450 + static void resctrl_abmc_config_one_amd(void *info) 451 + { 452 + union l3_qos_abmc_cfg *abmc_cfg = info; 453 + 454 + wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, abmc_cfg->full); 455 + } 456 + 457 + /* 458 + * Send an IPI to the domain to assign the counter to RMID, event pair. 459 + */ 460 + void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, 461 + enum resctrl_event_id evtid, u32 rmid, u32 closid, 462 + u32 cntr_id, bool assign) 463 + { 464 + struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d); 465 + union l3_qos_abmc_cfg abmc_cfg = { 0 }; 466 + struct arch_mbm_state *am; 467 + 468 + abmc_cfg.split.cfg_en = 1; 469 + abmc_cfg.split.cntr_en = assign ? 1 : 0; 470 + abmc_cfg.split.cntr_id = cntr_id; 471 + abmc_cfg.split.bw_src = rmid; 472 + if (assign) 473 + abmc_cfg.split.bw_type = resctrl_get_mon_evt_cfg(evtid); 474 + 475 + smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, &abmc_cfg, 1); 476 + 477 + /* 478 + * The hardware counter is reset (because cfg_en == 1) so there is no 479 + * need to record initial non-zero counts. 480 + */ 481 + am = get_arch_mbm_state(hw_dom, rmid, evtid); 482 + if (am) 483 + memset(am, 0, sizeof(*am)); 484 + } 485 + 486 + void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r) 487 + { 488 + struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); 489 + 490 + resctrl_abmc_set_one_amd(&hw_res->mbm_cntr_assign_enabled); 479 491 }
+1
arch/x86/kernel/cpu/scattered.c
··· 51 51 { X86_FEATURE_COHERENCY_SFW_NO, CPUID_EBX, 31, 0x8000001f, 0 }, 52 52 { X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 }, 53 53 { X86_FEATURE_BMEC, CPUID_EBX, 3, 0x80000020, 0 }, 54 + { X86_FEATURE_ABMC, CPUID_EBX, 5, 0x80000020, 0 }, 54 55 { X86_FEATURE_TSA_SQ_NO, CPUID_ECX, 1, 0x80000021, 0 }, 55 56 { X86_FEATURE_TSA_L1_NO, CPUID_ECX, 2, 0x80000021, 0 }, 56 57 { X86_FEATURE_AMD_WORKLOAD_CLASS, CPUID_EAX, 22, 0x80000021, 0 },
+19 -7
fs/resctrl/ctrlmondata.c
··· 473 473 rdt_last_cmd_clear(); 474 474 475 475 if (!strcmp(buf, "mbm_local_bytes")) { 476 - if (resctrl_arch_is_mbm_local_enabled()) 476 + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) 477 477 rdtgrp->mba_mbps_event = QOS_L3_MBM_LOCAL_EVENT_ID; 478 478 else 479 479 ret = -EINVAL; 480 480 } else if (!strcmp(buf, "mbm_total_bytes")) { 481 - if (resctrl_arch_is_mbm_total_enabled()) 481 + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) 482 482 rdtgrp->mba_mbps_event = QOS_L3_MBM_TOTAL_EVENT_ID; 483 483 else 484 484 ret = -EINVAL; ··· 563 563 rr->r = r; 564 564 rr->d = d; 565 565 rr->first = first; 566 - rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid); 567 - if (IS_ERR(rr->arch_mon_ctx)) { 568 - rr->err = -EINVAL; 569 - return; 566 + if (resctrl_arch_mbm_cntr_assign_enabled(r) && 567 + resctrl_is_mbm_event(evtid)) { 568 + rr->is_mbm_cntr = true; 569 + } else { 570 + rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid); 571 + if (IS_ERR(rr->arch_mon_ctx)) { 572 + rr->err = -EINVAL; 573 + return; 574 + } 570 575 } 571 576 572 577 cpu = cpumask_any_housekeeping(cpumask, RESCTRL_PICK_ANY_CPU); ··· 587 582 else 588 583 smp_call_on_cpu(cpu, smp_mon_event_count, rr, false); 589 584 590 - resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx); 585 + if (rr->arch_mon_ctx) 586 + resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx); 591 587 } 592 588 593 589 int rdtgroup_mondata_show(struct seq_file *m, void *arg) ··· 659 653 660 654 checkresult: 661 655 656 + /* 657 + * -ENOENT is a special case, set only when "mbm_event" counter assignment 658 + * mode is enabled and no counter has been assigned. 659 + */ 662 660 if (rr.err == -EIO) 663 661 seq_puts(m, "Error\n"); 664 662 else if (rr.err == -EINVAL) 665 663 seq_puts(m, "Unavailable\n"); 664 + else if (rr.err == -ENOENT) 665 + seq_puts(m, "Unassigned\n"); 666 666 else 667 667 seq_printf(m, "%llu\n", rr.val); 668 668
+55 -3
fs/resctrl/internal.h
··· 52 52 } 53 53 54 54 /** 55 - * struct mon_evt - Entry in the event list of a resource 55 + * struct mon_evt - Properties of a monitor event 56 56 * @evtid: event id 57 + * @rid: resource id for this event 57 58 * @name: name of the event 59 + * @evt_cfg: Event configuration value that represents the 60 + * memory transactions (e.g., READS_TO_LOCAL_MEM, 61 + * READS_TO_REMOTE_MEM) being tracked by @evtid. 62 + * Only valid if @evtid is an MBM event. 58 63 * @configurable: true if the event is configurable 59 - * @list: entry in &rdt_resource->evt_list 64 + * @enabled: true if the event is enabled 60 65 */ 61 66 struct mon_evt { 62 67 enum resctrl_event_id evtid; 68 + enum resctrl_res_level rid; 63 69 char *name; 70 + u32 evt_cfg; 64 71 bool configurable; 65 - struct list_head list; 72 + bool enabled; 66 73 }; 74 + 75 + extern struct mon_evt mon_event_all[QOS_NUM_EVENTS]; 76 + 77 + #define for_each_mon_event(mevt) for (mevt = &mon_event_all[QOS_FIRST_EVENT]; \ 78 + mevt < &mon_event_all[QOS_NUM_EVENTS]; mevt++) 67 79 68 80 /** 69 81 * struct mon_data - Monitoring details for each event file. ··· 111 99 * @evtid: Which monitor event to read. 112 100 * @first: Initialize MBM counter when true. 113 101 * @ci: Cacheinfo for L3. Only set when @d is NULL. Used when summing domains. 102 + * @is_mbm_cntr: true if "mbm_event" counter assignment mode is enabled and it 103 + * is an MBM event. 114 104 * @err: Error encountered when reading counter. 115 105 * @val: Returned value of event counter. If @rgrp is a parent resource group, 116 106 * @val includes the sum of event counts from its child resource groups. ··· 127 113 enum resctrl_event_id evtid; 128 114 bool first; 129 115 struct cacheinfo *ci; 116 + bool is_mbm_cntr; 130 117 int err; 131 118 u64 val; 132 119 void *arch_mon_ctx; ··· 240 225 #define RFTYPE_RES_MB BIT(9) 241 226 242 227 #define RFTYPE_DEBUG BIT(10) 228 + 229 + #define RFTYPE_ASSIGN_CONFIG BIT(11) 243 230 244 231 #define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL) 245 232 ··· 391 374 bool closid_allocated(unsigned int closid); 392 375 393 376 int resctrl_find_cleanest_closid(void); 377 + 378 + void *rdt_kn_parent_priv(struct kernfs_node *kn); 379 + 380 + int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of, struct seq_file *s, void *v); 381 + 382 + ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of, char *buf, 383 + size_t nbytes, loff_t off); 384 + 385 + void resctrl_bmec_files_show(struct rdt_resource *r, struct kernfs_node *l3_mon_kn, 386 + bool show); 387 + 388 + int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of, struct seq_file *s, void *v); 389 + 390 + int resctrl_available_mbm_cntrs_show(struct kernfs_open_file *of, struct seq_file *s, 391 + void *v); 392 + 393 + void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp); 394 + 395 + void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp); 396 + 397 + int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v); 398 + 399 + ssize_t event_filter_write(struct kernfs_open_file *of, char *buf, size_t nbytes, 400 + loff_t off); 401 + 402 + int resctrl_mbm_assign_on_mkdir_show(struct kernfs_open_file *of, 403 + struct seq_file *s, void *v); 404 + 405 + ssize_t resctrl_mbm_assign_on_mkdir_write(struct kernfs_open_file *of, char *buf, 406 + size_t nbytes, loff_t off); 407 + 408 + int mbm_L3_assignments_show(struct kernfs_open_file *of, struct seq_file *s, void *v); 409 + 410 + ssize_t mbm_L3_assignments_write(struct kernfs_open_file *of, char *buf, size_t nbytes, 411 + loff_t off); 394 412 395 413 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK 396 414 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
+948 -64
fs/resctrl/monitor.c
··· 336 336 337 337 entry = __rmid_entry(idx); 338 338 339 - if (resctrl_arch_is_llc_occupancy_enabled()) 339 + if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID)) 340 340 add_rmid_to_limbo(entry); 341 341 else 342 342 list_add_tail(&entry->list, &rmid_free_lru); ··· 346 346 u32 rmid, enum resctrl_event_id evtid) 347 347 { 348 348 u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid); 349 + struct mbm_state *state; 349 350 350 - switch (evtid) { 351 - case QOS_L3_MBM_TOTAL_EVENT_ID: 352 - return &d->mbm_total[idx]; 353 - case QOS_L3_MBM_LOCAL_EVENT_ID: 354 - return &d->mbm_local[idx]; 355 - default: 351 + if (!resctrl_is_mbm_event(evtid)) 356 352 return NULL; 357 - } 353 + 354 + state = d->mbm_states[MBM_STATE_IDX(evtid)]; 355 + 356 + return state ? &state[idx] : NULL; 358 357 } 359 358 360 - static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) 359 + /* 360 + * mbm_cntr_get() - Return the counter ID for the matching @evtid and @rdtgrp. 361 + * 362 + * Return: 363 + * Valid counter ID on success, or -ENOENT on failure. 364 + */ 365 + static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d, 366 + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid) 367 + { 368 + int cntr_id; 369 + 370 + if (!r->mon.mbm_cntr_assignable) 371 + return -ENOENT; 372 + 373 + if (!resctrl_is_mbm_event(evtid)) 374 + return -ENOENT; 375 + 376 + for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) { 377 + if (d->cntr_cfg[cntr_id].rdtgrp == rdtgrp && 378 + d->cntr_cfg[cntr_id].evtid == evtid) 379 + return cntr_id; 380 + } 381 + 382 + return -ENOENT; 383 + } 384 + 385 + /* 386 + * mbm_cntr_alloc() - Initialize and return a new counter ID in the domain @d. 387 + * Caller must ensure that the specified event is not assigned already. 388 + * 389 + * Return: 390 + * Valid counter ID on success, or -ENOSPC on failure. 391 + */ 392 + static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d, 393 + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid) 394 + { 395 + int cntr_id; 396 + 397 + for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) { 398 + if (!d->cntr_cfg[cntr_id].rdtgrp) { 399 + d->cntr_cfg[cntr_id].rdtgrp = rdtgrp; 400 + d->cntr_cfg[cntr_id].evtid = evtid; 401 + return cntr_id; 402 + } 403 + } 404 + 405 + return -ENOSPC; 406 + } 407 + 408 + /* 409 + * mbm_cntr_free() - Clear the counter ID configuration details in the domain @d. 410 + */ 411 + static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id) 412 + { 413 + memset(&d->cntr_cfg[cntr_id], 0, sizeof(*d->cntr_cfg)); 414 + } 415 + 416 + static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr) 361 417 { 362 418 int cpu = smp_processor_id(); 419 + u32 closid = rdtgrp->closid; 420 + u32 rmid = rdtgrp->mon.rmid; 363 421 struct rdt_mon_domain *d; 422 + int cntr_id = -ENOENT; 364 423 struct mbm_state *m; 365 424 int err, ret; 366 425 u64 tval = 0; 367 426 427 + if (rr->is_mbm_cntr) { 428 + cntr_id = mbm_cntr_get(rr->r, rr->d, rdtgrp, rr->evtid); 429 + if (cntr_id < 0) { 430 + rr->err = -ENOENT; 431 + return -EINVAL; 432 + } 433 + } 434 + 368 435 if (rr->first) { 369 - resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid); 436 + if (rr->is_mbm_cntr) 437 + resctrl_arch_reset_cntr(rr->r, rr->d, closid, rmid, cntr_id, rr->evtid); 438 + else 439 + resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid); 370 440 m = get_mbm_state(rr->d, closid, rmid, rr->evtid); 371 441 if (m) 372 442 memset(m, 0, sizeof(struct mbm_state)); ··· 447 377 /* Reading a single domain, must be on a CPU in that domain. */ 448 378 if (!cpumask_test_cpu(cpu, &rr->d->hdr.cpu_mask)) 449 379 return -EINVAL; 450 - rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, 451 - rr->evtid, &tval, rr->arch_mon_ctx); 380 + if (rr->is_mbm_cntr) 381 + rr->err = resctrl_arch_cntr_read(rr->r, rr->d, closid, rmid, cntr_id, 382 + rr->evtid, &tval); 383 + else 384 + rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, 385 + rr->evtid, &tval, rr->arch_mon_ctx); 452 386 if (rr->err) 453 387 return rr->err; 454 388 ··· 476 402 list_for_each_entry(d, &rr->r->mon_domains, hdr.list) { 477 403 if (d->ci_id != rr->ci->id) 478 404 continue; 479 - err = resctrl_arch_rmid_read(rr->r, d, closid, rmid, 480 - rr->evtid, &tval, rr->arch_mon_ctx); 405 + if (rr->is_mbm_cntr) 406 + err = resctrl_arch_cntr_read(rr->r, d, closid, rmid, cntr_id, 407 + rr->evtid, &tval); 408 + else 409 + err = resctrl_arch_rmid_read(rr->r, d, closid, rmid, 410 + rr->evtid, &tval, rr->arch_mon_ctx); 481 411 if (!err) { 482 412 rr->val += tval; 483 413 ret = 0; ··· 497 419 /* 498 420 * mbm_bw_count() - Update bw count from values previously read by 499 421 * __mon_event_count(). 500 - * @closid: The closid used to identify the cached mbm_state. 501 - * @rmid: The rmid used to identify the cached mbm_state. 422 + * @rdtgrp: resctrl group associated with the CLOSID and RMID to identify 423 + * the cached mbm_state. 502 424 * @rr: The struct rmid_read populated by __mon_event_count(). 503 425 * 504 426 * Supporting function to calculate the memory bandwidth ··· 506 428 * __mon_event_count() is compared with the chunks value from the previous 507 429 * invocation. This must be called once per second to maintain values in MBps. 508 430 */ 509 - static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr) 431 + static void mbm_bw_count(struct rdtgroup *rdtgrp, struct rmid_read *rr) 510 432 { 511 433 u64 cur_bw, bytes, cur_bytes; 434 + u32 closid = rdtgrp->closid; 435 + u32 rmid = rdtgrp->mon.rmid; 512 436 struct mbm_state *m; 513 437 514 438 m = get_mbm_state(rr->d, closid, rmid, rr->evtid); ··· 539 459 540 460 rdtgrp = rr->rgrp; 541 461 542 - ret = __mon_event_count(rdtgrp->closid, rdtgrp->mon.rmid, rr); 462 + ret = __mon_event_count(rdtgrp, rr); 543 463 544 464 /* 545 465 * For Ctrl groups read data from child monitor groups and ··· 550 470 551 471 if (rdtgrp->type == RDTCTRL_GROUP) { 552 472 list_for_each_entry(entry, head, mon.crdtgrp_list) { 553 - if (__mon_event_count(entry->closid, entry->mon.rmid, 554 - rr) == 0) 473 + if (__mon_event_count(entry, rr) == 0) 555 474 ret = 0; 556 475 } 557 476 } ··· 681 602 } 682 603 683 604 static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d, 684 - u32 closid, u32 rmid, enum resctrl_event_id evtid) 605 + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid) 685 606 { 686 607 struct rmid_read rr = {0}; 687 608 688 609 rr.r = r; 689 610 rr.d = d; 690 611 rr.evtid = evtid; 691 - rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid); 692 - if (IS_ERR(rr.arch_mon_ctx)) { 693 - pr_warn_ratelimited("Failed to allocate monitor context: %ld", 694 - PTR_ERR(rr.arch_mon_ctx)); 695 - return; 612 + if (resctrl_arch_mbm_cntr_assign_enabled(r)) { 613 + rr.is_mbm_cntr = true; 614 + } else { 615 + rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid); 616 + if (IS_ERR(rr.arch_mon_ctx)) { 617 + pr_warn_ratelimited("Failed to allocate monitor context: %ld", 618 + PTR_ERR(rr.arch_mon_ctx)); 619 + return; 620 + } 696 621 } 697 622 698 - __mon_event_count(closid, rmid, &rr); 623 + __mon_event_count(rdtgrp, &rr); 699 624 700 625 /* 701 626 * If the software controller is enabled, compute the 702 627 * bandwidth for this event id. 703 628 */ 704 629 if (is_mba_sc(NULL)) 705 - mbm_bw_count(closid, rmid, &rr); 630 + mbm_bw_count(rdtgrp, &rr); 706 631 707 - resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx); 632 + if (rr.arch_mon_ctx) 633 + resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx); 708 634 } 709 635 710 636 static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d, 711 - u32 closid, u32 rmid) 637 + struct rdtgroup *rdtgrp) 712 638 { 713 639 /* 714 640 * This is protected from concurrent reads from user as both 715 641 * the user and overflow handler hold the global mutex. 716 642 */ 717 - if (resctrl_arch_is_mbm_total_enabled()) 718 - mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_TOTAL_EVENT_ID); 643 + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) 644 + mbm_update_one_event(r, d, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID); 719 645 720 - if (resctrl_arch_is_mbm_local_enabled()) 721 - mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_LOCAL_EVENT_ID); 646 + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) 647 + mbm_update_one_event(r, d, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID); 722 648 } 723 649 724 650 /* ··· 796 712 d = container_of(work, struct rdt_mon_domain, mbm_over.work); 797 713 798 714 list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { 799 - mbm_update(r, d, prgrp->closid, prgrp->mon.rmid); 715 + mbm_update(r, d, prgrp); 800 716 801 717 head = &prgrp->mon.crdtgrp_list; 802 718 list_for_each_entry(crgrp, head, mon.crdtgrp_list) 803 - mbm_update(r, d, crgrp->closid, crgrp->mon.rmid); 719 + mbm_update(r, d, crgrp); 804 720 805 721 if (is_mba_sc(NULL)) 806 722 update_mba_bw(prgrp, d); ··· 926 842 mutex_unlock(&rdtgroup_mutex); 927 843 } 928 844 929 - static struct mon_evt llc_occupancy_event = { 930 - .name = "llc_occupancy", 931 - .evtid = QOS_L3_OCCUP_EVENT_ID, 845 + /* 846 + * All available events. Architecture code marks the ones that 847 + * are supported by a system using resctrl_enable_mon_event() 848 + * to set .enabled. 849 + */ 850 + struct mon_evt mon_event_all[QOS_NUM_EVENTS] = { 851 + [QOS_L3_OCCUP_EVENT_ID] = { 852 + .name = "llc_occupancy", 853 + .evtid = QOS_L3_OCCUP_EVENT_ID, 854 + .rid = RDT_RESOURCE_L3, 855 + }, 856 + [QOS_L3_MBM_TOTAL_EVENT_ID] = { 857 + .name = "mbm_total_bytes", 858 + .evtid = QOS_L3_MBM_TOTAL_EVENT_ID, 859 + .rid = RDT_RESOURCE_L3, 860 + }, 861 + [QOS_L3_MBM_LOCAL_EVENT_ID] = { 862 + .name = "mbm_local_bytes", 863 + .evtid = QOS_L3_MBM_LOCAL_EVENT_ID, 864 + .rid = RDT_RESOURCE_L3, 865 + }, 932 866 }; 933 867 934 - static struct mon_evt mbm_total_event = { 935 - .name = "mbm_total_bytes", 936 - .evtid = QOS_L3_MBM_TOTAL_EVENT_ID, 868 + void resctrl_enable_mon_event(enum resctrl_event_id eventid) 869 + { 870 + if (WARN_ON_ONCE(eventid < QOS_FIRST_EVENT || eventid >= QOS_NUM_EVENTS)) 871 + return; 872 + if (mon_event_all[eventid].enabled) { 873 + pr_warn("Duplicate enable for event %d\n", eventid); 874 + return; 875 + } 876 + 877 + mon_event_all[eventid].enabled = true; 878 + } 879 + 880 + bool resctrl_is_mon_event_enabled(enum resctrl_event_id eventid) 881 + { 882 + return eventid >= QOS_FIRST_EVENT && eventid < QOS_NUM_EVENTS && 883 + mon_event_all[eventid].enabled; 884 + } 885 + 886 + u32 resctrl_get_mon_evt_cfg(enum resctrl_event_id evtid) 887 + { 888 + return mon_event_all[evtid].evt_cfg; 889 + } 890 + 891 + /** 892 + * struct mbm_transaction - Memory transaction an MBM event can be configured with. 893 + * @name: Name of memory transaction (read, write ...). 894 + * @val: The bit (eg. READS_TO_LOCAL_MEM or READS_TO_REMOTE_MEM) used to 895 + * represent the memory transaction within an event's configuration. 896 + */ 897 + struct mbm_transaction { 898 + char name[32]; 899 + u32 val; 937 900 }; 938 901 939 - static struct mon_evt mbm_local_event = { 940 - .name = "mbm_local_bytes", 941 - .evtid = QOS_L3_MBM_LOCAL_EVENT_ID, 902 + /* Decoded values for each type of memory transaction. */ 903 + static struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = { 904 + {"local_reads", READS_TO_LOCAL_MEM}, 905 + {"remote_reads", READS_TO_REMOTE_MEM}, 906 + {"local_non_temporal_writes", NON_TEMP_WRITE_TO_LOCAL_MEM}, 907 + {"remote_non_temporal_writes", NON_TEMP_WRITE_TO_REMOTE_MEM}, 908 + {"local_reads_slow_memory", READS_TO_LOCAL_S_MEM}, 909 + {"remote_reads_slow_memory", READS_TO_REMOTE_S_MEM}, 910 + {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM}, 942 911 }; 912 + 913 + int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) 914 + { 915 + struct mon_evt *mevt = rdt_kn_parent_priv(of->kn); 916 + struct rdt_resource *r; 917 + bool sep = false; 918 + int ret = 0, i; 919 + 920 + mutex_lock(&rdtgroup_mutex); 921 + rdt_last_cmd_clear(); 922 + 923 + r = resctrl_arch_get_resource(mevt->rid); 924 + if (!resctrl_arch_mbm_cntr_assign_enabled(r)) { 925 + rdt_last_cmd_puts("mbm_event counter assignment mode is not enabled\n"); 926 + ret = -EINVAL; 927 + goto out_unlock; 928 + } 929 + 930 + for (i = 0; i < NUM_MBM_TRANSACTIONS; i++) { 931 + if (mevt->evt_cfg & mbm_transactions[i].val) { 932 + if (sep) 933 + seq_putc(seq, ','); 934 + seq_printf(seq, "%s", mbm_transactions[i].name); 935 + sep = true; 936 + } 937 + } 938 + seq_putc(seq, '\n'); 939 + 940 + out_unlock: 941 + mutex_unlock(&rdtgroup_mutex); 942 + 943 + return ret; 944 + } 945 + 946 + int resctrl_mbm_assign_on_mkdir_show(struct kernfs_open_file *of, struct seq_file *s, 947 + void *v) 948 + { 949 + struct rdt_resource *r = rdt_kn_parent_priv(of->kn); 950 + int ret = 0; 951 + 952 + mutex_lock(&rdtgroup_mutex); 953 + rdt_last_cmd_clear(); 954 + 955 + if (!resctrl_arch_mbm_cntr_assign_enabled(r)) { 956 + rdt_last_cmd_puts("mbm_event counter assignment mode is not enabled\n"); 957 + ret = -EINVAL; 958 + goto out_unlock; 959 + } 960 + 961 + seq_printf(s, "%u\n", r->mon.mbm_assign_on_mkdir); 962 + 963 + out_unlock: 964 + mutex_unlock(&rdtgroup_mutex); 965 + 966 + return ret; 967 + } 968 + 969 + ssize_t resctrl_mbm_assign_on_mkdir_write(struct kernfs_open_file *of, char *buf, 970 + size_t nbytes, loff_t off) 971 + { 972 + struct rdt_resource *r = rdt_kn_parent_priv(of->kn); 973 + bool value; 974 + int ret; 975 + 976 + ret = kstrtobool(buf, &value); 977 + if (ret) 978 + return ret; 979 + 980 + mutex_lock(&rdtgroup_mutex); 981 + rdt_last_cmd_clear(); 982 + 983 + if (!resctrl_arch_mbm_cntr_assign_enabled(r)) { 984 + rdt_last_cmd_puts("mbm_event counter assignment mode is not enabled\n"); 985 + ret = -EINVAL; 986 + goto out_unlock; 987 + } 988 + 989 + r->mon.mbm_assign_on_mkdir = value; 990 + 991 + out_unlock: 992 + mutex_unlock(&rdtgroup_mutex); 993 + 994 + return ret ?: nbytes; 995 + } 943 996 944 997 /* 945 - * Initialize the event list for the resource. 946 - * 947 - * Note that MBM events are also part of RDT_RESOURCE_L3 resource 948 - * because as per the SDM the total and local memory bandwidth 949 - * are enumerated as part of L3 monitoring. 998 + * mbm_cntr_free_all() - Clear all the counter ID configuration details in the 999 + * domain @d. Called when mbm_assign_mode is changed. 950 1000 */ 951 - static void l3_mon_evt_init(struct rdt_resource *r) 1001 + static void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d) 952 1002 { 953 - INIT_LIST_HEAD(&r->evt_list); 1003 + memset(d->cntr_cfg, 0, sizeof(*d->cntr_cfg) * r->mon.num_mbm_cntrs); 1004 + } 954 1005 955 - if (resctrl_arch_is_llc_occupancy_enabled()) 956 - list_add_tail(&llc_occupancy_event.list, &r->evt_list); 957 - if (resctrl_arch_is_mbm_total_enabled()) 958 - list_add_tail(&mbm_total_event.list, &r->evt_list); 959 - if (resctrl_arch_is_mbm_local_enabled()) 960 - list_add_tail(&mbm_local_event.list, &r->evt_list); 1006 + /* 1007 + * resctrl_reset_rmid_all() - Reset all non-architecture states for all the 1008 + * supported RMIDs. 1009 + */ 1010 + static void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d) 1011 + { 1012 + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); 1013 + enum resctrl_event_id evt; 1014 + int idx; 1015 + 1016 + for_each_mbm_event_id(evt) { 1017 + if (!resctrl_is_mon_event_enabled(evt)) 1018 + continue; 1019 + idx = MBM_STATE_IDX(evt); 1020 + memset(d->mbm_states[idx], 0, sizeof(*d->mbm_states[0]) * idx_limit); 1021 + } 1022 + } 1023 + 1024 + /* 1025 + * rdtgroup_assign_cntr() - Assign/unassign the counter ID for the event, RMID 1026 + * pair in the domain. 1027 + * 1028 + * Assign the counter if @assign is true else unassign the counter. Reset the 1029 + * associated non-architectural state. 1030 + */ 1031 + static void rdtgroup_assign_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, 1032 + enum resctrl_event_id evtid, u32 rmid, u32 closid, 1033 + u32 cntr_id, bool assign) 1034 + { 1035 + struct mbm_state *m; 1036 + 1037 + resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, assign); 1038 + 1039 + m = get_mbm_state(d, closid, rmid, evtid); 1040 + if (m) 1041 + memset(m, 0, sizeof(*m)); 1042 + } 1043 + 1044 + /* 1045 + * rdtgroup_alloc_assign_cntr() - Allocate a counter ID and assign it to the event 1046 + * pointed to by @mevt and the resctrl group @rdtgrp within the domain @d. 1047 + * 1048 + * Return: 1049 + * 0 on success, < 0 on failure. 1050 + */ 1051 + static int rdtgroup_alloc_assign_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, 1052 + struct rdtgroup *rdtgrp, struct mon_evt *mevt) 1053 + { 1054 + int cntr_id; 1055 + 1056 + /* No action required if the counter is assigned already. */ 1057 + cntr_id = mbm_cntr_get(r, d, rdtgrp, mevt->evtid); 1058 + if (cntr_id >= 0) 1059 + return 0; 1060 + 1061 + cntr_id = mbm_cntr_alloc(r, d, rdtgrp, mevt->evtid); 1062 + if (cntr_id < 0) { 1063 + rdt_last_cmd_printf("Failed to allocate counter for %s in domain %d\n", 1064 + mevt->name, d->hdr.id); 1065 + return cntr_id; 1066 + } 1067 + 1068 + rdtgroup_assign_cntr(r, d, mevt->evtid, rdtgrp->mon.rmid, rdtgrp->closid, cntr_id, true); 1069 + 1070 + return 0; 1071 + } 1072 + 1073 + /* 1074 + * rdtgroup_assign_cntr_event() - Assign a hardware counter for the event in 1075 + * @mevt to the resctrl group @rdtgrp. Assign counters to all domains if @d is 1076 + * NULL; otherwise, assign the counter to the specified domain @d. 1077 + * 1078 + * If all counters in a domain are already in use, rdtgroup_alloc_assign_cntr() 1079 + * will fail. The assignment process will abort at the first failure encountered 1080 + * during domain traversal, which may result in the event being only partially 1081 + * assigned. 1082 + * 1083 + * Return: 1084 + * 0 on success, < 0 on failure. 1085 + */ 1086 + static int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp, 1087 + struct mon_evt *mevt) 1088 + { 1089 + struct rdt_resource *r = resctrl_arch_get_resource(mevt->rid); 1090 + int ret = 0; 1091 + 1092 + if (!d) { 1093 + list_for_each_entry(d, &r->mon_domains, hdr.list) { 1094 + ret = rdtgroup_alloc_assign_cntr(r, d, rdtgrp, mevt); 1095 + if (ret) 1096 + return ret; 1097 + } 1098 + } else { 1099 + ret = rdtgroup_alloc_assign_cntr(r, d, rdtgrp, mevt); 1100 + } 1101 + 1102 + return ret; 1103 + } 1104 + 1105 + /* 1106 + * rdtgroup_assign_cntrs() - Assign counters to MBM events. Called when 1107 + * a new group is created. 1108 + * 1109 + * Each group can accommodate two counters per domain: one for the total 1110 + * event and one for the local event. Assignments may fail due to the limited 1111 + * number of counters. However, it is not necessary to fail the group creation 1112 + * and thus no failure is returned. Users have the option to modify the 1113 + * counter assignments after the group has been created. 1114 + */ 1115 + void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp) 1116 + { 1117 + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3); 1118 + 1119 + if (!r->mon_capable || !resctrl_arch_mbm_cntr_assign_enabled(r) || 1120 + !r->mon.mbm_assign_on_mkdir) 1121 + return; 1122 + 1123 + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) 1124 + rdtgroup_assign_cntr_event(NULL, rdtgrp, 1125 + &mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID]); 1126 + 1127 + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) 1128 + rdtgroup_assign_cntr_event(NULL, rdtgrp, 1129 + &mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID]); 1130 + } 1131 + 1132 + /* 1133 + * rdtgroup_free_unassign_cntr() - Unassign and reset the counter ID configuration 1134 + * for the event pointed to by @mevt within the domain @d and resctrl group @rdtgrp. 1135 + */ 1136 + static void rdtgroup_free_unassign_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, 1137 + struct rdtgroup *rdtgrp, struct mon_evt *mevt) 1138 + { 1139 + int cntr_id; 1140 + 1141 + cntr_id = mbm_cntr_get(r, d, rdtgrp, mevt->evtid); 1142 + 1143 + /* If there is no cntr_id assigned, nothing to do */ 1144 + if (cntr_id < 0) 1145 + return; 1146 + 1147 + rdtgroup_assign_cntr(r, d, mevt->evtid, rdtgrp->mon.rmid, rdtgrp->closid, cntr_id, false); 1148 + 1149 + mbm_cntr_free(d, cntr_id); 1150 + } 1151 + 1152 + /* 1153 + * rdtgroup_unassign_cntr_event() - Unassign a hardware counter associated with 1154 + * the event structure @mevt from the domain @d and the group @rdtgrp. Unassign 1155 + * the counters from all the domains if @d is NULL else unassign from @d. 1156 + */ 1157 + static void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp, 1158 + struct mon_evt *mevt) 1159 + { 1160 + struct rdt_resource *r = resctrl_arch_get_resource(mevt->rid); 1161 + 1162 + if (!d) { 1163 + list_for_each_entry(d, &r->mon_domains, hdr.list) 1164 + rdtgroup_free_unassign_cntr(r, d, rdtgrp, mevt); 1165 + } else { 1166 + rdtgroup_free_unassign_cntr(r, d, rdtgrp, mevt); 1167 + } 1168 + } 1169 + 1170 + /* 1171 + * rdtgroup_unassign_cntrs() - Unassign the counters associated with MBM events. 1172 + * Called when a group is deleted. 1173 + */ 1174 + void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp) 1175 + { 1176 + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3); 1177 + 1178 + if (!r->mon_capable || !resctrl_arch_mbm_cntr_assign_enabled(r)) 1179 + return; 1180 + 1181 + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) 1182 + rdtgroup_unassign_cntr_event(NULL, rdtgrp, 1183 + &mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID]); 1184 + 1185 + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) 1186 + rdtgroup_unassign_cntr_event(NULL, rdtgrp, 1187 + &mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID]); 1188 + } 1189 + 1190 + static int resctrl_parse_mem_transactions(char *tok, u32 *val) 1191 + { 1192 + u32 temp_val = 0; 1193 + char *evt_str; 1194 + bool found; 1195 + int i; 1196 + 1197 + next_config: 1198 + if (!tok || tok[0] == '\0') { 1199 + *val = temp_val; 1200 + return 0; 1201 + } 1202 + 1203 + /* Start processing the strings for each memory transaction type */ 1204 + evt_str = strim(strsep(&tok, ",")); 1205 + found = false; 1206 + for (i = 0; i < NUM_MBM_TRANSACTIONS; i++) { 1207 + if (!strcmp(mbm_transactions[i].name, evt_str)) { 1208 + temp_val |= mbm_transactions[i].val; 1209 + found = true; 1210 + break; 1211 + } 1212 + } 1213 + 1214 + if (!found) { 1215 + rdt_last_cmd_printf("Invalid memory transaction type %s\n", evt_str); 1216 + return -EINVAL; 1217 + } 1218 + 1219 + goto next_config; 1220 + } 1221 + 1222 + /* 1223 + * rdtgroup_update_cntr_event - Update the counter assignments for the event 1224 + * in a group. 1225 + * @r: Resource to which update needs to be done. 1226 + * @rdtgrp: Resctrl group. 1227 + * @evtid: MBM monitor event. 1228 + */ 1229 + static void rdtgroup_update_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp, 1230 + enum resctrl_event_id evtid) 1231 + { 1232 + struct rdt_mon_domain *d; 1233 + int cntr_id; 1234 + 1235 + list_for_each_entry(d, &r->mon_domains, hdr.list) { 1236 + cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid); 1237 + if (cntr_id >= 0) 1238 + rdtgroup_assign_cntr(r, d, evtid, rdtgrp->mon.rmid, 1239 + rdtgrp->closid, cntr_id, true); 1240 + } 1241 + } 1242 + 1243 + /* 1244 + * resctrl_update_cntr_allrdtgrp - Update the counter assignments for the event 1245 + * for all the groups. 1246 + * @mevt MBM Monitor event. 1247 + */ 1248 + static void resctrl_update_cntr_allrdtgrp(struct mon_evt *mevt) 1249 + { 1250 + struct rdt_resource *r = resctrl_arch_get_resource(mevt->rid); 1251 + struct rdtgroup *prgrp, *crgrp; 1252 + 1253 + /* 1254 + * Find all the groups where the event is assigned and update the 1255 + * configuration of existing assignments. 1256 + */ 1257 + list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { 1258 + rdtgroup_update_cntr_event(r, prgrp, mevt->evtid); 1259 + 1260 + list_for_each_entry(crgrp, &prgrp->mon.crdtgrp_list, mon.crdtgrp_list) 1261 + rdtgroup_update_cntr_event(r, crgrp, mevt->evtid); 1262 + } 1263 + } 1264 + 1265 + ssize_t event_filter_write(struct kernfs_open_file *of, char *buf, size_t nbytes, 1266 + loff_t off) 1267 + { 1268 + struct mon_evt *mevt = rdt_kn_parent_priv(of->kn); 1269 + struct rdt_resource *r; 1270 + u32 evt_cfg = 0; 1271 + int ret = 0; 1272 + 1273 + /* Valid input requires a trailing newline */ 1274 + if (nbytes == 0 || buf[nbytes - 1] != '\n') 1275 + return -EINVAL; 1276 + 1277 + buf[nbytes - 1] = '\0'; 1278 + 1279 + cpus_read_lock(); 1280 + mutex_lock(&rdtgroup_mutex); 1281 + 1282 + rdt_last_cmd_clear(); 1283 + 1284 + r = resctrl_arch_get_resource(mevt->rid); 1285 + if (!resctrl_arch_mbm_cntr_assign_enabled(r)) { 1286 + rdt_last_cmd_puts("mbm_event counter assignment mode is not enabled\n"); 1287 + ret = -EINVAL; 1288 + goto out_unlock; 1289 + } 1290 + 1291 + ret = resctrl_parse_mem_transactions(buf, &evt_cfg); 1292 + if (!ret && mevt->evt_cfg != evt_cfg) { 1293 + mevt->evt_cfg = evt_cfg; 1294 + resctrl_update_cntr_allrdtgrp(mevt); 1295 + } 1296 + 1297 + out_unlock: 1298 + mutex_unlock(&rdtgroup_mutex); 1299 + cpus_read_unlock(); 1300 + 1301 + return ret ?: nbytes; 1302 + } 1303 + 1304 + int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of, 1305 + struct seq_file *s, void *v) 1306 + { 1307 + struct rdt_resource *r = rdt_kn_parent_priv(of->kn); 1308 + bool enabled; 1309 + 1310 + mutex_lock(&rdtgroup_mutex); 1311 + enabled = resctrl_arch_mbm_cntr_assign_enabled(r); 1312 + 1313 + if (r->mon.mbm_cntr_assignable) { 1314 + if (enabled) 1315 + seq_puts(s, "[mbm_event]\n"); 1316 + else 1317 + seq_puts(s, "[default]\n"); 1318 + 1319 + if (!IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED)) { 1320 + if (enabled) 1321 + seq_puts(s, "default\n"); 1322 + else 1323 + seq_puts(s, "mbm_event\n"); 1324 + } 1325 + } else { 1326 + seq_puts(s, "[default]\n"); 1327 + } 1328 + 1329 + mutex_unlock(&rdtgroup_mutex); 1330 + 1331 + return 0; 1332 + } 1333 + 1334 + ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of, char *buf, 1335 + size_t nbytes, loff_t off) 1336 + { 1337 + struct rdt_resource *r = rdt_kn_parent_priv(of->kn); 1338 + struct rdt_mon_domain *d; 1339 + int ret = 0; 1340 + bool enable; 1341 + 1342 + /* Valid input requires a trailing newline */ 1343 + if (nbytes == 0 || buf[nbytes - 1] != '\n') 1344 + return -EINVAL; 1345 + 1346 + buf[nbytes - 1] = '\0'; 1347 + 1348 + cpus_read_lock(); 1349 + mutex_lock(&rdtgroup_mutex); 1350 + 1351 + rdt_last_cmd_clear(); 1352 + 1353 + if (!strcmp(buf, "default")) { 1354 + enable = 0; 1355 + } else if (!strcmp(buf, "mbm_event")) { 1356 + if (r->mon.mbm_cntr_assignable) { 1357 + enable = 1; 1358 + } else { 1359 + ret = -EINVAL; 1360 + rdt_last_cmd_puts("mbm_event mode is not supported\n"); 1361 + goto out_unlock; 1362 + } 1363 + } else { 1364 + ret = -EINVAL; 1365 + rdt_last_cmd_puts("Unsupported assign mode\n"); 1366 + goto out_unlock; 1367 + } 1368 + 1369 + if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) { 1370 + ret = resctrl_arch_mbm_cntr_assign_set(r, enable); 1371 + if (ret) 1372 + goto out_unlock; 1373 + 1374 + /* Update the visibility of BMEC related files */ 1375 + resctrl_bmec_files_show(r, NULL, !enable); 1376 + 1377 + /* 1378 + * Initialize the default memory transaction values for 1379 + * total and local events. 1380 + */ 1381 + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) 1382 + mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask; 1383 + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) 1384 + mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask & 1385 + (READS_TO_LOCAL_MEM | 1386 + READS_TO_LOCAL_S_MEM | 1387 + NON_TEMP_WRITE_TO_LOCAL_MEM); 1388 + /* Enable auto assignment when switching to "mbm_event" mode */ 1389 + if (enable) 1390 + r->mon.mbm_assign_on_mkdir = true; 1391 + /* 1392 + * Reset all the non-achitectural RMID state and assignable counters. 1393 + */ 1394 + list_for_each_entry(d, &r->mon_domains, hdr.list) { 1395 + mbm_cntr_free_all(r, d); 1396 + resctrl_reset_rmid_all(r, d); 1397 + } 1398 + } 1399 + 1400 + out_unlock: 1401 + mutex_unlock(&rdtgroup_mutex); 1402 + cpus_read_unlock(); 1403 + 1404 + return ret ?: nbytes; 1405 + } 1406 + 1407 + int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of, 1408 + struct seq_file *s, void *v) 1409 + { 1410 + struct rdt_resource *r = rdt_kn_parent_priv(of->kn); 1411 + struct rdt_mon_domain *dom; 1412 + bool sep = false; 1413 + 1414 + cpus_read_lock(); 1415 + mutex_lock(&rdtgroup_mutex); 1416 + 1417 + list_for_each_entry(dom, &r->mon_domains, hdr.list) { 1418 + if (sep) 1419 + seq_putc(s, ';'); 1420 + 1421 + seq_printf(s, "%d=%d", dom->hdr.id, r->mon.num_mbm_cntrs); 1422 + sep = true; 1423 + } 1424 + seq_putc(s, '\n'); 1425 + 1426 + mutex_unlock(&rdtgroup_mutex); 1427 + cpus_read_unlock(); 1428 + return 0; 1429 + } 1430 + 1431 + int resctrl_available_mbm_cntrs_show(struct kernfs_open_file *of, 1432 + struct seq_file *s, void *v) 1433 + { 1434 + struct rdt_resource *r = rdt_kn_parent_priv(of->kn); 1435 + struct rdt_mon_domain *dom; 1436 + bool sep = false; 1437 + u32 cntrs, i; 1438 + int ret = 0; 1439 + 1440 + cpus_read_lock(); 1441 + mutex_lock(&rdtgroup_mutex); 1442 + 1443 + rdt_last_cmd_clear(); 1444 + 1445 + if (!resctrl_arch_mbm_cntr_assign_enabled(r)) { 1446 + rdt_last_cmd_puts("mbm_event counter assignment mode is not enabled\n"); 1447 + ret = -EINVAL; 1448 + goto out_unlock; 1449 + } 1450 + 1451 + list_for_each_entry(dom, &r->mon_domains, hdr.list) { 1452 + if (sep) 1453 + seq_putc(s, ';'); 1454 + 1455 + cntrs = 0; 1456 + for (i = 0; i < r->mon.num_mbm_cntrs; i++) { 1457 + if (!dom->cntr_cfg[i].rdtgrp) 1458 + cntrs++; 1459 + } 1460 + 1461 + seq_printf(s, "%d=%u", dom->hdr.id, cntrs); 1462 + sep = true; 1463 + } 1464 + seq_putc(s, '\n'); 1465 + 1466 + out_unlock: 1467 + mutex_unlock(&rdtgroup_mutex); 1468 + cpus_read_unlock(); 1469 + 1470 + return ret; 1471 + } 1472 + 1473 + int mbm_L3_assignments_show(struct kernfs_open_file *of, struct seq_file *s, void *v) 1474 + { 1475 + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3); 1476 + struct rdt_mon_domain *d; 1477 + struct rdtgroup *rdtgrp; 1478 + struct mon_evt *mevt; 1479 + int ret = 0; 1480 + bool sep; 1481 + 1482 + rdtgrp = rdtgroup_kn_lock_live(of->kn); 1483 + if (!rdtgrp) { 1484 + ret = -ENOENT; 1485 + goto out_unlock; 1486 + } 1487 + 1488 + rdt_last_cmd_clear(); 1489 + if (!resctrl_arch_mbm_cntr_assign_enabled(r)) { 1490 + rdt_last_cmd_puts("mbm_event counter assignment mode is not enabled\n"); 1491 + ret = -EINVAL; 1492 + goto out_unlock; 1493 + } 1494 + 1495 + for_each_mon_event(mevt) { 1496 + if (mevt->rid != r->rid || !mevt->enabled || !resctrl_is_mbm_event(mevt->evtid)) 1497 + continue; 1498 + 1499 + sep = false; 1500 + seq_printf(s, "%s:", mevt->name); 1501 + list_for_each_entry(d, &r->mon_domains, hdr.list) { 1502 + if (sep) 1503 + seq_putc(s, ';'); 1504 + 1505 + if (mbm_cntr_get(r, d, rdtgrp, mevt->evtid) < 0) 1506 + seq_printf(s, "%d=_", d->hdr.id); 1507 + else 1508 + seq_printf(s, "%d=e", d->hdr.id); 1509 + 1510 + sep = true; 1511 + } 1512 + seq_putc(s, '\n'); 1513 + } 1514 + 1515 + out_unlock: 1516 + rdtgroup_kn_unlock(of->kn); 1517 + 1518 + return ret; 1519 + } 1520 + 1521 + /* 1522 + * mbm_get_mon_event_by_name() - Return the mon_evt entry for the matching 1523 + * event name. 1524 + */ 1525 + static struct mon_evt *mbm_get_mon_event_by_name(struct rdt_resource *r, char *name) 1526 + { 1527 + struct mon_evt *mevt; 1528 + 1529 + for_each_mon_event(mevt) { 1530 + if (mevt->rid == r->rid && mevt->enabled && 1531 + resctrl_is_mbm_event(mevt->evtid) && 1532 + !strcmp(mevt->name, name)) 1533 + return mevt; 1534 + } 1535 + 1536 + return NULL; 1537 + } 1538 + 1539 + static int rdtgroup_modify_assign_state(char *assign, struct rdt_mon_domain *d, 1540 + struct rdtgroup *rdtgrp, struct mon_evt *mevt) 1541 + { 1542 + int ret = 0; 1543 + 1544 + if (!assign || strlen(assign) != 1) 1545 + return -EINVAL; 1546 + 1547 + switch (*assign) { 1548 + case 'e': 1549 + ret = rdtgroup_assign_cntr_event(d, rdtgrp, mevt); 1550 + break; 1551 + case '_': 1552 + rdtgroup_unassign_cntr_event(d, rdtgrp, mevt); 1553 + break; 1554 + default: 1555 + ret = -EINVAL; 1556 + break; 1557 + } 1558 + 1559 + return ret; 1560 + } 1561 + 1562 + static int resctrl_parse_mbm_assignment(struct rdt_resource *r, struct rdtgroup *rdtgrp, 1563 + char *event, char *tok) 1564 + { 1565 + struct rdt_mon_domain *d; 1566 + unsigned long dom_id = 0; 1567 + char *dom_str, *id_str; 1568 + struct mon_evt *mevt; 1569 + int ret; 1570 + 1571 + mevt = mbm_get_mon_event_by_name(r, event); 1572 + if (!mevt) { 1573 + rdt_last_cmd_printf("Invalid event %s\n", event); 1574 + return -ENOENT; 1575 + } 1576 + 1577 + next: 1578 + if (!tok || tok[0] == '\0') 1579 + return 0; 1580 + 1581 + /* Start processing the strings for each domain */ 1582 + dom_str = strim(strsep(&tok, ";")); 1583 + 1584 + id_str = strsep(&dom_str, "="); 1585 + 1586 + /* Check for domain id '*' which means all domains */ 1587 + if (id_str && *id_str == '*') { 1588 + ret = rdtgroup_modify_assign_state(dom_str, NULL, rdtgrp, mevt); 1589 + if (ret) 1590 + rdt_last_cmd_printf("Assign operation '%s:*=%s' failed\n", 1591 + event, dom_str); 1592 + return ret; 1593 + } else if (!id_str || kstrtoul(id_str, 10, &dom_id)) { 1594 + rdt_last_cmd_puts("Missing domain id\n"); 1595 + return -EINVAL; 1596 + } 1597 + 1598 + /* Verify if the dom_id is valid */ 1599 + list_for_each_entry(d, &r->mon_domains, hdr.list) { 1600 + if (d->hdr.id == dom_id) { 1601 + ret = rdtgroup_modify_assign_state(dom_str, d, rdtgrp, mevt); 1602 + if (ret) { 1603 + rdt_last_cmd_printf("Assign operation '%s:%ld=%s' failed\n", 1604 + event, dom_id, dom_str); 1605 + return ret; 1606 + } 1607 + goto next; 1608 + } 1609 + } 1610 + 1611 + rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id); 1612 + return -EINVAL; 1613 + } 1614 + 1615 + ssize_t mbm_L3_assignments_write(struct kernfs_open_file *of, char *buf, 1616 + size_t nbytes, loff_t off) 1617 + { 1618 + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3); 1619 + struct rdtgroup *rdtgrp; 1620 + char *token, *event; 1621 + int ret = 0; 1622 + 1623 + /* Valid input requires a trailing newline */ 1624 + if (nbytes == 0 || buf[nbytes - 1] != '\n') 1625 + return -EINVAL; 1626 + 1627 + buf[nbytes - 1] = '\0'; 1628 + 1629 + rdtgrp = rdtgroup_kn_lock_live(of->kn); 1630 + if (!rdtgrp) { 1631 + rdtgroup_kn_unlock(of->kn); 1632 + return -ENOENT; 1633 + } 1634 + rdt_last_cmd_clear(); 1635 + 1636 + if (!resctrl_arch_mbm_cntr_assign_enabled(r)) { 1637 + rdt_last_cmd_puts("mbm_event mode is not enabled\n"); 1638 + rdtgroup_kn_unlock(of->kn); 1639 + return -EINVAL; 1640 + } 1641 + 1642 + while ((token = strsep(&buf, "\n")) != NULL) { 1643 + /* 1644 + * The write command follows the following format: 1645 + * "<Event>:<Domain ID>=<Assignment state>" 1646 + * Extract the event name first. 1647 + */ 1648 + event = strsep(&token, ":"); 1649 + 1650 + ret = resctrl_parse_mbm_assignment(r, rdtgrp, event, token); 1651 + if (ret) 1652 + break; 1653 + } 1654 + 1655 + rdtgroup_kn_unlock(of->kn); 1656 + 1657 + return ret ?: nbytes; 961 1658 } 962 1659 963 1660 /** ··· 1765 900 if (ret) 1766 901 return ret; 1767 902 1768 - l3_mon_evt_init(r); 1769 - 1770 903 if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_TOTAL_EVENT_ID)) { 1771 - mbm_total_event.configurable = true; 904 + mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].configurable = true; 1772 905 resctrl_file_fflags_init("mbm_total_bytes_config", 1773 906 RFTYPE_MON_INFO | RFTYPE_RES_CACHE); 1774 907 } 1775 908 if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_LOCAL_EVENT_ID)) { 1776 - mbm_local_event.configurable = true; 909 + mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].configurable = true; 1777 910 resctrl_file_fflags_init("mbm_local_bytes_config", 1778 911 RFTYPE_MON_INFO | RFTYPE_RES_CACHE); 1779 912 } 1780 913 1781 - if (resctrl_arch_is_mbm_local_enabled()) 914 + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) 1782 915 mba_mbps_default_event = QOS_L3_MBM_LOCAL_EVENT_ID; 1783 - else if (resctrl_arch_is_mbm_total_enabled()) 916 + else if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) 1784 917 mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID; 918 + 919 + if (r->mon.mbm_cntr_assignable) { 920 + if (!resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) 921 + resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID); 922 + if (!resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) 923 + resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID); 924 + mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask; 925 + mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask & 926 + (READS_TO_LOCAL_MEM | 927 + READS_TO_LOCAL_S_MEM | 928 + NON_TEMP_WRITE_TO_LOCAL_MEM); 929 + r->mon.mbm_assign_on_mkdir = true; 930 + resctrl_file_fflags_init("num_mbm_cntrs", 931 + RFTYPE_MON_INFO | RFTYPE_RES_CACHE); 932 + resctrl_file_fflags_init("available_mbm_cntrs", 933 + RFTYPE_MON_INFO | RFTYPE_RES_CACHE); 934 + resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG); 935 + resctrl_file_fflags_init("mbm_assign_on_mkdir", RFTYPE_MON_INFO | 936 + RFTYPE_RES_CACHE); 937 + resctrl_file_fflags_init("mbm_L3_assignments", RFTYPE_MON_BASE); 938 + } 1785 939 1786 940 return 0; 1787 941 }
+209 -50
fs/resctrl/rdtgroup.c
··· 123 123 124 124 static bool resctrl_is_mbm_enabled(void) 125 125 { 126 - return (resctrl_arch_is_mbm_total_enabled() || 127 - resctrl_arch_is_mbm_local_enabled()); 128 - } 129 - 130 - static bool resctrl_is_mbm_event(int e) 131 - { 132 - return (e >= QOS_L3_MBM_TOTAL_EVENT_ID && 133 - e <= QOS_L3_MBM_LOCAL_EVENT_ID); 126 + return (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID) || 127 + resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)); 134 128 } 135 129 136 130 /* ··· 190 196 lockdep_assert_held(&rdtgroup_mutex); 191 197 192 198 if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID) && 193 - resctrl_arch_is_llc_occupancy_enabled()) { 199 + resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID)) { 194 200 cleanest_closid = resctrl_find_cleanest_closid(); 195 201 if (cleanest_closid < 0) 196 202 return cleanest_closid; ··· 975 981 return 0; 976 982 } 977 983 978 - static void *rdt_kn_parent_priv(struct kernfs_node *kn) 984 + void *rdt_kn_parent_priv(struct kernfs_node *kn) 979 985 { 980 986 /* 981 987 * The parent pointer is only valid within RCU section since it can be ··· 1135 1141 { 1136 1142 struct rdt_resource *r = rdt_kn_parent_priv(of->kn); 1137 1143 1138 - seq_printf(seq, "%d\n", r->num_rmid); 1144 + seq_printf(seq, "%d\n", r->mon.num_rmid); 1139 1145 1140 1146 return 0; 1141 1147 } ··· 1146 1152 struct rdt_resource *r = rdt_kn_parent_priv(of->kn); 1147 1153 struct mon_evt *mevt; 1148 1154 1149 - list_for_each_entry(mevt, &r->evt_list, list) { 1155 + for_each_mon_event(mevt) { 1156 + if (mevt->rid != r->rid || !mevt->enabled) 1157 + continue; 1150 1158 seq_printf(seq, "%s\n", mevt->name); 1151 - if (mevt->configurable) 1159 + if (mevt->configurable && 1160 + !resctrl_arch_mbm_cntr_assign_enabled(r)) 1152 1161 seq_printf(seq, "%s_config\n", mevt->name); 1153 1162 } 1154 1163 ··· 1732 1735 } 1733 1736 1734 1737 /* Value from user cannot be more than the supported set of events */ 1735 - if ((val & r->mbm_cfg_mask) != val) { 1738 + if ((val & r->mon.mbm_cfg_mask) != val) { 1736 1739 rdt_last_cmd_printf("Invalid event configuration: max valid mask is 0x%02x\n", 1737 - r->mbm_cfg_mask); 1740 + r->mon.mbm_cfg_mask); 1738 1741 return -EINVAL; 1739 1742 } 1740 1743 ··· 1800 1803 return ret ?: nbytes; 1801 1804 } 1802 1805 1806 + /* 1807 + * resctrl_bmec_files_show() — Controls the visibility of BMEC-related resctrl 1808 + * files. When @show is true, the files are displayed; when false, the files 1809 + * are hidden. 1810 + * Don't treat kernfs_find_and_get failure as an error, since this function may 1811 + * be called regardless of whether BMEC is supported or the event is enabled. 1812 + */ 1813 + void resctrl_bmec_files_show(struct rdt_resource *r, struct kernfs_node *l3_mon_kn, 1814 + bool show) 1815 + { 1816 + struct kernfs_node *kn_config, *mon_kn = NULL; 1817 + char name[32]; 1818 + 1819 + if (!l3_mon_kn) { 1820 + sprintf(name, "%s_MON", r->name); 1821 + mon_kn = kernfs_find_and_get(kn_info, name); 1822 + if (!mon_kn) 1823 + return; 1824 + l3_mon_kn = mon_kn; 1825 + } 1826 + 1827 + kn_config = kernfs_find_and_get(l3_mon_kn, "mbm_total_bytes_config"); 1828 + if (kn_config) { 1829 + kernfs_show(kn_config, show); 1830 + kernfs_put(kn_config); 1831 + } 1832 + 1833 + kn_config = kernfs_find_and_get(l3_mon_kn, "mbm_local_bytes_config"); 1834 + if (kn_config) { 1835 + kernfs_show(kn_config, show); 1836 + kernfs_put(kn_config); 1837 + } 1838 + 1839 + /* Release the reference only if it was acquired */ 1840 + if (mon_kn) 1841 + kernfs_put(mon_kn); 1842 + } 1843 + 1803 1844 /* rdtgroup information files for one cache resource. */ 1804 1845 static struct rftype res_common_files[] = { 1805 1846 { ··· 1846 1811 .kf_ops = &rdtgroup_kf_single_ops, 1847 1812 .seq_show = rdt_last_cmd_status_show, 1848 1813 .fflags = RFTYPE_TOP_INFO, 1814 + }, 1815 + { 1816 + .name = "mbm_assign_on_mkdir", 1817 + .mode = 0644, 1818 + .kf_ops = &rdtgroup_kf_single_ops, 1819 + .seq_show = resctrl_mbm_assign_on_mkdir_show, 1820 + .write = resctrl_mbm_assign_on_mkdir_write, 1849 1821 }, 1850 1822 { 1851 1823 .name = "num_closids", ··· 1869 1827 .fflags = RFTYPE_MON_INFO, 1870 1828 }, 1871 1829 { 1830 + .name = "available_mbm_cntrs", 1831 + .mode = 0444, 1832 + .kf_ops = &rdtgroup_kf_single_ops, 1833 + .seq_show = resctrl_available_mbm_cntrs_show, 1834 + }, 1835 + { 1872 1836 .name = "num_rmids", 1873 1837 .mode = 0444, 1874 1838 .kf_ops = &rdtgroup_kf_single_ops, ··· 1887 1839 .kf_ops = &rdtgroup_kf_single_ops, 1888 1840 .seq_show = rdt_default_ctrl_show, 1889 1841 .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE, 1842 + }, 1843 + { 1844 + .name = "num_mbm_cntrs", 1845 + .mode = 0444, 1846 + .kf_ops = &rdtgroup_kf_single_ops, 1847 + .seq_show = resctrl_num_mbm_cntrs_show, 1890 1848 }, 1891 1849 { 1892 1850 .name = "min_cbm_bits", ··· 1968 1914 .kf_ops = &rdtgroup_kf_single_ops, 1969 1915 .seq_show = mbm_local_bytes_config_show, 1970 1916 .write = mbm_local_bytes_config_write, 1917 + }, 1918 + { 1919 + .name = "event_filter", 1920 + .mode = 0644, 1921 + .kf_ops = &rdtgroup_kf_single_ops, 1922 + .seq_show = event_filter_show, 1923 + .write = event_filter_write, 1924 + }, 1925 + { 1926 + .name = "mbm_L3_assignments", 1927 + .mode = 0644, 1928 + .kf_ops = &rdtgroup_kf_single_ops, 1929 + .seq_show = mbm_L3_assignments_show, 1930 + .write = mbm_L3_assignments_write, 1931 + }, 1932 + { 1933 + .name = "mbm_assign_mode", 1934 + .mode = 0644, 1935 + .kf_ops = &rdtgroup_kf_single_ops, 1936 + .seq_show = resctrl_mbm_assign_mode_show, 1937 + .write = resctrl_mbm_assign_mode_write, 1938 + .fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE, 1971 1939 }, 1972 1940 { 1973 1941 .name = "cpus", ··· 2244 2168 return ret; 2245 2169 } 2246 2170 2171 + static int resctrl_mkdir_event_configs(struct rdt_resource *r, struct kernfs_node *l3_mon_kn) 2172 + { 2173 + struct kernfs_node *kn_subdir, *kn_subdir2; 2174 + struct mon_evt *mevt; 2175 + int ret; 2176 + 2177 + kn_subdir = kernfs_create_dir(l3_mon_kn, "event_configs", l3_mon_kn->mode, NULL); 2178 + if (IS_ERR(kn_subdir)) 2179 + return PTR_ERR(kn_subdir); 2180 + 2181 + ret = rdtgroup_kn_set_ugid(kn_subdir); 2182 + if (ret) 2183 + return ret; 2184 + 2185 + for_each_mon_event(mevt) { 2186 + if (mevt->rid != r->rid || !mevt->enabled || !resctrl_is_mbm_event(mevt->evtid)) 2187 + continue; 2188 + 2189 + kn_subdir2 = kernfs_create_dir(kn_subdir, mevt->name, kn_subdir->mode, mevt); 2190 + if (IS_ERR(kn_subdir2)) { 2191 + ret = PTR_ERR(kn_subdir2); 2192 + goto out; 2193 + } 2194 + 2195 + ret = rdtgroup_kn_set_ugid(kn_subdir2); 2196 + if (ret) 2197 + goto out; 2198 + 2199 + ret = rdtgroup_add_files(kn_subdir2, RFTYPE_ASSIGN_CONFIG); 2200 + if (ret) 2201 + break; 2202 + } 2203 + 2204 + out: 2205 + return ret; 2206 + } 2207 + 2247 2208 static int rdtgroup_mkdir_info_resdir(void *priv, char *name, 2248 2209 unsigned long fflags) 2249 2210 { 2250 2211 struct kernfs_node *kn_subdir; 2212 + struct rdt_resource *r; 2251 2213 int ret; 2252 2214 2253 2215 kn_subdir = kernfs_create_dir(kn_info, name, ··· 2298 2184 return ret; 2299 2185 2300 2186 ret = rdtgroup_add_files(kn_subdir, fflags); 2301 - if (!ret) 2302 - kernfs_activate(kn_subdir); 2187 + if (ret) 2188 + return ret; 2189 + 2190 + if ((fflags & RFTYPE_MON_INFO) == RFTYPE_MON_INFO) { 2191 + r = priv; 2192 + if (r->mon.mbm_cntr_assignable) { 2193 + ret = resctrl_mkdir_event_configs(r, kn_subdir); 2194 + if (ret) 2195 + return ret; 2196 + /* 2197 + * Hide BMEC related files if mbm_event mode 2198 + * is enabled. 2199 + */ 2200 + if (resctrl_arch_mbm_cntr_assign_enabled(r)) 2201 + resctrl_bmec_files_show(r, kn_subdir, false); 2202 + } 2203 + } 2204 + 2205 + kernfs_activate(kn_subdir); 2303 2206 2304 2207 return ret; 2305 2208 } ··· 2739 2608 goto out_root; 2740 2609 2741 2610 ret = schemata_list_create(); 2742 - if (ret) { 2743 - schemata_list_destroy(); 2744 - goto out_ctx; 2745 - } 2611 + if (ret) 2612 + goto out_schemata_free; 2746 2613 2747 2614 ret = closid_init(); 2748 2615 if (ret) ··· 2765 2636 &kn_mongrp); 2766 2637 if (ret < 0) 2767 2638 goto out_info; 2639 + 2640 + rdtgroup_assign_cntrs(&rdtgroup_default); 2768 2641 2769 2642 ret = mkdir_mondata_all(rdtgroup_default.kn, 2770 2643 &rdtgroup_default, &kn_mondata); ··· 2806 2675 if (resctrl_arch_mon_capable()) 2807 2676 kernfs_remove(kn_mondata); 2808 2677 out_mongrp: 2809 - if (resctrl_arch_mon_capable()) 2678 + if (resctrl_arch_mon_capable()) { 2679 + rdtgroup_unassign_cntrs(&rdtgroup_default); 2810 2680 kernfs_remove(kn_mongrp); 2681 + } 2811 2682 out_info: 2812 2683 kernfs_remove(kn_info); 2813 2684 out_closid_exit: 2814 2685 closid_exit(); 2815 2686 out_schemata_free: 2816 2687 schemata_list_destroy(); 2817 - out_ctx: 2818 2688 rdt_disable_ctx(); 2819 2689 out_root: 2820 2690 rdtgroup_destroy_root(); ··· 2954 2822 2955 2823 head = &rdtgrp->mon.crdtgrp_list; 2956 2824 list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) { 2825 + rdtgroup_unassign_cntrs(sentry); 2957 2826 free_rmid(sentry->closid, sentry->mon.rmid); 2958 2827 list_del(&sentry->mon.crdtgrp_list); 2959 2828 ··· 2994 2861 */ 2995 2862 cpumask_or(&rdtgroup_default.cpu_mask, 2996 2863 &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask); 2864 + 2865 + rdtgroup_unassign_cntrs(rdtgrp); 2997 2866 2998 2867 free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); 2999 2868 ··· 3081 2946 return; 3082 2947 3083 2948 rmdir_all_sub(); 2949 + rdtgroup_unassign_cntrs(&rdtgroup_default); 3084 2950 mon_put_kn_priv(); 3085 2951 rdt_pseudo_lock_release(); 3086 2952 rdtgroup_default.mode = RDT_MODE_SHAREABLE; ··· 3193 3057 struct mon_evt *mevt; 3194 3058 int ret, domid; 3195 3059 3196 - if (WARN_ON(list_empty(&r->evt_list))) 3197 - return -EPERM; 3198 - 3199 - list_for_each_entry(mevt, &r->evt_list, list) { 3060 + for_each_mon_event(mevt) { 3061 + if (mevt->rid != r->rid || !mevt->enabled) 3062 + continue; 3200 3063 domid = do_sum ? d->ci_id : d->hdr.id; 3201 3064 priv = mon_get_kn_priv(r->rid, domid, mevt, do_sum); 3202 3065 if (WARN_ON_ONCE(!priv)) ··· 3562 3427 } 3563 3428 rdtgrp->mon.rmid = ret; 3564 3429 3430 + rdtgroup_assign_cntrs(rdtgrp); 3431 + 3565 3432 ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn); 3566 3433 if (ret) { 3567 3434 rdt_last_cmd_puts("kernfs subdir error\n"); 3435 + rdtgroup_unassign_cntrs(rdtgrp); 3568 3436 free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); 3569 3437 return ret; 3570 3438 } ··· 3577 3439 3578 3440 static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp) 3579 3441 { 3580 - if (resctrl_arch_mon_capable()) 3442 + if (resctrl_arch_mon_capable()) { 3443 + rdtgroup_unassign_cntrs(rgrp); 3581 3444 free_rmid(rgrp->closid, rgrp->mon.rmid); 3445 + } 3582 3446 } 3583 3447 3584 3448 /* ··· 3856 3716 update_closid_rmid(tmpmask, NULL); 3857 3717 3858 3718 rdtgrp->flags = RDT_DELETED; 3719 + 3720 + rdtgroup_unassign_cntrs(rdtgrp); 3721 + 3859 3722 free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); 3860 3723 3861 3724 /* ··· 3905 3762 */ 3906 3763 cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask); 3907 3764 update_closid_rmid(tmpmask, NULL); 3765 + 3766 + rdtgroup_unassign_cntrs(rdtgrp); 3908 3767 3909 3768 free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); 3910 3769 closid_free(rdtgrp->closid); ··· 4167 4022 4168 4023 static void domain_destroy_mon_state(struct rdt_mon_domain *d) 4169 4024 { 4025 + int idx; 4026 + 4027 + kfree(d->cntr_cfg); 4170 4028 bitmap_free(d->rmid_busy_llc); 4171 - kfree(d->mbm_total); 4172 - kfree(d->mbm_local); 4029 + for_each_mbm_idx(idx) { 4030 + kfree(d->mbm_states[idx]); 4031 + d->mbm_states[idx] = NULL; 4032 + } 4173 4033 } 4174 4034 4175 4035 void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_ctrl_domain *d) ··· 4200 4050 4201 4051 if (resctrl_is_mbm_enabled()) 4202 4052 cancel_delayed_work(&d->mbm_over); 4203 - if (resctrl_arch_is_llc_occupancy_enabled() && has_busy_rmid(d)) { 4053 + if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID) && has_busy_rmid(d)) { 4204 4054 /* 4205 4055 * When a package is going down, forcefully 4206 4056 * decrement rmid->ebusy. There is no way to know ··· 4234 4084 static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain *d) 4235 4085 { 4236 4086 u32 idx_limit = resctrl_arch_system_num_rmid_idx(); 4237 - size_t tsize; 4087 + size_t tsize = sizeof(*d->mbm_states[0]); 4088 + enum resctrl_event_id eventid; 4089 + int idx; 4238 4090 4239 - if (resctrl_arch_is_llc_occupancy_enabled()) { 4091 + if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID)) { 4240 4092 d->rmid_busy_llc = bitmap_zalloc(idx_limit, GFP_KERNEL); 4241 4093 if (!d->rmid_busy_llc) 4242 4094 return -ENOMEM; 4243 4095 } 4244 - if (resctrl_arch_is_mbm_total_enabled()) { 4245 - tsize = sizeof(*d->mbm_total); 4246 - d->mbm_total = kcalloc(idx_limit, tsize, GFP_KERNEL); 4247 - if (!d->mbm_total) { 4248 - bitmap_free(d->rmid_busy_llc); 4249 - return -ENOMEM; 4250 - } 4096 + 4097 + for_each_mbm_event_id(eventid) { 4098 + if (!resctrl_is_mon_event_enabled(eventid)) 4099 + continue; 4100 + idx = MBM_STATE_IDX(eventid); 4101 + d->mbm_states[idx] = kcalloc(idx_limit, tsize, GFP_KERNEL); 4102 + if (!d->mbm_states[idx]) 4103 + goto cleanup; 4251 4104 } 4252 - if (resctrl_arch_is_mbm_local_enabled()) { 4253 - tsize = sizeof(*d->mbm_local); 4254 - d->mbm_local = kcalloc(idx_limit, tsize, GFP_KERNEL); 4255 - if (!d->mbm_local) { 4256 - bitmap_free(d->rmid_busy_llc); 4257 - kfree(d->mbm_total); 4258 - return -ENOMEM; 4259 - } 4105 + 4106 + if (resctrl_is_mbm_enabled() && r->mon.mbm_cntr_assignable) { 4107 + tsize = sizeof(*d->cntr_cfg); 4108 + d->cntr_cfg = kcalloc(r->mon.num_mbm_cntrs, tsize, GFP_KERNEL); 4109 + if (!d->cntr_cfg) 4110 + goto cleanup; 4260 4111 } 4261 4112 4262 4113 return 0; 4114 + cleanup: 4115 + bitmap_free(d->rmid_busy_llc); 4116 + for_each_mbm_idx(idx) { 4117 + kfree(d->mbm_states[idx]); 4118 + d->mbm_states[idx] = NULL; 4119 + } 4120 + 4121 + return -ENOMEM; 4263 4122 } 4264 4123 4265 4124 int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_ctrl_domain *d) ··· 4303 4144 RESCTRL_PICK_ANY_CPU); 4304 4145 } 4305 4146 4306 - if (resctrl_arch_is_llc_occupancy_enabled()) 4147 + if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID)) 4307 4148 INIT_DELAYED_WORK(&d->cqm_limbo, cqm_handle_limbo); 4308 4149 4309 4150 /* ··· 4378 4219 cancel_delayed_work(&d->mbm_over); 4379 4220 mbm_setup_overflow_handler(d, 0, cpu); 4380 4221 } 4381 - if (resctrl_arch_is_llc_occupancy_enabled() && 4222 + if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID) && 4382 4223 cpu == d->cqm_work_cpu && has_busy_rmid(d)) { 4383 4224 cancel_delayed_work(&d->cqm_limbo); 4384 4225 cqm_setup_limbo_handler(d, 0, cpu);
+137 -11
include/linux/resctrl.h
··· 157 157 }; 158 158 159 159 /** 160 + * struct mbm_cntr_cfg - Assignable counter configuration. 161 + * @evtid: MBM event to which the counter is assigned. Only valid 162 + * if @rdtgroup is not NULL. 163 + * @rdtgrp: resctrl group assigned to the counter. NULL if the 164 + * counter is free. 165 + */ 166 + struct mbm_cntr_cfg { 167 + enum resctrl_event_id evtid; 168 + struct rdtgroup *rdtgrp; 169 + }; 170 + 171 + /** 160 172 * struct rdt_mon_domain - group of CPUs sharing a resctrl monitor resource 161 173 * @hdr: common header for different domain types 162 174 * @ci_id: cache info id for this domain 163 175 * @rmid_busy_llc: bitmap of which limbo RMIDs are above threshold 164 - * @mbm_total: saved state for MBM total bandwidth 165 - * @mbm_local: saved state for MBM local bandwidth 176 + * @mbm_states: Per-event pointer to the MBM event's saved state. 177 + * An MBM event's state is an array of struct mbm_state 178 + * indexed by RMID on x86 or combined CLOSID, RMID on Arm. 166 179 * @mbm_over: worker to periodically read MBM h/w counters 167 180 * @cqm_limbo: worker to periodically read CQM h/w counters 168 181 * @mbm_work_cpu: worker CPU for MBM h/w counters 169 182 * @cqm_work_cpu: worker CPU for CQM h/w counters 183 + * @cntr_cfg: array of assignable counters' configuration (indexed 184 + * by counter ID) 170 185 */ 171 186 struct rdt_mon_domain { 172 187 struct rdt_domain_hdr hdr; 173 188 unsigned int ci_id; 174 189 unsigned long *rmid_busy_llc; 175 - struct mbm_state *mbm_total; 176 - struct mbm_state *mbm_local; 190 + struct mbm_state *mbm_states[QOS_NUM_L3_MBM_EVENTS]; 177 191 struct delayed_work mbm_over; 178 192 struct delayed_work cqm_limbo; 179 193 int mbm_work_cpu; 180 194 int cqm_work_cpu; 195 + struct mbm_cntr_cfg *cntr_cfg; 181 196 }; 182 197 183 198 /** ··· 271 256 }; 272 257 273 258 /** 259 + * struct resctrl_mon - Monitoring related data of a resctrl resource. 260 + * @num_rmid: Number of RMIDs available. 261 + * @mbm_cfg_mask: Memory transactions that can be tracked when bandwidth 262 + * monitoring events can be configured. 263 + * @num_mbm_cntrs: Number of assignable counters. 264 + * @mbm_cntr_assignable:Is system capable of supporting counter assignment? 265 + * @mbm_assign_on_mkdir:True if counters should automatically be assigned to MBM 266 + * events of monitor groups created via mkdir. 267 + */ 268 + struct resctrl_mon { 269 + int num_rmid; 270 + unsigned int mbm_cfg_mask; 271 + int num_mbm_cntrs; 272 + bool mbm_cntr_assignable; 273 + bool mbm_assign_on_mkdir; 274 + }; 275 + 276 + /** 274 277 * struct rdt_resource - attributes of a resctrl resource 275 278 * @rid: The index of the resource 276 279 * @alloc_capable: Is allocation available on this machine 277 280 * @mon_capable: Is monitor feature available on this machine 278 - * @num_rmid: Number of RMIDs available 279 281 * @ctrl_scope: Scope of this resource for control functions 280 282 * @mon_scope: Scope of this resource for monitor functions 281 283 * @cache: Cache allocation related data 282 284 * @membw: If the component has bandwidth controls, their properties. 285 + * @mon: Monitoring related data. 283 286 * @ctrl_domains: RCU list of all control domains for this resource 284 287 * @mon_domains: RCU list of all monitor domains for this resource 285 288 * @name: Name to use in "schemata" file. 286 289 * @schema_fmt: Which format string and parser is used for this schema. 287 - * @evt_list: List of monitoring events 288 - * @mbm_cfg_mask: Bandwidth sources that can be tracked when bandwidth 289 - * monitoring events can be configured. 290 290 * @cdp_capable: Is the CDP feature available on this resource 291 291 */ 292 292 struct rdt_resource { 293 293 int rid; 294 294 bool alloc_capable; 295 295 bool mon_capable; 296 - int num_rmid; 297 296 enum resctrl_scope ctrl_scope; 298 297 enum resctrl_scope mon_scope; 299 298 struct resctrl_cache cache; 300 299 struct resctrl_membw membw; 300 + struct resctrl_mon mon; 301 301 struct list_head ctrl_domains; 302 302 struct list_head mon_domains; 303 303 char *name; 304 304 enum resctrl_schema_fmt schema_fmt; 305 - struct list_head evt_list; 306 - unsigned int mbm_cfg_mask; 307 305 bool cdp_capable; 308 306 }; 309 307 ··· 400 372 u32 resctrl_arch_system_num_rmid_idx(void); 401 373 int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid); 402 374 375 + void resctrl_enable_mon_event(enum resctrl_event_id eventid); 376 + 377 + bool resctrl_is_mon_event_enabled(enum resctrl_event_id eventid); 378 + 403 379 bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt); 380 + 381 + static inline bool resctrl_is_mbm_event(enum resctrl_event_id eventid) 382 + { 383 + return (eventid >= QOS_L3_MBM_TOTAL_EVENT_ID && 384 + eventid <= QOS_L3_MBM_LOCAL_EVENT_ID); 385 + } 386 + 387 + u32 resctrl_get_mon_evt_cfg(enum resctrl_event_id eventid); 388 + 389 + /* Iterate over all memory bandwidth events */ 390 + #define for_each_mbm_event_id(eventid) \ 391 + for (eventid = QOS_L3_MBM_TOTAL_EVENT_ID; \ 392 + eventid <= QOS_L3_MBM_LOCAL_EVENT_ID; eventid++) 393 + 394 + /* Iterate over memory bandwidth arrays in domain structures */ 395 + #define for_each_mbm_idx(idx) \ 396 + for (idx = 0; idx < QOS_NUM_L3_MBM_EVENTS; idx++) 404 397 405 398 /** 406 399 * resctrl_arch_mon_event_config_write() - Write the config for an event. ··· 464 415 465 416 bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l); 466 417 int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable); 418 + 419 + /** 420 + * resctrl_arch_mbm_cntr_assign_enabled() - Check if MBM counter assignment 421 + * mode is enabled. 422 + * @r: Pointer to the resource structure. 423 + * 424 + * Return: 425 + * true if the assignment mode is enabled, false otherwise. 426 + */ 427 + bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r); 428 + 429 + /** 430 + * resctrl_arch_mbm_cntr_assign_set() - Configure the MBM counter assignment mode. 431 + * @r: Pointer to the resource structure. 432 + * @enable: Set to true to enable, false to disable the assignment mode. 433 + * 434 + * Return: 435 + * 0 on success, < 0 on error. 436 + */ 437 + int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable); 467 438 468 439 /* 469 440 * Update the ctrl_val and apply this config right now. ··· 596 527 * This can be called from any CPU. 597 528 */ 598 529 void resctrl_arch_reset_all_ctrls(struct rdt_resource *r); 530 + 531 + /** 532 + * resctrl_arch_config_cntr() - Configure the counter with its new RMID 533 + * and event details. 534 + * @r: Resource structure. 535 + * @d: The domain in which counter with ID @cntr_id should be configured. 536 + * @evtid: Monitoring event type (e.g., QOS_L3_MBM_TOTAL_EVENT_ID 537 + * or QOS_L3_MBM_LOCAL_EVENT_ID). 538 + * @rmid: RMID. 539 + * @closid: CLOSID. 540 + * @cntr_id: Counter ID to configure. 541 + * @assign: True to assign the counter or update an existing assignment, 542 + * false to unassign the counter. 543 + * 544 + * This can be called from any CPU. 545 + */ 546 + void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, 547 + enum resctrl_event_id evtid, u32 rmid, u32 closid, 548 + u32 cntr_id, bool assign); 549 + 550 + /** 551 + * resctrl_arch_cntr_read() - Read the event data corresponding to the counter ID 552 + * assigned to the RMID, event pair for this resource 553 + * and domain. 554 + * @r: Resource that the counter should be read from. 555 + * @d: Domain that the counter should be read from. 556 + * @closid: CLOSID that matches the RMID. 557 + * @rmid: The RMID to which @cntr_id is assigned. 558 + * @cntr_id: The counter to read. 559 + * @eventid: The MBM event to which @cntr_id is assigned. 560 + * @val: Result of the counter read in bytes. 561 + * 562 + * Called on a CPU that belongs to domain @d when "mbm_event" mode is enabled. 563 + * Called from a non-migrateable process context via smp_call_on_cpu() unless all 564 + * CPUs are nohz_full, in which case it is called via IPI (smp_call_function_any()). 565 + * 566 + * Return: 567 + * 0 on success, or -EIO, -EINVAL etc on error. 568 + */ 569 + int resctrl_arch_cntr_read(struct rdt_resource *r, struct rdt_mon_domain *d, 570 + u32 closid, u32 rmid, int cntr_id, 571 + enum resctrl_event_id eventid, u64 *val); 572 + 573 + /** 574 + * resctrl_arch_reset_cntr() - Reset any private state associated with counter ID. 575 + * @r: The domain's resource. 576 + * @d: The counter ID's domain. 577 + * @closid: CLOSID that matches the RMID. 578 + * @rmid: The RMID to which @cntr_id is assigned. 579 + * @cntr_id: The counter to reset. 580 + * @eventid: The MBM event to which @cntr_id is assigned. 581 + * 582 + * This can be called from any CPU. 583 + */ 584 + void resctrl_arch_reset_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, 585 + u32 closid, u32 rmid, int cntr_id, 586 + enum resctrl_event_id eventid); 599 587 600 588 extern unsigned int resctrl_rmid_realloc_threshold; 601 589 extern unsigned int resctrl_rmid_realloc_limit;
+14 -4
include/linux/resctrl_types.h
··· 34 34 /* Max event bits supported */ 35 35 #define MAX_EVT_CONFIG_BITS GENMASK(6, 0) 36 36 37 - /* 38 - * Event IDs, the values match those used to program IA32_QM_EVTSEL before 39 - * reading IA32_QM_CTR on RDT systems. 40 - */ 37 + /* Number of memory transactions that an MBM event can be configured with */ 38 + #define NUM_MBM_TRANSACTIONS 7 39 + 40 + /* Event IDs */ 41 41 enum resctrl_event_id { 42 + /* Must match value of first event below */ 43 + QOS_FIRST_EVENT = 0x01, 44 + 45 + /* 46 + * These values match those used to program IA32_QM_EVTSEL before 47 + * reading IA32_QM_CTR on RDT systems. 48 + */ 42 49 QOS_L3_OCCUP_EVENT_ID = 0x01, 43 50 QOS_L3_MBM_TOTAL_EVENT_ID = 0x02, 44 51 QOS_L3_MBM_LOCAL_EVENT_ID = 0x03, ··· 53 46 /* Must be the last */ 54 47 QOS_NUM_EVENTS, 55 48 }; 49 + 50 + #define QOS_NUM_L3_MBM_EVENTS (QOS_L3_MBM_LOCAL_EVENT_ID - QOS_L3_MBM_TOTAL_EVENT_ID + 1) 51 + #define MBM_STATE_IDX(evt) ((evt) - QOS_L3_MBM_TOTAL_EVENT_ID) 56 52 57 53 #endif /* __LINUX_RESCTRL_TYPES_H */