Merge tag 'drm-xe-next-2025-08-29' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

+40 -7

Documentation/gpu/drm-uapi.rst

··· 418 418 Recovery 419 419 -------- 420 420 421 - Current implementation defines three recovery methods, out of which, drivers 421 + Current implementation defines four recovery methods, out of which, drivers 422 422 can use any one, multiple or none. Method(s) of choice will be sent in the 423 423 uevent environment as ``WEDGED=<method1>[,..,<methodN>]`` in order of less to 424 - more side-effects. If driver is unsure about recovery or method is unknown 425 - (like soft/hard system reboot, firmware flashing, physical device replacement 426 - or any other procedure which can't be attempted on the fly), ``WEDGED=unknown`` 427 - will be sent instead. 424 + more side-effects. See the section `Vendor Specific Recovery`_ 425 + for ``WEDGED=vendor-specific``. If driver is unsure about recovery or 426 + method is unknown, ``WEDGED=unknown`` will be sent instead. 428 427 429 428 Userspace consumers can parse this event and attempt recovery as per the 430 429 following expectations. ··· 434 435 none optional telemetry collection 435 436 rebind unbind + bind driver 436 437 bus-reset unbind + bus reset/re-enumeration + bind 438 + vendor-specific vendor specific recovery method 437 439 unknown consumer policy 438 440 =============== ======================================== 439 441 ··· 445 445 telemetry information (devcoredump, syslog). This is useful because the first 446 446 hang is usually the most critical one which can result in consequential hangs or 447 447 complete wedging. 448 + 449 + 450 + Vendor Specific Recovery 451 + ------------------------ 452 + 453 + When ``WEDGED=vendor-specific`` is sent, it indicates that the device requires 454 + a recovery procedure specific to the hardware vendor and is not one of the 455 + standardized approaches. 456 + 457 + ``WEDGED=vendor-specific`` may be used to indicate different cases within a 458 + single vendor driver, each requiring a distinct recovery procedure. 459 + In such scenarios, the vendor driver must provide comprehensive documentation 460 + that describes each case, include additional hints to identify specific case and 461 + outline the corresponding recovery procedure. The documentation includes: 462 + 463 + Case - A list of all cases that sends the ``WEDGED=vendor-specific`` recovery method. 464 + 465 + Hints - Additional Information to assist the userspace consumer in identifying and 466 + differentiating between different cases. This can be exposed through sysfs, debugfs, 467 + traces, dmesg etc. 468 + 469 + Recovery Procedure - Clear instructions and guidance for recovering each case. 470 + This may include userspace scripts, tools needed for the recovery procedure. 471 + 472 + It is the responsibility of the admin/userspace consumer to identify the case and 473 + verify additional identification hints before attempting a recovery procedure. 474 + 475 + Example: If the device uses the Xe driver, then userspace consumer should refer to 476 + :ref:`Xe Device Wedging <xe-device-wedging>` for the detailed documentation. 448 477 449 478 Task information 450 479 ---------------- ··· 501 472 be closed to prevent leaks or undefined behaviour. The idea here is to clear the 502 473 device of all user context beforehand and set the stage for a clean recovery. 503 474 504 - Example 505 - ------- 475 + For ``WEDGED=vendor-specific`` recovery method, it is the responsibility of the 476 + consumer to check the driver documentation and the usecase before attempting 477 + a recovery. 478 + 479 + Example - rebind 480 + ---------------- 506 481 507 482 Udev rule:: 508 483

+1

Documentation/gpu/xe/index.rst

··· 25 25 xe_tile 26 26 xe_debugging 27 27 xe_devcoredump 28 + xe_device 28 29 xe-drm-usage-stats.rst 29 30 xe_configfs

+10

Documentation/gpu/xe/xe_device.rst

··· 1 + .. SPDX-License-Identifier: (GPL-2.0+ OR MIT) 2 + 3 + .. _xe-device-wedging: 4 + 5 + ================== 6 + Xe Device Wedging 7 + ================== 8 + 9 + .. kernel-doc:: drivers/gpu/drm/xe/xe_device.c 10 + :doc: Xe Device Wedging

+4 -2

Documentation/gpu/xe/xe_pcode.rst

··· 13 13 .. kernel-doc:: drivers/gpu/drm/xe/xe_pcode.c 14 14 :internal: 15 15 16 + .. _xe-survivability-mode: 17 + 16 18 ================== 17 - Boot Survivability 19 + Survivability Mode 18 20 ================== 19 21 20 22 .. kernel-doc:: drivers/gpu/drm/xe/xe_survivability_mode.c 21 - :doc: Xe Boot Survivability 23 + :doc: Survivability Mode

+2

drivers/gpu/drm/drm_drv.c

··· 532 532 return "rebind"; 533 533 case DRM_WEDGE_RECOVERY_BUS_RESET: 534 534 return "bus-reset"; 535 + case DRM_WEDGE_RECOVERY_VENDOR: 536 + return "vendor-specific"; 535 537 default: 536 538 return NULL; 537 539 }

+2 -2

drivers/gpu/drm/drm_gpusvm.c

··· 975 975 }; 976 976 977 977 for (i = 0, j = 0; i < npages; j++) { 978 - struct drm_pagemap_device_addr *addr = &range->dma_addr[j]; 978 + struct drm_pagemap_addr *addr = &range->dma_addr[j]; 979 979 980 980 if (addr->proto == DRM_INTERCONNECT_SYSTEM) 981 981 dma_unmap_page(dev, ··· 1328 1328 goto err_unmap; 1329 1329 } 1330 1330 1331 - range->dma_addr[j] = drm_pagemap_device_addr_encode 1331 + range->dma_addr[j] = drm_pagemap_addr_encode 1332 1332 (addr, DRM_INTERCONNECT_SYSTEM, order, 1333 1333 DMA_BIDIRECTIONAL); 1334 1334 }

+93 -49

drivers/gpu/drm/drm_pagemap.c

··· 202 202 /** 203 203 * drm_pagemap_migrate_map_pages() - Map migration pages for GPU SVM migration 204 204 * @dev: The device for which the pages are being mapped 205 - * @dma_addr: Array to store DMA addresses corresponding to mapped pages 205 + * @pagemap_addr: Array to store DMA information corresponding to mapped pages 206 206 * @migrate_pfn: Array of migrate page frame numbers to map 207 207 * @npages: Number of pages to map 208 208 * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL) ··· 215 215 * Returns: 0 on success, -EFAULT if an error occurs during mapping. 216 216 */ 217 217 static int drm_pagemap_migrate_map_pages(struct device *dev, 218 - dma_addr_t *dma_addr, 218 + struct drm_pagemap_addr *pagemap_addr, 219 219 unsigned long *migrate_pfn, 220 220 unsigned long npages, 221 221 enum dma_data_direction dir) 222 222 { 223 223 unsigned long i; 224 224 225 - for (i = 0; i < npages; ++i) { 225 + for (i = 0; i < npages;) { 226 226 struct page *page = migrate_pfn_to_page(migrate_pfn[i]); 227 + dma_addr_t dma_addr; 228 + struct folio *folio; 229 + unsigned int order = 0; 227 230 228 231 if (!page) 229 - continue; 232 + goto next; 230 233 231 234 if (WARN_ON_ONCE(is_zone_device_page(page))) 232 235 return -EFAULT; 233 236 234 - dma_addr[i] = dma_map_page(dev, page, 0, PAGE_SIZE, dir); 235 - if (dma_mapping_error(dev, dma_addr[i])) 237 + folio = page_folio(page); 238 + order = folio_order(folio); 239 + 240 + dma_addr = dma_map_page(dev, page, 0, page_size(page), dir); 241 + if (dma_mapping_error(dev, dma_addr)) 236 242 return -EFAULT; 243 + 244 + pagemap_addr[i] = 245 + drm_pagemap_addr_encode(dma_addr, 246 + DRM_INTERCONNECT_SYSTEM, 247 + order, dir); 248 + 249 + next: 250 + i += NR_PAGES(order); 237 251 } 238 252 239 253 return 0; ··· 256 242 /** 257 243 * drm_pagemap_migrate_unmap_pages() - Unmap pages previously mapped for GPU SVM migration 258 244 * @dev: The device for which the pages were mapped 259 - * @dma_addr: Array of DMA addresses corresponding to mapped pages 245 + * @pagemap_addr: Array of DMA information corresponding to mapped pages 260 246 * @npages: Number of pages to unmap 261 247 * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL) 262 248 * ··· 265 251 * if it's valid and not already unmapped, and unmaps the corresponding page. 266 252 */ 267 253 static void drm_pagemap_migrate_unmap_pages(struct device *dev, 268 - dma_addr_t *dma_addr, 254 + struct drm_pagemap_addr *pagemap_addr, 269 255 unsigned long npages, 270 256 enum dma_data_direction dir) 271 257 { 272 258 unsigned long i; 273 259 274 - for (i = 0; i < npages; ++i) { 275 - if (!dma_addr[i] || dma_mapping_error(dev, dma_addr[i])) 276 - continue; 260 + for (i = 0; i < npages;) { 261 + if (!pagemap_addr[i].addr || dma_mapping_error(dev, pagemap_addr[i].addr)) 262 + goto next; 277 263 278 - dma_unmap_page(dev, dma_addr[i], PAGE_SIZE, dir); 264 + dma_unmap_page(dev, pagemap_addr[i].addr, PAGE_SIZE << pagemap_addr[i].order, dir); 265 + 266 + next: 267 + i += NR_PAGES(pagemap_addr[i].order); 279 268 } 280 269 } 281 270 ··· 331 314 struct vm_area_struct *vas; 332 315 struct drm_pagemap_zdd *zdd = NULL; 333 316 struct page **pages; 334 - dma_addr_t *dma_addr; 317 + struct drm_pagemap_addr *pagemap_addr; 335 318 void *buf; 336 319 int err; 337 320 ··· 357 340 goto err_out; 358 341 } 359 342 360 - buf = kvcalloc(npages, 2 * sizeof(*migrate.src) + sizeof(*dma_addr) + 343 + buf = kvcalloc(npages, 2 * sizeof(*migrate.src) + sizeof(*pagemap_addr) + 361 344 sizeof(*pages), GFP_KERNEL); 362 345 if (!buf) { 363 346 err = -ENOMEM; 364 347 goto err_out; 365 348 } 366 - dma_addr = buf + (2 * sizeof(*migrate.src) * npages); 367 - pages = buf + (2 * sizeof(*migrate.src) + sizeof(*dma_addr)) * npages; 349 + pagemap_addr = buf + (2 * sizeof(*migrate.src) * npages); 350 + pages = buf + (2 * sizeof(*migrate.src) + sizeof(*pagemap_addr)) * npages; 368 351 369 352 zdd = drm_pagemap_zdd_alloc(pgmap_owner); 370 353 if (!zdd) { ··· 394 377 if (err) 395 378 goto err_finalize; 396 379 397 - err = drm_pagemap_migrate_map_pages(devmem_allocation->dev, dma_addr, 380 + err = drm_pagemap_migrate_map_pages(devmem_allocation->dev, pagemap_addr, 398 381 migrate.src, npages, DMA_TO_DEVICE); 382 + 399 383 if (err) 400 384 goto err_finalize; 401 385 ··· 408 390 drm_pagemap_get_devmem_page(page, zdd); 409 391 } 410 392 411 - err = ops->copy_to_devmem(pages, dma_addr, npages); 393 + err = ops->copy_to_devmem(pages, pagemap_addr, npages); 412 394 if (err) 413 395 goto err_finalize; 414 396 ··· 422 404 drm_pagemap_migration_unlock_put_pages(npages, migrate.dst); 423 405 migrate_vma_pages(&migrate); 424 406 migrate_vma_finalize(&migrate); 425 - drm_pagemap_migrate_unmap_pages(devmem_allocation->dev, dma_addr, npages, 407 + drm_pagemap_migrate_unmap_pages(devmem_allocation->dev, pagemap_addr, npages, 426 408 DMA_TO_DEVICE); 427 409 err_free: 428 410 if (zdd) ··· 460 442 { 461 443 unsigned long i; 462 444 463 - for (i = 0; i < npages; ++i, addr += PAGE_SIZE) { 464 - struct page *page, *src_page; 445 + for (i = 0; i < npages;) { 446 + struct page *page = NULL, *src_page; 447 + struct folio *folio; 448 + unsigned int order = 0; 465 449 466 450 if (!(src_mpfn[i] & MIGRATE_PFN_MIGRATE)) 467 - continue; 451 + goto next; 468 452 469 453 src_page = migrate_pfn_to_page(src_mpfn[i]); 470 454 if (!src_page) 471 - continue; 455 + goto next; 472 456 473 457 if (fault_page) { 474 458 if (src_page->zone_device_data != 475 459 fault_page->zone_device_data) 476 - continue; 460 + goto next; 477 461 } 478 462 479 - if (vas) 480 - page = alloc_page_vma(GFP_HIGHUSER, vas, addr); 481 - else 482 - page = alloc_page(GFP_HIGHUSER); 463 + order = folio_order(page_folio(src_page)); 483 464 484 - if (!page) 465 + /* TODO: Support fallback to single pages if THP allocation fails */ 466 + if (vas) 467 + folio = vma_alloc_folio(GFP_HIGHUSER, order, vas, addr); 468 + else 469 + folio = folio_alloc(GFP_HIGHUSER, order); 470 + 471 + if (!folio) 485 472 goto free_pages; 486 473 474 + page = folio_page(folio, 0); 487 475 mpfn[i] = migrate_pfn(page_to_pfn(page)); 476 + 477 + next: 478 + if (page) 479 + addr += page_size(page); 480 + else 481 + addr += PAGE_SIZE; 482 + 483 + i += NR_PAGES(order); 488 484 } 489 485 490 - for (i = 0; i < npages; ++i) { 486 + for (i = 0; i < npages;) { 491 487 struct page *page = migrate_pfn_to_page(mpfn[i]); 488 + unsigned int order = 0; 492 489 493 490 if (!page) 494 - continue; 491 + goto next_lock; 495 492 496 - WARN_ON_ONCE(!trylock_page(page)); 497 - ++*mpages; 493 + WARN_ON_ONCE(!folio_trylock(page_folio(page))); 494 + 495 + order = folio_order(page_folio(page)); 496 + *mpages += NR_PAGES(order); 497 + 498 + next_lock: 499 + i += NR_PAGES(order); 498 500 } 499 501 500 502 return 0; 501 503 502 504 free_pages: 503 - for (i = 0; i < npages; ++i) { 505 + for (i = 0; i < npages;) { 504 506 struct page *page = migrate_pfn_to_page(mpfn[i]); 507 + unsigned int order = 0; 505 508 506 509 if (!page) 507 - continue; 510 + goto next_put; 508 511 509 512 put_page(page); 510 513 mpfn[i] = 0; 514 + 515 + order = folio_order(page_folio(page)); 516 + 517 + next_put: 518 + i += NR_PAGES(order); 511 519 } 512 520 return -ENOMEM; 513 521 } ··· 553 509 unsigned long npages, mpages = 0; 554 510 struct page **pages; 555 511 unsigned long *src, *dst; 556 - dma_addr_t *dma_addr; 512 + struct drm_pagemap_addr *pagemap_addr; 557 513 void *buf; 558 514 int i, err = 0; 559 515 unsigned int retry_count = 2; ··· 564 520 if (!mmget_not_zero(devmem_allocation->mm)) 565 521 return -EFAULT; 566 522 567 - buf = kvcalloc(npages, 2 * sizeof(*src) + sizeof(*dma_addr) + 523 + buf = kvcalloc(npages, 2 * sizeof(*src) + sizeof(*pagemap_addr) + 568 524 sizeof(*pages), GFP_KERNEL); 569 525 if (!buf) { 570 526 err = -ENOMEM; ··· 572 528 } 573 529 src = buf; 574 530 dst = buf + (sizeof(*src) * npages); 575 - dma_addr = buf + (2 * sizeof(*src) * npages); 576 - pages = buf + (2 * sizeof(*src) + sizeof(*dma_addr)) * npages; 531 + pagemap_addr = buf + (2 * sizeof(*src) * npages); 532 + pages = buf + (2 * sizeof(*src) + sizeof(*pagemap_addr)) * npages; 577 533 578 534 err = ops->populate_devmem_pfn(devmem_allocation, npages, src); 579 535 if (err) ··· 588 544 if (err || !mpages) 589 545 goto err_finalize; 590 546 591 - err = drm_pagemap_migrate_map_pages(devmem_allocation->dev, dma_addr, 547 + err = drm_pagemap_migrate_map_pages(devmem_allocation->dev, pagemap_addr, 592 548 dst, npages, DMA_FROM_DEVICE); 593 549 if (err) 594 550 goto err_finalize; ··· 596 552 for (i = 0; i < npages; ++i) 597 553 pages[i] = migrate_pfn_to_page(src[i]); 598 554 599 - err = ops->copy_to_ram(pages, dma_addr, npages); 555 + err = ops->copy_to_ram(pages, pagemap_addr, npages); 600 556 if (err) 601 557 goto err_finalize; 602 558 ··· 605 561 drm_pagemap_migration_unlock_put_pages(npages, dst); 606 562 migrate_device_pages(src, dst, npages); 607 563 migrate_device_finalize(src, dst, npages); 608 - drm_pagemap_migrate_unmap_pages(devmem_allocation->dev, dma_addr, npages, 564 + drm_pagemap_migrate_unmap_pages(devmem_allocation->dev, pagemap_addr, npages, 609 565 DMA_FROM_DEVICE); 610 566 err_free: 611 567 kvfree(buf); ··· 656 612 struct device *dev = NULL; 657 613 unsigned long npages, mpages = 0; 658 614 struct page **pages; 659 - dma_addr_t *dma_addr; 615 + struct drm_pagemap_addr *pagemap_addr; 660 616 unsigned long start, end; 661 617 void *buf; 662 618 int i, err = 0; ··· 681 637 migrate.end = end; 682 638 npages = npages_in_range(start, end); 683 639 684 - buf = kvcalloc(npages, 2 * sizeof(*migrate.src) + sizeof(*dma_addr) + 640 + buf = kvcalloc(npages, 2 * sizeof(*migrate.src) + sizeof(*pagemap_addr) + 685 641 sizeof(*pages), GFP_KERNEL); 686 642 if (!buf) { 687 643 err = -ENOMEM; 688 644 goto err_out; 689 645 } 690 - dma_addr = buf + (2 * sizeof(*migrate.src) * npages); 691 - pages = buf + (2 * sizeof(*migrate.src) + sizeof(*dma_addr)) * npages; 646 + pagemap_addr = buf + (2 * sizeof(*migrate.src) * npages); 647 + pages = buf + (2 * sizeof(*migrate.src) + sizeof(*pagemap_addr)) * npages; 692 648 693 649 migrate.vma = vas; 694 650 migrate.src = buf; ··· 724 680 if (err) 725 681 goto err_finalize; 726 682 727 - err = drm_pagemap_migrate_map_pages(dev, dma_addr, migrate.dst, npages, 683 + err = drm_pagemap_migrate_map_pages(dev, pagemap_addr, migrate.dst, npages, 728 684 DMA_FROM_DEVICE); 729 685 if (err) 730 686 goto err_finalize; ··· 732 688 for (i = 0; i < npages; ++i) 733 689 pages[i] = migrate_pfn_to_page(migrate.src[i]); 734 690 735 - err = ops->copy_to_ram(pages, dma_addr, npages); 691 + err = ops->copy_to_ram(pages, pagemap_addr, npages); 736 692 if (err) 737 693 goto err_finalize; 738 694 ··· 742 698 migrate_vma_pages(&migrate); 743 699 migrate_vma_finalize(&migrate); 744 700 if (dev) 745 - drm_pagemap_migrate_unmap_pages(dev, dma_addr, npages, 701 + drm_pagemap_migrate_unmap_pages(dev, pagemap_addr, npages, 746 702 DMA_FROM_DEVICE); 747 703 err_free: 748 704 kvfree(buf);

+9 -1

drivers/gpu/drm/xe/Makefile

··· 35 35 xe-y += xe_bb.o \ 36 36 xe_bo.o \ 37 37 xe_bo_evict.o \ 38 + xe_dep_scheduler.o \ 38 39 xe_devcoredump.o \ 39 40 xe_device.o \ 40 41 xe_device_sysfs.o \ ··· 61 60 xe_gt_pagefault.o \ 62 61 xe_gt_sysfs.o \ 63 62 xe_gt_throttle.o \ 64 - xe_gt_tlb_invalidation.o \ 65 63 xe_gt_topology.o \ 66 64 xe_guc.o \ 67 65 xe_guc_ads.o \ ··· 75 75 xe_guc_log.o \ 76 76 xe_guc_pc.o \ 77 77 xe_guc_submit.o \ 78 + xe_guc_tlb_inval.o \ 78 79 xe_heci_gsc.o \ 79 80 xe_huc.o \ 80 81 xe_hw_engine.o \ 81 82 xe_hw_engine_class_sysfs.o \ 82 83 xe_hw_engine_group.o \ 84 + xe_hw_error.o \ 83 85 xe_hw_fence.o \ 84 86 xe_irq.o \ 85 87 xe_lrc.o \ 86 88 xe_migrate.o \ 87 89 xe_mmio.o \ 90 + xe_mmio_gem.o \ 88 91 xe_mocs.o \ 89 92 xe_module.o \ 90 93 xe_nvm.o \ ··· 98 95 xe_pcode.o \ 99 96 xe_pm.o \ 100 97 xe_preempt_fence.o \ 98 + xe_psmi.o \ 101 99 xe_pt.o \ 102 100 xe_pt_walk.o \ 103 101 xe_pxp.o \ ··· 118 114 xe_sync.o \ 119 115 xe_tile.o \ 120 116 xe_tile_sysfs.o \ 117 + xe_tlb_inval.o \ 118 + xe_tlb_inval_job.o \ 121 119 xe_trace.o \ 122 120 xe_trace_bo.o \ 123 121 xe_trace_guc.o \ ··· 131 125 xe_uc.o \ 132 126 xe_uc_fw.o \ 133 127 xe_vm.o \ 128 + xe_vm_madvise.o \ 134 129 xe_vram.o \ 135 130 xe_vram_freq.o \ 136 131 xe_vsec.o \ ··· 156 149 xe_memirq.o \ 157 150 xe_sriov.o \ 158 151 xe_sriov_vf.o \ 152 + xe_sriov_vf_ccs.o \ 159 153 xe_tile_sriov_vf.o 160 154 161 155 xe-$(CONFIG_PCI_IOV) += \

+8

drivers/gpu/drm/xe/abi/guc_actions_abi.h

··· 193 193 XE_GUC_REGISTER_CONTEXT_MULTI_LRC_MSG_MIN_LEN = 11, 194 194 }; 195 195 196 + enum xe_guc_context_wq_item_offsets { 197 + XE_GUC_CONTEXT_WQ_HEADER_DATA_0_TYPE_LEN = 0, 198 + XE_GUC_CONTEXT_WQ_EL_INFO_DATA_1_CTX_DESC_LOW, 199 + XE_GUC_CONTEXT_WQ_EL_INFO_DATA_2_GUCCTX_RINGTAIL_FREEZEPOCS, 200 + XE_GUC_CONTEXT_WQ_EL_INFO_DATA_3_WI_FENCE_ID, 201 + XE_GUC_CONTEXT_WQ_EL_CHILD_LIST_DATA_4_RINGTAIL, 202 + }; 203 + 196 204 enum xe_guc_report_status { 197 205 XE_GUC_REPORT_STATUS_UNKNOWN = 0x0, 198 206 XE_GUC_REPORT_STATUS_ACKED = 0x1,

+3

drivers/gpu/drm/xe/abi/guc_errors_abi.h

··· 63 63 XE_GUC_LOAD_STATUS_HWCONFIG_START = 0x05, 64 64 XE_GUC_LOAD_STATUS_HWCONFIG_DONE = 0x06, 65 65 XE_GUC_LOAD_STATUS_HWCONFIG_ERROR = 0x07, 66 + XE_GUC_LOAD_STATUS_BOOTROM_VERSION_MISMATCH = 0x08, 66 67 XE_GUC_LOAD_STATUS_GDT_DONE = 0x10, 67 68 XE_GUC_LOAD_STATUS_IDT_DONE = 0x20, 68 69 XE_GUC_LOAD_STATUS_LAPIC_DONE = 0x30, ··· 76 75 XE_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_START, 77 76 XE_GUC_LOAD_STATUS_MPU_DATA_INVALID = 0x73, 78 77 XE_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID = 0x74, 78 + XE_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR = 0x75, 79 + XE_GUC_LOAD_STATUS_INVALID_FTR_FLAG = 0x76, 79 80 XE_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_END, 80 81 81 82 XE_GUC_LOAD_STATUS_READY = 0xF0,

+2

drivers/gpu/drm/xe/abi/guc_klvs_abi.h

··· 390 390 */ 391 391 enum xe_guc_klv_ids { 392 392 GUC_WORKAROUND_KLV_BLOCK_INTERRUPTS_WHEN_MGSR_BLOCKED = 0x9002, 393 + GUC_WORKAROUND_KLV_DISABLE_PSMI_INTERRUPTS_AT_C6_ENTRY_RESTORE_AT_EXIT = 0x9004, 393 394 GUC_WORKAROUND_KLV_ID_GAM_PFQ_SHADOW_TAIL_POLLING = 0x9005, 394 395 GUC_WORKAROUND_KLV_ID_DISABLE_MTP_DURING_ASYNC_COMPUTE = 0x9007, 395 396 GUC_WA_KLV_NP_RD_WRITE_TO_CLEAR_RCSM_AT_CGP_LATE_RESTORE = 0x9008, 396 397 GUC_WORKAROUND_KLV_ID_BACK_TO_BACK_RCS_ENGINE_RESET = 0x9009, 397 398 GUC_WA_KLV_WAKE_POWER_DOMAINS_FOR_OUTBOUND_MMIO = 0x900a, 398 399 GUC_WA_KLV_RESET_BB_STACK_PTR_ON_VF_SWITCH = 0x900b, 400 + GUC_WA_KLV_RESTORE_UNSAVED_MEDIA_CONTROL_REG = 0x900c, 399 401 }; 400 402 401 403 #endif

+1 -1

drivers/gpu/drm/xe/display/intel_fbdev_fb.c

··· 41 41 size = PAGE_ALIGN(size); 42 42 obj = ERR_PTR(-ENODEV); 43 43 44 - if (!IS_DGFX(xe) && !XE_WA(xe_root_mmio_gt(xe), 22019338487_display)) { 44 + if (!IS_DGFX(xe) && !XE_GT_WA(xe_root_mmio_gt(xe), 22019338487_display)) { 45 45 obj = xe_bo_create_pin_map(xe, xe_device_get_root_tile(xe), 46 46 NULL, size, 47 47 ttm_bo_type_kernel, XE_BO_FLAG_SCANOUT |

+1 -1

drivers/gpu/drm/xe/display/xe_display_wa.c

··· 14 14 { 15 15 struct xe_device *xe = to_xe_device(display->drm); 16 16 17 - return XE_WA(xe_root_mmio_gt(xe), 16023588340); 17 + return XE_GT_WA(xe_root_mmio_gt(xe), 16023588340); 18 18 }

+3 -2

drivers/gpu/drm/xe/display/xe_fb_pin.c

··· 16 16 #include "xe_device.h" 17 17 #include "xe_ggtt.h" 18 18 #include "xe_pm.h" 19 + #include "xe_vram_types.h" 19 20 20 21 static void 21 22 write_dpt_rotated(struct xe_bo *bo, struct iosys_map *map, u32 *dpt_ofs, u32 bo_ofs, ··· 290 289 if (IS_DGFX(to_xe_device(bo->ttm.base.dev)) && 291 290 intel_fb_rc_ccs_cc_plane(&fb->base) >= 0 && 292 291 !(bo->flags & XE_BO_FLAG_NEEDS_CPU_ACCESS)) { 293 - struct xe_tile *tile = xe_device_get_root_tile(xe); 292 + struct xe_vram_region *vram = xe_device_get_root_tile(xe)->mem.vram; 294 293 295 294 /* 296 295 * If we need to able to access the clear-color value stored in ··· 298 297 * accessible. This is important on small-bar systems where 299 298 * only some subset of VRAM is CPU accessible. 300 299 */ 301 - if (tile->mem.vram.io_size < tile->mem.vram.usable_size) { 300 + if (xe_vram_region_io_size(vram) < xe_vram_region_usable_size(vram)) { 302 301 ret = -EINVAL; 303 302 goto err; 304 303 }

+3 -2

drivers/gpu/drm/xe/display/xe_plane_initial.c

··· 21 21 #include "intel_plane.h" 22 22 #include "intel_plane_initial.h" 23 23 #include "xe_bo.h" 24 + #include "xe_vram_types.h" 24 25 #include "xe_wa.h" 25 26 26 27 #include <generated/xe_wa_oob.h> ··· 104 103 * We don't currently expect this to ever be placed in the 105 104 * stolen portion. 106 105 */ 107 - if (phys_base >= tile0->mem.vram.usable_size) { 106 + if (phys_base >= xe_vram_region_usable_size(tile0->mem.vram)) { 108 107 drm_err(&xe->drm, 109 108 "Initial plane programming using invalid range, phys_base=%pa\n", 110 109 &phys_base); ··· 122 121 phys_base = base; 123 122 flags |= XE_BO_FLAG_STOLEN; 124 123 125 - if (XE_WA(xe_root_mmio_gt(xe), 22019338487_display)) 124 + if (XE_GT_WA(xe_root_mmio_gt(xe), 22019338487_display)) 126 125 return NULL; 127 126 128 127 /*

+1

drivers/gpu/drm/xe/instructions/xe_mi_commands.h

··· 65 65 66 66 #define MI_LOAD_REGISTER_MEM (__MI_INSTR(0x29) | XE_INSTR_NUM_DW(4)) 67 67 #define MI_LRM_USE_GGTT REG_BIT(22) 68 + #define MI_LRM_ASYNC REG_BIT(21) 68 69 69 70 #define MI_LOAD_REGISTER_REG (__MI_INSTR(0x2a) | XE_INSTR_NUM_DW(3)) 70 71 #define MI_LRR_DST_CS_MMIO REG_BIT(19)

+3

drivers/gpu/drm/xe/regs/xe_engine_regs.h

··· 111 111 #define PPHWSP_CSB_AND_TIMESTAMP_REPORT_DIS REG_BIT(14) 112 112 #define CS_PRIORITY_MEM_READ REG_BIT(7) 113 113 114 + #define CS_DEBUG_MODE2(base) XE_REG((base) + 0xd8, XE_REG_OPTION_MASKED) 115 + #define INSTRUCTION_STATE_CACHE_INVALIDATE REG_BIT(6) 116 + 114 117 #define FF_SLICE_CS_CHICKEN1(base) XE_REG((base) + 0xe0, XE_REG_OPTION_MASKED) 115 118 #define FFSC_PERCTX_PREEMPT_CTRL REG_BIT(14) 116 119

+2

drivers/gpu/drm/xe/regs/xe_gsc_regs.h

··· 13 13 14 14 /* Definitions of GSC H/W registers, bits, etc */ 15 15 16 + #define BMG_GSC_HECI1_BASE 0x373000 17 + 16 18 #define MTL_GSC_HECI1_BASE 0x00116000 17 19 #define MTL_GSC_HECI2_BASE 0x00117000 18 20

+1 -1

drivers/gpu/drm/xe/regs/xe_gt_regs.h

··· 42 42 #define FORCEWAKE_ACK_GSC XE_REG(0xdf8) 43 43 #define FORCEWAKE_ACK_GT_MTL XE_REG(0xdfc) 44 44 45 - #define MCFG_MCR_SELECTOR XE_REG(0xfd0) 45 + #define STEER_SEMAPHORE XE_REG(0xfd0) 46 46 #define MTL_MCR_SELECTOR XE_REG(0xfd4) 47 47 #define SF_MCR_SELECTOR XE_REG(0xfd8) 48 48 #define MCR_SELECTOR XE_REG(0xfdc)

+20

drivers/gpu/drm/xe/regs/xe_hw_error_regs.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_HW_ERROR_REGS_H_ 7 + #define _XE_HW_ERROR_REGS_H_ 8 + 9 + #define HEC_UNCORR_ERR_STATUS(base) XE_REG((base) + 0x118) 10 + #define UNCORR_FW_REPORTED_ERR BIT(6) 11 + 12 + #define HEC_UNCORR_FW_ERR_DW0(base) XE_REG((base) + 0x124) 13 + 14 + #define DEV_ERR_STAT_NONFATAL 0x100178 15 + #define DEV_ERR_STAT_CORRECTABLE 0x10017c 16 + #define DEV_ERR_STAT_REG(x) XE_REG(_PICK_EVEN((x), \ 17 + DEV_ERR_STAT_CORRECTABLE, \ 18 + DEV_ERR_STAT_NONFATAL)) 19 + #define XE_CSC_ERROR BIT(17) 20 + #endif

+1

drivers/gpu/drm/xe/regs/xe_irq_regs.h

··· 18 18 #define GFX_MSTR_IRQ XE_REG(0x190010, XE_REG_OPTION_VF) 19 19 #define MASTER_IRQ REG_BIT(31) 20 20 #define GU_MISC_IRQ REG_BIT(29) 21 + #define ERROR_IRQ(x) REG_BIT(26 + (x)) 21 22 #define DISPLAY_IRQ REG_BIT(16) 22 23 #define I2C_IRQ REG_BIT(12) 23 24 #define GT_DW_IRQ(x) REG_BIT(x)

+10

drivers/gpu/drm/xe/regs/xe_pmt.h

··· 21 21 #define SG_REMAP_INDEX1 XE_REG(SOC_BASE + 0x08) 22 22 #define SG_REMAP_BITS REG_GENMASK(31, 24) 23 23 24 + #define BMG_MODS_RESIDENCY_OFFSET (0x4D0) 25 + #define BMG_G2_RESIDENCY_OFFSET (0x530) 26 + #define BMG_G6_RESIDENCY_OFFSET (0x538) 27 + #define BMG_G8_RESIDENCY_OFFSET (0x540) 28 + #define BMG_G10_RESIDENCY_OFFSET (0x548) 29 + 30 + #define BMG_PCIE_LINK_L0_RESIDENCY_OFFSET (0x570) 31 + #define BMG_PCIE_LINK_L1_RESIDENCY_OFFSET (0x578) 32 + #define BMG_PCIE_LINK_L1_2_RESIDENCY_OFFSET (0x580) 33 + 24 34 #endif

+6 -7

drivers/gpu/drm/xe/tests/xe_dma_buf.c

··· 57 57 return; 58 58 59 59 /* 60 - * Evict exporter. Note that the gem object dma_buf member isn't 61 - * set from xe_gem_prime_export(), and it's needed for the move_notify() 62 - * functionality, so hack that up here. Evicting the exported bo will 60 + * Evict exporter. Evicting the exported bo will 63 61 * evict also the imported bo through the move_notify() functionality if 64 62 * importer is on a different device. If they're on the same device, 65 63 * the exporter and the importer should be the same bo. 66 64 */ 67 - swap(exported->ttm.base.dma_buf, dmabuf); 68 65 ret = xe_bo_evict(exported); 69 - swap(exported->ttm.base.dma_buf, dmabuf); 70 66 if (ret) { 71 67 if (ret != -EINTR && ret != -ERESTARTSYS) 72 68 KUNIT_FAIL(test, "Evicting exporter failed with err=%d.\n", ··· 135 139 PTR_ERR(dmabuf)); 136 140 goto out; 137 141 } 142 + bo->ttm.base.dma_buf = dmabuf; 138 143 139 144 import = xe_gem_prime_import(&xe->drm, dmabuf); 140 145 if (!IS_ERR(import)) { ··· 183 186 KUNIT_FAIL(test, "dynamic p2p attachment failed with err=%ld\n", 184 187 PTR_ERR(import)); 185 188 } 189 + bo->ttm.base.dma_buf = NULL; 186 190 dma_buf_put(dmabuf); 187 191 out: 188 192 drm_gem_object_put(&bo->ttm.base); ··· 204 206 static const struct dma_buf_test_params test_params[] = { 205 207 {.mem_mask = XE_BO_FLAG_VRAM0, 206 208 .attach_ops = &xe_dma_buf_attach_ops}, 207 - {.mem_mask = XE_BO_FLAG_VRAM0, 209 + {.mem_mask = XE_BO_FLAG_VRAM0 | XE_BO_FLAG_NEEDS_CPU_ACCESS, 208 210 .attach_ops = &xe_dma_buf_attach_ops, 209 211 .force_different_devices = true}, 210 212 ··· 236 238 237 239 {.mem_mask = XE_BO_FLAG_SYSTEM | XE_BO_FLAG_VRAM0, 238 240 .attach_ops = &xe_dma_buf_attach_ops}, 239 - {.mem_mask = XE_BO_FLAG_SYSTEM | XE_BO_FLAG_VRAM0, 241 + {.mem_mask = XE_BO_FLAG_SYSTEM | XE_BO_FLAG_VRAM0 | 242 + XE_BO_FLAG_NEEDS_CPU_ACCESS, 240 243 .attach_ops = &xe_dma_buf_attach_ops, 241 244 .force_different_devices = true}, 242 245

+7

drivers/gpu/drm/xe/tests/xe_pci.c

··· 101 101 } 102 102 } 103 103 104 + static void fake_xe_info_probe_tile_count(struct xe_device *xe) 105 + { 106 + /* Nothing to do, just use the statically defined value. */ 107 + } 108 + 104 109 int xe_pci_fake_device_init(struct xe_device *xe) 105 110 { 106 111 struct kunit *test = kunit_get_current_test(); ··· 143 138 data->sriov_mode : XE_SRIOV_MODE_NONE; 144 139 145 140 kunit_activate_static_stub(test, read_gmdid, fake_read_gmdid); 141 + kunit_activate_static_stub(test, xe_info_probe_tile_count, 142 + fake_xe_info_probe_tile_count); 146 143 147 144 xe_info_init_early(xe, desc, subplatform_desc); 148 145 xe_info_init(xe, desc);

+1

drivers/gpu/drm/xe/tests/xe_wa_test.c

··· 75 75 GMDID_CASE(LUNARLAKE, 2004, A0, 2000, A0), 76 76 GMDID_CASE(LUNARLAKE, 2004, B0, 2000, A0), 77 77 GMDID_CASE(BATTLEMAGE, 2001, A0, 1301, A1), 78 + GMDID_CASE(PANTHERLAKE, 3000, A0, 3000, A0), 78 79 }; 79 80 80 81 static void platform_desc(const struct platform_test_case *t, char *desc)

+3 -1

drivers/gpu/drm/xe/xe_assert.h

··· 12 12 13 13 #include "xe_gt_types.h" 14 14 #include "xe_step.h" 15 + #include "xe_vram.h" 15 16 16 17 /** 17 18 * DOC: Xe Asserts ··· 146 145 const struct xe_tile *__tile = (tile); \ 147 146 char __buf[10] __maybe_unused; \ 148 147 xe_assert_msg(tile_to_xe(__tile), condition, "tile: %u VRAM %s\n" msg, \ 149 - __tile->id, ({ string_get_size(__tile->mem.vram.actual_physical_size, 1, \ 148 + __tile->id, ({ string_get_size( \ 149 + xe_vram_region_actual_physical_size(__tile->mem.vram), 1, \ 150 150 STRING_UNITS_2, __buf, sizeof(__buf)); __buf; }), ## arg); \ 151 151 }) 152 152

+35

drivers/gpu/drm/xe/xe_bb.c

··· 60 60 return ERR_PTR(err); 61 61 } 62 62 63 + struct xe_bb *xe_bb_ccs_new(struct xe_gt *gt, u32 dwords, 64 + enum xe_sriov_vf_ccs_rw_ctxs ctx_id) 65 + { 66 + struct xe_bb *bb = kmalloc(sizeof(*bb), GFP_KERNEL); 67 + struct xe_tile *tile = gt_to_tile(gt); 68 + struct xe_sa_manager *bb_pool; 69 + int err; 70 + 71 + if (!bb) 72 + return ERR_PTR(-ENOMEM); 73 + /* 74 + * We need to allocate space for the requested number of dwords & 75 + * one additional MI_BATCH_BUFFER_END dword. Since the whole SA 76 + * is submitted to HW, we need to make sure that the last instruction 77 + * is not over written when the last chunk of SA is allocated for BB. 78 + * So, this extra DW acts as a guard here. 79 + */ 80 + 81 + bb_pool = tile->sriov.vf.ccs[ctx_id].mem.ccs_bb_pool; 82 + bb->bo = xe_sa_bo_new(bb_pool, 4 * (dwords + 1)); 83 + 84 + if (IS_ERR(bb->bo)) { 85 + err = PTR_ERR(bb->bo); 86 + goto err; 87 + } 88 + 89 + bb->cs = xe_sa_bo_cpu_addr(bb->bo); 90 + bb->len = 0; 91 + 92 + return bb; 93 + err: 94 + kfree(bb); 95 + return ERR_PTR(err); 96 + } 97 + 63 98 static struct xe_sched_job * 64 99 __xe_bb_create_job(struct xe_exec_queue *q, struct xe_bb *bb, u64 *addr) 65 100 {

+3

drivers/gpu/drm/xe/xe_bb.h

··· 13 13 struct xe_gt; 14 14 struct xe_exec_queue; 15 15 struct xe_sched_job; 16 + enum xe_sriov_vf_ccs_rw_ctxs; 16 17 17 18 struct xe_bb *xe_bb_new(struct xe_gt *gt, u32 dwords, bool usm); 19 + struct xe_bb *xe_bb_ccs_new(struct xe_gt *gt, u32 dwords, 20 + enum xe_sriov_vf_ccs_rw_ctxs ctx_id); 18 21 struct xe_sched_job *xe_bb_create_job(struct xe_exec_queue *q, 19 22 struct xe_bb *bb); 20 23 struct xe_sched_job *xe_bb_create_migration_job(struct xe_exec_queue *q,

+73 -21

drivers/gpu/drm/xe/xe_bo.c

··· 33 33 #include "xe_pxp.h" 34 34 #include "xe_res_cursor.h" 35 35 #include "xe_shrinker.h" 36 + #include "xe_sriov_vf_ccs.h" 36 37 #include "xe_trace_bo.h" 37 38 #include "xe_ttm_stolen_mgr.h" 38 39 #include "xe_vm.h" 40 + #include "xe_vram_types.h" 39 41 40 42 const char *const xe_mem_type_to_name[TTM_NUM_MEM_TYPES] = { 41 43 [XE_PL_SYSTEM] = "system", ··· 200 198 else if (bo_flags & XE_BO_FLAG_PINNED && 201 199 !(bo_flags & XE_BO_FLAG_PINNED_LATE_RESTORE)) 202 200 return true; /* needs vmap */ 201 + else if (bo_flags & XE_BO_FLAG_CPU_ADDR_MIRROR) 202 + return true; 203 203 204 204 /* 205 205 * For eviction / restore on suspend / resume objects pinned in VRAM ··· 816 812 } 817 813 818 814 if (ttm_bo->type == ttm_bo_type_sg) { 819 - ret = xe_bo_move_notify(bo, ctx); 815 + if (new_mem->mem_type == XE_PL_SYSTEM) 816 + ret = xe_bo_move_notify(bo, ctx); 820 817 if (!ret) 821 818 ret = xe_bo_move_dmabuf(ttm_bo, new_mem); 822 819 return ret; 823 820 } 824 821 825 - tt_has_data = ttm && (ttm_tt_is_populated(ttm) || 826 - (ttm->page_flags & TTM_TT_FLAG_SWAPPED)); 822 + tt_has_data = ttm && (ttm_tt_is_populated(ttm) || ttm_tt_is_swapped(ttm)); 827 823 828 824 move_lacks_source = !old_mem || (handle_system_ccs ? (!bo->ccs_cleared) : 829 825 (!mem_type_is_vram(old_mem_type) && !tt_has_data)); ··· 968 964 dma_fence_put(fence); 969 965 xe_pm_runtime_put(xe); 970 966 967 + /* 968 + * CCS meta data is migrated from TT -> SMEM. So, let us detach the 969 + * BBs from BO as it is no longer needed. 970 + */ 971 + if (IS_VF_CCS_BB_VALID(xe, bo) && old_mem_type == XE_PL_TT && 972 + new_mem->mem_type == XE_PL_SYSTEM) 973 + xe_sriov_vf_ccs_detach_bo(bo); 974 + 975 + if (IS_SRIOV_VF(xe) && 976 + ((move_lacks_source && new_mem->mem_type == XE_PL_TT) || 977 + (old_mem_type == XE_PL_SYSTEM && new_mem->mem_type == XE_PL_TT)) && 978 + handle_system_ccs) 979 + ret = xe_sriov_vf_ccs_attach_bo(bo); 980 + 971 981 out: 972 982 if ((!ttm_bo->resource || ttm_bo->resource->mem_type == XE_PL_SYSTEM) && 973 983 ttm_bo->ttm) { ··· 991 973 MAX_SCHEDULE_TIMEOUT); 992 974 if (timeout < 0) 993 975 ret = timeout; 976 + 977 + if (IS_VF_CCS_BB_VALID(xe, bo)) 978 + xe_sriov_vf_ccs_detach_bo(bo); 994 979 995 980 xe_tt_unmap_sg(xe, ttm_bo->ttm); 996 981 } ··· 1522 1501 1523 1502 static void xe_ttm_bo_delete_mem_notify(struct ttm_buffer_object *ttm_bo) 1524 1503 { 1504 + struct xe_bo *bo = ttm_to_xe_bo(ttm_bo); 1505 + 1525 1506 if (!xe_bo_is_xe_bo(ttm_bo)) 1526 1507 return; 1508 + 1509 + if (IS_VF_CCS_BB_VALID(ttm_to_xe_device(ttm_bo->bdev), bo)) 1510 + xe_sriov_vf_ccs_detach_bo(bo); 1527 1511 1528 1512 /* 1529 1513 * Object is idle and about to be destroyed. Release the ··· 1711 1685 } 1712 1686 } 1713 1687 1688 + static bool should_migrate_to_smem(struct xe_bo *bo) 1689 + { 1690 + /* 1691 + * NOTE: The following atomic checks are platform-specific. For example, 1692 + * if a device supports CXL atomics, these may not be necessary or 1693 + * may behave differently. 1694 + */ 1695 + 1696 + return bo->attr.atomic_access == DRM_XE_ATOMIC_GLOBAL || 1697 + bo->attr.atomic_access == DRM_XE_ATOMIC_CPU; 1698 + } 1699 + 1714 1700 static vm_fault_t xe_gem_fault(struct vm_fault *vmf) 1715 1701 { 1716 1702 struct ttm_buffer_object *tbo = vmf->vma->vm_private_data; ··· 1731 1693 struct xe_bo *bo = ttm_to_xe_bo(tbo); 1732 1694 bool needs_rpm = bo->flags & XE_BO_FLAG_VRAM_MASK; 1733 1695 vm_fault_t ret; 1734 - int idx; 1696 + int idx, r = 0; 1735 1697 1736 1698 if (needs_rpm) 1737 1699 xe_pm_runtime_get(xe); ··· 1743 1705 if (drm_dev_enter(ddev, &idx)) { 1744 1706 trace_xe_bo_cpu_fault(bo); 1745 1707 1746 - ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot, 1747 - TTM_BO_VM_NUM_PREFAULT); 1708 + if (should_migrate_to_smem(bo)) { 1709 + xe_assert(xe, bo->flags & XE_BO_FLAG_SYSTEM); 1710 + 1711 + r = xe_bo_migrate(bo, XE_PL_TT); 1712 + if (r == -EBUSY || r == -ERESTARTSYS || r == -EINTR) 1713 + ret = VM_FAULT_NOPAGE; 1714 + else if (r) 1715 + ret = VM_FAULT_SIGBUS; 1716 + } 1717 + if (!ret) 1718 + ret = ttm_bo_vm_fault_reserved(vmf, 1719 + vmf->vma->vm_page_prot, 1720 + TTM_BO_VM_NUM_PREFAULT); 1748 1721 drm_dev_exit(idx); 1722 + 1723 + if (ret == VM_FAULT_RETRY && 1724 + !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) 1725 + goto out; 1726 + 1727 + /* 1728 + * ttm_bo_vm_reserve() already has dma_resv_lock. 1729 + */ 1730 + if (ret == VM_FAULT_NOPAGE && 1731 + mem_type_is_vram(tbo->resource->mem_type)) { 1732 + mutex_lock(&xe->mem_access.vram_userfault.lock); 1733 + if (list_empty(&bo->vram_userfault_link)) 1734 + list_add(&bo->vram_userfault_link, 1735 + &xe->mem_access.vram_userfault.list); 1736 + mutex_unlock(&xe->mem_access.vram_userfault.lock); 1737 + } 1749 1738 } else { 1750 1739 ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot); 1751 - } 1752 - 1753 - if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) 1754 - goto out; 1755 - /* 1756 - * ttm_bo_vm_reserve() already has dma_resv_lock. 1757 - */ 1758 - if (ret == VM_FAULT_NOPAGE && mem_type_is_vram(tbo->resource->mem_type)) { 1759 - mutex_lock(&xe->mem_access.vram_userfault.lock); 1760 - if (list_empty(&bo->vram_userfault_link)) 1761 - list_add(&bo->vram_userfault_link, &xe->mem_access.vram_userfault.list); 1762 - mutex_unlock(&xe->mem_access.vram_userfault.lock); 1763 1740 } 1764 1741 1765 1742 dma_resv_unlock(tbo->base.resv); ··· 2491 2438 .no_wait_gpu = false, 2492 2439 .gfp_retry_mayfail = true, 2493 2440 }; 2494 - struct pin_cookie cookie; 2495 2441 int ret; 2496 2442 2497 2443 if (vm) { ··· 2501 2449 ctx.resv = xe_vm_resv(vm); 2502 2450 } 2503 2451 2504 - cookie = xe_vm_set_validating(vm, allow_res_evict); 2452 + xe_vm_set_validating(vm, allow_res_evict); 2505 2453 trace_xe_bo_validate(bo); 2506 2454 ret = ttm_bo_validate(&bo->ttm, &bo->placement, &ctx); 2507 - xe_vm_clear_validating(vm, allow_res_evict, cookie); 2455 + xe_vm_clear_validating(vm, allow_res_evict); 2508 2456 2509 2457 return ret; 2510 2458 }

+3 -1

drivers/gpu/drm/xe/xe_bo.h

··· 12 12 #include "xe_macros.h" 13 13 #include "xe_vm_types.h" 14 14 #include "xe_vm.h" 15 + #include "xe_vram_types.h" 15 16 16 17 #define XE_DEFAULT_GTT_SIZE_MB 3072ULL /* 3GB by default */ 17 18 ··· 24 23 #define XE_BO_FLAG_VRAM_MASK (XE_BO_FLAG_VRAM0 | XE_BO_FLAG_VRAM1) 25 24 /* -- */ 26 25 #define XE_BO_FLAG_STOLEN BIT(4) 26 + #define XE_BO_FLAG_VRAM(vram) (XE_BO_FLAG_VRAM0 << ((vram)->id)) 27 27 #define XE_BO_FLAG_VRAM_IF_DGFX(tile) (IS_DGFX(tile_to_xe(tile)) ? \ 28 - XE_BO_FLAG_VRAM0 << (tile)->id : \ 28 + XE_BO_FLAG_VRAM((tile)->mem.vram) : \ 29 29 XE_BO_FLAG_SYSTEM) 30 30 #define XE_BO_FLAG_GGTT BIT(5) 31 31 #define XE_BO_FLAG_IGNORE_MIN_PAGE_SIZE BIT(6)

+12

drivers/gpu/drm/xe/xe_bo_types.h

··· 9 9 #include <linux/iosys-map.h> 10 10 11 11 #include <drm/drm_gpusvm.h> 12 + #include <drm/drm_pagemap.h> 12 13 #include <drm/ttm/ttm_bo.h> 13 14 #include <drm/ttm/ttm_device.h> 14 15 #include <drm/ttm/ttm_placement.h> ··· 61 60 */ 62 61 struct list_head client_link; 63 62 #endif 63 + /** @attr: User controlled attributes for bo */ 64 + struct { 65 + /** 66 + * @atomic_access: type of atomic access bo needs 67 + * protected by bo dma-resv lock 68 + */ 69 + u32 atomic_access; 70 + } attr; 64 71 /** 65 72 * @pxp_key_instance: PXP key instance this BO was created against. A 66 73 * 0 in this variable indicates that the BO does not use PXP encryption. ··· 84 75 85 76 /** @ccs_cleared */ 86 77 bool ccs_cleared; 78 + 79 + /** @bb_ccs_rw: BB instructions of CCS read/write. Valid only for VF */ 80 + struct xe_bb *bb_ccs[XE_SRIOV_VF_CCS_CTX_COUNT]; 87 81 88 82 /** 89 83 * @cpu_caching: CPU caching mode. Currently only used for userspace

+278 -68

drivers/gpu/drm/xe/xe_configfs.c

··· 5 5 6 6 #include <linux/bitops.h> 7 7 #include <linux/configfs.h> 8 + #include <linux/cleanup.h> 8 9 #include <linux/find.h> 9 10 #include <linux/init.h> 10 11 #include <linux/module.h> ··· 13 12 #include <linux/string.h> 14 13 15 14 #include "xe_configfs.h" 16 - #include "xe_module.h" 17 - 18 15 #include "xe_hw_engine_types.h" 16 + #include "xe_module.h" 17 + #include "xe_pci_types.h" 19 18 20 19 /** 21 20 * DOC: Xe Configfs ··· 24 23 * ========= 25 24 * 26 25 * Configfs is a filesystem-based manager of kernel objects. XE KMD registers a 27 - * configfs subsystem called ``'xe'`` that creates a directory in the mounted configfs directory 28 - * The user can create devices under this directory and configure them as necessary 29 - * See Documentation/filesystems/configfs.rst for more information about how configfs works. 26 + * configfs subsystem called ``xe`` that creates a directory in the mounted 27 + * configfs directory. The user can create devices under this directory and 28 + * configure them as necessary. See Documentation/filesystems/configfs.rst for 29 + * more information about how configfs works. 30 30 * 31 31 * Create devices 32 - * =============== 32 + * ============== 33 33 * 34 - * In order to create a device, the user has to create a directory inside ``'xe'``:: 34 + * To create a device, the ``xe`` module should already be loaded, but some 35 + * attributes can only be set before binding the device. It can be accomplished 36 + * by blocking the driver autoprobe: 35 37 * 36 - * mkdir /sys/kernel/config/xe/0000:03:00.0/ 38 + * # echo 0 > /sys/bus/pci/drivers_autoprobe 39 + * # modprobe xe 40 + * 41 + * In order to create a device, the user has to create a directory inside ``xe``:: 42 + * 43 + * # mkdir /sys/kernel/config/xe/0000:03:00.0/ 37 44 * 38 45 * Every device created is populated by the driver with entries that can be 39 46 * used to configure it:: 40 47 * 41 48 * /sys/kernel/config/xe/ 42 - * .. 0000:03:00.0/ 43 - * ... survivability_mode 49 + * ├── 0000:00:02.0 50 + * │ └── ... 51 + * ├── 0000:00:02.1 52 + * │ └── ... 53 + * : 54 + * └── 0000:03:00.0 55 + * ├── survivability_mode 56 + * ├── engines_allowed 57 + * └── enable_psmi 58 + * 59 + * After configuring the attributes as per next section, the device can be 60 + * probed with:: 61 + * 62 + * # echo 0000:03:00.0 > /sys/bus/pci/drivers/xe/bind 63 + * # # or 64 + * # echo 0000:03:00.0 > /sys/bus/pci/drivers_probe 44 65 * 45 66 * Configure Attributes 46 67 * ==================== ··· 74 51 * effect when probing the device. Example to enable it:: 75 52 * 76 53 * # echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode 77 - * # echo 0000:03:00.0 > /sys/bus/pci/drivers/xe/bind (Enters survivability mode if supported) 54 + * 55 + * This attribute can only be set before binding to the device. 78 56 * 79 57 * Allowed engines: 80 58 * ---------------- ··· 101 77 * available for migrations, but it's disabled. This is intended for debugging 102 78 * purposes only. 103 79 * 80 + * This attribute can only be set before binding to the device. 81 + * 82 + * PSMI 83 + * ---- 84 + * 85 + * Enable extra debugging capabilities to trace engine execution. Only useful 86 + * during early platform enabling and requires additional hardware connected. 87 + * Once it's enabled, additionals WAs are added and runtime configuration is 88 + * done via debugfs. Example to enable it:: 89 + * 90 + * # echo 1 > /sys/kernel/config/xe/0000:03:00.0/enable_psmi 91 + * 92 + * This attribute can only be set before binding to the device. 93 + * 104 94 * Remove devices 105 95 * ============== 106 96 * 107 97 * The created device directories can be removed using ``rmdir``:: 108 98 * 109 - * rmdir /sys/kernel/config/xe/0000:03:00.0/ 99 + * # rmdir /sys/kernel/config/xe/0000:03:00.0/ 110 100 */ 111 101 112 - struct xe_config_device { 102 + struct xe_config_group_device { 113 103 struct config_group group; 114 104 115 - bool survivability_mode; 116 - u64 engines_allowed; 105 + struct xe_config_device { 106 + u64 engines_allowed; 107 + bool survivability_mode; 108 + bool enable_psmi; 109 + } config; 117 110 118 111 /* protects attributes */ 119 112 struct mutex lock; 120 113 }; 114 + 115 + static const struct xe_config_device device_defaults = { 116 + .engines_allowed = U64_MAX, 117 + .survivability_mode = false, 118 + .enable_psmi = false, 119 + }; 120 + 121 + static void set_device_defaults(struct xe_config_device *config) 122 + { 123 + *config = device_defaults; 124 + } 121 125 122 126 struct engine_info { 123 127 const char *cls; ··· 165 113 { .cls = "gsccs", .mask = XE_HW_ENGINE_GSCCS_MASK }, 166 114 }; 167 115 116 + static struct xe_config_group_device *to_xe_config_group_device(struct config_item *item) 117 + { 118 + return container_of(to_config_group(item), struct xe_config_group_device, group); 119 + } 120 + 168 121 static struct xe_config_device *to_xe_config_device(struct config_item *item) 169 122 { 170 - return container_of(to_config_group(item), struct xe_config_device, group); 123 + return &to_xe_config_group_device(item)->config; 124 + } 125 + 126 + static bool is_bound(struct xe_config_group_device *dev) 127 + { 128 + unsigned int domain, bus, slot, function; 129 + struct pci_dev *pdev; 130 + const char *name; 131 + bool ret; 132 + 133 + lockdep_assert_held(&dev->lock); 134 + 135 + name = dev->group.cg_item.ci_name; 136 + if (sscanf(name, "%x:%x:%x.%x", &domain, &bus, &slot, &function) != 4) 137 + return false; 138 + 139 + pdev = pci_get_domain_bus_and_slot(domain, bus, PCI_DEVFN(slot, function)); 140 + if (!pdev) 141 + return false; 142 + 143 + ret = pci_get_drvdata(pdev); 144 + pci_dev_put(pdev); 145 + 146 + if (ret) 147 + pci_dbg(pdev, "Already bound to driver\n"); 148 + 149 + return ret; 171 150 } 172 151 173 152 static ssize_t survivability_mode_show(struct config_item *item, char *page) ··· 210 127 211 128 static ssize_t survivability_mode_store(struct config_item *item, const char *page, size_t len) 212 129 { 213 - struct xe_config_device *dev = to_xe_config_device(item); 130 + struct xe_config_group_device *dev = to_xe_config_group_device(item); 214 131 bool survivability_mode; 215 132 int ret; 216 133 ··· 218 135 if (ret) 219 136 return ret; 220 137 221 - mutex_lock(&dev->lock); 222 - dev->survivability_mode = survivability_mode; 223 - mutex_unlock(&dev->lock); 138 + guard(mutex)(&dev->lock); 139 + if (is_bound(dev)) 140 + return -EBUSY; 141 + 142 + dev->config.survivability_mode = survivability_mode; 224 143 225 144 return len; 226 145 } ··· 284 199 static ssize_t engines_allowed_store(struct config_item *item, const char *page, 285 200 size_t len) 286 201 { 287 - struct xe_config_device *dev = to_xe_config_device(item); 202 + struct xe_config_group_device *dev = to_xe_config_group_device(item); 288 203 size_t patternlen, p; 289 204 u64 mask, val = 0; 290 205 ··· 304 219 val |= mask; 305 220 } 306 221 307 - mutex_lock(&dev->lock); 308 - dev->engines_allowed = val; 309 - mutex_unlock(&dev->lock); 222 + guard(mutex)(&dev->lock); 223 + if (is_bound(dev)) 224 + return -EBUSY; 225 + 226 + dev->config.engines_allowed = val; 310 227 311 228 return len; 312 229 } 313 230 314 - CONFIGFS_ATTR(, survivability_mode); 231 + static ssize_t enable_psmi_show(struct config_item *item, char *page) 232 + { 233 + struct xe_config_device *dev = to_xe_config_device(item); 234 + 235 + return sprintf(page, "%d\n", dev->enable_psmi); 236 + } 237 + 238 + static ssize_t enable_psmi_store(struct config_item *item, const char *page, size_t len) 239 + { 240 + struct xe_config_group_device *dev = to_xe_config_group_device(item); 241 + bool val; 242 + int ret; 243 + 244 + ret = kstrtobool(page, &val); 245 + if (ret) 246 + return ret; 247 + 248 + guard(mutex)(&dev->lock); 249 + if (is_bound(dev)) 250 + return -EBUSY; 251 + 252 + dev->config.enable_psmi = val; 253 + 254 + return len; 255 + } 256 + 257 + CONFIGFS_ATTR(, enable_psmi); 315 258 CONFIGFS_ATTR(, engines_allowed); 259 + CONFIGFS_ATTR(, survivability_mode); 316 260 317 261 static struct configfs_attribute *xe_config_device_attrs[] = { 318 - &attr_survivability_mode, 262 + &attr_enable_psmi, 319 263 &attr_engines_allowed, 264 + &attr_survivability_mode, 320 265 NULL, 321 266 }; 322 267 323 268 static void xe_config_device_release(struct config_item *item) 324 269 { 325 - struct xe_config_device *dev = to_xe_config_device(item); 270 + struct xe_config_group_device *dev = to_xe_config_group_device(item); 326 271 327 272 mutex_destroy(&dev->lock); 328 273 kfree(dev); ··· 368 253 .ct_owner = THIS_MODULE, 369 254 }; 370 255 256 + static const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev) 257 + { 258 + struct device_driver *driver = driver_find("xe", &pci_bus_type); 259 + struct pci_driver *drv = to_pci_driver(driver); 260 + const struct pci_device_id *ids = drv ? drv->id_table : NULL; 261 + const struct pci_device_id *found = pci_match_id(ids, pdev); 262 + 263 + return found ? (const void *)found->driver_data : NULL; 264 + } 265 + 266 + static struct pci_dev *get_physfn_instead(struct pci_dev *virtfn) 267 + { 268 + struct pci_dev *physfn = pci_physfn(virtfn); 269 + 270 + pci_dev_get(physfn); 271 + pci_dev_put(virtfn); 272 + return physfn; 273 + } 274 + 371 275 static struct config_group *xe_config_make_device_group(struct config_group *group, 372 276 const char *name) 373 277 { 374 278 unsigned int domain, bus, slot, function; 375 - struct xe_config_device *dev; 279 + struct xe_config_group_device *dev; 280 + const struct xe_device_desc *match; 376 281 struct pci_dev *pdev; 282 + char canonical[16]; 283 + int vfnumber = 0; 377 284 int ret; 378 285 379 - ret = sscanf(name, "%04x:%02x:%02x.%x", &domain, &bus, &slot, &function); 286 + ret = sscanf(name, "%x:%x:%x.%x", &domain, &bus, &slot, &function); 380 287 if (ret != 4) 381 288 return ERR_PTR(-EINVAL); 382 289 290 + ret = scnprintf(canonical, sizeof(canonical), "%04x:%02x:%02x.%d", domain, bus, 291 + PCI_SLOT(PCI_DEVFN(slot, function)), 292 + PCI_FUNC(PCI_DEVFN(slot, function))); 293 + if (ret != 12 || strcmp(name, canonical)) 294 + return ERR_PTR(-EINVAL); 295 + 383 296 pdev = pci_get_domain_bus_and_slot(domain, bus, PCI_DEVFN(slot, function)); 297 + if (!pdev && function) 298 + pdev = pci_get_domain_bus_and_slot(domain, bus, PCI_DEVFN(slot, 0)); 299 + if (!pdev && slot) 300 + pdev = pci_get_domain_bus_and_slot(domain, bus, PCI_DEVFN(0, 0)); 384 301 if (!pdev) 385 302 return ERR_PTR(-ENODEV); 303 + 304 + if (PCI_DEVFN(slot, function) != pdev->devfn) { 305 + pdev = get_physfn_instead(pdev); 306 + vfnumber = PCI_DEVFN(slot, function) - pdev->devfn; 307 + if (!dev_is_pf(&pdev->dev) || vfnumber > pci_sriov_get_totalvfs(pdev)) { 308 + pci_dev_put(pdev); 309 + return ERR_PTR(-ENODEV); 310 + } 311 + } 312 + 313 + match = xe_match_desc(pdev); 314 + if (match && vfnumber && !match->has_sriov) { 315 + pci_info(pdev, "xe driver does not support VFs on this device\n"); 316 + match = NULL; 317 + } else if (!match) { 318 + pci_info(pdev, "xe driver does not support configuration of this device\n"); 319 + } 320 + 386 321 pci_dev_put(pdev); 322 + 323 + if (!match) 324 + return ERR_PTR(-ENOENT); 387 325 388 326 dev = kzalloc(sizeof(*dev), GFP_KERNEL); 389 327 if (!dev) 390 328 return ERR_PTR(-ENOMEM); 391 329 392 - /* Default values */ 393 - dev->engines_allowed = U64_MAX; 330 + set_device_defaults(&dev->config); 394 331 395 332 config_group_init_type_name(&dev->group, name, &xe_config_device_type); 396 333 ··· 469 302 }, 470 303 }; 471 304 472 - static struct xe_config_device *configfs_find_group(struct pci_dev *pdev) 305 + static struct xe_config_group_device *find_xe_config_group_device(struct pci_dev *pdev) 473 306 { 474 307 struct config_item *item; 475 - char name[64]; 476 - 477 - snprintf(name, sizeof(name), "%04x:%02x:%02x.%x", pci_domain_nr(pdev->bus), 478 - pdev->bus->number, PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn)); 479 308 480 309 mutex_lock(&xe_configfs.su_mutex); 481 - item = config_group_find_item(&xe_configfs.su_group, name); 310 + item = config_group_find_item(&xe_configfs.su_group, pci_name(pdev)); 482 311 mutex_unlock(&xe_configfs.su_mutex); 483 312 484 313 if (!item) 485 314 return NULL; 486 315 487 - return to_xe_config_device(item); 316 + return to_xe_config_group_device(item); 317 + } 318 + 319 + static void dump_custom_dev_config(struct pci_dev *pdev, 320 + struct xe_config_group_device *dev) 321 + { 322 + #define PRI_CUSTOM_ATTR(fmt_, attr_) do { \ 323 + if (dev->config.attr_ != device_defaults.attr_) \ 324 + pci_info(pdev, "configfs: " __stringify(attr_) " = " fmt_ "\n", \ 325 + dev->config.attr_); \ 326 + } while (0) 327 + 328 + PRI_CUSTOM_ATTR("%llx", engines_allowed); 329 + PRI_CUSTOM_ATTR("%d", enable_psmi); 330 + PRI_CUSTOM_ATTR("%d", survivability_mode); 331 + 332 + #undef PRI_CUSTOM_ATTR 333 + } 334 + 335 + /** 336 + * xe_configfs_check_device() - Test if device was configured by configfs 337 + * @pdev: the &pci_dev device to test 338 + * 339 + * Try to find the configfs group that belongs to the specified pci device 340 + * and print a diagnostic message if different than the default value. 341 + */ 342 + void xe_configfs_check_device(struct pci_dev *pdev) 343 + { 344 + struct xe_config_group_device *dev = find_xe_config_group_device(pdev); 345 + 346 + if (!dev) 347 + return; 348 + 349 + /* memcmp here is safe as both are zero-initialized */ 350 + if (memcmp(&dev->config, &device_defaults, sizeof(dev->config))) { 351 + pci_info(pdev, "Found custom settings in configfs\n"); 352 + dump_custom_dev_config(pdev, dev); 353 + } 354 + 355 + config_group_put(&dev->group); 488 356 } 489 357 490 358 /** 491 359 * xe_configfs_get_survivability_mode - get configfs survivability mode attribute 492 360 * @pdev: pci device 493 361 * 494 - * find the configfs group that belongs to the pci device and return 495 - * the survivability mode attribute 496 - * 497 - * Return: survivability mode if config group is found, false otherwise 362 + * Return: survivability_mode attribute in configfs 498 363 */ 499 364 bool xe_configfs_get_survivability_mode(struct pci_dev *pdev) 500 365 { 501 - struct xe_config_device *dev = configfs_find_group(pdev); 366 + struct xe_config_group_device *dev = find_xe_config_group_device(pdev); 502 367 bool mode; 503 368 504 369 if (!dev) 505 - return false; 370 + return device_defaults.survivability_mode; 506 371 507 - mode = dev->survivability_mode; 508 - config_item_put(&dev->group.cg_item); 372 + mode = dev->config.survivability_mode; 373 + config_group_put(&dev->group); 509 374 510 375 return mode; 511 376 } 512 377 513 378 /** 514 - * xe_configfs_clear_survivability_mode - clear configfs survivability mode attribute 379 + * xe_configfs_clear_survivability_mode - clear configfs survivability mode 515 380 * @pdev: pci device 516 - * 517 - * find the configfs group that belongs to the pci device and clear survivability 518 - * mode attribute 519 381 */ 520 382 void xe_configfs_clear_survivability_mode(struct pci_dev *pdev) 521 383 { 522 - struct xe_config_device *dev = configfs_find_group(pdev); 384 + struct xe_config_group_device *dev = find_xe_config_group_device(pdev); 523 385 524 386 if (!dev) 525 387 return; 526 388 527 - mutex_lock(&dev->lock); 528 - dev->survivability_mode = 0; 529 - mutex_unlock(&dev->lock); 389 + guard(mutex)(&dev->lock); 390 + dev->config.survivability_mode = 0; 530 391 531 - config_item_put(&dev->group.cg_item); 392 + config_group_put(&dev->group); 532 393 } 533 394 534 395 /** 535 396 * xe_configfs_get_engines_allowed - get engine allowed mask from configfs 536 397 * @pdev: pci device 537 398 * 538 - * Find the configfs group that belongs to the pci device and return 539 - * the mask of engines allowed to be used. 540 - * 541 - * Return: engine mask with allowed engines 399 + * Return: engine mask with allowed engines set in configfs 542 400 */ 543 401 u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev) 544 402 { 545 - struct xe_config_device *dev = configfs_find_group(pdev); 403 + struct xe_config_group_device *dev = find_xe_config_group_device(pdev); 546 404 u64 engines_allowed; 547 405 548 406 if (!dev) 549 - return U64_MAX; 407 + return device_defaults.engines_allowed; 550 408 551 - engines_allowed = dev->engines_allowed; 552 - config_item_put(&dev->group.cg_item); 409 + engines_allowed = dev->config.engines_allowed; 410 + config_group_put(&dev->group); 553 411 554 412 return engines_allowed; 555 413 } 556 414 415 + /** 416 + * xe_configfs_get_psmi_enabled - get configfs enable_psmi setting 417 + * @pdev: pci device 418 + * 419 + * Return: enable_psmi setting in configfs 420 + */ 421 + bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) 422 + { 423 + struct xe_config_group_device *dev = find_xe_config_group_device(pdev); 424 + bool ret; 425 + 426 + if (!dev) 427 + return false; 428 + 429 + ret = dev->config.enable_psmi; 430 + config_item_put(&dev->group.cg_item); 431 + 432 + return ret; 433 + } 434 + 557 435 int __init xe_configfs_init(void) 558 436 { 559 - struct config_group *root = &xe_configfs.su_group; 560 437 int ret; 561 438 562 - config_group_init(root); 439 + config_group_init(&xe_configfs.su_group); 563 440 mutex_init(&xe_configfs.su_mutex); 564 441 ret = configfs_register_subsystem(&xe_configfs); 565 442 if (ret) { 566 - pr_err("Error %d while registering %s subsystem\n", 567 - ret, root->cg_item.ci_namebuf); 443 + mutex_destroy(&xe_configfs.su_mutex); 568 444 return ret; 569 445 } 570 446 ··· 617 407 void __exit xe_configfs_exit(void) 618 408 { 619 409 configfs_unregister_subsystem(&xe_configfs); 410 + mutex_destroy(&xe_configfs.su_mutex); 620 411 } 621 -

+4

drivers/gpu/drm/xe/xe_configfs.h

··· 13 13 #if IS_ENABLED(CONFIG_CONFIGFS_FS) 14 14 int xe_configfs_init(void); 15 15 void xe_configfs_exit(void); 16 + void xe_configfs_check_device(struct pci_dev *pdev); 16 17 bool xe_configfs_get_survivability_mode(struct pci_dev *pdev); 17 18 void xe_configfs_clear_survivability_mode(struct pci_dev *pdev); 18 19 u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev); 20 + bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev); 19 21 #else 20 22 static inline int xe_configfs_init(void) { return 0; } 21 23 static inline void xe_configfs_exit(void) { } 24 + static inline void xe_configfs_check_device(struct pci_dev *pdev) { } 22 25 static inline bool xe_configfs_get_survivability_mode(struct pci_dev *pdev) { return false; } 23 26 static inline void xe_configfs_clear_survivability_mode(struct pci_dev *pdev) { } 24 27 static inline u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev) { return U64_MAX; } 28 + static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; } 25 29 #endif 26 30 27 31 #endif

+114

drivers/gpu/drm/xe/xe_debugfs.c

··· 11 11 12 12 #include <drm/drm_debugfs.h> 13 13 14 + #include "regs/xe_pmt.h" 14 15 #include "xe_bo.h" 15 16 #include "xe_device.h" 16 17 #include "xe_force_wake.h" 17 18 #include "xe_gt_debugfs.h" 18 19 #include "xe_gt_printk.h" 19 20 #include "xe_guc_ads.h" 21 + #include "xe_mmio.h" 20 22 #include "xe_pm.h" 23 + #include "xe_psmi.h" 21 24 #include "xe_pxp_debugfs.h" 22 25 #include "xe_sriov.h" 23 26 #include "xe_sriov_pf.h" 24 27 #include "xe_step.h" 25 28 #include "xe_wa.h" 29 + #include "xe_vsec.h" 26 30 27 31 #ifdef CONFIG_DRM_XE_DEBUG 28 32 #include "xe_bo_evict.h" ··· 35 31 #endif 36 32 37 33 DECLARE_FAULT_ATTR(gt_reset_failure); 34 + DECLARE_FAULT_ATTR(inject_csc_hw_error); 35 + 36 + static void read_residency_counter(struct xe_device *xe, struct xe_mmio *mmio, 37 + u32 offset, char *name, struct drm_printer *p) 38 + { 39 + u64 residency = 0; 40 + int ret; 41 + 42 + ret = xe_pmt_telem_read(to_pci_dev(xe->drm.dev), 43 + xe_mmio_read32(mmio, PUNIT_TELEMETRY_GUID), 44 + &residency, offset, sizeof(residency)); 45 + if (ret != sizeof(residency)) { 46 + drm_warn(&xe->drm, "%s counter failed to read, ret %d\n", name, ret); 47 + return; 48 + } 49 + 50 + drm_printf(p, "%s : %llu\n", name, residency); 51 + } 38 52 39 53 static struct xe_device *node_to_xe(struct drm_info_node *node) 40 54 { ··· 124 102 return 0; 125 103 } 126 104 105 + static int dgfx_pkg_residencies_show(struct seq_file *m, void *data) 106 + { 107 + struct xe_device *xe; 108 + struct xe_mmio *mmio; 109 + struct drm_printer p; 110 + 111 + xe = node_to_xe(m->private); 112 + p = drm_seq_file_printer(m); 113 + xe_pm_runtime_get(xe); 114 + mmio = xe_root_tile_mmio(xe); 115 + struct { 116 + u32 offset; 117 + char *name; 118 + } residencies[] = { 119 + {BMG_G2_RESIDENCY_OFFSET, "Package G2"}, 120 + {BMG_G6_RESIDENCY_OFFSET, "Package G6"}, 121 + {BMG_G8_RESIDENCY_OFFSET, "Package G8"}, 122 + {BMG_G10_RESIDENCY_OFFSET, "Package G10"}, 123 + {BMG_MODS_RESIDENCY_OFFSET, "Package ModS"} 124 + }; 125 + 126 + for (int i = 0; i < ARRAY_SIZE(residencies); i++) 127 + read_residency_counter(xe, mmio, residencies[i].offset, residencies[i].name, &p); 128 + 129 + xe_pm_runtime_put(xe); 130 + return 0; 131 + } 132 + 133 + static int dgfx_pcie_link_residencies_show(struct seq_file *m, void *data) 134 + { 135 + struct xe_device *xe; 136 + struct xe_mmio *mmio; 137 + struct drm_printer p; 138 + 139 + xe = node_to_xe(m->private); 140 + p = drm_seq_file_printer(m); 141 + xe_pm_runtime_get(xe); 142 + mmio = xe_root_tile_mmio(xe); 143 + 144 + struct { 145 + u32 offset; 146 + char *name; 147 + } residencies[] = { 148 + {BMG_PCIE_LINK_L0_RESIDENCY_OFFSET, "PCIE LINK L0 RESIDENCY"}, 149 + {BMG_PCIE_LINK_L1_RESIDENCY_OFFSET, "PCIE LINK L1 RESIDENCY"}, 150 + {BMG_PCIE_LINK_L1_2_RESIDENCY_OFFSET, "PCIE LINK L1.2 RESIDENCY"} 151 + }; 152 + 153 + for (int i = 0; i < ARRAY_SIZE(residencies); i++) 154 + read_residency_counter(xe, mmio, residencies[i].offset, residencies[i].name, &p); 155 + 156 + xe_pm_runtime_put(xe); 157 + return 0; 158 + } 159 + 127 160 static const struct drm_info_list debugfs_list[] = { 128 161 {"info", info, 0}, 129 162 { .name = "sriov_info", .show = sriov_info, }, 130 163 { .name = "workarounds", .show = workaround_info, }, 164 + }; 165 + 166 + static const struct drm_info_list debugfs_residencies[] = { 167 + { .name = "dgfx_pkg_residencies", .show = dgfx_pkg_residencies_show, }, 168 + { .name = "dgfx_pcie_link_residencies", .show = dgfx_pcie_link_residencies_show, }, 131 169 }; 132 170 133 171 static int forcewake_open(struct inode *inode, struct file *file) ··· 329 247 .write = atomic_svm_timeslice_ms_set, 330 248 }; 331 249 250 + static void create_tile_debugfs(struct xe_tile *tile, struct dentry *root) 251 + { 252 + char name[8]; 253 + 254 + snprintf(name, sizeof(name), "tile%u", tile->id); 255 + tile->debugfs = debugfs_create_dir(name, root); 256 + if (IS_ERR(tile->debugfs)) 257 + return; 258 + 259 + /* 260 + * Store the xe_tile pointer as private data of the tile/ directory 261 + * node so other tile specific attributes under that directory may 262 + * refer to it by looking at its parent node private data. 263 + */ 264 + tile->debugfs->d_inode->i_private = tile; 265 + } 266 + 332 267 void xe_debugfs_register(struct xe_device *xe) 333 268 { 334 269 struct ttm_device *bdev = &xe->ttm; 335 270 struct drm_minor *minor = xe->drm.primary; 336 271 struct dentry *root = minor->debugfs_root; 337 272 struct ttm_resource_manager *man; 273 + struct xe_tile *tile; 338 274 struct xe_gt *gt; 339 275 u32 mem_type; 276 + u8 tile_id; 340 277 u8 id; 341 278 342 279 drm_debugfs_create_files(debugfs_list, 343 280 ARRAY_SIZE(debugfs_list), 344 281 root, minor); 282 + 283 + if (xe->info.platform == XE_BATTLEMAGE) { 284 + drm_debugfs_create_files(debugfs_residencies, 285 + ARRAY_SIZE(debugfs_residencies), 286 + root, minor); 287 + fault_create_debugfs_attr("inject_csc_hw_error", root, 288 + &inject_csc_hw_error); 289 + } 345 290 346 291 debugfs_create_file("forcewake_all", 0400, root, xe, 347 292 &forcewake_all_fops); ··· 397 288 if (man) 398 289 ttm_resource_manager_create_debugfs(man, root, "stolen_mm"); 399 290 291 + for_each_tile(tile, xe, tile_id) 292 + create_tile_debugfs(tile, root); 293 + 400 294 for_each_gt(gt, xe, id) 401 295 xe_gt_debugfs_register(gt); 402 296 403 297 xe_pxp_debugfs_register(xe->pxp); 298 + 299 + xe_psmi_debugfs_register(xe); 404 300 405 301 fault_create_debugfs_attr("fail_gt_reset", root, &gt_reset_failure); 406 302

+29

drivers/gpu/drm/xe/xe_dep_job_types.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_DEP_JOB_TYPES_H_ 7 + #define _XE_DEP_JOB_TYPES_H_ 8 + 9 + #include <drm/gpu_scheduler.h> 10 + 11 + struct xe_dep_job; 12 + 13 + /** struct xe_dep_job_ops - Generic Xe dependency job operations */ 14 + struct xe_dep_job_ops { 15 + /** @run_job: Run generic Xe dependency job */ 16 + struct dma_fence *(*run_job)(struct xe_dep_job *job); 17 + /** @free_job: Free generic Xe dependency job */ 18 + void (*free_job)(struct xe_dep_job *job); 19 + }; 20 + 21 + /** struct xe_dep_job - Generic dependency Xe job */ 22 + struct xe_dep_job { 23 + /** @drm: base DRM scheduler job */ 24 + struct drm_sched_job drm; 25 + /** @ops: dependency job operations */ 26 + const struct xe_dep_job_ops *ops; 27 + }; 28 + 29 + #endif

+143

drivers/gpu/drm/xe/xe_dep_scheduler.c

··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #include <linux/slab.h> 7 + 8 + #include <drm/gpu_scheduler.h> 9 + 10 + #include "xe_dep_job_types.h" 11 + #include "xe_dep_scheduler.h" 12 + #include "xe_device_types.h" 13 + 14 + /** 15 + * DOC: Xe Dependency Scheduler 16 + * 17 + * The Xe dependency scheduler is a simple wrapper built around the DRM 18 + * scheduler to execute jobs once their dependencies are resolved (i.e., all 19 + * input fences specified as dependencies are signaled). The jobs that are 20 + * executed contain virtual functions to run (execute) and free the job, 21 + * allowing a single dependency scheduler to handle jobs performing different 22 + * operations. 23 + * 24 + * Example use cases include deferred resource freeing, TLB invalidations after 25 + * bind jobs, etc. 26 + */ 27 + 28 + /** struct xe_dep_scheduler - Generic Xe dependency scheduler */ 29 + struct xe_dep_scheduler { 30 + /** @sched: DRM GPU scheduler */ 31 + struct drm_gpu_scheduler sched; 32 + /** @entity: DRM scheduler entity */ 33 + struct drm_sched_entity entity; 34 + /** @rcu: For safe freeing of exported dma fences */ 35 + struct rcu_head rcu; 36 + }; 37 + 38 + static struct dma_fence *xe_dep_scheduler_run_job(struct drm_sched_job *drm_job) 39 + { 40 + struct xe_dep_job *dep_job = 41 + container_of(drm_job, typeof(*dep_job), drm); 42 + 43 + return dep_job->ops->run_job(dep_job); 44 + } 45 + 46 + static void xe_dep_scheduler_free_job(struct drm_sched_job *drm_job) 47 + { 48 + struct xe_dep_job *dep_job = 49 + container_of(drm_job, typeof(*dep_job), drm); 50 + 51 + dep_job->ops->free_job(dep_job); 52 + } 53 + 54 + static const struct drm_sched_backend_ops sched_ops = { 55 + .run_job = xe_dep_scheduler_run_job, 56 + .free_job = xe_dep_scheduler_free_job, 57 + }; 58 + 59 + /** 60 + * xe_dep_scheduler_create() - Generic Xe dependency scheduler create 61 + * @xe: Xe device 62 + * @submit_wq: Submit workqueue struct (can be NULL) 63 + * @name: Name of dependency scheduler 64 + * @job_limit: Max dependency jobs that can be scheduled 65 + * 66 + * Create a generic Xe dependency scheduler and initialize internal DRM 67 + * scheduler objects. 68 + * 69 + * Return: Generic Xe dependency scheduler object on success, ERR_PTR failure 70 + */ 71 + struct xe_dep_scheduler * 72 + xe_dep_scheduler_create(struct xe_device *xe, 73 + struct workqueue_struct *submit_wq, 74 + const char *name, u32 job_limit) 75 + { 76 + struct xe_dep_scheduler *dep_scheduler; 77 + struct drm_gpu_scheduler *sched; 78 + const struct drm_sched_init_args args = { 79 + .ops = &sched_ops, 80 + .submit_wq = submit_wq, 81 + .num_rqs = 1, 82 + .credit_limit = job_limit, 83 + .timeout = MAX_SCHEDULE_TIMEOUT, 84 + .name = name, 85 + .dev = xe->drm.dev, 86 + }; 87 + int err; 88 + 89 + dep_scheduler = kzalloc(sizeof(*dep_scheduler), GFP_KERNEL); 90 + if (!dep_scheduler) 91 + return ERR_PTR(-ENOMEM); 92 + 93 + err = drm_sched_init(&dep_scheduler->sched, &args); 94 + if (err) 95 + goto err_free; 96 + 97 + sched = &dep_scheduler->sched; 98 + err = drm_sched_entity_init(&dep_scheduler->entity, 0, &sched, 1, NULL); 99 + if (err) 100 + goto err_sched; 101 + 102 + init_rcu_head(&dep_scheduler->rcu); 103 + 104 + return dep_scheduler; 105 + 106 + err_sched: 107 + drm_sched_fini(&dep_scheduler->sched); 108 + err_free: 109 + kfree(dep_scheduler); 110 + 111 + return ERR_PTR(err); 112 + } 113 + 114 + /** 115 + * xe_dep_scheduler_fini() - Generic Xe dependency scheduler finalize 116 + * @dep_scheduler: Generic Xe dependency scheduler object 117 + * 118 + * Finalize internal DRM scheduler objects and free generic Xe dependency 119 + * scheduler object 120 + */ 121 + void xe_dep_scheduler_fini(struct xe_dep_scheduler *dep_scheduler) 122 + { 123 + drm_sched_entity_fini(&dep_scheduler->entity); 124 + drm_sched_fini(&dep_scheduler->sched); 125 + /* 126 + * RCU free due sched being exported via DRM scheduler fences 127 + * (timeline name). 128 + */ 129 + kfree_rcu(dep_scheduler, rcu); 130 + } 131 + 132 + /** 133 + * xe_dep_scheduler_entity() - Retrieve a generic Xe dependency scheduler 134 + * DRM scheduler entity 135 + * @dep_scheduler: Generic Xe dependency scheduler object 136 + * 137 + * Return: The generic Xe dependency scheduler's DRM scheduler entity 138 + */ 139 + struct drm_sched_entity * 140 + xe_dep_scheduler_entity(struct xe_dep_scheduler *dep_scheduler) 141 + { 142 + return &dep_scheduler->entity; 143 + }

+21

drivers/gpu/drm/xe/xe_dep_scheduler.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #include <linux/types.h> 7 + 8 + struct drm_sched_entity; 9 + struct workqueue_struct; 10 + struct xe_dep_scheduler; 11 + struct xe_device; 12 + 13 + struct xe_dep_scheduler * 14 + xe_dep_scheduler_create(struct xe_device *xe, 15 + struct workqueue_struct *submit_wq, 16 + const char *name, u32 job_limit); 17 + 18 + void xe_dep_scheduler_fini(struct xe_dep_scheduler *dep_scheduler); 19 + 20 + struct drm_sched_entity * 21 + xe_dep_scheduler_entity(struct xe_dep_scheduler *dep_scheduler);

+101 -11

drivers/gpu/drm/xe/xe_device.c

··· 54 54 #include "xe_pcode.h" 55 55 #include "xe_pm.h" 56 56 #include "xe_pmu.h" 57 + #include "xe_psmi.h" 57 58 #include "xe_pxp.h" 58 59 #include "xe_query.h" 59 60 #include "xe_shrinker.h" ··· 64 63 #include "xe_ttm_stolen_mgr.h" 65 64 #include "xe_ttm_sys_mgr.h" 66 65 #include "xe_vm.h" 66 + #include "xe_vm_madvise.h" 67 67 #include "xe_vram.h" 68 + #include "xe_vram_types.h" 68 69 #include "xe_vsec.h" 69 70 #include "xe_wait_user_fence.h" 70 71 #include "xe_wa.h" ··· 203 200 DRM_IOCTL_DEF_DRV(XE_WAIT_USER_FENCE, xe_wait_user_fence_ioctl, 204 201 DRM_RENDER_ALLOW), 205 202 DRM_IOCTL_DEF_DRV(XE_OBSERVATION, xe_observation_ioctl, DRM_RENDER_ALLOW), 203 + DRM_IOCTL_DEF_DRV(XE_MADVISE, xe_vm_madvise_ioctl, DRM_RENDER_ALLOW), 204 + DRM_IOCTL_DEF_DRV(XE_VM_QUERY_MEM_RANGE_ATTRS, xe_vm_query_vmas_attrs_ioctl, 205 + DRM_RENDER_ALLOW), 206 206 }; 207 207 208 208 static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg) ··· 694 688 } 695 689 } 696 690 691 + static int xe_device_vram_alloc(struct xe_device *xe) 692 + { 693 + struct xe_vram_region *vram; 694 + 695 + if (!IS_DGFX(xe)) 696 + return 0; 697 + 698 + vram = drmm_kzalloc(&xe->drm, sizeof(*vram), GFP_KERNEL); 699 + if (!vram) 700 + return -ENOMEM; 701 + 702 + xe->mem.vram = vram; 703 + return 0; 704 + } 705 + 697 706 /** 698 707 * xe_device_probe_early: Device early probe 699 708 * @xe: xe device instance ··· 743 722 * possible, but still return the previous error for error 744 723 * propagation 745 724 */ 746 - err = xe_survivability_mode_enable(xe); 725 + err = xe_survivability_mode_boot_enable(xe); 747 726 if (err) 748 727 return err; 749 728 ··· 755 734 return err; 756 735 757 736 xe->wedged.mode = xe_modparam.wedged_mode; 737 + 738 + err = xe_device_vram_alloc(xe); 739 + if (err) 740 + return err; 758 741 759 742 return 0; 760 743 } ··· 888 863 } 889 864 890 865 if (xe->tiles->media_gt && 891 - XE_WA(xe->tiles->media_gt, 15015404425_disable)) 866 + XE_GT_WA(xe->tiles->media_gt, 15015404425_disable)) 892 867 XE_DEVICE_WA_DISABLE(xe, 15015404425); 893 868 894 869 err = xe_devcoredump_init(xe); ··· 910 885 return err; 911 886 912 887 err = xe_pxp_init(xe); 888 + if (err) 889 + return err; 890 + 891 + err = xe_psmi_init(xe); 913 892 if (err) 914 893 return err; 915 894 ··· 949 920 xe_gt_sanitize_freq(gt); 950 921 951 922 xe_vsec_init(xe); 923 + 924 + err = xe_sriov_late_init(xe); 925 + if (err) 926 + goto err_unregister_display; 952 927 953 928 return devm_add_action_or_reset(xe->drm.dev, xe_device_sanitize, xe); 954 929 ··· 1052 1019 1053 1020 gt = xe_root_mmio_gt(xe); 1054 1021 1055 - if (!XE_WA(gt, 16023588340)) 1022 + if (!XE_GT_WA(gt, 16023588340)) 1056 1023 return; 1057 1024 1058 1025 fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); ··· 1096 1063 return; 1097 1064 1098 1065 root_gt = xe_root_mmio_gt(xe); 1099 - if (XE_WA(root_gt, 16023588340)) { 1066 + if (XE_GT_WA(root_gt, 16023588340)) { 1100 1067 /* A transient flush is not sufficient: flush the L2 */ 1101 1068 xe_device_l2_flush(xe); 1102 1069 } else { ··· 1167 1134 } 1168 1135 1169 1136 /** 1137 + * DOC: Xe Device Wedging 1138 + * 1139 + * Xe driver uses drm device wedged uevent as documented in Documentation/gpu/drm-uapi.rst. 1140 + * When device is in wedged state, every IOCTL will be blocked and GT cannot be 1141 + * used. Certain critical errors like gt reset failure, firmware failures can cause 1142 + * the device to be wedged. The default recovery method for a wedged state 1143 + * is rebind/bus-reset. 1144 + * 1145 + * Another recovery method is vendor-specific. Below are the cases that send 1146 + * ``WEDGED=vendor-specific`` recovery method in drm device wedged uevent. 1147 + * 1148 + * Case: Firmware Flash 1149 + * -------------------- 1150 + * 1151 + * Identification Hint 1152 + * +++++++++++++++++++ 1153 + * 1154 + * ``WEDGED=vendor-specific`` drm device wedged uevent with 1155 + * :ref:`Runtime Survivability mode <xe-survivability-mode>` is used to notify 1156 + * admin/userspace consumer about the need for a firmware flash. 1157 + * 1158 + * Recovery Procedure 1159 + * ++++++++++++++++++ 1160 + * 1161 + * Once ``WEDGED=vendor-specific`` drm device wedged uevent is received, follow 1162 + * the below steps 1163 + * 1164 + * - Check Runtime Survivability mode sysfs. 1165 + * If enabled, firmware flash is required to recover the device. 1166 + * 1167 + * /sys/bus/pci/devices/<device>/survivability_mode 1168 + * 1169 + * - Admin/userpsace consumer can use firmware flashing tools like fwupd to flash 1170 + * firmware and restore device to normal operation. 1171 + */ 1172 + 1173 + /** 1174 + * xe_device_set_wedged_method - Set wedged recovery method 1175 + * @xe: xe device instance 1176 + * @method: recovery method to set 1177 + * 1178 + * Set wedged recovery method to be sent in drm wedged uevent. 1179 + */ 1180 + void xe_device_set_wedged_method(struct xe_device *xe, unsigned long method) 1181 + { 1182 + xe->wedged.method = method; 1183 + } 1184 + 1185 + /** 1170 1186 * xe_device_declare_wedged - Declare device wedged 1171 1187 * @xe: xe device instance 1172 1188 * 1173 - * This is a final state that can only be cleared with a module 1174 - * re-probe (unbind + bind). 1189 + * This is a final state that can only be cleared with the recovery method 1190 + * specified in the drm wedged uevent. The method can be set using 1191 + * xe_device_set_wedged_method before declaring the device as wedged. If no method 1192 + * is set, reprobe (unbind/re-bind) will be sent by default. 1193 + * 1175 1194 * In this state every IOCTL will be blocked so the GT cannot be used. 1176 1195 * In general it will be called upon any critical error such as gt reset 1177 1196 * failure or guc loading failure. Userspace will be notified of this state ··· 1257 1172 "IOCTLs and executions are blocked. Only a rebind may clear the failure\n" 1258 1173 "Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new\n", 1259 1174 dev_name(xe->drm.dev)); 1260 - 1261 - /* Notify userspace of wedged device */ 1262 - drm_dev_wedged_event(&xe->drm, 1263 - DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET, 1264 - NULL); 1265 1175 } 1266 1176 1267 1177 for_each_gt(gt, xe, id) 1268 1178 xe_gt_declare_wedged(gt); 1179 + 1180 + if (xe_device_wedged(xe)) { 1181 + /* If no wedge recovery method is set, use default */ 1182 + if (!xe->wedged.method) 1183 + xe_device_set_wedged_method(xe, DRM_WEDGE_RECOVERY_REBIND | 1184 + DRM_WEDGE_RECOVERY_BUS_RESET); 1185 + 1186 + /* Notify userspace of wedged device */ 1187 + drm_dev_wedged_event(&xe->drm, xe->wedged.method, NULL); 1188 + } 1269 1189 }

+1

drivers/gpu/drm/xe/xe_device.h

··· 187 187 return atomic_read(&xe->wedged.flag); 188 188 } 189 189 190 + void xe_device_set_wedged_method(struct xe_device *xe, unsigned long method); 190 191 void xe_device_declare_wedged(struct xe_device *xe); 191 192 192 193 struct xe_file *xe_file_get(struct xe_file *xef);

+4 -4

drivers/gpu/drm/xe/xe_device_sysfs.c

··· 76 76 { 77 77 struct xe_device *xe = pdev_to_xe_device(to_pci_dev(dev)); 78 78 struct xe_tile *root = xe_device_get_root_tile(xe); 79 - u32 cap, ver_low = FAN_TABLE, ver_high = FAN_TABLE; 79 + u32 cap = 0, ver_low = FAN_TABLE, ver_high = FAN_TABLE; 80 80 u16 major = 0, minor = 0, hotfix = 0, build = 0; 81 81 int ret; 82 82 ··· 115 115 { 116 116 struct xe_device *xe = pdev_to_xe_device(to_pci_dev(dev)); 117 117 struct xe_tile *root = xe_device_get_root_tile(xe); 118 - u32 cap, ver_low = VR_CONFIG, ver_high = VR_CONFIG; 118 + u32 cap = 0, ver_low = VR_CONFIG, ver_high = VR_CONFIG; 119 119 u16 major = 0, minor = 0, hotfix = 0, build = 0; 120 120 int ret; 121 121 ··· 153 153 { 154 154 struct xe_device *xe = pdev_to_xe_device(to_pci_dev(dev)); 155 155 struct xe_tile *root = xe_device_get_root_tile(xe); 156 - u32 cap; 156 + u32 cap = 0; 157 157 int ret; 158 158 159 159 xe_pm_runtime_get(xe); ··· 186 186 { 187 187 struct xe_device *xe = pdev_to_xe_device(to_pci_dev(dev)); 188 188 struct xe_tile *root = xe_device_get_root_tile(xe); 189 - u32 cap; 189 + u32 cap = 0; 190 190 int ret; 191 191 192 192 xe_pm_runtime_get(xe);

+25 -61

drivers/gpu/drm/xe/xe_device_types.h

··· 10 10 11 11 #include <drm/drm_device.h> 12 12 #include <drm/drm_file.h> 13 - #include <drm/drm_pagemap.h> 14 13 #include <drm/ttm/ttm_device.h> 15 14 16 15 #include "xe_devcoredump_types.h" ··· 23 24 #include "xe_sriov_pf_types.h" 24 25 #include "xe_sriov_types.h" 25 26 #include "xe_sriov_vf_types.h" 27 + #include "xe_sriov_vf_ccs_types.h" 26 28 #include "xe_step_types.h" 27 29 #include "xe_survivability_mode_types.h" 28 - #include "xe_ttm_vram_mgr_types.h" 29 30 30 31 #if IS_ENABLED(CONFIG_DRM_XE_DEBUG) 31 32 #define TEST_VM_OPS_ERROR ··· 38 39 struct xe_i2c; 39 40 struct xe_pat_ops; 40 41 struct xe_pxp; 42 + struct xe_vram_region; 41 43 42 44 #define XE_BO_INVALID_OFFSET LONG_MAX 43 45 ··· 70 70 _Generic(tile__, \ 71 71 const struct xe_tile * : (const struct xe_device *)((tile__)->xe), \ 72 72 struct xe_tile * : (tile__)->xe) 73 - 74 - /** 75 - * struct xe_vram_region - memory region structure 76 - * This is used to describe a memory region in xe 77 - * device, such as HBM memory or CXL extension memory. 78 - */ 79 - struct xe_vram_region { 80 - /** @io_start: IO start address of this VRAM instance */ 81 - resource_size_t io_start; 82 - /** 83 - * @io_size: IO size of this VRAM instance 84 - * 85 - * This represents how much of this VRAM we can access 86 - * via the CPU through the VRAM BAR. This can be smaller 87 - * than @usable_size, in which case only part of VRAM is CPU 88 - * accessible (typically the first 256M). This 89 - * configuration is known as small-bar. 90 - */ 91 - resource_size_t io_size; 92 - /** @dpa_base: This memory regions's DPA (device physical address) base */ 93 - resource_size_t dpa_base; 94 - /** 95 - * @usable_size: usable size of VRAM 96 - * 97 - * Usable size of VRAM excluding reserved portions 98 - * (e.g stolen mem) 99 - */ 100 - resource_size_t usable_size; 101 - /** 102 - * @actual_physical_size: Actual VRAM size 103 - * 104 - * Actual VRAM size including reserved portions 105 - * (e.g stolen mem) 106 - */ 107 - resource_size_t actual_physical_size; 108 - /** @mapping: pointer to VRAM mappable space */ 109 - void __iomem *mapping; 110 - /** @ttm: VRAM TTM manager */ 111 - struct xe_ttm_vram_mgr ttm; 112 - #if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP) 113 - /** @pagemap: Used to remap device memory as ZONE_DEVICE */ 114 - struct dev_pagemap pagemap; 115 - /** 116 - * @dpagemap: The struct drm_pagemap of the ZONE_DEVICE memory 117 - * pages of this tile. 118 - */ 119 - struct drm_pagemap dpagemap; 120 - /** 121 - * @hpa_base: base host physical address 122 - * 123 - * This is generated when remap device memory as ZONE_DEVICE 124 - */ 125 - resource_size_t hpa_base; 126 - #endif 127 - }; 128 73 129 74 /** 130 75 * struct xe_mmio - register mmio structure ··· 161 216 * Although VRAM is associated with a specific tile, it can 162 217 * still be accessed by all tiles' GTs. 163 218 */ 164 - struct xe_vram_region vram; 219 + struct xe_vram_region *vram; 165 220 166 221 /** @mem.ggtt: Global graphics translation table */ 167 222 struct xe_ggtt *ggtt; ··· 183 238 struct { 184 239 /** @sriov.vf.ggtt_balloon: GGTT regions excluded from use. */ 185 240 struct xe_ggtt_node *ggtt_balloon[2]; 241 + 242 + /** @sriov.vf.ccs: CCS read and write contexts for VF. */ 243 + struct xe_tile_vf_ccs ccs[XE_SRIOV_VF_CCS_CTX_COUNT]; 186 244 } vf; 187 245 } sriov; 188 246 189 247 /** @memirq: Memory Based Interrupts. */ 190 248 struct xe_memirq memirq; 249 + 250 + /** @csc_hw_error_work: worker to report CSC HW errors */ 251 + struct work_struct csc_hw_error_work; 191 252 192 253 /** @pcode: tile's PCODE */ 193 254 struct { ··· 206 255 207 256 /** @sysfs: sysfs' kobj used by xe_tile_sysfs */ 208 257 struct kobject *sysfs; 258 + 259 + /** @debugfs: debugfs directory associated with this tile */ 260 + struct dentry *debugfs; 209 261 }; 210 262 211 263 /** ··· 290 336 u8 has_mbx_power_limits:1; 291 337 /** @info.has_pxp: Device has PXP support */ 292 338 u8 has_pxp:1; 293 - /** @info.has_range_tlb_invalidation: Has range based TLB invalidations */ 294 - u8 has_range_tlb_invalidation:1; 339 + /** @info.has_range_tlb_inval: Has range based TLB invalidations */ 340 + u8 has_range_tlb_inval:1; 295 341 /** @info.has_sriov: Supports SR-IOV */ 296 342 u8 has_sriov:1; 297 343 /** @info.has_usm: Device has unified shared memory support */ ··· 366 412 /** @mem: memory info for device */ 367 413 struct { 368 414 /** @mem.vram: VRAM info for device */ 369 - struct xe_vram_region vram; 415 + struct xe_vram_region *vram; 370 416 /** @mem.sys_mgr: system TTM manager */ 371 417 struct ttm_resource_manager sys_mgr; 372 418 /** @mem.sys_mgr: system memory shrinker. */ ··· 544 590 atomic_t flag; 545 591 /** @wedged.mode: Mode controlled by kernel parameter and debugfs */ 546 592 int mode; 593 + /** @wedged.method: Recovery method to be sent in the drm device wedged uevent */ 594 + unsigned long method; 547 595 } wedged; 548 596 549 597 /** @bo_device: Struct to control async free of BOs */ ··· 580 624 */ 581 625 atomic64_t global_total_pages; 582 626 #endif 627 + 628 + /** @psmi: GPU debugging via additional validation HW */ 629 + struct { 630 + /** @psmi.capture_obj: PSMI buffer for VRAM */ 631 + struct xe_bo *capture_obj[XE_MAX_TILES_PER_DEVICE + 1]; 632 + /** @psmi.region_mask: Mask of valid memory regions */ 633 + u8 region_mask; 634 + } psmi; 583 635 584 636 /* private: */ 585 637

+2 -2

drivers/gpu/drm/xe/xe_eu_stall.c

··· 649 649 return -ETIMEDOUT; 650 650 } 651 651 652 - if (XE_WA(gt, 22016596838)) 652 + if (XE_GT_WA(gt, 22016596838)) 653 653 xe_gt_mcr_multicast_write(gt, ROW_CHICKEN2, 654 654 _MASKED_BIT_ENABLE(DISABLE_DOP_GATING)); 655 655 ··· 805 805 806 806 cancel_delayed_work_sync(&stream->buf_poll_work); 807 807 808 - if (XE_WA(gt, 22016596838)) 808 + if (XE_GT_WA(gt, 22016596838)) 809 809 xe_gt_mcr_multicast_write(gt, ROW_CHICKEN2, 810 810 _MASKED_BIT_DISABLE(DISABLE_DOP_GATING)); 811 811

+111

drivers/gpu/drm/xe/xe_exec_queue.c

··· 12 12 #include <drm/drm_file.h> 13 13 #include <uapi/drm/xe_drm.h> 14 14 15 + #include "xe_dep_scheduler.h" 15 16 #include "xe_device.h" 16 17 #include "xe_gt.h" 17 18 #include "xe_hw_engine_class_sysfs.h" ··· 40 39 41 40 static void __xe_exec_queue_free(struct xe_exec_queue *q) 42 41 { 42 + int i; 43 + 44 + for (i = 0; i < XE_EXEC_QUEUE_TLB_INVAL_COUNT; ++i) 45 + if (q->tlb_inval[i].dep_scheduler) 46 + xe_dep_scheduler_fini(q->tlb_inval[i].dep_scheduler); 47 + 43 48 if (xe_exec_queue_uses_pxp(q)) 44 49 xe_pxp_exec_queue_remove(gt_to_xe(q->gt)->pxp, q); 45 50 if (q->vm) ··· 55 48 xe_file_put(q->xef); 56 49 57 50 kfree(q); 51 + } 52 + 53 + static int alloc_dep_schedulers(struct xe_device *xe, struct xe_exec_queue *q) 54 + { 55 + struct xe_tile *tile = gt_to_tile(q->gt); 56 + int i; 57 + 58 + for (i = 0; i < XE_EXEC_QUEUE_TLB_INVAL_COUNT; ++i) { 59 + struct xe_dep_scheduler *dep_scheduler; 60 + struct xe_gt *gt; 61 + struct workqueue_struct *wq; 62 + 63 + if (i == XE_EXEC_QUEUE_TLB_INVAL_PRIMARY_GT) 64 + gt = tile->primary_gt; 65 + else 66 + gt = tile->media_gt; 67 + 68 + if (!gt) 69 + continue; 70 + 71 + wq = gt->tlb_inval.job_wq; 72 + 73 + #define MAX_TLB_INVAL_JOBS 16 /* Picking a reasonable value */ 74 + dep_scheduler = xe_dep_scheduler_create(xe, wq, q->name, 75 + MAX_TLB_INVAL_JOBS); 76 + if (IS_ERR(dep_scheduler)) 77 + return PTR_ERR(dep_scheduler); 78 + 79 + q->tlb_inval[i].dep_scheduler = dep_scheduler; 80 + } 81 + #undef MAX_TLB_INVAL_JOBS 82 + 83 + return 0; 58 84 } 59 85 60 86 static struct xe_exec_queue *__xe_exec_queue_alloc(struct xe_device *xe, ··· 133 93 q->sched_props.priority = XE_EXEC_QUEUE_PRIORITY_KERNEL; 134 94 else 135 95 q->sched_props.priority = XE_EXEC_QUEUE_PRIORITY_NORMAL; 96 + 97 + if (q->flags & (EXEC_QUEUE_FLAG_MIGRATE | EXEC_QUEUE_FLAG_VM)) { 98 + err = alloc_dep_schedulers(xe, q); 99 + if (err) { 100 + __xe_exec_queue_free(q); 101 + return ERR_PTR(err); 102 + } 103 + } 136 104 137 105 if (vm) 138 106 q->vm = xe_vm_get(vm); ··· 790 742 } 791 743 792 744 /** 745 + * xe_exec_queue_lrc() - Get the LRC from exec queue. 746 + * @q: The exec_queue. 747 + * 748 + * Retrieves the primary LRC for the exec queue. Note that this function 749 + * returns only the first LRC instance, even when multiple parallel LRCs 750 + * are configured. 751 + * 752 + * Return: Pointer to LRC on success, error on failure 753 + */ 754 + struct xe_lrc *xe_exec_queue_lrc(struct xe_exec_queue *q) 755 + { 756 + return q->lrc[0]; 757 + } 758 + 759 + /** 793 760 * xe_exec_queue_is_lr() - Whether an exec_queue is long-running 794 761 * @q: The exec_queue 795 762 * ··· 1090 1027 } 1091 1028 1092 1029 return err; 1030 + } 1031 + 1032 + /** 1033 + * xe_exec_queue_contexts_hwsp_rebase - Re-compute GGTT references 1034 + * within all LRCs of a queue. 1035 + * @q: the &xe_exec_queue struct instance containing target LRCs 1036 + * @scratch: scratch buffer to be used as temporary storage 1037 + * 1038 + * Returns: zero on success, negative error code on failure 1039 + */ 1040 + int xe_exec_queue_contexts_hwsp_rebase(struct xe_exec_queue *q, void *scratch) 1041 + { 1042 + int i; 1043 + int err = 0; 1044 + 1045 + for (i = 0; i < q->width; ++i) { 1046 + xe_lrc_update_memirq_regs_with_address(q->lrc[i], q->hwe, scratch); 1047 + xe_lrc_update_hwctx_regs_with_address(q->lrc[i]); 1048 + err = xe_lrc_setup_wa_bb_with_scratch(q->lrc[i], q->hwe, scratch); 1049 + if (err) 1050 + break; 1051 + } 1052 + 1053 + return err; 1054 + } 1055 + 1056 + /** 1057 + * xe_exec_queue_jobs_ring_restore - Re-emit ring commands of requests pending on given queue. 1058 + * @q: the &xe_exec_queue struct instance 1059 + */ 1060 + void xe_exec_queue_jobs_ring_restore(struct xe_exec_queue *q) 1061 + { 1062 + struct xe_gpu_scheduler *sched = &q->guc->sched; 1063 + struct xe_sched_job *job; 1064 + 1065 + /* 1066 + * This routine is used within VF migration recovery. This means 1067 + * using the lock here introduces a restriction: we cannot wait 1068 + * for any GFX HW response while the lock is taken. 1069 + */ 1070 + spin_lock(&sched->base.job_list_lock); 1071 + list_for_each_entry(job, &sched->base.pending_list, drm.list) { 1072 + if (xe_sched_job_is_error(job)) 1073 + continue; 1074 + 1075 + q->ring_ops->emit_job(job); 1076 + } 1077 + spin_unlock(&sched->base.job_list_lock); 1093 1078 }

+5

drivers/gpu/drm/xe/xe_exec_queue.h

··· 90 90 struct xe_vm *vm); 91 91 void xe_exec_queue_update_run_ticks(struct xe_exec_queue *q); 92 92 93 + int xe_exec_queue_contexts_hwsp_rebase(struct xe_exec_queue *q, void *scratch); 94 + 95 + void xe_exec_queue_jobs_ring_restore(struct xe_exec_queue *q); 96 + 97 + struct xe_lrc *xe_exec_queue_lrc(struct xe_exec_queue *q); 93 98 #endif

+15

drivers/gpu/drm/xe/xe_exec_queue_types.h

··· 87 87 #define EXEC_QUEUE_FLAG_HIGH_PRIORITY BIT(4) 88 88 /* flag to indicate low latency hint to guc */ 89 89 #define EXEC_QUEUE_FLAG_LOW_LATENCY BIT(5) 90 + /* for migration (kernel copy, clear, bind) jobs */ 91 + #define EXEC_QUEUE_FLAG_MIGRATE BIT(6) 90 92 91 93 /** 92 94 * @flags: flags for this exec queue, should statically setup aside from ban ··· 133 131 /** @lr.link: link into VM's list of exec queues */ 134 132 struct list_head link; 135 133 } lr; 134 + 135 + #define XE_EXEC_QUEUE_TLB_INVAL_PRIMARY_GT 0 136 + #define XE_EXEC_QUEUE_TLB_INVAL_MEDIA_GT 1 137 + #define XE_EXEC_QUEUE_TLB_INVAL_COUNT (XE_EXEC_QUEUE_TLB_INVAL_MEDIA_GT + 1) 138 + 139 + /** @tlb_inval: TLB invalidations exec queue state */ 140 + struct { 141 + /** 142 + * @tlb_inval.dep_scheduler: The TLB invalidation 143 + * dependency scheduler 144 + */ 145 + struct xe_dep_scheduler *dep_scheduler; 146 + } tlb_inval[XE_EXEC_QUEUE_TLB_INVAL_COUNT]; 136 147 137 148 /** @pxp: PXP info tracking */ 138 149 struct {

+9 -1

drivers/gpu/drm/xe/xe_gen_wa_oob.c

··· 123 123 return 0; 124 124 } 125 125 126 + /* Avoid GNU vs POSIX basename() discrepancy, just use our own */ 127 + static const char *xbasename(const char *s) 128 + { 129 + const char *p = strrchr(s, '/'); 130 + 131 + return p ? p + 1 : s; 132 + } 133 + 126 134 static int fn_to_prefix(const char *fn, char *prefix, size_t size) 127 135 { 128 136 size_t len; 129 137 130 - fn = basename(fn); 138 + fn = xbasename(fn); 131 139 len = strlen(fn); 132 140 133 141 if (len > size - 1)

+7 -8

drivers/gpu/drm/xe/xe_ggtt.c

··· 23 23 #include "xe_device.h" 24 24 #include "xe_gt.h" 25 25 #include "xe_gt_printk.h" 26 - #include "xe_gt_tlb_invalidation.h" 27 26 #include "xe_map.h" 28 27 #include "xe_mmio.h" 29 28 #include "xe_pm.h" 30 29 #include "xe_res_cursor.h" 31 30 #include "xe_sriov.h" 32 31 #include "xe_tile_sriov_vf.h" 32 + #include "xe_tlb_inval.h" 33 33 #include "xe_wa.h" 34 34 #include "xe_wopcm.h" 35 35 ··· 106 106 static void ggtt_update_access_counter(struct xe_ggtt *ggtt) 107 107 { 108 108 struct xe_tile *tile = ggtt->tile; 109 - struct xe_gt *affected_gt = XE_WA(tile->primary_gt, 22019338487) ? 109 + struct xe_gt *affected_gt = XE_GT_WA(tile->primary_gt, 22019338487) ? 110 110 tile->primary_gt : tile->media_gt; 111 111 struct xe_mmio *mmio = &affected_gt->mmio; 112 - u32 max_gtt_writes = XE_WA(ggtt->tile->primary_gt, 22019338487) ? 1100 : 63; 112 + u32 max_gtt_writes = XE_GT_WA(ggtt->tile->primary_gt, 22019338487) ? 1100 : 63; 113 113 /* 114 114 * Wa_22019338487: GMD_ID is a RO register, a dummy write forces gunit 115 115 * to wait for completion of prior GTT writes before letting this through. ··· 284 284 285 285 if (GRAPHICS_VERx100(xe) >= 1270) 286 286 ggtt->pt_ops = (ggtt->tile->media_gt && 287 - XE_WA(ggtt->tile->media_gt, 22019338487)) || 288 - XE_WA(ggtt->tile->primary_gt, 22019338487) ? 287 + XE_GT_WA(ggtt->tile->media_gt, 22019338487)) || 288 + XE_GT_WA(ggtt->tile->primary_gt, 22019338487) ? 289 289 &xelpg_pt_wa_ops : &xelpg_pt_ops; 290 290 else 291 291 ggtt->pt_ops = &xelp_pt_ops; ··· 438 438 if (!gt) 439 439 return; 440 440 441 - err = xe_gt_tlb_invalidation_ggtt(gt); 442 - if (err) 443 - drm_warn(&gt_to_xe(gt)->drm, "xe_gt_tlb_invalidation_ggtt error=%d", err); 441 + err = xe_tlb_inval_ggtt(&gt->tlb_inval); 442 + xe_gt_WARN(gt, err, "Failed to invalidate GGTT (%pe)", ERR_PTR(err)); 444 443 } 445 444 446 445 static void xe_ggtt_invalidate(struct xe_ggtt *ggtt)

+13

drivers/gpu/drm/xe/xe_gpu_scheduler.c

··· 101 101 cancel_work_sync(&sched->work_process_msg); 102 102 } 103 103 104 + /** 105 + * xe_sched_submission_stop_async - Stop further runs of submission tasks on a scheduler. 106 + * @sched: the &xe_gpu_scheduler struct instance 107 + * 108 + * This call disables further runs of scheduling work queue. It does not wait 109 + * for any in-progress runs to finish, only makes sure no further runs happen 110 + * afterwards. 111 + */ 112 + void xe_sched_submission_stop_async(struct xe_gpu_scheduler *sched) 113 + { 114 + drm_sched_wqueue_stop(&sched->base); 115 + } 116 + 104 117 void xe_sched_submission_resume_tdr(struct xe_gpu_scheduler *sched) 105 118 { 106 119 drm_sched_resume_timeout(&sched->base, sched->base.timeout);

+1

drivers/gpu/drm/xe/xe_gpu_scheduler.h

··· 21 21 22 22 void xe_sched_submission_start(struct xe_gpu_scheduler *sched); 23 23 void xe_sched_submission_stop(struct xe_gpu_scheduler *sched); 24 + void xe_sched_submission_stop_async(struct xe_gpu_scheduler *sched); 24 25 25 26 void xe_sched_submission_resume_tdr(struct xe_gpu_scheduler *sched); 26 27

+3 -3

drivers/gpu/drm/xe/xe_gsc.c

··· 266 266 unsigned int fw_ref; 267 267 int ret; 268 268 269 - if (XE_WA(tile->primary_gt, 14018094691)) { 269 + if (XE_GT_WA(tile->primary_gt, 14018094691)) { 270 270 fw_ref = xe_force_wake_get(gt_to_fw(tile->primary_gt), XE_FORCEWAKE_ALL); 271 271 272 272 /* ··· 281 281 282 282 ret = gsc_upload(gsc); 283 283 284 - if (XE_WA(tile->primary_gt, 14018094691)) 284 + if (XE_GT_WA(tile->primary_gt, 14018094691)) 285 285 xe_force_wake_put(gt_to_fw(tile->primary_gt), fw_ref); 286 286 287 287 if (ret) ··· 593 593 u32 gs1_clr = prep ? 0 : HECI_H_GS1_ER_PREP; 594 594 595 595 /* WA only applies if the GSC is loaded */ 596 - if (!XE_WA(gt, 14015076503) || !gsc_fw_is_loaded(gt)) 596 + if (!XE_GT_WA(gt, 14015076503) || !gsc_fw_is_loaded(gt)) 597 597 return; 598 598 599 599 xe_mmio_rmw32(&gt->mmio, HECI_H_GS1(MTL_GSC_HECI2_BASE), gs1_clr, gs1_set);

+21 -13

drivers/gpu/drm/xe/xe_gt.c

··· 37 37 #include "xe_gt_sriov_pf.h" 38 38 #include "xe_gt_sriov_vf.h" 39 39 #include "xe_gt_sysfs.h" 40 - #include "xe_gt_tlb_invalidation.h" 41 40 #include "xe_gt_topology.h" 42 41 #include "xe_guc_exec_queue_types.h" 43 42 #include "xe_guc_pc.h" 43 + #include "xe_guc_submit.h" 44 44 #include "xe_hw_fence.h" 45 45 #include "xe_hw_engine_class_sysfs.h" 46 46 #include "xe_irq.h" ··· 57 57 #include "xe_sa.h" 58 58 #include "xe_sched_job.h" 59 59 #include "xe_sriov.h" 60 + #include "xe_tlb_inval.h" 60 61 #include "xe_tuning.h" 61 62 #include "xe_uc.h" 62 63 #include "xe_uc_fw.h" ··· 106 105 unsigned int fw_ref; 107 106 u32 reg; 108 107 109 - if (!XE_WA(gt, 16023588340)) 108 + if (!XE_GT_WA(gt, 16023588340)) 110 109 return; 111 110 112 111 fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); ··· 128 127 unsigned int fw_ref; 129 128 u32 reg; 130 129 131 - if (!XE_WA(gt, 16023588340)) 130 + if (!XE_GT_WA(gt, 16023588340)) 132 131 return; 133 132 134 133 if (xe_gt_is_media_type(gt)) ··· 400 399 401 400 xe_reg_sr_init(&gt->reg_sr, "GT", gt_to_xe(gt)); 402 401 403 - err = xe_wa_init(gt); 402 + err = xe_wa_gt_init(gt); 404 403 if (err) 405 404 return err; 406 405 ··· 408 407 if (err) 409 408 return err; 410 409 411 - xe_wa_process_oob(gt); 410 + xe_wa_process_gt_oob(gt); 412 411 413 412 xe_force_wake_init_gt(gt, gt_to_fw(gt)); 414 413 spin_lock_init(&gt->global_invl_lock); 415 414 416 - err = xe_gt_tlb_invalidation_init_early(gt); 415 + err = xe_gt_tlb_inval_init_early(gt); 417 416 if (err) 418 417 return err; 419 418 ··· 565 564 if (xe_gt_is_main_type(gt)) { 566 565 struct xe_tile *tile = gt_to_tile(gt); 567 566 568 - tile->migrate = xe_migrate_init(tile); 569 - if (IS_ERR(tile->migrate)) { 570 - err = PTR_ERR(tile->migrate); 567 + err = xe_migrate_init(tile->migrate); 568 + if (err) 571 569 goto err_force_wake; 572 - } 573 570 } 574 571 575 572 err = xe_uc_load_hw(&gt->uc); ··· 803 804 return 0; 804 805 } 805 806 807 + static int gt_wait_reset_unblock(struct xe_gt *gt) 808 + { 809 + return xe_guc_wait_reset_unblock(&gt->uc.guc); 810 + } 811 + 806 812 static int gt_reset(struct xe_gt *gt) 807 813 { 808 814 unsigned int fw_ref; ··· 821 817 return -ENODEV; 822 818 823 819 xe_gt_info(gt, "reset started\n"); 820 + 821 + err = gt_wait_reset_unblock(gt); 822 + if (!err) 823 + xe_gt_warn(gt, "reset block failed to get lifted"); 824 824 825 825 xe_pm_runtime_get(gt_to_xe(gt)); 826 826 ··· 850 842 851 843 xe_uc_stop(&gt->uc); 852 844 853 - xe_gt_tlb_invalidation_reset(gt); 845 + xe_tlb_inval_reset(&gt->tlb_inval); 854 846 855 847 err = do_gt_reset(gt); 856 848 if (err) ··· 966 958 if ((!xe_uc_fw_is_available(&gt->uc.gsc.fw) || 967 959 xe_uc_fw_is_loaded(&gt->uc.gsc.fw) || 968 960 xe_uc_fw_is_in_error_state(&gt->uc.gsc.fw)) && 969 - XE_WA(gt, 22019338487)) 961 + XE_GT_WA(gt, 22019338487)) 970 962 ret = xe_guc_pc_restore_stashed_freq(&gt->uc.guc.pc); 971 963 972 964 return ret; ··· 1064 1056 xe_gt_assert(gt, gt_to_xe(gt)->wedged.mode); 1065 1057 1066 1058 xe_uc_declare_wedged(&gt->uc); 1067 - xe_gt_tlb_invalidation_reset(gt); 1059 + xe_tlb_inval_reset(&gt->tlb_inval); 1068 1060 }

+54 -2

drivers/gpu/drm/xe/xe_gt_debugfs.c

··· 29 29 #include "xe_pm.h" 30 30 #include "xe_reg_sr.h" 31 31 #include "xe_reg_whitelist.h" 32 + #include "xe_sa.h" 32 33 #include "xe_sriov.h" 33 34 #include "xe_tuning.h" 34 35 #include "xe_uc_debugfs.h" ··· 129 128 130 129 xe_pm_runtime_get(gt_to_xe(gt)); 131 130 drm_suballoc_dump_debug_info(&tile->mem.kernel_bb_pool->base, p, 132 - tile->mem.kernel_bb_pool->gpu_addr); 131 + xe_sa_manager_gpu_addr(tile->mem.kernel_bb_pool)); 132 + xe_pm_runtime_put(gt_to_xe(gt)); 133 + 134 + return 0; 135 + } 136 + 137 + static int sa_info_vf_ccs(struct xe_gt *gt, struct drm_printer *p) 138 + { 139 + struct xe_tile *tile = gt_to_tile(gt); 140 + struct xe_sa_manager *bb_pool; 141 + enum xe_sriov_vf_ccs_rw_ctxs ctx_id; 142 + 143 + if (!IS_VF_CCS_READY(gt_to_xe(gt))) 144 + return 0; 145 + 146 + xe_pm_runtime_get(gt_to_xe(gt)); 147 + 148 + for_each_ccs_rw_ctx(ctx_id) { 149 + bb_pool = tile->sriov.vf.ccs[ctx_id].mem.ccs_bb_pool; 150 + if (!bb_pool) 151 + break; 152 + 153 + drm_printf(p, "ccs %s bb suballoc info\n", ctx_id ? "write" : "read"); 154 + drm_printf(p, "-------------------------\n"); 155 + drm_suballoc_dump_debug_info(&bb_pool->base, p, xe_sa_manager_gpu_addr(bb_pool)); 156 + drm_puts(p, "\n"); 157 + } 158 + 133 159 xe_pm_runtime_put(gt_to_xe(gt)); 134 160 135 161 return 0; ··· 331 303 {"hwconfig", .show = xe_gt_debugfs_simple_show, .data = hwconfig}, 332 304 }; 333 305 306 + /* 307 + * only for GT debugfs files which are valid on VF. Not valid on PF. 308 + */ 309 + static const struct drm_info_list vf_only_debugfs_list[] = { 310 + {"sa_info_vf_ccs", .show = xe_gt_debugfs_simple_show, .data = sa_info_vf_ccs}, 311 + }; 312 + 334 313 /* everything else should be added here */ 335 314 static const struct drm_info_list pf_only_debugfs_list[] = { 336 315 {"hw_engines", .show = xe_gt_debugfs_simple_show, .data = hw_engines}, ··· 423 388 { 424 389 struct xe_device *xe = gt_to_xe(gt); 425 390 struct drm_minor *minor = gt_to_xe(gt)->drm.primary; 391 + struct dentry *parent = gt->tile->debugfs; 426 392 struct dentry *root; 393 + char symlink[16]; 427 394 char name[8]; 428 395 429 396 xe_gt_assert(gt, minor->debugfs_root); 430 397 398 + if (IS_ERR(parent)) 399 + return; 400 + 431 401 snprintf(name, sizeof(name), "gt%d", gt->info.id); 432 - root = debugfs_create_dir(name, minor->debugfs_root); 402 + root = debugfs_create_dir(name, parent); 433 403 if (IS_ERR(root)) { 434 404 drm_warn(&xe->drm, "Create GT directory failed"); 435 405 return; ··· 459 419 drm_debugfs_create_files(pf_only_debugfs_list, 460 420 ARRAY_SIZE(pf_only_debugfs_list), 461 421 root, minor); 422 + else 423 + drm_debugfs_create_files(vf_only_debugfs_list, 424 + ARRAY_SIZE(vf_only_debugfs_list), 425 + root, minor); 426 + 462 427 463 428 xe_uc_debugfs_register(&gt->uc, root); 464 429 ··· 471 426 xe_gt_sriov_pf_debugfs_register(gt, root); 472 427 else if (IS_SRIOV_VF(xe)) 473 428 xe_gt_sriov_vf_debugfs_register(gt, root); 429 + 430 + /* 431 + * Backwards compatibility only: create a link for the legacy clients 432 + * who may expect gt/ directory at the root level, not the tile level. 433 + */ 434 + snprintf(symlink, sizeof(symlink), "tile%u/%s", gt->tile->id, name); 435 + debugfs_create_symlink(name, minor->debugfs_root, symlink); 474 436 }

+13 -8

drivers/gpu/drm/xe/xe_gt_idle.c

··· 322 322 { 323 323 struct kobject *kobj = arg; 324 324 struct xe_gt *gt = kobj_to_gt(kobj->parent); 325 - unsigned int fw_ref; 326 325 327 326 xe_gt_idle_disable_pg(gt); 328 327 329 - if (gt_to_xe(gt)->info.skip_guc_pc) { 330 - fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); 328 + if (gt_to_xe(gt)->info.skip_guc_pc) 331 329 xe_gt_idle_disable_c6(gt); 332 - xe_force_wake_put(gt_to_fw(gt), fw_ref); 333 - } 334 330 335 331 sysfs_remove_files(kobj, gt_idle_attrs); 336 332 kobject_put(kobj); ··· 386 390 RC_CTL_HW_ENABLE | RC_CTL_TO_MODE | RC_CTL_RC6_ENABLE); 387 391 } 388 392 389 - void xe_gt_idle_disable_c6(struct xe_gt *gt) 393 + int xe_gt_idle_disable_c6(struct xe_gt *gt) 390 394 { 395 + unsigned int fw_ref; 396 + 391 397 xe_device_assert_mem_access(gt_to_xe(gt)); 392 - xe_force_wake_assert_held(gt_to_fw(gt), XE_FW_GT); 393 398 394 399 if (IS_SRIOV_VF(gt_to_xe(gt))) 395 - return; 400 + return 0; 401 + 402 + fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); 403 + if (!fw_ref) 404 + return -ETIMEDOUT; 396 405 397 406 xe_mmio_write32(&gt->mmio, RC_CONTROL, 0); 398 407 xe_mmio_write32(&gt->mmio, RC_STATE, 0); 408 + 409 + xe_force_wake_put(gt_to_fw(gt), fw_ref); 410 + 411 + return 0; 399 412 }

+1 -1

drivers/gpu/drm/xe/xe_gt_idle.h

··· 13 13 14 14 int xe_gt_idle_init(struct xe_gt_idle *gtidle); 15 15 void xe_gt_idle_enable_c6(struct xe_gt *gt); 16 - void xe_gt_idle_disable_c6(struct xe_gt *gt); 16 + int xe_gt_idle_disable_c6(struct xe_gt *gt); 17 17 void xe_gt_idle_enable_pg(struct xe_gt *gt); 18 18 void xe_gt_idle_disable_pg(struct xe_gt *gt); 19 19 int xe_gt_idle_pg_print(struct xe_gt *gt, struct drm_printer *p);

+1 -3

drivers/gpu/drm/xe/xe_gt_mcr.c

··· 46 46 * MCR registers are not available on Virtual Function (VF). 47 47 */ 48 48 49 - #define STEER_SEMAPHORE XE_REG(0xFD0) 50 - 51 49 static inline struct xe_reg to_xe_reg(struct xe_reg_mcr reg_mcr) 52 50 { 53 51 return reg_mcr.__reg; ··· 531 533 u32 steer_val = REG_FIELD_PREP(MCR_SLICE_MASK, 0) | 532 534 REG_FIELD_PREP(MCR_SUBSLICE_MASK, 2); 533 535 534 - xe_mmio_write32(&gt->mmio, MCFG_MCR_SELECTOR, steer_val); 536 + xe_mmio_write32(&gt->mmio, STEER_SEMAPHORE, steer_val); 535 537 xe_mmio_write32(&gt->mmio, SF_MCR_SELECTOR, steer_val); 536 538 /* 537 539 * For GAM registers, all reads should be directed to instance 1

+14 -21

drivers/gpu/drm/xe/xe_gt_pagefault.c

··· 16 16 #include "xe_gt.h" 17 17 #include "xe_gt_printk.h" 18 18 #include "xe_gt_stats.h" 19 - #include "xe_gt_tlb_invalidation.h" 20 19 #include "xe_guc.h" 21 20 #include "xe_guc_ct.h" 22 21 #include "xe_migrate.h" 23 22 #include "xe_svm.h" 24 23 #include "xe_trace_bo.h" 25 24 #include "xe_vm.h" 25 + #include "xe_vram_types.h" 26 26 27 27 struct pagefault { 28 28 u64 page_addr; ··· 74 74 } 75 75 76 76 static int xe_pf_begin(struct drm_exec *exec, struct xe_vma *vma, 77 - bool atomic, unsigned int id) 77 + bool need_vram_move, struct xe_vram_region *vram) 78 78 { 79 79 struct xe_bo *bo = xe_vma_bo(vma); 80 80 struct xe_vm *vm = xe_vma_vm(vma); ··· 84 84 if (err) 85 85 return err; 86 86 87 - if (atomic && IS_DGFX(vm->xe)) { 88 - if (xe_vma_is_userptr(vma)) { 89 - err = -EACCES; 90 - return err; 91 - } 87 + if (!bo) 88 + return 0; 92 89 93 - /* Migrate to VRAM, move should invalidate the VMA first */ 94 - err = xe_bo_migrate(bo, XE_PL_VRAM0 + id); 95 - if (err) 96 - return err; 97 - } else if (bo) { 98 - /* Create backing store if needed */ 99 - err = xe_bo_validate(bo, vm, true); 100 - if (err) 101 - return err; 102 - } 90 + err = need_vram_move ? xe_bo_migrate(bo, vram->placement) : 91 + xe_bo_validate(bo, vm, true); 103 92 104 - return 0; 93 + return err; 105 94 } 106 95 107 96 static int handle_vma_pagefault(struct xe_gt *gt, struct xe_vma *vma, ··· 101 112 struct drm_exec exec; 102 113 struct dma_fence *fence; 103 114 ktime_t end = 0; 104 - int err; 115 + int err, needs_vram; 105 116 106 117 lockdep_assert_held_write(&vm->lock); 118 + 119 + needs_vram = xe_vma_need_vram_for_atomic(vm->xe, vma, atomic); 120 + if (needs_vram < 0 || (needs_vram && xe_vma_is_userptr(vma))) 121 + return needs_vram < 0 ? needs_vram : -EACCES; 107 122 108 123 xe_gt_stats_incr(gt, XE_GT_STATS_ID_VMA_PAGEFAULT_COUNT, 1); 109 124 xe_gt_stats_incr(gt, XE_GT_STATS_ID_VMA_PAGEFAULT_KB, xe_vma_size(vma) / 1024); ··· 131 138 /* Lock VM and BOs dma-resv */ 132 139 drm_exec_init(&exec, 0, 0); 133 140 drm_exec_until_all_locked(&exec) { 134 - err = xe_pf_begin(&exec, vma, atomic, tile->id); 141 + err = xe_pf_begin(&exec, vma, needs_vram == 1, tile->mem.vram); 135 142 drm_exec_retry_on_contention(&exec); 136 143 if (xe_vm_validate_should_retry(&exec, err, &end)) 137 144 err = -EAGAIN; ··· 566 573 /* Lock VM and BOs dma-resv */ 567 574 drm_exec_init(&exec, 0, 0); 568 575 drm_exec_until_all_locked(&exec) { 569 - ret = xe_pf_begin(&exec, vma, true, tile->id); 576 + ret = xe_pf_begin(&exec, vma, IS_DGFX(vm->xe), tile->mem.vram); 570 577 drm_exec_retry_on_contention(&exec); 571 578 if (ret) 572 579 break;

+20 -4

drivers/gpu/drm/xe/xe_gt_sriov_pf.c

··· 55 55 static void pf_fini_workers(struct xe_gt *gt) 56 56 { 57 57 xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 58 - disable_work_sync(&gt->sriov.pf.workers.restart); 58 + 59 + if (disable_work_sync(&gt->sriov.pf.workers.restart)) { 60 + xe_gt_sriov_dbg_verbose(gt, "pending restart disabled!\n"); 61 + /* release an rpm reference taken on the worker's behalf */ 62 + xe_pm_runtime_put(gt_to_xe(gt)); 63 + } 59 64 } 60 65 61 66 /** ··· 212 207 { 213 208 xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 214 209 215 - if (cancel_work_sync(&gt->sriov.pf.workers.restart)) 210 + if (cancel_work_sync(&gt->sriov.pf.workers.restart)) { 216 211 xe_gt_sriov_dbg_verbose(gt, "pending restart canceled!\n"); 212 + /* release an rpm reference taken on the worker's behalf */ 213 + xe_pm_runtime_put(gt_to_xe(gt)); 214 + } 217 215 } 218 216 219 217 /** ··· 234 226 { 235 227 struct xe_device *xe = gt_to_xe(gt); 236 228 237 - xe_pm_runtime_get(xe); 229 + xe_gt_assert(gt, !xe_pm_runtime_suspended(xe)); 230 + 238 231 xe_gt_sriov_pf_config_restart(gt); 239 232 xe_gt_sriov_pf_control_restart(gt); 233 + 234 + /* release an rpm reference taken on our behalf */ 240 235 xe_pm_runtime_put(xe); 241 236 242 237 xe_gt_sriov_dbg(gt, "restart completed\n"); ··· 258 247 259 248 xe_gt_assert(gt, IS_SRIOV_PF(xe)); 260 249 261 - if (!queue_work(xe->sriov.wq, &gt->sriov.pf.workers.restart)) 250 + /* take an rpm reference on behalf of the worker */ 251 + xe_pm_runtime_get_noresume(xe); 252 + 253 + if (!queue_work(xe->sriov.wq, &gt->sriov.pf.workers.restart)) { 262 254 xe_gt_sriov_dbg(gt, "restart already in queue!\n"); 255 + xe_pm_runtime_put(xe); 256 + } 263 257 } 264 258 265 259 /**

+9 -4

drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c

··· 33 33 #include "xe_migrate.h" 34 34 #include "xe_sriov.h" 35 35 #include "xe_ttm_vram_mgr.h" 36 + #include "xe_vram_types.h" 36 37 #include "xe_wopcm.h" 37 38 38 39 #define make_u64_from_u32(hi, lo) ((u64)((u64)(u32)(hi) << 32 | (u32)(lo))) ··· 1434 1433 return err; 1435 1434 } 1436 1435 1437 - static void pf_release_vf_config_lmem(struct xe_gt *gt, struct xe_gt_sriov_config *config) 1436 + /* Return: %true if there was an LMEM provisioned, %false otherwise */ 1437 + static bool pf_release_vf_config_lmem(struct xe_gt *gt, struct xe_gt_sriov_config *config) 1438 1438 { 1439 1439 xe_gt_assert(gt, IS_DGFX(gt_to_xe(gt))); 1440 1440 xe_gt_assert(gt, xe_gt_is_main_type(gt)); ··· 1444 1442 if (config->lmem_obj) { 1445 1443 xe_bo_unpin_map_no_vm(config->lmem_obj); 1446 1444 config->lmem_obj = NULL; 1445 + return true; 1447 1446 } 1447 + return false; 1448 1448 } 1449 1449 1450 1450 static int pf_provision_vf_lmem(struct xe_gt *gt, unsigned int vfid, u64 size) ··· 1608 1604 { 1609 1605 struct xe_tile *tile = gt->tile; 1610 1606 1611 - return xe_ttm_vram_get_avail(&tile->mem.vram.ttm.manager); 1607 + return xe_ttm_vram_get_avail(&tile->mem.vram->ttm.manager); 1612 1608 } 1613 1609 1614 1610 static u64 pf_query_max_lmem(struct xe_gt *gt) ··· 2024 2020 { 2025 2021 struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid); 2026 2022 struct xe_device *xe = gt_to_xe(gt); 2023 + bool released; 2027 2024 2028 2025 if (xe_gt_is_main_type(gt)) { 2029 2026 pf_release_vf_config_ggtt(gt, config); 2030 2027 if (IS_DGFX(xe)) { 2031 - pf_release_vf_config_lmem(gt, config); 2032 - if (xe_device_has_lmtt(xe)) 2028 + released = pf_release_vf_config_lmem(gt, config); 2029 + if (released && xe_device_has_lmtt(xe)) 2033 2030 pf_update_vf_lmtt(xe, vfid); 2034 2031 } 2035 2032 }

+14

drivers/gpu/drm/xe/xe_gt_sriov_vf.c

··· 25 25 #include "xe_guc.h" 26 26 #include "xe_guc_hxg_helpers.h" 27 27 #include "xe_guc_relay.h" 28 + #include "xe_lrc.h" 28 29 #include "xe_mmio.h" 29 30 #include "xe_sriov.h" 30 31 #include "xe_sriov_vf.h" ··· 749 748 failed: 750 749 xe_gt_sriov_err(gt, "Failed to get version info (%pe)\n", ERR_PTR(err)); 751 750 return err; 751 + } 752 + 753 + /** 754 + * xe_gt_sriov_vf_default_lrcs_hwsp_rebase - Update GGTT references in HWSP of default LRCs. 755 + * @gt: the &xe_gt struct instance 756 + */ 757 + void xe_gt_sriov_vf_default_lrcs_hwsp_rebase(struct xe_gt *gt) 758 + { 759 + struct xe_hw_engine *hwe; 760 + enum xe_hw_engine_id id; 761 + 762 + for_each_hw_engine(hwe, gt, id) 763 + xe_default_lrc_update_memirq_regs_with_address(hwe); 752 764 } 753 765 754 766 /**

+1

drivers/gpu/drm/xe/xe_gt_sriov_vf.h

··· 21 21 int xe_gt_sriov_vf_query_config(struct xe_gt *gt); 22 22 int xe_gt_sriov_vf_connect(struct xe_gt *gt); 23 23 int xe_gt_sriov_vf_query_runtime(struct xe_gt *gt); 24 + void xe_gt_sriov_vf_default_lrcs_hwsp_rebase(struct xe_gt *gt); 24 25 int xe_gt_sriov_vf_notify_resfix_done(struct xe_gt *gt); 25 26 void xe_gt_sriov_vf_migrated_event_handler(struct xe_gt *gt); 26 27

-596

drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c

··· 1 - // SPDX-License-Identifier: MIT 2 - /* 3 - * Copyright © 2023 Intel Corporation 4 - */ 5 - 6 - #include "xe_gt_tlb_invalidation.h" 7 - 8 - #include "abi/guc_actions_abi.h" 9 - #include "xe_device.h" 10 - #include "xe_force_wake.h" 11 - #include "xe_gt.h" 12 - #include "xe_gt_printk.h" 13 - #include "xe_guc.h" 14 - #include "xe_guc_ct.h" 15 - #include "xe_gt_stats.h" 16 - #include "xe_mmio.h" 17 - #include "xe_pm.h" 18 - #include "xe_sriov.h" 19 - #include "xe_trace.h" 20 - #include "regs/xe_guc_regs.h" 21 - 22 - #define FENCE_STACK_BIT DMA_FENCE_FLAG_USER_BITS 23 - 24 - /* 25 - * TLB inval depends on pending commands in the CT queue and then the real 26 - * invalidation time. Double up the time to process full CT queue 27 - * just to be on the safe side. 28 - */ 29 - static long tlb_timeout_jiffies(struct xe_gt *gt) 30 - { 31 - /* this reflects what HW/GuC needs to process TLB inv request */ 32 - const long hw_tlb_timeout = HZ / 4; 33 - 34 - /* this estimates actual delay caused by the CTB transport */ 35 - long delay = xe_guc_ct_queue_proc_time_jiffies(&gt->uc.guc.ct); 36 - 37 - return hw_tlb_timeout + 2 * delay; 38 - } 39 - 40 - static void xe_gt_tlb_invalidation_fence_fini(struct xe_gt_tlb_invalidation_fence *fence) 41 - { 42 - if (WARN_ON_ONCE(!fence->gt)) 43 - return; 44 - 45 - xe_pm_runtime_put(gt_to_xe(fence->gt)); 46 - fence->gt = NULL; /* fini() should be called once */ 47 - } 48 - 49 - static void 50 - __invalidation_fence_signal(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence) 51 - { 52 - bool stack = test_bit(FENCE_STACK_BIT, &fence->base.flags); 53 - 54 - trace_xe_gt_tlb_invalidation_fence_signal(xe, fence); 55 - xe_gt_tlb_invalidation_fence_fini(fence); 56 - dma_fence_signal(&fence->base); 57 - if (!stack) 58 - dma_fence_put(&fence->base); 59 - } 60 - 61 - static void 62 - invalidation_fence_signal(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence) 63 - { 64 - list_del(&fence->link); 65 - __invalidation_fence_signal(xe, fence); 66 - } 67 - 68 - void xe_gt_tlb_invalidation_fence_signal(struct xe_gt_tlb_invalidation_fence *fence) 69 - { 70 - if (WARN_ON_ONCE(!fence->gt)) 71 - return; 72 - 73 - __invalidation_fence_signal(gt_to_xe(fence->gt), fence); 74 - } 75 - 76 - static void xe_gt_tlb_fence_timeout(struct work_struct *work) 77 - { 78 - struct xe_gt *gt = container_of(work, struct xe_gt, 79 - tlb_invalidation.fence_tdr.work); 80 - struct xe_device *xe = gt_to_xe(gt); 81 - struct xe_gt_tlb_invalidation_fence *fence, *next; 82 - 83 - LNL_FLUSH_WORK(&gt->uc.guc.ct.g2h_worker); 84 - 85 - spin_lock_irq(&gt->tlb_invalidation.pending_lock); 86 - list_for_each_entry_safe(fence, next, 87 - &gt->tlb_invalidation.pending_fences, link) { 88 - s64 since_inval_ms = ktime_ms_delta(ktime_get(), 89 - fence->invalidation_time); 90 - 91 - if (msecs_to_jiffies(since_inval_ms) < tlb_timeout_jiffies(gt)) 92 - break; 93 - 94 - trace_xe_gt_tlb_invalidation_fence_timeout(xe, fence); 95 - xe_gt_err(gt, "TLB invalidation fence timeout, seqno=%d recv=%d", 96 - fence->seqno, gt->tlb_invalidation.seqno_recv); 97 - 98 - fence->base.error = -ETIME; 99 - invalidation_fence_signal(xe, fence); 100 - } 101 - if (!list_empty(&gt->tlb_invalidation.pending_fences)) 102 - queue_delayed_work(system_wq, 103 - &gt->tlb_invalidation.fence_tdr, 104 - tlb_timeout_jiffies(gt)); 105 - spin_unlock_irq(&gt->tlb_invalidation.pending_lock); 106 - } 107 - 108 - /** 109 - * xe_gt_tlb_invalidation_init_early - Initialize GT TLB invalidation state 110 - * @gt: GT structure 111 - * 112 - * Initialize GT TLB invalidation state, purely software initialization, should 113 - * be called once during driver load. 114 - * 115 - * Return: 0 on success, negative error code on error. 116 - */ 117 - int xe_gt_tlb_invalidation_init_early(struct xe_gt *gt) 118 - { 119 - gt->tlb_invalidation.seqno = 1; 120 - INIT_LIST_HEAD(&gt->tlb_invalidation.pending_fences); 121 - spin_lock_init(&gt->tlb_invalidation.pending_lock); 122 - spin_lock_init(&gt->tlb_invalidation.lock); 123 - INIT_DELAYED_WORK(&gt->tlb_invalidation.fence_tdr, 124 - xe_gt_tlb_fence_timeout); 125 - 126 - return 0; 127 - } 128 - 129 - /** 130 - * xe_gt_tlb_invalidation_reset - Initialize GT TLB invalidation reset 131 - * @gt: GT structure 132 - * 133 - * Signal any pending invalidation fences, should be called during a GT reset 134 - */ 135 - void xe_gt_tlb_invalidation_reset(struct xe_gt *gt) 136 - { 137 - struct xe_gt_tlb_invalidation_fence *fence, *next; 138 - int pending_seqno; 139 - 140 - /* 141 - * we can get here before the CTs are even initialized if we're wedging 142 - * very early, in which case there are not going to be any pending 143 - * fences so we can bail immediately. 144 - */ 145 - if (!xe_guc_ct_initialized(&gt->uc.guc.ct)) 146 - return; 147 - 148 - /* 149 - * CT channel is already disabled at this point. No new TLB requests can 150 - * appear. 151 - */ 152 - 153 - mutex_lock(&gt->uc.guc.ct.lock); 154 - spin_lock_irq(&gt->tlb_invalidation.pending_lock); 155 - cancel_delayed_work(&gt->tlb_invalidation.fence_tdr); 156 - /* 157 - * We might have various kworkers waiting for TLB flushes to complete 158 - * which are not tracked with an explicit TLB fence, however at this 159 - * stage that will never happen since the CT is already disabled, so 160 - * make sure we signal them here under the assumption that we have 161 - * completed a full GT reset. 162 - */ 163 - if (gt->tlb_invalidation.seqno == 1) 164 - pending_seqno = TLB_INVALIDATION_SEQNO_MAX - 1; 165 - else 166 - pending_seqno = gt->tlb_invalidation.seqno - 1; 167 - WRITE_ONCE(gt->tlb_invalidation.seqno_recv, pending_seqno); 168 - 169 - list_for_each_entry_safe(fence, next, 170 - &gt->tlb_invalidation.pending_fences, link) 171 - invalidation_fence_signal(gt_to_xe(gt), fence); 172 - spin_unlock_irq(&gt->tlb_invalidation.pending_lock); 173 - mutex_unlock(&gt->uc.guc.ct.lock); 174 - } 175 - 176 - static bool tlb_invalidation_seqno_past(struct xe_gt *gt, int seqno) 177 - { 178 - int seqno_recv = READ_ONCE(gt->tlb_invalidation.seqno_recv); 179 - 180 - if (seqno - seqno_recv < -(TLB_INVALIDATION_SEQNO_MAX / 2)) 181 - return false; 182 - 183 - if (seqno - seqno_recv > (TLB_INVALIDATION_SEQNO_MAX / 2)) 184 - return true; 185 - 186 - return seqno_recv >= seqno; 187 - } 188 - 189 - static int send_tlb_invalidation(struct xe_guc *guc, 190 - struct xe_gt_tlb_invalidation_fence *fence, 191 - u32 *action, int len) 192 - { 193 - struct xe_gt *gt = guc_to_gt(guc); 194 - struct xe_device *xe = gt_to_xe(gt); 195 - int seqno; 196 - int ret; 197 - 198 - xe_gt_assert(gt, fence); 199 - 200 - /* 201 - * XXX: The seqno algorithm relies on TLB invalidation being processed 202 - * in order which they currently are, if that changes the algorithm will 203 - * need to be updated. 204 - */ 205 - 206 - mutex_lock(&guc->ct.lock); 207 - seqno = gt->tlb_invalidation.seqno; 208 - fence->seqno = seqno; 209 - trace_xe_gt_tlb_invalidation_fence_send(xe, fence); 210 - action[1] = seqno; 211 - ret = xe_guc_ct_send_locked(&guc->ct, action, len, 212 - G2H_LEN_DW_TLB_INVALIDATE, 1); 213 - if (!ret) { 214 - spin_lock_irq(&gt->tlb_invalidation.pending_lock); 215 - /* 216 - * We haven't actually published the TLB fence as per 217 - * pending_fences, but in theory our seqno could have already 218 - * been written as we acquired the pending_lock. In such a case 219 - * we can just go ahead and signal the fence here. 220 - */ 221 - if (tlb_invalidation_seqno_past(gt, seqno)) { 222 - __invalidation_fence_signal(xe, fence); 223 - } else { 224 - fence->invalidation_time = ktime_get(); 225 - list_add_tail(&fence->link, 226 - &gt->tlb_invalidation.pending_fences); 227 - 228 - if (list_is_singular(&gt->tlb_invalidation.pending_fences)) 229 - queue_delayed_work(system_wq, 230 - &gt->tlb_invalidation.fence_tdr, 231 - tlb_timeout_jiffies(gt)); 232 - } 233 - spin_unlock_irq(&gt->tlb_invalidation.pending_lock); 234 - } else { 235 - __invalidation_fence_signal(xe, fence); 236 - } 237 - if (!ret) { 238 - gt->tlb_invalidation.seqno = (gt->tlb_invalidation.seqno + 1) % 239 - TLB_INVALIDATION_SEQNO_MAX; 240 - if (!gt->tlb_invalidation.seqno) 241 - gt->tlb_invalidation.seqno = 1; 242 - } 243 - mutex_unlock(&guc->ct.lock); 244 - xe_gt_stats_incr(gt, XE_GT_STATS_ID_TLB_INVAL, 1); 245 - 246 - return ret; 247 - } 248 - 249 - #define MAKE_INVAL_OP(type) ((type << XE_GUC_TLB_INVAL_TYPE_SHIFT) | \ 250 - XE_GUC_TLB_INVAL_MODE_HEAVY << XE_GUC_TLB_INVAL_MODE_SHIFT | \ 251 - XE_GUC_TLB_INVAL_FLUSH_CACHE) 252 - 253 - /** 254 - * xe_gt_tlb_invalidation_guc - Issue a TLB invalidation on this GT for the GuC 255 - * @gt: GT structure 256 - * @fence: invalidation fence which will be signal on TLB invalidation 257 - * completion 258 - * 259 - * Issue a TLB invalidation for the GuC. Completion of TLB is asynchronous and 260 - * caller can use the invalidation fence to wait for completion. 261 - * 262 - * Return: 0 on success, negative error code on error 263 - */ 264 - static int xe_gt_tlb_invalidation_guc(struct xe_gt *gt, 265 - struct xe_gt_tlb_invalidation_fence *fence) 266 - { 267 - u32 action[] = { 268 - XE_GUC_ACTION_TLB_INVALIDATION, 269 - 0, /* seqno, replaced in send_tlb_invalidation */ 270 - MAKE_INVAL_OP(XE_GUC_TLB_INVAL_GUC), 271 - }; 272 - int ret; 273 - 274 - ret = send_tlb_invalidation(&gt->uc.guc, fence, action, 275 - ARRAY_SIZE(action)); 276 - /* 277 - * -ECANCELED indicates the CT is stopped for a GT reset. TLB caches 278 - * should be nuked on a GT reset so this error can be ignored. 279 - */ 280 - if (ret == -ECANCELED) 281 - return 0; 282 - 283 - return ret; 284 - } 285 - 286 - /** 287 - * xe_gt_tlb_invalidation_ggtt - Issue a TLB invalidation on this GT for the GGTT 288 - * @gt: GT structure 289 - * 290 - * Issue a TLB invalidation for the GGTT. Completion of TLB invalidation is 291 - * synchronous. 292 - * 293 - * Return: 0 on success, negative error code on error 294 - */ 295 - int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt) 296 - { 297 - struct xe_device *xe = gt_to_xe(gt); 298 - unsigned int fw_ref; 299 - 300 - if (xe_guc_ct_enabled(&gt->uc.guc.ct) && 301 - gt->uc.guc.submission_state.enabled) { 302 - struct xe_gt_tlb_invalidation_fence fence; 303 - int ret; 304 - 305 - xe_gt_tlb_invalidation_fence_init(gt, &fence, true); 306 - ret = xe_gt_tlb_invalidation_guc(gt, &fence); 307 - if (ret) 308 - return ret; 309 - 310 - xe_gt_tlb_invalidation_fence_wait(&fence); 311 - } else if (xe_device_uc_enabled(xe) && !xe_device_wedged(xe)) { 312 - struct xe_mmio *mmio = &gt->mmio; 313 - 314 - if (IS_SRIOV_VF(xe)) 315 - return 0; 316 - 317 - fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); 318 - if (xe->info.platform == XE_PVC || GRAPHICS_VER(xe) >= 20) { 319 - xe_mmio_write32(mmio, PVC_GUC_TLB_INV_DESC1, 320 - PVC_GUC_TLB_INV_DESC1_INVALIDATE); 321 - xe_mmio_write32(mmio, PVC_GUC_TLB_INV_DESC0, 322 - PVC_GUC_TLB_INV_DESC0_VALID); 323 - } else { 324 - xe_mmio_write32(mmio, GUC_TLB_INV_CR, 325 - GUC_TLB_INV_CR_INVALIDATE); 326 - } 327 - xe_force_wake_put(gt_to_fw(gt), fw_ref); 328 - } 329 - 330 - return 0; 331 - } 332 - 333 - static int send_tlb_invalidation_all(struct xe_gt *gt, 334 - struct xe_gt_tlb_invalidation_fence *fence) 335 - { 336 - u32 action[] = { 337 - XE_GUC_ACTION_TLB_INVALIDATION_ALL, 338 - 0, /* seqno, replaced in send_tlb_invalidation */ 339 - MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL), 340 - }; 341 - 342 - return send_tlb_invalidation(&gt->uc.guc, fence, action, ARRAY_SIZE(action)); 343 - } 344 - 345 - /** 346 - * xe_gt_tlb_invalidation_all - Invalidate all TLBs across PF and all VFs. 347 - * @gt: the &xe_gt structure 348 - * @fence: the &xe_gt_tlb_invalidation_fence to be signaled on completion 349 - * 350 - * Send a request to invalidate all TLBs across PF and all VFs. 351 - * 352 - * Return: 0 on success, negative error code on error 353 - */ 354 - int xe_gt_tlb_invalidation_all(struct xe_gt *gt, struct xe_gt_tlb_invalidation_fence *fence) 355 - { 356 - int err; 357 - 358 - xe_gt_assert(gt, gt == fence->gt); 359 - 360 - err = send_tlb_invalidation_all(gt, fence); 361 - if (err) 362 - xe_gt_err(gt, "TLB invalidation request failed (%pe)", ERR_PTR(err)); 363 - 364 - return err; 365 - } 366 - 367 - /* 368 - * Ensure that roundup_pow_of_two(length) doesn't overflow. 369 - * Note that roundup_pow_of_two() operates on unsigned long, 370 - * not on u64. 371 - */ 372 - #define MAX_RANGE_TLB_INVALIDATION_LENGTH (rounddown_pow_of_two(ULONG_MAX)) 373 - 374 - /** 375 - * xe_gt_tlb_invalidation_range - Issue a TLB invalidation on this GT for an 376 - * address range 377 - * 378 - * @gt: GT structure 379 - * @fence: invalidation fence which will be signal on TLB invalidation 380 - * completion 381 - * @start: start address 382 - * @end: end address 383 - * @asid: address space id 384 - * 385 - * Issue a range based TLB invalidation if supported, if not fallback to a full 386 - * TLB invalidation. Completion of TLB is asynchronous and caller can use 387 - * the invalidation fence to wait for completion. 388 - * 389 - * Return: Negative error code on error, 0 on success 390 - */ 391 - int xe_gt_tlb_invalidation_range(struct xe_gt *gt, 392 - struct xe_gt_tlb_invalidation_fence *fence, 393 - u64 start, u64 end, u32 asid) 394 - { 395 - struct xe_device *xe = gt_to_xe(gt); 396 - #define MAX_TLB_INVALIDATION_LEN 7 397 - u32 action[MAX_TLB_INVALIDATION_LEN]; 398 - u64 length = end - start; 399 - int len = 0; 400 - 401 - xe_gt_assert(gt, fence); 402 - 403 - /* Execlists not supported */ 404 - if (gt_to_xe(gt)->info.force_execlist) { 405 - __invalidation_fence_signal(xe, fence); 406 - return 0; 407 - } 408 - 409 - action[len++] = XE_GUC_ACTION_TLB_INVALIDATION; 410 - action[len++] = 0; /* seqno, replaced in send_tlb_invalidation */ 411 - if (!xe->info.has_range_tlb_invalidation || 412 - length > MAX_RANGE_TLB_INVALIDATION_LENGTH) { 413 - action[len++] = MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL); 414 - } else { 415 - u64 orig_start = start; 416 - u64 align; 417 - 418 - if (length < SZ_4K) 419 - length = SZ_4K; 420 - 421 - /* 422 - * We need to invalidate a higher granularity if start address 423 - * is not aligned to length. When start is not aligned with 424 - * length we need to find the length large enough to create an 425 - * address mask covering the required range. 426 - */ 427 - align = roundup_pow_of_two(length); 428 - start = ALIGN_DOWN(start, align); 429 - end = ALIGN(end, align); 430 - length = align; 431 - while (start + length < end) { 432 - length <<= 1; 433 - start = ALIGN_DOWN(orig_start, length); 434 - } 435 - 436 - /* 437 - * Minimum invalidation size for a 2MB page that the hardware 438 - * expects is 16MB 439 - */ 440 - if (length >= SZ_2M) { 441 - length = max_t(u64, SZ_16M, length); 442 - start = ALIGN_DOWN(orig_start, length); 443 - } 444 - 445 - xe_gt_assert(gt, length >= SZ_4K); 446 - xe_gt_assert(gt, is_power_of_2(length)); 447 - xe_gt_assert(gt, !(length & GENMASK(ilog2(SZ_16M) - 1, 448 - ilog2(SZ_2M) + 1))); 449 - xe_gt_assert(gt, IS_ALIGNED(start, length)); 450 - 451 - action[len++] = MAKE_INVAL_OP(XE_GUC_TLB_INVAL_PAGE_SELECTIVE); 452 - action[len++] = asid; 453 - action[len++] = lower_32_bits(start); 454 - action[len++] = upper_32_bits(start); 455 - action[len++] = ilog2(length) - ilog2(SZ_4K); 456 - } 457 - 458 - xe_gt_assert(gt, len <= MAX_TLB_INVALIDATION_LEN); 459 - 460 - return send_tlb_invalidation(&gt->uc.guc, fence, action, len); 461 - } 462 - 463 - /** 464 - * xe_gt_tlb_invalidation_vm - Issue a TLB invalidation on this GT for a VM 465 - * @gt: graphics tile 466 - * @vm: VM to invalidate 467 - * 468 - * Invalidate entire VM's address space 469 - */ 470 - void xe_gt_tlb_invalidation_vm(struct xe_gt *gt, struct xe_vm *vm) 471 - { 472 - struct xe_gt_tlb_invalidation_fence fence; 473 - u64 range = 1ull << vm->xe->info.va_bits; 474 - int ret; 475 - 476 - xe_gt_tlb_invalidation_fence_init(gt, &fence, true); 477 - 478 - ret = xe_gt_tlb_invalidation_range(gt, &fence, 0, range, vm->usm.asid); 479 - if (ret < 0) 480 - return; 481 - 482 - xe_gt_tlb_invalidation_fence_wait(&fence); 483 - } 484 - 485 - /** 486 - * xe_guc_tlb_invalidation_done_handler - TLB invalidation done handler 487 - * @guc: guc 488 - * @msg: message indicating TLB invalidation done 489 - * @len: length of message 490 - * 491 - * Parse seqno of TLB invalidation, wake any waiters for seqno, and signal any 492 - * invalidation fences for seqno. Algorithm for this depends on seqno being 493 - * received in-order and asserts this assumption. 494 - * 495 - * Return: 0 on success, -EPROTO for malformed messages. 496 - */ 497 - int xe_guc_tlb_invalidation_done_handler(struct xe_guc *guc, u32 *msg, u32 len) 498 - { 499 - struct xe_gt *gt = guc_to_gt(guc); 500 - struct xe_device *xe = gt_to_xe(gt); 501 - struct xe_gt_tlb_invalidation_fence *fence, *next; 502 - unsigned long flags; 503 - 504 - if (unlikely(len != 1)) 505 - return -EPROTO; 506 - 507 - /* 508 - * This can also be run both directly from the IRQ handler and also in 509 - * process_g2h_msg(). Only one may process any individual CT message, 510 - * however the order they are processed here could result in skipping a 511 - * seqno. To handle that we just process all the seqnos from the last 512 - * seqno_recv up to and including the one in msg[0]. The delta should be 513 - * very small so there shouldn't be much of pending_fences we actually 514 - * need to iterate over here. 515 - * 516 - * From GuC POV we expect the seqnos to always appear in-order, so if we 517 - * see something later in the timeline we can be sure that anything 518 - * appearing earlier has already signalled, just that we have yet to 519 - * officially process the CT message like if racing against 520 - * process_g2h_msg(). 521 - */ 522 - spin_lock_irqsave(&gt->tlb_invalidation.pending_lock, flags); 523 - if (tlb_invalidation_seqno_past(gt, msg[0])) { 524 - spin_unlock_irqrestore(&gt->tlb_invalidation.pending_lock, flags); 525 - return 0; 526 - } 527 - 528 - WRITE_ONCE(gt->tlb_invalidation.seqno_recv, msg[0]); 529 - 530 - list_for_each_entry_safe(fence, next, 531 - &gt->tlb_invalidation.pending_fences, link) { 532 - trace_xe_gt_tlb_invalidation_fence_recv(xe, fence); 533 - 534 - if (!tlb_invalidation_seqno_past(gt, fence->seqno)) 535 - break; 536 - 537 - invalidation_fence_signal(xe, fence); 538 - } 539 - 540 - if (!list_empty(&gt->tlb_invalidation.pending_fences)) 541 - mod_delayed_work(system_wq, 542 - &gt->tlb_invalidation.fence_tdr, 543 - tlb_timeout_jiffies(gt)); 544 - else 545 - cancel_delayed_work(&gt->tlb_invalidation.fence_tdr); 546 - 547 - spin_unlock_irqrestore(&gt->tlb_invalidation.pending_lock, flags); 548 - 549 - return 0; 550 - } 551 - 552 - static const char * 553 - invalidation_fence_get_driver_name(struct dma_fence *dma_fence) 554 - { 555 - return "xe"; 556 - } 557 - 558 - static const char * 559 - invalidation_fence_get_timeline_name(struct dma_fence *dma_fence) 560 - { 561 - return "invalidation_fence"; 562 - } 563 - 564 - static const struct dma_fence_ops invalidation_fence_ops = { 565 - .get_driver_name = invalidation_fence_get_driver_name, 566 - .get_timeline_name = invalidation_fence_get_timeline_name, 567 - }; 568 - 569 - /** 570 - * xe_gt_tlb_invalidation_fence_init - Initialize TLB invalidation fence 571 - * @gt: GT 572 - * @fence: TLB invalidation fence to initialize 573 - * @stack: fence is stack variable 574 - * 575 - * Initialize TLB invalidation fence for use. xe_gt_tlb_invalidation_fence_fini 576 - * will be automatically called when fence is signalled (all fences must signal), 577 - * even on error. 578 - */ 579 - void xe_gt_tlb_invalidation_fence_init(struct xe_gt *gt, 580 - struct xe_gt_tlb_invalidation_fence *fence, 581 - bool stack) 582 - { 583 - xe_pm_runtime_get_noresume(gt_to_xe(gt)); 584 - 585 - spin_lock_irq(&gt->tlb_invalidation.lock); 586 - dma_fence_init(&fence->base, &invalidation_fence_ops, 587 - &gt->tlb_invalidation.lock, 588 - dma_fence_context_alloc(1), 1); 589 - spin_unlock_irq(&gt->tlb_invalidation.lock); 590 - INIT_LIST_HEAD(&fence->link); 591 - if (stack) 592 - set_bit(FENCE_STACK_BIT, &fence->base.flags); 593 - else 594 - dma_fence_get(&fence->base); 595 - fence->gt = gt; 596 - }

-40

drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h

··· 1 - /* SPDX-License-Identifier: MIT */ 2 - /* 3 - * Copyright © 2023 Intel Corporation 4 - */ 5 - 6 - #ifndef _XE_GT_TLB_INVALIDATION_H_ 7 - #define _XE_GT_TLB_INVALIDATION_H_ 8 - 9 - #include <linux/types.h> 10 - 11 - #include "xe_gt_tlb_invalidation_types.h" 12 - 13 - struct xe_gt; 14 - struct xe_guc; 15 - struct xe_vm; 16 - struct xe_vma; 17 - 18 - int xe_gt_tlb_invalidation_init_early(struct xe_gt *gt); 19 - 20 - void xe_gt_tlb_invalidation_reset(struct xe_gt *gt); 21 - int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt); 22 - void xe_gt_tlb_invalidation_vm(struct xe_gt *gt, struct xe_vm *vm); 23 - int xe_gt_tlb_invalidation_all(struct xe_gt *gt, struct xe_gt_tlb_invalidation_fence *fence); 24 - int xe_gt_tlb_invalidation_range(struct xe_gt *gt, 25 - struct xe_gt_tlb_invalidation_fence *fence, 26 - u64 start, u64 end, u32 asid); 27 - int xe_guc_tlb_invalidation_done_handler(struct xe_guc *guc, u32 *msg, u32 len); 28 - 29 - void xe_gt_tlb_invalidation_fence_init(struct xe_gt *gt, 30 - struct xe_gt_tlb_invalidation_fence *fence, 31 - bool stack); 32 - void xe_gt_tlb_invalidation_fence_signal(struct xe_gt_tlb_invalidation_fence *fence); 33 - 34 - static inline void 35 - xe_gt_tlb_invalidation_fence_wait(struct xe_gt_tlb_invalidation_fence *fence) 36 - { 37 - dma_fence_wait(&fence->base, false); 38 - } 39 - 40 - #endif /* _XE_GT_TLB_INVALIDATION_ */

-32

drivers/gpu/drm/xe/xe_gt_tlb_invalidation_types.h

··· 1 - /* SPDX-License-Identifier: MIT */ 2 - /* 3 - * Copyright © 2023 Intel Corporation 4 - */ 5 - 6 - #ifndef _XE_GT_TLB_INVALIDATION_TYPES_H_ 7 - #define _XE_GT_TLB_INVALIDATION_TYPES_H_ 8 - 9 - #include <linux/dma-fence.h> 10 - 11 - struct xe_gt; 12 - 13 - /** 14 - * struct xe_gt_tlb_invalidation_fence - XE GT TLB invalidation fence 15 - * 16 - * Optionally passed to xe_gt_tlb_invalidation and will be signaled upon TLB 17 - * invalidation completion. 18 - */ 19 - struct xe_gt_tlb_invalidation_fence { 20 - /** @base: dma fence base */ 21 - struct dma_fence base; 22 - /** @gt: GT which fence belong to */ 23 - struct xe_gt *gt; 24 - /** @link: link into list of pending tlb fences */ 25 - struct list_head link; 26 - /** @seqno: seqno of TLB invalidation to signal fence one */ 27 - int seqno; 28 - /** @invalidation_time: time of TLB invalidation */ 29 - ktime_t invalidation_time; 30 - }; 31 - 32 - #endif

+1 -1

drivers/gpu/drm/xe/xe_gt_topology.c

··· 138 138 * but there's no tracking number assigned yet so we use a custom 139 139 * OOB workaround descriptor. 140 140 */ 141 - if (XE_WA(gt, no_media_l3)) 141 + if (XE_GT_WA(gt, no_media_l3)) 142 142 return; 143 143 144 144 if (GRAPHICS_VER(xe) >= 30) {

+4 -29

drivers/gpu/drm/xe/xe_gt_types.h

··· 17 17 #include "xe_oa_types.h" 18 18 #include "xe_reg_sr_types.h" 19 19 #include "xe_sa_types.h" 20 + #include "xe_tlb_inval_types.h" 20 21 #include "xe_uc_types.h" 21 22 22 23 struct xe_exec_queue_ops; ··· 186 185 struct work_struct worker; 187 186 } reset; 188 187 189 - /** @tlb_invalidation: TLB invalidation state */ 190 - struct { 191 - /** @tlb_invalidation.seqno: TLB invalidation seqno, protected by CT lock */ 192 - #define TLB_INVALIDATION_SEQNO_MAX 0x100000 193 - int seqno; 194 - /** 195 - * @tlb_invalidation.seqno_recv: last received TLB invalidation seqno, 196 - * protected by CT lock 197 - */ 198 - int seqno_recv; 199 - /** 200 - * @tlb_invalidation.pending_fences: list of pending fences waiting TLB 201 - * invaliations, protected by CT lock 202 - */ 203 - struct list_head pending_fences; 204 - /** 205 - * @tlb_invalidation.pending_lock: protects @tlb_invalidation.pending_fences 206 - * and updating @tlb_invalidation.seqno_recv. 207 - */ 208 - spinlock_t pending_lock; 209 - /** 210 - * @tlb_invalidation.fence_tdr: schedules a delayed call to 211 - * xe_gt_tlb_fence_timeout after the timeut interval is over. 212 - */ 213 - struct delayed_work fence_tdr; 214 - /** @tlb_invalidation.lock: protects TLB invalidation fences */ 215 - spinlock_t lock; 216 - } tlb_invalidation; 188 + /** @tlb_inval: TLB invalidation state */ 189 + struct xe_tlb_inval tlb_inval; 217 190 218 191 /** 219 192 * @ccs_mode: Number of compute engines enabled. ··· 386 411 unsigned long *oob; 387 412 /** 388 413 * @wa_active.oob_initialized: mark oob as initialized to help 389 - * detecting misuse of XE_WA() - it can only be called on 414 + * detecting misuse of XE_GT_WA() - it can only be called on 390 415 * initialization after OOB WAs have being processed 391 416 */ 392 417 bool oob_initialized;

+33 -10

drivers/gpu/drm/xe/xe_guc.c

··· 16 16 #include "regs/xe_guc_regs.h" 17 17 #include "regs/xe_irq_regs.h" 18 18 #include "xe_bo.h" 19 + #include "xe_configfs.h" 19 20 #include "xe_device.h" 20 21 #include "xe_force_wake.h" 21 22 #include "xe_gt.h" ··· 82 81 83 82 static u32 guc_ctl_feature_flags(struct xe_guc *guc) 84 83 { 84 + struct xe_device *xe = guc_to_xe(guc); 85 85 u32 flags = GUC_CTL_ENABLE_LITE_RESTORE; 86 86 87 - if (!guc_to_xe(guc)->info.skip_guc_pc) 87 + if (!xe->info.skip_guc_pc) 88 88 flags |= GUC_CTL_ENABLE_SLPC; 89 + 90 + if (xe_configfs_get_psmi_enabled(to_pci_dev(xe->drm.dev))) 91 + flags |= GUC_CTL_ENABLE_PSMI_LOGGING; 89 92 90 93 return flags; 91 94 } ··· 162 157 * on RCS and CCSes with different address spaces, which on DG2 is 163 158 * required as a WA for an HW bug. 164 159 */ 165 - if (XE_WA(gt, 22011391025)) 160 + if (XE_GT_WA(gt, 22011391025)) 166 161 return true; 167 162 168 163 /* ··· 189 184 struct xe_gt *gt = guc_to_gt(guc); 190 185 u32 flags = 0; 191 186 192 - if (XE_WA(gt, 22012773006)) 187 + if (XE_GT_WA(gt, 22012773006)) 193 188 flags |= GUC_WA_POLLCS; 194 189 195 - if (XE_WA(gt, 14014475959)) 190 + if (XE_GT_WA(gt, 14014475959)) 196 191 flags |= GUC_WA_HOLD_CCS_SWITCHOUT; 197 192 198 193 if (needs_wa_dual_queue(gt)) ··· 206 201 if (GRAPHICS_VERx100(xe) < 1270) 207 202 flags |= GUC_WA_PRE_PARSER; 208 203 209 - if (XE_WA(gt, 22012727170) || XE_WA(gt, 22012727685)) 204 + if (XE_GT_WA(gt, 22012727170) || XE_GT_WA(gt, 22012727685)) 210 205 flags |= GUC_WA_CONTEXT_ISOLATION; 211 206 212 - if (XE_WA(gt, 18020744125) && 207 + if (XE_GT_WA(gt, 18020744125) && 213 208 !xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_RENDER)) 214 209 flags |= GUC_WA_RCS_REGS_IN_CCS_REGS_LIST; 215 210 216 - if (XE_WA(gt, 1509372804)) 211 + if (XE_GT_WA(gt, 1509372804)) 217 212 flags |= GUC_WA_RENDER_RST_RC6_EXIT; 218 213 219 - if (XE_WA(gt, 14018913170)) 214 + if (XE_GT_WA(gt, 14018913170)) 220 215 flags |= GUC_WA_ENABLE_TSC_CHECK_ON_RC6; 216 + 217 + if (XE_GT_WA(gt, 16023683509)) 218 + flags |= GUC_WA_SAVE_RESTORE_MCFG_REG_AT_MC6; 221 219 222 220 return flags; 223 221 } ··· 1000 992 case XE_GUC_LOAD_STATUS_GUC_PREPROD_BUILD_MISMATCH: 1001 993 case XE_GUC_LOAD_STATUS_ERROR_DEVID_INVALID_GUCTYPE: 1002 994 case XE_GUC_LOAD_STATUS_HWCONFIG_ERROR: 995 + case XE_GUC_LOAD_STATUS_BOOTROM_VERSION_MISMATCH: 1003 996 case XE_GUC_LOAD_STATUS_DPC_ERROR: 1004 997 case XE_GUC_LOAD_STATUS_EXCEPTION: 1005 998 case XE_GUC_LOAD_STATUS_INIT_DATA_INVALID: 1006 999 case XE_GUC_LOAD_STATUS_MPU_DATA_INVALID: 1007 1000 case XE_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID: 1001 + case XE_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR: 1002 + case XE_GUC_LOAD_STATUS_INVALID_FTR_FLAG: 1008 1003 return -1; 1009 1004 } 1010 1005 ··· 1147 1136 } 1148 1137 1149 1138 switch (ukernel) { 1139 + case XE_GUC_LOAD_STATUS_HWCONFIG_START: 1140 + xe_gt_err(gt, "still extracting hwconfig table.\n"); 1141 + break; 1142 + 1150 1143 case XE_GUC_LOAD_STATUS_EXCEPTION: 1151 1144 xe_gt_err(gt, "firmware exception. EIP: %#x\n", 1152 1145 xe_mmio_read32(mmio, SOFT_SCRATCH(13))); 1146 + break; 1147 + 1148 + case XE_GUC_LOAD_STATUS_INIT_DATA_INVALID: 1149 + xe_gt_err(gt, "illegal init/ADS data\n"); 1153 1150 break; 1154 1151 1155 1152 case XE_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID: 1156 1153 xe_gt_err(gt, "illegal register in save/restore workaround list\n"); 1157 1154 break; 1158 1155 1159 - case XE_GUC_LOAD_STATUS_HWCONFIG_START: 1160 - xe_gt_err(gt, "still extracting hwconfig table.\n"); 1156 + case XE_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR: 1157 + xe_gt_err(gt, "illegal workaround KLV data\n"); 1158 + break; 1159 + 1160 + case XE_GUC_LOAD_STATUS_INVALID_FTR_FLAG: 1161 + xe_gt_err(gt, "illegal feature flag specified\n"); 1161 1162 break; 1162 1163 } 1163 1164

+52 -71

drivers/gpu/drm/xe/xe_guc_ads.c

··· 247 247 248 248 count += ADS_REGSET_EXTRA_MAX * XE_NUM_HW_ENGINES; 249 249 250 - if (XE_WA(gt, 1607983814)) 250 + if (XE_GT_WA(gt, 1607983814)) 251 251 count += LNCFCMOCS_REG_COUNT; 252 252 253 253 return count * sizeof(struct guc_mmio_reg); ··· 284 284 return total_size; 285 285 } 286 286 287 - static void guc_waklv_enable_one_word(struct xe_guc_ads *ads, 288 - enum xe_guc_klv_ids klv_id, 289 - u32 value, 290 - u32 *offset, u32 *remain) 287 + static void guc_waklv_enable(struct xe_guc_ads *ads, 288 + u32 data[], u32 data_len_dw, 289 + u32 *offset, u32 *remain, 290 + enum xe_guc_klv_ids klv_id) 291 291 { 292 - u32 size; 293 - u32 klv_entry[] = { 294 - /* 16:16 key/length */ 295 - FIELD_PREP(GUC_KLV_0_KEY, klv_id) | 296 - FIELD_PREP(GUC_KLV_0_LEN, 1), 297 - value, 298 - /* 1 dword data */ 299 - }; 300 - 301 - size = sizeof(klv_entry); 292 + size_t size = sizeof(u32) * (1 + data_len_dw); 302 293 303 294 if (*remain < size) { 304 295 drm_warn(&ads_to_xe(ads)->drm, 305 - "w/a klv buffer too small to add klv id %d\n", klv_id); 306 - } else { 307 - xe_map_memcpy_to(ads_to_xe(ads), ads_to_map(ads), *offset, 308 - klv_entry, size); 309 - *offset += size; 310 - *remain -= size; 311 - } 312 - } 313 - 314 - static void guc_waklv_enable_simple(struct xe_guc_ads *ads, 315 - enum xe_guc_klv_ids klv_id, u32 *offset, u32 *remain) 316 - { 317 - u32 klv_entry[] = { 318 - /* 16:16 key/length */ 319 - FIELD_PREP(GUC_KLV_0_KEY, klv_id) | 320 - FIELD_PREP(GUC_KLV_0_LEN, 0), 321 - /* 0 dwords data */ 322 - }; 323 - u32 size; 324 - 325 - size = sizeof(klv_entry); 326 - 327 - if (xe_gt_WARN(ads_to_gt(ads), *remain < size, 328 - "w/a klv buffer too small to add klv id %d\n", klv_id)) 296 + "w/a klv buffer too small to add klv id 0x%04X\n", klv_id); 329 297 return; 298 + } 330 299 331 - xe_map_memcpy_to(ads_to_xe(ads), ads_to_map(ads), *offset, 332 - klv_entry, size); 300 + /* 16:16 key/length */ 301 + xe_map_wr(ads_to_xe(ads), ads_to_map(ads), *offset, u32, 302 + FIELD_PREP(GUC_KLV_0_KEY, klv_id) | FIELD_PREP(GUC_KLV_0_LEN, data_len_dw)); 303 + /* data_len_dw dwords of data */ 304 + xe_map_memcpy_to(ads_to_xe(ads), ads_to_map(ads), 305 + *offset + sizeof(u32), data, data_len_dw * sizeof(u32)); 306 + 333 307 *offset += size; 334 308 *remain -= size; 335 309 } ··· 317 343 offset = guc_ads_waklv_offset(ads); 318 344 remain = guc_ads_waklv_size(ads); 319 345 320 - if (XE_WA(gt, 14019882105) || XE_WA(gt, 16021333562)) 321 - guc_waklv_enable_simple(ads, 322 - GUC_WORKAROUND_KLV_BLOCK_INTERRUPTS_WHEN_MGSR_BLOCKED, 323 - &offset, &remain); 324 - if (XE_WA(gt, 18024947630)) 325 - guc_waklv_enable_simple(ads, 326 - GUC_WORKAROUND_KLV_ID_GAM_PFQ_SHADOW_TAIL_POLLING, 327 - &offset, &remain); 328 - if (XE_WA(gt, 16022287689)) 329 - guc_waklv_enable_simple(ads, 330 - GUC_WORKAROUND_KLV_ID_DISABLE_MTP_DURING_ASYNC_COMPUTE, 331 - &offset, &remain); 346 + if (XE_GT_WA(gt, 14019882105) || XE_GT_WA(gt, 16021333562)) 347 + guc_waklv_enable(ads, NULL, 0, &offset, &remain, 348 + GUC_WORKAROUND_KLV_BLOCK_INTERRUPTS_WHEN_MGSR_BLOCKED); 349 + if (XE_GT_WA(gt, 18024947630)) 350 + guc_waklv_enable(ads, NULL, 0, &offset, &remain, 351 + GUC_WORKAROUND_KLV_ID_GAM_PFQ_SHADOW_TAIL_POLLING); 352 + if (XE_GT_WA(gt, 16022287689)) 353 + guc_waklv_enable(ads, NULL, 0, &offset, &remain, 354 + GUC_WORKAROUND_KLV_ID_DISABLE_MTP_DURING_ASYNC_COMPUTE); 332 355 333 - if (XE_WA(gt, 14022866841)) 334 - guc_waklv_enable_simple(ads, 335 - GUC_WA_KLV_WAKE_POWER_DOMAINS_FOR_OUTBOUND_MMIO, 336 - &offset, &remain); 356 + if (XE_GT_WA(gt, 14022866841)) 357 + guc_waklv_enable(ads, NULL, 0, &offset, &remain, 358 + GUC_WA_KLV_WAKE_POWER_DOMAINS_FOR_OUTBOUND_MMIO); 337 359 338 360 /* 339 361 * On RC6 exit, GuC will write register 0xB04 with the default value provided. As of now, 340 362 * the default value for this register is determined to be 0xC40. This could change in the 341 363 * future, so GuC depends on KMD to send it the correct value. 342 364 */ 343 - if (XE_WA(gt, 13011645652)) 344 - guc_waklv_enable_one_word(ads, 345 - GUC_WA_KLV_NP_RD_WRITE_TO_CLEAR_RCSM_AT_CGP_LATE_RESTORE, 346 - 0xC40, 347 - &offset, &remain); 365 + if (XE_GT_WA(gt, 13011645652)) { 366 + u32 data = 0xC40; 348 367 349 - if (XE_WA(gt, 14022293748) || XE_WA(gt, 22019794406)) 350 - guc_waklv_enable_simple(ads, 351 - GUC_WORKAROUND_KLV_ID_BACK_TO_BACK_RCS_ENGINE_RESET, 352 - &offset, &remain); 368 + guc_waklv_enable(ads, &data, sizeof(data) / sizeof(u32), &offset, &remain, 369 + GUC_WA_KLV_NP_RD_WRITE_TO_CLEAR_RCSM_AT_CGP_LATE_RESTORE); 370 + } 353 371 354 - if (GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER(70, 44, 0) && XE_WA(gt, 16026508708)) 355 - guc_waklv_enable_simple(ads, 356 - GUC_WA_KLV_RESET_BB_STACK_PTR_ON_VF_SWITCH, 357 - &offset, &remain); 372 + if (XE_GT_WA(gt, 14022293748) || XE_GT_WA(gt, 22019794406)) 373 + guc_waklv_enable(ads, NULL, 0, &offset, &remain, 374 + GUC_WORKAROUND_KLV_ID_BACK_TO_BACK_RCS_ENGINE_RESET); 375 + 376 + if (GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER(70, 44, 0) && XE_GT_WA(gt, 16026508708)) 377 + guc_waklv_enable(ads, NULL, 0, &offset, &remain, 378 + GUC_WA_KLV_RESET_BB_STACK_PTR_ON_VF_SWITCH); 379 + if (GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER(70, 47, 0) && XE_GT_WA(gt, 16026007364)) { 380 + u32 data[] = { 381 + 0x0, 382 + 0xF, 383 + }; 384 + guc_waklv_enable(ads, data, sizeof(data) / sizeof(u32), &offset, &remain, 385 + GUC_WA_KLV_RESTORE_UNSAVED_MEDIA_CONTROL_REG); 386 + } 387 + 388 + if (XE_GT_WA(gt, 14020001231)) 389 + guc_waklv_enable(ads, NULL, 0, &offset, &remain, 390 + GUC_WORKAROUND_KLV_DISABLE_PSMI_INTERRUPTS_AT_C6_ENTRY_RESTORE_AT_EXIT); 358 391 359 392 size = guc_ads_waklv_size(ads) - remain; 360 393 if (!size) ··· 765 784 guc_mmio_regset_write_one(ads, regset_map, e->reg, count++); 766 785 } 767 786 768 - if (XE_WA(hwe->gt, 1607983814) && hwe->class == XE_ENGINE_CLASS_RENDER) { 787 + if (XE_GT_WA(hwe->gt, 1607983814) && hwe->class == XE_ENGINE_CLASS_RENDER) { 769 788 for (i = 0; i < LNCFCMOCS_REG_COUNT; i++) { 770 789 guc_mmio_regset_write_one(ads, regset_map, 771 790 XELP_LNCFCMOCS(i), count++);

+1 -1

drivers/gpu/drm/xe/xe_guc_buf.c

··· 164 164 if (offset < 0 || offset + size > cache->sam->base.size) 165 165 return 0; 166 166 167 - return cache->sam->gpu_addr + offset; 167 + return xe_sa_manager_gpu_addr(cache->sam) + offset; 168 168 } 169 169 170 170 #if IS_BUILTIN(CONFIG_DRM_XE_KUNIT_TEST)

+3 -5

drivers/gpu/drm/xe/xe_guc_ct.c

··· 26 26 #include "xe_gt_sriov_pf_control.h" 27 27 #include "xe_gt_sriov_pf_monitor.h" 28 28 #include "xe_gt_sriov_printk.h" 29 - #include "xe_gt_tlb_invalidation.h" 30 29 #include "xe_guc.h" 31 30 #include "xe_guc_log.h" 32 31 #include "xe_guc_relay.h" 33 32 #include "xe_guc_submit.h" 33 + #include "xe_guc_tlb_inval.h" 34 34 #include "xe_map.h" 35 35 #include "xe_pm.h" 36 36 #include "xe_trace_guc.h" ··· 1416 1416 ret = xe_guc_pagefault_handler(guc, payload, adj_len); 1417 1417 break; 1418 1418 case XE_GUC_ACTION_TLB_INVALIDATION_DONE: 1419 - ret = xe_guc_tlb_invalidation_done_handler(guc, payload, 1420 - adj_len); 1419 + ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len); 1421 1420 break; 1422 1421 case XE_GUC_ACTION_ACCESS_COUNTER_NOTIFY: 1423 1422 ret = xe_guc_access_counter_notify_handler(guc, payload, ··· 1617 1618 break; 1618 1619 case XE_GUC_ACTION_TLB_INVALIDATION_DONE: 1619 1620 __g2h_release_space(ct, len); 1620 - ret = xe_guc_tlb_invalidation_done_handler(guc, payload, 1621 - adj_len); 1621 + ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len); 1622 1622 break; 1623 1623 default: 1624 1624 xe_gt_warn(gt, "NOT_POSSIBLE");

+7

drivers/gpu/drm/xe/xe_guc_fwif.h

··· 45 45 #define GUC_MAX_ENGINE_CLASSES 16 46 46 #define GUC_MAX_INSTANCES_PER_CLASS 32 47 47 48 + #define GUC_CONTEXT_NORMAL 0 49 + #define GUC_CONTEXT_COMPRESSION_SAVE 1 50 + #define GUC_CONTEXT_COMPRESSION_RESTORE 2 51 + #define GUC_CONTEXT_COUNT (GUC_CONTEXT_COMPRESSION_RESTORE + 1) 52 + 48 53 /* Helper for context registration H2G */ 49 54 struct guc_ctxt_registration_info { 50 55 u32 flags; ··· 108 103 #define GUC_WA_RENDER_RST_RC6_EXIT BIT(19) 109 104 #define GUC_WA_RCS_REGS_IN_CCS_REGS_LIST BIT(21) 110 105 #define GUC_WA_ENABLE_TSC_CHECK_ON_RC6 BIT(22) 106 + #define GUC_WA_SAVE_RESTORE_MCFG_REG_AT_MC6 BIT(25) 111 107 112 108 #define GUC_CTL_FEATURE 2 113 109 #define GUC_CTL_ENABLE_SLPC BIT(2) 114 110 #define GUC_CTL_ENABLE_LITE_RESTORE BIT(4) 111 + #define GUC_CTL_ENABLE_PSMI_LOGGING BIT(7) 115 112 #define GUC_CTL_DISABLE_SCHEDULER BIT(14) 116 113 117 114 #define GUC_CTL_DEBUG 3

+6 -17

drivers/gpu/drm/xe/xe_guc_pc.c

··· 722 722 */ 723 723 int xe_guc_pc_set_max_freq(struct xe_guc_pc *pc, u32 freq) 724 724 { 725 - if (XE_WA(pc_to_gt(pc), 22019338487)) { 725 + if (XE_GT_WA(pc_to_gt(pc), 22019338487)) { 726 726 if (wait_for_flush_complete(pc) != 0) 727 727 return -EAGAIN; 728 728 } ··· 835 835 { 836 836 struct xe_gt *gt = pc_to_gt(pc); 837 837 838 - if (XE_WA(gt, 22019338487)) { 838 + if (XE_GT_WA(gt, 22019338487)) { 839 839 if (xe_gt_is_media_type(gt)) 840 840 return min(LNL_MERT_FREQ_CAP, pc->rp0_freq); 841 841 else ··· 899 899 if (pc_get_min_freq(pc) > pc->rp0_freq) 900 900 ret = pc_set_min_freq(pc, pc->rp0_freq); 901 901 902 - if (XE_WA(tile->primary_gt, 14022085890)) 902 + if (XE_GT_WA(tile->primary_gt, 14022085890)) 903 903 ret = pc_set_min_freq(pc, max(BMG_MIN_FREQ, pc_get_min_freq(pc))); 904 904 905 905 out: ··· 931 931 { 932 932 struct xe_gt *gt = pc_to_gt(pc); 933 933 934 - return XE_WA(gt, 22019338487) && 934 + return XE_GT_WA(gt, 22019338487) && 935 935 pc->rp0_freq > BMG_MERT_FLUSH_FREQ_CAP; 936 936 } 937 937 ··· 1017 1017 { 1018 1018 int ret; 1019 1019 1020 - if (!XE_WA(pc_to_gt(pc), 22019338487)) 1020 + if (!XE_GT_WA(pc_to_gt(pc), 22019338487)) 1021 1021 return 0; 1022 1022 1023 1023 guard(mutex)(&pc->freq_lock); ··· 1076 1076 { 1077 1077 struct xe_device *xe = pc_to_xe(pc); 1078 1078 struct xe_gt *gt = pc_to_gt(pc); 1079 - unsigned int fw_ref; 1080 1079 int ret = 0; 1081 1080 1082 1081 if (xe->info.skip_guc_pc) ··· 1085 1086 if (ret) 1086 1087 return ret; 1087 1088 1088 - fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL); 1089 - if (!xe_force_wake_ref_has_domain(fw_ref, XE_FORCEWAKE_ALL)) { 1090 - xe_force_wake_put(gt_to_fw(gt), fw_ref); 1091 - return -ETIMEDOUT; 1092 - } 1093 - 1094 - xe_gt_idle_disable_c6(gt); 1095 - 1096 - xe_force_wake_put(gt_to_fw(gt), fw_ref); 1097 - 1098 - return 0; 1089 + return xe_gt_idle_disable_c6(gt); 1099 1090 } 1100 1091 1101 1092 /**

+207 -2

drivers/gpu/drm/xe/xe_guc_submit.c

··· 542 542 xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0); 543 543 } 544 544 545 - static void register_exec_queue(struct xe_exec_queue *q) 545 + static void register_exec_queue(struct xe_exec_queue *q, int ctx_type) 546 546 { 547 547 struct xe_guc *guc = exec_queue_to_guc(q); 548 548 struct xe_device *xe = guc_to_xe(guc); ··· 550 550 struct guc_ctxt_registration_info info; 551 551 552 552 xe_gt_assert(guc_to_gt(guc), !exec_queue_registered(q)); 553 + xe_gt_assert(guc_to_gt(guc), ctx_type < GUC_CONTEXT_COUNT); 553 554 554 555 memset(&info, 0, sizeof(info)); 555 556 info.context_idx = q->guc->id; ··· 559 558 info.hwlrca_lo = lower_32_bits(xe_lrc_descriptor(lrc)); 560 559 info.hwlrca_hi = upper_32_bits(xe_lrc_descriptor(lrc)); 561 560 info.flags = CONTEXT_REGISTRATION_FLAG_KMD; 561 + 562 + if (ctx_type != GUC_CONTEXT_NORMAL) 563 + info.flags |= BIT(ctx_type); 562 564 563 565 if (xe_exec_queue_is_parallel(q)) { 564 566 u64 ggtt_addr = xe_lrc_parallel_ggtt_addr(lrc); ··· 671 667 if (wq_wait_for_space(q, wqi_size)) 672 668 return; 673 669 670 + xe_gt_assert(guc_to_gt(guc), i == XE_GUC_CONTEXT_WQ_HEADER_DATA_0_TYPE_LEN); 674 671 wqi[i++] = FIELD_PREP(WQ_TYPE_MASK, WQ_TYPE_MULTI_LRC) | 675 672 FIELD_PREP(WQ_LEN_MASK, len_dw); 673 + xe_gt_assert(guc_to_gt(guc), i == XE_GUC_CONTEXT_WQ_EL_INFO_DATA_1_CTX_DESC_LOW); 676 674 wqi[i++] = xe_lrc_descriptor(q->lrc[0]); 675 + xe_gt_assert(guc_to_gt(guc), i == 676 + XE_GUC_CONTEXT_WQ_EL_INFO_DATA_2_GUCCTX_RINGTAIL_FREEZEPOCS); 677 677 wqi[i++] = FIELD_PREP(WQ_GUC_ID_MASK, q->guc->id) | 678 678 FIELD_PREP(WQ_RING_TAIL_MASK, q->lrc[0]->ring.tail / sizeof(u64)); 679 + xe_gt_assert(guc_to_gt(guc), i == XE_GUC_CONTEXT_WQ_EL_INFO_DATA_3_WI_FENCE_ID); 679 680 wqi[i++] = 0; 681 + xe_gt_assert(guc_to_gt(guc), i == XE_GUC_CONTEXT_WQ_EL_CHILD_LIST_DATA_4_RINGTAIL); 680 682 for (j = 1; j < q->width; ++j) { 681 683 struct xe_lrc *lrc = q->lrc[j]; 682 684 ··· 701 691 702 692 map = xe_lrc_parallel_map(q->lrc[0]); 703 693 parallel_write(xe, map, wq_desc.tail, q->guc->wqi_tail); 694 + } 695 + 696 + static int wq_items_rebase(struct xe_exec_queue *q) 697 + { 698 + struct xe_guc *guc = exec_queue_to_guc(q); 699 + struct xe_device *xe = guc_to_xe(guc); 700 + struct iosys_map map = xe_lrc_parallel_map(q->lrc[0]); 701 + int i = q->guc->wqi_head; 702 + 703 + /* the ring starts after a header struct */ 704 + iosys_map_incr(&map, offsetof(struct guc_submit_parallel_scratch, wq[0])); 705 + 706 + while ((i % WQ_SIZE) != (q->guc->wqi_tail % WQ_SIZE)) { 707 + u32 len_dw, type, val; 708 + 709 + if (drm_WARN_ON_ONCE(&xe->drm, i < 0 || i > 2 * WQ_SIZE)) 710 + break; 711 + 712 + val = xe_map_rd_ring_u32(xe, &map, i / sizeof(u32) + 713 + XE_GUC_CONTEXT_WQ_HEADER_DATA_0_TYPE_LEN, 714 + WQ_SIZE / sizeof(u32)); 715 + len_dw = FIELD_GET(WQ_LEN_MASK, val); 716 + type = FIELD_GET(WQ_TYPE_MASK, val); 717 + 718 + if (drm_WARN_ON_ONCE(&xe->drm, len_dw >= WQ_SIZE / sizeof(u32))) 719 + break; 720 + 721 + if (type == WQ_TYPE_MULTI_LRC) { 722 + val = xe_lrc_descriptor(q->lrc[0]); 723 + xe_map_wr_ring_u32(xe, &map, i / sizeof(u32) + 724 + XE_GUC_CONTEXT_WQ_EL_INFO_DATA_1_CTX_DESC_LOW, 725 + WQ_SIZE / sizeof(u32), val); 726 + } else if (drm_WARN_ON_ONCE(&xe->drm, type != WQ_TYPE_NOOP)) { 727 + break; 728 + } 729 + 730 + i += (len_dw + 1) * sizeof(u32); 731 + } 732 + 733 + if ((i % WQ_SIZE) != (q->guc->wqi_tail % WQ_SIZE)) { 734 + xe_gt_err(q->gt, "Exec queue fixups incomplete - wqi parse failed\n"); 735 + return -EBADMSG; 736 + } 737 + return 0; 704 738 } 705 739 706 740 #define RESUME_PENDING ~0x0ull ··· 815 761 816 762 if (!exec_queue_killed_or_banned_or_wedged(q) && !xe_sched_job_is_error(job)) { 817 763 if (!exec_queue_registered(q)) 818 - register_exec_queue(q); 764 + register_exec_queue(q, GUC_CONTEXT_NORMAL); 819 765 if (!lr) /* LR jobs are emitted in the exec IOCTL */ 820 766 q->ring_ops->emit_job(job); 821 767 submit_exec_queue(q); ··· 829 775 } 830 776 831 777 return fence; 778 + } 779 + 780 + /** 781 + * xe_guc_jobs_ring_rebase - Re-emit ring commands of requests pending 782 + * on all queues under a guc. 783 + * @guc: the &xe_guc struct instance 784 + */ 785 + void xe_guc_jobs_ring_rebase(struct xe_guc *guc) 786 + { 787 + struct xe_exec_queue *q; 788 + unsigned long index; 789 + 790 + /* 791 + * This routine is used within VF migration recovery. This means 792 + * using the lock here introduces a restriction: we cannot wait 793 + * for any GFX HW response while the lock is taken. 794 + */ 795 + mutex_lock(&guc->submission_state.lock); 796 + xa_for_each(&guc->submission_state.exec_queue_lookup, index, q) { 797 + if (exec_queue_killed_or_banned_or_wedged(q)) 798 + continue; 799 + xe_exec_queue_jobs_ring_restore(q); 800 + } 801 + mutex_unlock(&guc->submission_state.lock); 832 802 } 833 803 834 804 static void guc_exec_queue_free_job(struct drm_sched_job *drm_job) ··· 1851 1773 } 1852 1774 } 1853 1775 1776 + /** 1777 + * xe_guc_submit_reset_block - Disallow reset calls on given GuC. 1778 + * @guc: the &xe_guc struct instance 1779 + */ 1780 + int xe_guc_submit_reset_block(struct xe_guc *guc) 1781 + { 1782 + return atomic_fetch_or(1, &guc->submission_state.reset_blocked); 1783 + } 1784 + 1785 + /** 1786 + * xe_guc_submit_reset_unblock - Allow back reset calls on given GuC. 1787 + * @guc: the &xe_guc struct instance 1788 + */ 1789 + void xe_guc_submit_reset_unblock(struct xe_guc *guc) 1790 + { 1791 + atomic_set_release(&guc->submission_state.reset_blocked, 0); 1792 + wake_up_all(&guc->ct.wq); 1793 + } 1794 + 1795 + static int guc_submit_reset_is_blocked(struct xe_guc *guc) 1796 + { 1797 + return atomic_read_acquire(&guc->submission_state.reset_blocked); 1798 + } 1799 + 1800 + /* Maximum time of blocking reset */ 1801 + #define RESET_BLOCK_PERIOD_MAX (HZ * 5) 1802 + 1803 + /** 1804 + * xe_guc_wait_reset_unblock - Wait until reset blocking flag is lifted, or timeout. 1805 + * @guc: the &xe_guc struct instance 1806 + */ 1807 + int xe_guc_wait_reset_unblock(struct xe_guc *guc) 1808 + { 1809 + return wait_event_timeout(guc->ct.wq, 1810 + !guc_submit_reset_is_blocked(guc), RESET_BLOCK_PERIOD_MAX); 1811 + } 1812 + 1854 1813 int xe_guc_submit_reset_prepare(struct xe_guc *guc) 1855 1814 { 1856 1815 int ret; ··· 1941 1826 1942 1827 } 1943 1828 1829 + /** 1830 + * xe_guc_submit_pause - Stop further runs of submission tasks on given GuC. 1831 + * @guc: the &xe_guc struct instance whose scheduler is to be disabled 1832 + */ 1833 + void xe_guc_submit_pause(struct xe_guc *guc) 1834 + { 1835 + struct xe_exec_queue *q; 1836 + unsigned long index; 1837 + 1838 + xa_for_each(&guc->submission_state.exec_queue_lookup, index, q) 1839 + xe_sched_submission_stop_async(&q->guc->sched); 1840 + } 1841 + 1944 1842 static void guc_exec_queue_start(struct xe_exec_queue *q) 1945 1843 { 1946 1844 struct xe_gpu_scheduler *sched = &q->guc->sched; ··· 1992 1864 wake_up_all(&guc->ct.wq); 1993 1865 1994 1866 return 0; 1867 + } 1868 + 1869 + static void guc_exec_queue_unpause(struct xe_exec_queue *q) 1870 + { 1871 + struct xe_gpu_scheduler *sched = &q->guc->sched; 1872 + 1873 + xe_sched_submission_start(sched); 1874 + } 1875 + 1876 + /** 1877 + * xe_guc_submit_unpause - Allow further runs of submission tasks on given GuC. 1878 + * @guc: the &xe_guc struct instance whose scheduler is to be enabled 1879 + */ 1880 + void xe_guc_submit_unpause(struct xe_guc *guc) 1881 + { 1882 + struct xe_exec_queue *q; 1883 + unsigned long index; 1884 + 1885 + xa_for_each(&guc->submission_state.exec_queue_lookup, index, q) 1886 + guc_exec_queue_unpause(q); 1887 + 1888 + wake_up_all(&guc->ct.wq); 1995 1889 } 1996 1890 1997 1891 static struct xe_exec_queue * ··· 2528 2378 } 2529 2379 2530 2380 /** 2381 + * xe_guc_register_exec_queue - Register exec queue for a given context type. 2382 + * @q: Execution queue 2383 + * @ctx_type: Type of the context 2384 + * 2385 + * This function registers the execution queue with the guc. Special context 2386 + * types like GUC_CONTEXT_COMPRESSION_SAVE and GUC_CONTEXT_COMPRESSION_RESTORE 2387 + * are only applicable for IGPU and in the VF. 2388 + * Submits the execution queue to GUC after registering it. 2389 + * 2390 + * Returns - None. 2391 + */ 2392 + void xe_guc_register_exec_queue(struct xe_exec_queue *q, int ctx_type) 2393 + { 2394 + struct xe_guc *guc = exec_queue_to_guc(q); 2395 + struct xe_device *xe = guc_to_xe(guc); 2396 + 2397 + xe_assert(xe, IS_SRIOV_VF(xe)); 2398 + xe_assert(xe, !IS_DGFX(xe)); 2399 + xe_assert(xe, (ctx_type > GUC_CONTEXT_NORMAL && 2400 + ctx_type < GUC_CONTEXT_COUNT)); 2401 + 2402 + register_exec_queue(q, ctx_type); 2403 + enable_scheduling(q); 2404 + } 2405 + 2406 + /** 2531 2407 * xe_guc_submit_print - GuC Submit Print. 2532 2408 * @guc: GuC. 2533 2409 * @p: drm_printer where it will be printed out. ··· 2572 2396 xa_for_each(&guc->submission_state.exec_queue_lookup, index, q) 2573 2397 guc_exec_queue_print(q, p); 2574 2398 mutex_unlock(&guc->submission_state.lock); 2399 + } 2400 + 2401 + /** 2402 + * xe_guc_contexts_hwsp_rebase - Re-compute GGTT references within all 2403 + * exec queues registered to given GuC. 2404 + * @guc: the &xe_guc struct instance 2405 + * @scratch: scratch buffer to be used as temporary storage 2406 + * 2407 + * Returns: zero on success, negative error code on failure. 2408 + */ 2409 + int xe_guc_contexts_hwsp_rebase(struct xe_guc *guc, void *scratch) 2410 + { 2411 + struct xe_exec_queue *q; 2412 + unsigned long index; 2413 + int err = 0; 2414 + 2415 + mutex_lock(&guc->submission_state.lock); 2416 + xa_for_each(&guc->submission_state.exec_queue_lookup, index, q) { 2417 + err = xe_exec_queue_contexts_hwsp_rebase(q, scratch); 2418 + if (err) 2419 + break; 2420 + if (xe_exec_queue_is_parallel(q)) 2421 + err = wq_items_rebase(q); 2422 + if (err) 2423 + break; 2424 + } 2425 + mutex_unlock(&guc->submission_state.lock); 2426 + 2427 + return err; 2575 2428 }

+10

drivers/gpu/drm/xe/xe_guc_submit.h

··· 18 18 void xe_guc_submit_reset_wait(struct xe_guc *guc); 19 19 void xe_guc_submit_stop(struct xe_guc *guc); 20 20 int xe_guc_submit_start(struct xe_guc *guc); 21 + void xe_guc_submit_pause(struct xe_guc *guc); 22 + void xe_guc_submit_unpause(struct xe_guc *guc); 23 + int xe_guc_submit_reset_block(struct xe_guc *guc); 24 + void xe_guc_submit_reset_unblock(struct xe_guc *guc); 25 + int xe_guc_wait_reset_unblock(struct xe_guc *guc); 21 26 void xe_guc_submit_wedge(struct xe_guc *guc); 22 27 23 28 int xe_guc_read_stopped(struct xe_guc *guc); ··· 34 29 int xe_guc_exec_queue_reset_failure_handler(struct xe_guc *guc, u32 *msg, u32 len); 35 30 int xe_guc_error_capture_handler(struct xe_guc *guc, u32 *msg, u32 len); 36 31 32 + void xe_guc_jobs_ring_rebase(struct xe_guc *guc); 33 + 37 34 struct xe_guc_submit_exec_queue_snapshot * 38 35 xe_guc_exec_queue_snapshot_capture(struct xe_exec_queue *q); 39 36 void ··· 46 39 void 47 40 xe_guc_exec_queue_snapshot_free(struct xe_guc_submit_exec_queue_snapshot *snapshot); 48 41 void xe_guc_submit_print(struct xe_guc *guc, struct drm_printer *p); 42 + void xe_guc_register_exec_queue(struct xe_exec_queue *q, int ctx_type); 43 + 44 + int xe_guc_contexts_hwsp_rebase(struct xe_guc *guc, void *scratch); 49 45 50 46 #endif

+242

drivers/gpu/drm/xe/xe_guc_tlb_inval.c

··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #include "abi/guc_actions_abi.h" 7 + 8 + #include "xe_device.h" 9 + #include "xe_gt_stats.h" 10 + #include "xe_gt_types.h" 11 + #include "xe_guc.h" 12 + #include "xe_guc_ct.h" 13 + #include "xe_guc_tlb_inval.h" 14 + #include "xe_force_wake.h" 15 + #include "xe_mmio.h" 16 + #include "xe_tlb_inval.h" 17 + 18 + #include "regs/xe_guc_regs.h" 19 + 20 + /* 21 + * XXX: The seqno algorithm relies on TLB invalidation being processed in order 22 + * which they currently are by the GuC, if that changes the algorithm will need 23 + * to be updated. 24 + */ 25 + 26 + static int send_tlb_inval(struct xe_guc *guc, const u32 *action, int len) 27 + { 28 + struct xe_gt *gt = guc_to_gt(guc); 29 + 30 + xe_gt_assert(gt, action[1]); /* Seqno */ 31 + 32 + xe_gt_stats_incr(gt, XE_GT_STATS_ID_TLB_INVAL, 1); 33 + return xe_guc_ct_send(&guc->ct, action, len, 34 + G2H_LEN_DW_TLB_INVALIDATE, 1); 35 + } 36 + 37 + #define MAKE_INVAL_OP(type) ((type << XE_GUC_TLB_INVAL_TYPE_SHIFT) | \ 38 + XE_GUC_TLB_INVAL_MODE_HEAVY << XE_GUC_TLB_INVAL_MODE_SHIFT | \ 39 + XE_GUC_TLB_INVAL_FLUSH_CACHE) 40 + 41 + static int send_tlb_inval_all(struct xe_tlb_inval *tlb_inval, u32 seqno) 42 + { 43 + struct xe_guc *guc = tlb_inval->private; 44 + u32 action[] = { 45 + XE_GUC_ACTION_TLB_INVALIDATION_ALL, 46 + seqno, 47 + MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL), 48 + }; 49 + 50 + return send_tlb_inval(guc, action, ARRAY_SIZE(action)); 51 + } 52 + 53 + static int send_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval, u32 seqno) 54 + { 55 + struct xe_guc *guc = tlb_inval->private; 56 + struct xe_gt *gt = guc_to_gt(guc); 57 + struct xe_device *xe = guc_to_xe(guc); 58 + 59 + /* 60 + * Returning -ECANCELED in this function is squashed at the caller and 61 + * signals waiters. 62 + */ 63 + 64 + if (xe_guc_ct_enabled(&guc->ct) && guc->submission_state.enabled) { 65 + u32 action[] = { 66 + XE_GUC_ACTION_TLB_INVALIDATION, 67 + seqno, 68 + MAKE_INVAL_OP(XE_GUC_TLB_INVAL_GUC), 69 + }; 70 + 71 + return send_tlb_inval(guc, action, ARRAY_SIZE(action)); 72 + } else if (xe_device_uc_enabled(xe) && !xe_device_wedged(xe)) { 73 + struct xe_mmio *mmio = &gt->mmio; 74 + unsigned int fw_ref; 75 + 76 + if (IS_SRIOV_VF(xe)) 77 + return -ECANCELED; 78 + 79 + fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); 80 + if (xe->info.platform == XE_PVC || GRAPHICS_VER(xe) >= 20) { 81 + xe_mmio_write32(mmio, PVC_GUC_TLB_INV_DESC1, 82 + PVC_GUC_TLB_INV_DESC1_INVALIDATE); 83 + xe_mmio_write32(mmio, PVC_GUC_TLB_INV_DESC0, 84 + PVC_GUC_TLB_INV_DESC0_VALID); 85 + } else { 86 + xe_mmio_write32(mmio, GUC_TLB_INV_CR, 87 + GUC_TLB_INV_CR_INVALIDATE); 88 + } 89 + xe_force_wake_put(gt_to_fw(gt), fw_ref); 90 + } 91 + 92 + return -ECANCELED; 93 + } 94 + 95 + /* 96 + * Ensure that roundup_pow_of_two(length) doesn't overflow. 97 + * Note that roundup_pow_of_two() operates on unsigned long, 98 + * not on u64. 99 + */ 100 + #define MAX_RANGE_TLB_INVALIDATION_LENGTH (rounddown_pow_of_two(ULONG_MAX)) 101 + 102 + static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno, 103 + u64 start, u64 end, u32 asid) 104 + { 105 + #define MAX_TLB_INVALIDATION_LEN 7 106 + struct xe_guc *guc = tlb_inval->private; 107 + struct xe_gt *gt = guc_to_gt(guc); 108 + u32 action[MAX_TLB_INVALIDATION_LEN]; 109 + u64 length = end - start; 110 + int len = 0; 111 + 112 + if (guc_to_xe(guc)->info.force_execlist) 113 + return -ECANCELED; 114 + 115 + action[len++] = XE_GUC_ACTION_TLB_INVALIDATION; 116 + action[len++] = seqno; 117 + if (!gt_to_xe(gt)->info.has_range_tlb_inval || 118 + length > MAX_RANGE_TLB_INVALIDATION_LENGTH) { 119 + action[len++] = MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL); 120 + } else { 121 + u64 orig_start = start; 122 + u64 align; 123 + 124 + if (length < SZ_4K) 125 + length = SZ_4K; 126 + 127 + /* 128 + * We need to invalidate a higher granularity if start address 129 + * is not aligned to length. When start is not aligned with 130 + * length we need to find the length large enough to create an 131 + * address mask covering the required range. 132 + */ 133 + align = roundup_pow_of_two(length); 134 + start = ALIGN_DOWN(start, align); 135 + end = ALIGN(end, align); 136 + length = align; 137 + while (start + length < end) { 138 + length <<= 1; 139 + start = ALIGN_DOWN(orig_start, length); 140 + } 141 + 142 + /* 143 + * Minimum invalidation size for a 2MB page that the hardware 144 + * expects is 16MB 145 + */ 146 + if (length >= SZ_2M) { 147 + length = max_t(u64, SZ_16M, length); 148 + start = ALIGN_DOWN(orig_start, length); 149 + } 150 + 151 + xe_gt_assert(gt, length >= SZ_4K); 152 + xe_gt_assert(gt, is_power_of_2(length)); 153 + xe_gt_assert(gt, !(length & GENMASK(ilog2(SZ_16M) - 1, 154 + ilog2(SZ_2M) + 1))); 155 + xe_gt_assert(gt, IS_ALIGNED(start, length)); 156 + 157 + action[len++] = MAKE_INVAL_OP(XE_GUC_TLB_INVAL_PAGE_SELECTIVE); 158 + action[len++] = asid; 159 + action[len++] = lower_32_bits(start); 160 + action[len++] = upper_32_bits(start); 161 + action[len++] = ilog2(length) - ilog2(SZ_4K); 162 + } 163 + 164 + xe_gt_assert(gt, len <= MAX_TLB_INVALIDATION_LEN); 165 + 166 + return send_tlb_inval(guc, action, len); 167 + } 168 + 169 + static bool tlb_inval_initialized(struct xe_tlb_inval *tlb_inval) 170 + { 171 + struct xe_guc *guc = tlb_inval->private; 172 + 173 + return xe_guc_ct_initialized(&guc->ct); 174 + } 175 + 176 + static void tlb_inval_flush(struct xe_tlb_inval *tlb_inval) 177 + { 178 + struct xe_guc *guc = tlb_inval->private; 179 + 180 + LNL_FLUSH_WORK(&guc->ct.g2h_worker); 181 + } 182 + 183 + static long tlb_inval_timeout_delay(struct xe_tlb_inval *tlb_inval) 184 + { 185 + struct xe_guc *guc = tlb_inval->private; 186 + 187 + /* this reflects what HW/GuC needs to process TLB inv request */ 188 + const long hw_tlb_timeout = HZ / 4; 189 + 190 + /* this estimates actual delay caused by the CTB transport */ 191 + long delay = xe_guc_ct_queue_proc_time_jiffies(&guc->ct); 192 + 193 + return hw_tlb_timeout + 2 * delay; 194 + } 195 + 196 + static const struct xe_tlb_inval_ops guc_tlb_inval_ops = { 197 + .all = send_tlb_inval_all, 198 + .ggtt = send_tlb_inval_ggtt, 199 + .ppgtt = send_tlb_inval_ppgtt, 200 + .initialized = tlb_inval_initialized, 201 + .flush = tlb_inval_flush, 202 + .timeout_delay = tlb_inval_timeout_delay, 203 + }; 204 + 205 + /** 206 + * xe_guc_tlb_inval_init_early() - Init GuC TLB invalidation early 207 + * @guc: GuC object 208 + * @tlb_inval: TLB invalidation client 209 + * 210 + * Inititialize GuC TLB invalidation by setting back pointer in TLB invalidation 211 + * client to the GuC and setting GuC backend ops. 212 + */ 213 + void xe_guc_tlb_inval_init_early(struct xe_guc *guc, 214 + struct xe_tlb_inval *tlb_inval) 215 + { 216 + tlb_inval->private = guc; 217 + tlb_inval->ops = &guc_tlb_inval_ops; 218 + } 219 + 220 + /** 221 + * xe_guc_tlb_inval_done_handler() - TLB invalidation done handler 222 + * @guc: guc 223 + * @msg: message indicating TLB invalidation done 224 + * @len: length of message 225 + * 226 + * Parse seqno of TLB invalidation, wake any waiters for seqno, and signal any 227 + * invalidation fences for seqno. Algorithm for this depends on seqno being 228 + * received in-order and asserts this assumption. 229 + * 230 + * Return: 0 on success, -EPROTO for malformed messages. 231 + */ 232 + int xe_guc_tlb_inval_done_handler(struct xe_guc *guc, u32 *msg, u32 len) 233 + { 234 + struct xe_gt *gt = guc_to_gt(guc); 235 + 236 + if (unlikely(len != 1)) 237 + return -EPROTO; 238 + 239 + xe_tlb_inval_done_handler(&gt->tlb_inval, msg[0]); 240 + 241 + return 0; 242 + }

+19

drivers/gpu/drm/xe/xe_guc_tlb_inval.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_GUC_TLB_INVAL_H_ 7 + #define _XE_GUC_TLB_INVAL_H_ 8 + 9 + #include <linux/types.h> 10 + 11 + struct xe_guc; 12 + struct xe_tlb_inval; 13 + 14 + void xe_guc_tlb_inval_init_early(struct xe_guc *guc, 15 + struct xe_tlb_inval *tlb_inval); 16 + 17 + int xe_guc_tlb_inval_done_handler(struct xe_guc *guc, u32 *msg, u32 len); 18 + 19 + #endif

+6

drivers/gpu/drm/xe/xe_guc_types.h

··· 85 85 struct xarray exec_queue_lookup; 86 86 /** @submission_state.stopped: submissions are stopped */ 87 87 atomic_t stopped; 88 + /** 89 + * @submission_state.reset_blocked: reset attempts are blocked; 90 + * blocking reset in order to delay it may be required if running 91 + * an operation which is sensitive to resets. 92 + */ 93 + atomic_t reset_blocked; 88 94 /** @submission_state.lock: protects submission state */ 89 95 struct mutex lock; 90 96 /** @submission_state.enabled: submission is enabled */

+1 -1

drivers/gpu/drm/xe/xe_heci_gsc.c

··· 197 197 if (ret) 198 198 return ret; 199 199 200 - if (!def->use_polling && !xe_survivability_mode_is_enabled(xe)) { 200 + if (!def->use_polling && !xe_survivability_mode_is_boot_enabled(xe)) { 201 201 ret = heci_gsc_irq_setup(xe); 202 202 if (ret) 203 203 return ret;

+1 -1

drivers/gpu/drm/xe/xe_hw_engine.c

··· 576 576 u32 maxcnt_units_ns = 640; 577 577 bool inhibit_switch = 0; 578 578 579 - if (!IS_SRIOV_VF(gt_to_xe(hwe->gt)) && XE_WA(gt, 16023105232)) { 579 + if (!IS_SRIOV_VF(gt_to_xe(hwe->gt)) && XE_GT_WA(gt, 16023105232)) { 580 580 idledly = xe_mmio_read32(&gt->mmio, RING_IDLEDLY(hwe->mmio_base)); 581 581 maxcnt = xe_mmio_read32(&gt->mmio, RING_PWRCTX_MAXCNT(hwe->mmio_base)); 582 582

+2 -2

drivers/gpu/drm/xe/xe_hw_engine_group.c

··· 103 103 break; 104 104 case XE_ENGINE_CLASS_OTHER: 105 105 break; 106 - default: 107 - drm_warn(&xe->drm, "NOT POSSIBLE"); 106 + case XE_ENGINE_CLASS_MAX: 107 + xe_gt_assert(gt, false); 108 108 } 109 109 } 110 110

+182

drivers/gpu/drm/xe/xe_hw_error.c

··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #include <linux/fault-inject.h> 7 + 8 + #include "regs/xe_gsc_regs.h" 9 + #include "regs/xe_hw_error_regs.h" 10 + #include "regs/xe_irq_regs.h" 11 + 12 + #include "xe_device.h" 13 + #include "xe_hw_error.h" 14 + #include "xe_mmio.h" 15 + #include "xe_survivability_mode.h" 16 + 17 + #define HEC_UNCORR_FW_ERR_BITS 4 18 + extern struct fault_attr inject_csc_hw_error; 19 + 20 + /* Error categories reported by hardware */ 21 + enum hardware_error { 22 + HARDWARE_ERROR_CORRECTABLE = 0, 23 + HARDWARE_ERROR_NONFATAL = 1, 24 + HARDWARE_ERROR_FATAL = 2, 25 + HARDWARE_ERROR_MAX, 26 + }; 27 + 28 + static const char * const hec_uncorrected_fw_errors[] = { 29 + "Fatal", 30 + "CSE Disabled", 31 + "FD Corruption", 32 + "Data Corruption" 33 + }; 34 + 35 + static const char *hw_error_to_str(const enum hardware_error hw_err) 36 + { 37 + switch (hw_err) { 38 + case HARDWARE_ERROR_CORRECTABLE: 39 + return "CORRECTABLE"; 40 + case HARDWARE_ERROR_NONFATAL: 41 + return "NONFATAL"; 42 + case HARDWARE_ERROR_FATAL: 43 + return "FATAL"; 44 + default: 45 + return "UNKNOWN"; 46 + } 47 + } 48 + 49 + static bool fault_inject_csc_hw_error(void) 50 + { 51 + return IS_ENABLED(CONFIG_DEBUG_FS) && should_fail(&inject_csc_hw_error, 1); 52 + } 53 + 54 + static void csc_hw_error_work(struct work_struct *work) 55 + { 56 + struct xe_tile *tile = container_of(work, typeof(*tile), csc_hw_error_work); 57 + struct xe_device *xe = tile_to_xe(tile); 58 + int ret; 59 + 60 + ret = xe_survivability_mode_runtime_enable(xe); 61 + if (ret) 62 + drm_err(&xe->drm, "Failed to enable runtime survivability mode\n"); 63 + } 64 + 65 + static void csc_hw_error_handler(struct xe_tile *tile, const enum hardware_error hw_err) 66 + { 67 + const char *hw_err_str = hw_error_to_str(hw_err); 68 + struct xe_device *xe = tile_to_xe(tile); 69 + struct xe_mmio *mmio = &tile->mmio; 70 + u32 base, err_bit, err_src; 71 + unsigned long fw_err; 72 + 73 + if (xe->info.platform != XE_BATTLEMAGE) 74 + return; 75 + 76 + base = BMG_GSC_HECI1_BASE; 77 + lockdep_assert_held(&xe->irq.lock); 78 + err_src = xe_mmio_read32(mmio, HEC_UNCORR_ERR_STATUS(base)); 79 + if (!err_src) { 80 + drm_err_ratelimited(&xe->drm, HW_ERR "Tile%d reported HEC_ERR_STATUS_%s blank\n", 81 + tile->id, hw_err_str); 82 + return; 83 + } 84 + 85 + if (err_src & UNCORR_FW_REPORTED_ERR) { 86 + fw_err = xe_mmio_read32(mmio, HEC_UNCORR_FW_ERR_DW0(base)); 87 + for_each_set_bit(err_bit, &fw_err, HEC_UNCORR_FW_ERR_BITS) { 88 + drm_err_ratelimited(&xe->drm, HW_ERR 89 + "%s: HEC Uncorrected FW %s error reported, bit[%d] is set\n", 90 + hw_err_str, hec_uncorrected_fw_errors[err_bit], 91 + err_bit); 92 + 93 + schedule_work(&tile->csc_hw_error_work); 94 + } 95 + } 96 + 97 + xe_mmio_write32(mmio, HEC_UNCORR_ERR_STATUS(base), err_src); 98 + } 99 + 100 + static void hw_error_source_handler(struct xe_tile *tile, const enum hardware_error hw_err) 101 + { 102 + const char *hw_err_str = hw_error_to_str(hw_err); 103 + struct xe_device *xe = tile_to_xe(tile); 104 + unsigned long flags; 105 + u32 err_src; 106 + 107 + if (xe->info.platform != XE_BATTLEMAGE) 108 + return; 109 + 110 + spin_lock_irqsave(&xe->irq.lock, flags); 111 + err_src = xe_mmio_read32(&tile->mmio, DEV_ERR_STAT_REG(hw_err)); 112 + if (!err_src) { 113 + drm_err_ratelimited(&xe->drm, HW_ERR "Tile%d reported DEV_ERR_STAT_%s blank!\n", 114 + tile->id, hw_err_str); 115 + goto unlock; 116 + } 117 + 118 + if (err_src & XE_CSC_ERROR) 119 + csc_hw_error_handler(tile, hw_err); 120 + 121 + xe_mmio_write32(&tile->mmio, DEV_ERR_STAT_REG(hw_err), err_src); 122 + 123 + unlock: 124 + spin_unlock_irqrestore(&xe->irq.lock, flags); 125 + } 126 + 127 + /** 128 + * xe_hw_error_irq_handler - irq handling for hw errors 129 + * @tile: tile instance 130 + * @master_ctl: value read from master interrupt register 131 + * 132 + * Xe platforms add three error bits to the master interrupt register to support error handling. 133 + * These three bits are used to convey the class of error FATAL, NONFATAL, or CORRECTABLE. 134 + * To process the interrupt, determine the source of error by reading the Device Error Source 135 + * Register that corresponds to the class of error being serviced. 136 + */ 137 + void xe_hw_error_irq_handler(struct xe_tile *tile, const u32 master_ctl) 138 + { 139 + enum hardware_error hw_err; 140 + 141 + if (fault_inject_csc_hw_error()) 142 + schedule_work(&tile->csc_hw_error_work); 143 + 144 + for (hw_err = 0; hw_err < HARDWARE_ERROR_MAX; hw_err++) 145 + if (master_ctl & ERROR_IRQ(hw_err)) 146 + hw_error_source_handler(tile, hw_err); 147 + } 148 + 149 + /* 150 + * Process hardware errors during boot 151 + */ 152 + static void process_hw_errors(struct xe_device *xe) 153 + { 154 + struct xe_tile *tile; 155 + u32 master_ctl; 156 + u8 id; 157 + 158 + for_each_tile(tile, xe, id) { 159 + master_ctl = xe_mmio_read32(&tile->mmio, GFX_MSTR_IRQ); 160 + xe_hw_error_irq_handler(tile, master_ctl); 161 + xe_mmio_write32(&tile->mmio, GFX_MSTR_IRQ, master_ctl); 162 + } 163 + } 164 + 165 + /** 166 + * xe_hw_error_init - Initialize hw errors 167 + * @xe: xe device instance 168 + * 169 + * Initialize and check for errors that occurred during boot 170 + * prior to driver load 171 + */ 172 + void xe_hw_error_init(struct xe_device *xe) 173 + { 174 + struct xe_tile *tile = xe_device_get_root_tile(xe); 175 + 176 + if (!IS_DGFX(xe) || IS_SRIOV_VF(xe)) 177 + return; 178 + 179 + INIT_WORK(&tile->csc_hw_error_work, csc_hw_error_work); 180 + 181 + process_hw_errors(xe); 182 + }

+15

drivers/gpu/drm/xe/xe_hw_error.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + #ifndef XE_HW_ERROR_H_ 6 + #define XE_HW_ERROR_H_ 7 + 8 + #include <linux/types.h> 9 + 10 + struct xe_tile; 11 + struct xe_device; 12 + 13 + void xe_hw_error_irq_handler(struct xe_tile *tile, const u32 master_ctl); 14 + void xe_hw_error_init(struct xe_device *xe); 15 + #endif

+4 -4

drivers/gpu/drm/xe/xe_hwmon.c

··· 179 179 u32 clr, u32 set) 180 180 { 181 181 struct xe_tile *root_tile = xe_device_get_root_tile(hwmon->xe); 182 - u32 val0, val1; 182 + u32 val0 = 0, val1 = 0; 183 183 int ret = 0; 184 184 185 185 ret = xe_pcode_read(root_tile, PCODE_MBOX(PCODE_POWER_SETUP, ··· 734 734 long *value, u32 scale_factor) 735 735 { 736 736 int ret; 737 - u32 uval; 737 + u32 uval = 0; 738 738 739 739 mutex_lock(&hwmon->hwmon_lock); 740 740 ··· 918 918 static umode_t 919 919 xe_hwmon_curr_is_visible(const struct xe_hwmon *hwmon, u32 attr, int channel) 920 920 { 921 - u32 uval; 921 + u32 uval = 0; 922 922 923 923 /* hwmon sysfs attribute of current available only for package */ 924 924 if (channel != CHANNEL_PKG) ··· 1020 1020 static umode_t 1021 1021 xe_hwmon_fan_is_visible(struct xe_hwmon *hwmon, u32 attr, int channel) 1022 1022 { 1023 - u32 uval; 1023 + u32 uval = 0; 1024 1024 1025 1025 if (!hwmon->xe->info.has_fan_control) 1026 1026 return 0;

+16 -2

drivers/gpu/drm/xe/xe_i2c.c

··· 147 147 } 148 148 149 149 /** 150 + * xe_i2c_present - I2C controller is present and functional 151 + * @xe: xe device instance 152 + * 153 + * Check whether the I2C controller is present and functioning with valid 154 + * endpoint cookie. 155 + * 156 + * Return: %true if present, %false otherwise. 157 + */ 158 + bool xe_i2c_present(struct xe_device *xe) 159 + { 160 + return xe->i2c && xe->i2c->ep.cookie == XE_I2C_EP_COOKIE_DEVICE; 161 + } 162 + 163 + /** 150 164 * xe_i2c_irq_handler: Handler for I2C interrupts 151 165 * @xe: xe device instance 152 166 * @master_ctl: interrupt register ··· 244 230 { 245 231 struct xe_mmio *mmio = xe_root_tile_mmio(xe); 246 232 247 - if (!xe->i2c || xe->i2c->ep.cookie != XE_I2C_EP_COOKIE_DEVICE) 233 + if (!xe_i2c_present(xe)) 248 234 return; 249 235 250 236 xe_mmio_rmw32(mmio, I2C_CONFIG_PMCSR, PCI_PM_CTRL_STATE_MASK, (__force u32)PCI_D3hot); ··· 255 241 { 256 242 struct xe_mmio *mmio = xe_root_tile_mmio(xe); 257 243 258 - if (!xe->i2c || xe->i2c->ep.cookie != XE_I2C_EP_COOKIE_DEVICE) 244 + if (!xe_i2c_present(xe)) 259 245 return; 260 246 261 247 if (d3cold)

+2

drivers/gpu/drm/xe/xe_i2c.h

··· 49 49 50 50 #if IS_ENABLED(CONFIG_I2C) 51 51 int xe_i2c_probe(struct xe_device *xe); 52 + bool xe_i2c_present(struct xe_device *xe); 52 53 void xe_i2c_irq_handler(struct xe_device *xe, u32 master_ctl); 53 54 void xe_i2c_pm_suspend(struct xe_device *xe); 54 55 void xe_i2c_pm_resume(struct xe_device *xe, bool d3cold); 55 56 #else 56 57 static inline int xe_i2c_probe(struct xe_device *xe) { return 0; } 58 + static inline bool xe_i2c_present(struct xe_device *xe) { return false; } 57 59 static inline void xe_i2c_irq_handler(struct xe_device *xe, u32 master_ctl) { } 58 60 static inline void xe_i2c_pm_suspend(struct xe_device *xe) { } 59 61 static inline void xe_i2c_pm_resume(struct xe_device *xe, bool d3cold) { }

+4

drivers/gpu/drm/xe/xe_irq.c

··· 18 18 #include "xe_gt.h" 19 19 #include "xe_guc.h" 20 20 #include "xe_hw_engine.h" 21 + #include "xe_hw_error.h" 21 22 #include "xe_i2c.h" 22 23 #include "xe_memirq.h" 23 24 #include "xe_mmio.h" ··· 469 468 xe_mmio_write32(mmio, GFX_MSTR_IRQ, master_ctl); 470 469 471 470 gt_irq_handler(tile, master_ctl, intr_dw, identity); 471 + xe_hw_error_irq_handler(tile, master_ctl); 472 472 473 473 /* 474 474 * Display interrupts (including display backlight operations ··· 757 755 unsigned int irq_flags = PCI_IRQ_MSI; 758 756 int nvec = 1; 759 757 int err; 758 + 759 + xe_hw_error_init(xe); 760 760 761 761 xe_irq_reset(xe); 762 762

+12 -9

drivers/gpu/drm/xe/xe_lmtt.c

··· 11 11 12 12 #include "xe_assert.h" 13 13 #include "xe_bo.h" 14 - #include "xe_gt_tlb_invalidation.h" 14 + #include "xe_tlb_inval.h" 15 15 #include "xe_lmtt.h" 16 16 #include "xe_map.h" 17 17 #include "xe_mmio.h" ··· 195 195 struct xe_tile *tile = lmtt_to_tile(lmtt); 196 196 struct xe_device *xe = tile_to_xe(tile); 197 197 dma_addr_t offset = xe_bo_main_addr(lmtt->pd->bo, XE_PAGE_SIZE); 198 + struct xe_gt *gt; 199 + u8 id; 198 200 199 201 lmtt_debug(lmtt, "DIR offset %pad\n", &offset); 200 202 lmtt_assert(lmtt, xe_bo_is_vram(lmtt->pd->bo)); 201 203 lmtt_assert(lmtt, IS_ALIGNED(offset, SZ_64K)); 202 204 203 - xe_mmio_write32(&tile->mmio, 204 - GRAPHICS_VER(xe) >= 20 ? XE2_LMEM_CFG : LMEM_CFG, 205 - LMEM_EN | REG_FIELD_PREP(LMTT_DIR_PTR, offset / SZ_64K)); 205 + for_each_gt_on_tile(gt, tile, id) 206 + xe_mmio_write32(&gt->mmio, 207 + GRAPHICS_VER(xe) >= 20 ? XE2_LMEM_CFG : LMEM_CFG, 208 + LMEM_EN | REG_FIELD_PREP(LMTT_DIR_PTR, offset / SZ_64K)); 206 209 } 207 210 208 211 /** ··· 228 225 229 226 static int lmtt_invalidate_hw(struct xe_lmtt *lmtt) 230 227 { 231 - struct xe_gt_tlb_invalidation_fence fences[XE_MAX_GT_PER_TILE]; 232 - struct xe_gt_tlb_invalidation_fence *fence = fences; 228 + struct xe_tlb_inval_fence fences[XE_MAX_GT_PER_TILE]; 229 + struct xe_tlb_inval_fence *fence = fences; 233 230 struct xe_tile *tile = lmtt_to_tile(lmtt); 234 231 struct xe_gt *gt; 235 232 int result = 0; ··· 237 234 u8 id; 238 235 239 236 for_each_gt_on_tile(gt, tile, id) { 240 - xe_gt_tlb_invalidation_fence_init(gt, fence, true); 241 - err = xe_gt_tlb_invalidation_all(gt, fence); 237 + xe_tlb_inval_fence_init(&gt->tlb_inval, fence, true); 238 + err = xe_tlb_inval_all(&gt->tlb_inval, fence); 242 239 result = result ?: err; 243 240 fence++; 244 241 } ··· 252 249 */ 253 250 fence = fences; 254 251 for_each_gt_on_tile(gt, tile, id) 255 - xe_gt_tlb_invalidation_fence_wait(fence++); 252 + xe_tlb_inval_fence_wait(fence++); 256 253 257 254 return result; 258 255 }

+164 -10

drivers/gpu/drm/xe/xe_lrc.c

··· 41 41 #define LRC_PPHWSP_SIZE SZ_4K 42 42 #define LRC_INDIRECT_CTX_BO_SIZE SZ_4K 43 43 #define LRC_INDIRECT_RING_STATE_SIZE SZ_4K 44 - #define LRC_WA_BB_SIZE SZ_4K 45 44 46 45 /* 47 46 * Layout of the LRC and associated data allocated as ··· 75 76 static bool 76 77 gt_engine_needs_indirect_ctx(struct xe_gt *gt, enum xe_engine_class class) 77 78 { 79 + if (XE_GT_WA(gt, 16010904313) && 80 + (class == XE_ENGINE_CLASS_RENDER || 81 + class == XE_ENGINE_CLASS_COMPUTE)) 82 + return true; 83 + 78 84 return false; 79 85 } 80 86 ··· 696 692 return xe_lrc_pphwsp_offset(lrc) + LRC_PPHWSP_SIZE; 697 693 } 698 694 699 - static size_t lrc_reg_size(struct xe_device *xe) 695 + /** 696 + * xe_lrc_reg_size() - Get size of the LRC registers area within queues 697 + * @xe: the &xe_device struct instance 698 + * 699 + * Returns: Size of the LRC registers area for current platform 700 + */ 701 + size_t xe_lrc_reg_size(struct xe_device *xe) 700 702 { 701 703 if (GRAPHICS_VERx100(xe) >= 1250) 702 704 return 96 * sizeof(u32); ··· 712 702 713 703 size_t xe_lrc_skip_size(struct xe_device *xe) 714 704 { 715 - return LRC_PPHWSP_SIZE + lrc_reg_size(xe); 705 + return LRC_PPHWSP_SIZE + xe_lrc_reg_size(xe); 716 706 } 717 707 718 708 static inline u32 __xe_lrc_seqno_offset(struct xe_lrc *lrc) ··· 953 943 return data; 954 944 } 955 945 946 + /** 947 + * xe_default_lrc_update_memirq_regs_with_address - Re-compute GGTT references in default LRC 948 + * of given engine. 949 + * @hwe: the &xe_hw_engine struct instance 950 + */ 951 + void xe_default_lrc_update_memirq_regs_with_address(struct xe_hw_engine *hwe) 952 + { 953 + struct xe_gt *gt = hwe->gt; 954 + u32 *regs; 955 + 956 + if (!gt->default_lrc[hwe->class]) 957 + return; 958 + 959 + regs = gt->default_lrc[hwe->class] + LRC_PPHWSP_SIZE; 960 + set_memory_based_intr(regs, hwe); 961 + } 962 + 963 + /** 964 + * xe_lrc_update_memirq_regs_with_address - Re-compute GGTT references in mem interrupt data 965 + * for given LRC. 966 + * @lrc: the &xe_lrc struct instance 967 + * @hwe: the &xe_hw_engine struct instance 968 + * @regs: scratch buffer to be used as temporary storage 969 + */ 970 + void xe_lrc_update_memirq_regs_with_address(struct xe_lrc *lrc, struct xe_hw_engine *hwe, 971 + u32 *regs) 972 + { 973 + struct xe_gt *gt = hwe->gt; 974 + struct iosys_map map; 975 + size_t regs_len; 976 + 977 + if (!xe_device_uses_memirq(gt_to_xe(gt))) 978 + return; 979 + 980 + map = __xe_lrc_regs_map(lrc); 981 + regs_len = xe_lrc_reg_size(gt_to_xe(gt)); 982 + xe_map_memcpy_from(gt_to_xe(gt), regs, &map, 0, regs_len); 983 + set_memory_based_intr(regs, hwe); 984 + xe_map_memcpy_to(gt_to_xe(gt), &map, 0, regs, regs_len); 985 + } 986 + 956 987 static void xe_lrc_set_ppgtt(struct xe_lrc *lrc, struct xe_vm *vm) 957 988 { 958 989 u64 desc = xe_vm_pdp4_descriptor(vm, gt_to_tile(lrc->gt)); ··· 1065 1014 return cmd - batch; 1066 1015 } 1067 1016 1017 + static ssize_t setup_timestamp_wa(struct xe_lrc *lrc, struct xe_hw_engine *hwe, 1018 + u32 *batch, size_t max_len) 1019 + { 1020 + const u32 ts_addr = __xe_lrc_ctx_timestamp_ggtt_addr(lrc); 1021 + u32 *cmd = batch; 1022 + 1023 + if (!XE_GT_WA(lrc->gt, 16010904313) || 1024 + !(hwe->class == XE_ENGINE_CLASS_RENDER || 1025 + hwe->class == XE_ENGINE_CLASS_COMPUTE || 1026 + hwe->class == XE_ENGINE_CLASS_COPY || 1027 + hwe->class == XE_ENGINE_CLASS_VIDEO_DECODE || 1028 + hwe->class == XE_ENGINE_CLASS_VIDEO_ENHANCE)) 1029 + return 0; 1030 + 1031 + if (xe_gt_WARN_ON(lrc->gt, max_len < 12)) 1032 + return -ENOSPC; 1033 + 1034 + *cmd++ = MI_LOAD_REGISTER_MEM | MI_LRM_USE_GGTT | MI_LRI_LRM_CS_MMIO | 1035 + MI_LRM_ASYNC; 1036 + *cmd++ = RING_CTX_TIMESTAMP(0).addr; 1037 + *cmd++ = ts_addr; 1038 + *cmd++ = 0; 1039 + 1040 + *cmd++ = MI_LOAD_REGISTER_MEM | MI_LRM_USE_GGTT | MI_LRI_LRM_CS_MMIO | 1041 + MI_LRM_ASYNC; 1042 + *cmd++ = RING_CTX_TIMESTAMP(0).addr; 1043 + *cmd++ = ts_addr; 1044 + *cmd++ = 0; 1045 + 1046 + *cmd++ = MI_LOAD_REGISTER_MEM | MI_LRM_USE_GGTT | MI_LRI_LRM_CS_MMIO; 1047 + *cmd++ = RING_CTX_TIMESTAMP(0).addr; 1048 + *cmd++ = ts_addr; 1049 + *cmd++ = 0; 1050 + 1051 + return cmd - batch; 1052 + } 1053 + 1054 + static ssize_t setup_invalidate_state_cache_wa(struct xe_lrc *lrc, 1055 + struct xe_hw_engine *hwe, 1056 + u32 *batch, size_t max_len) 1057 + { 1058 + u32 *cmd = batch; 1059 + 1060 + if (!XE_GT_WA(lrc->gt, 18022495364) || 1061 + hwe->class != XE_ENGINE_CLASS_RENDER) 1062 + return 0; 1063 + 1064 + if (xe_gt_WARN_ON(lrc->gt, max_len < 3)) 1065 + return -ENOSPC; 1066 + 1067 + *cmd++ = MI_LOAD_REGISTER_IMM | MI_LRI_NUM_REGS(1); 1068 + *cmd++ = CS_DEBUG_MODE1(0).addr; 1069 + *cmd++ = _MASKED_BIT_ENABLE(INSTRUCTION_STATE_CACHE_INVALIDATE); 1070 + 1071 + return cmd - batch; 1072 + } 1073 + 1068 1074 struct bo_setup { 1069 1075 ssize_t (*setup)(struct xe_lrc *lrc, struct xe_hw_engine *hwe, 1070 1076 u32 *batch, size_t max_size); ··· 1148 1040 ssize_t remain; 1149 1041 1150 1042 if (state->lrc->bo->vmap.is_iomem) { 1151 - state->buffer = kmalloc(state->max_size, GFP_KERNEL); 1152 1043 if (!state->buffer) 1153 1044 return -ENOMEM; 1154 1045 state->ptr = state->buffer; 1155 1046 } else { 1156 1047 state->ptr = state->lrc->bo->vmap.vaddr + state->offset; 1157 - state->buffer = NULL; 1158 1048 } 1159 1049 1160 1050 remain = state->max_size / sizeof(u32); ··· 1177 1071 return 0; 1178 1072 1179 1073 fail: 1180 - kfree(state->buffer); 1181 1074 return -ENOSPC; 1182 1075 } 1183 1076 ··· 1188 1083 xe_map_memcpy_to(gt_to_xe(state->lrc->gt), &state->lrc->bo->vmap, 1189 1084 state->offset, state->buffer, 1190 1085 state->written * sizeof(u32)); 1191 - kfree(state->buffer); 1192 1086 } 1193 1087 1194 - static int setup_wa_bb(struct xe_lrc *lrc, struct xe_hw_engine *hwe) 1088 + /** 1089 + * xe_lrc_setup_wa_bb_with_scratch - Execute all wa bb setup callbacks. 1090 + * @lrc: the &xe_lrc struct instance 1091 + * @hwe: the &xe_hw_engine struct instance 1092 + * @scratch: preallocated scratch buffer for temporary storage 1093 + * Return: 0 on success, negative error code on failure 1094 + */ 1095 + int xe_lrc_setup_wa_bb_with_scratch(struct xe_lrc *lrc, struct xe_hw_engine *hwe, u32 *scratch) 1195 1096 { 1196 1097 static const struct bo_setup funcs[] = { 1098 + { .setup = setup_timestamp_wa }, 1099 + { .setup = setup_invalidate_state_cache_wa }, 1197 1100 { .setup = setup_utilization_wa }, 1198 1101 }; 1199 1102 struct bo_setup_state state = { 1200 1103 .lrc = lrc, 1201 1104 .hwe = hwe, 1202 1105 .max_size = LRC_WA_BB_SIZE, 1106 + .buffer = scratch, 1203 1107 .reserve_dw = 1, 1204 1108 .offset = __xe_lrc_wa_bb_offset(lrc), 1205 1109 .funcs = funcs, ··· 1231 1117 return 0; 1232 1118 } 1233 1119 1120 + static int setup_wa_bb(struct xe_lrc *lrc, struct xe_hw_engine *hwe) 1121 + { 1122 + u32 *buf = NULL; 1123 + int ret; 1124 + 1125 + if (lrc->bo->vmap.is_iomem) 1126 + buf = kmalloc(LRC_WA_BB_SIZE, GFP_KERNEL); 1127 + 1128 + ret = xe_lrc_setup_wa_bb_with_scratch(lrc, hwe, buf); 1129 + 1130 + kfree(buf); 1131 + 1132 + return ret; 1133 + } 1134 + 1234 1135 static int 1235 1136 setup_indirect_ctx(struct xe_lrc *lrc, struct xe_hw_engine *hwe) 1236 1137 { 1237 1138 static struct bo_setup rcs_funcs[] = { 1139 + { .setup = setup_timestamp_wa }, 1238 1140 }; 1239 1141 struct bo_setup_state state = { 1240 1142 .lrc = lrc, 1241 1143 .hwe = hwe, 1242 1144 .max_size = (63 * 64) /* max 63 cachelines */, 1145 + .buffer = NULL, 1243 1146 .offset = __xe_lrc_indirect_ctx_offset(lrc), 1244 1147 }; 1245 1148 int ret; ··· 1273 1142 if (xe_gt_WARN_ON(lrc->gt, !state.funcs)) 1274 1143 return 0; 1275 1144 1145 + if (lrc->bo->vmap.is_iomem) 1146 + state.buffer = kmalloc(state.max_size, GFP_KERNEL); 1147 + 1276 1148 ret = setup_bo(&state); 1277 - if (ret) 1149 + if (ret) { 1150 + kfree(state.buffer); 1278 1151 return ret; 1152 + } 1279 1153 1280 1154 /* 1281 1155 * Align to 64B cacheline so there's no garbage at the end for CS to ··· 1292 1156 } 1293 1157 1294 1158 finish_bo(&state); 1159 + kfree(state.buffer); 1295 1160 1296 1161 xe_lrc_write_ctx_reg(lrc, 1297 1162 CTX_CS_INDIRECT_CTX, ··· 1509 1372 1510 1373 xe_lrc_finish(lrc); 1511 1374 kfree(lrc); 1375 + } 1376 + 1377 + /** 1378 + * xe_lrc_update_hwctx_regs_with_address - Re-compute GGTT references within given LRC. 1379 + * @lrc: the &xe_lrc struct instance 1380 + */ 1381 + void xe_lrc_update_hwctx_regs_with_address(struct xe_lrc *lrc) 1382 + { 1383 + if (xe_lrc_has_indirect_ring_state(lrc)) { 1384 + xe_lrc_write_ctx_reg(lrc, CTX_INDIRECT_RING_STATE, 1385 + __xe_lrc_indirect_ring_ggtt_addr(lrc)); 1386 + 1387 + xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_START, 1388 + __xe_lrc_ring_ggtt_addr(lrc)); 1389 + } else { 1390 + xe_lrc_write_ctx_reg(lrc, CTX_RING_START, __xe_lrc_ring_ggtt_addr(lrc)); 1391 + } 1512 1392 } 1513 1393 1514 1394 void xe_lrc_set_ring_tail(struct xe_lrc *lrc, u32 tail) ··· 2093 1939 * continue to emit all of the SVG state since it's best not to leak 2094 1940 * any of the state between contexts, even if that leakage is harmless. 2095 1941 */ 2096 - if (XE_WA(gt, 14019789679) && q->hwe->class == XE_ENGINE_CLASS_RENDER) { 1942 + if (XE_GT_WA(gt, 14019789679) && q->hwe->class == XE_ENGINE_CLASS_RENDER) { 2097 1943 state_table = xe_hpg_svg_state; 2098 1944 state_table_size = ARRAY_SIZE(xe_hpg_svg_state); 2099 1945 }

+9

drivers/gpu/drm/xe/xe_lrc.h

··· 42 42 #define LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR (0x34 * 4) 43 43 #define LRC_PPHWSP_PXP_INVAL_SCRATCH_ADDR (0x40 * 4) 44 44 45 + #define LRC_WA_BB_SIZE SZ_4K 46 + 45 47 #define XE_LRC_CREATE_RUNALONE 0x1 46 48 #define XE_LRC_CREATE_PXP 0x2 47 49 struct xe_lrc *xe_lrc_create(struct xe_hw_engine *hwe, struct xe_vm *vm, ··· 90 88 u32 xe_lrc_indirect_ring_ggtt_addr(struct xe_lrc *lrc); 91 89 u32 xe_lrc_ggtt_addr(struct xe_lrc *lrc); 92 90 u32 *xe_lrc_regs(struct xe_lrc *lrc); 91 + void xe_lrc_update_hwctx_regs_with_address(struct xe_lrc *lrc); 92 + void xe_default_lrc_update_memirq_regs_with_address(struct xe_hw_engine *hwe); 93 + void xe_lrc_update_memirq_regs_with_address(struct xe_lrc *lrc, struct xe_hw_engine *hwe, 94 + u32 *regs); 93 95 94 96 u32 xe_lrc_read_ctx_reg(struct xe_lrc *lrc, int reg_nr); 95 97 void xe_lrc_write_ctx_reg(struct xe_lrc *lrc, int reg_nr, u32 val); ··· 112 106 u32 xe_lrc_parallel_ggtt_addr(struct xe_lrc *lrc); 113 107 struct iosys_map xe_lrc_parallel_map(struct xe_lrc *lrc); 114 108 109 + size_t xe_lrc_reg_size(struct xe_device *xe); 115 110 size_t xe_lrc_skip_size(struct xe_device *xe); 116 111 117 112 void xe_lrc_dump_default(struct drm_printer *p, ··· 131 124 u64 xe_lrc_ctx_timestamp(struct xe_lrc *lrc); 132 125 u32 xe_lrc_ctx_job_timestamp_ggtt_addr(struct xe_lrc *lrc); 133 126 u32 xe_lrc_ctx_job_timestamp(struct xe_lrc *lrc); 127 + int xe_lrc_setup_wa_bb_with_scratch(struct xe_lrc *lrc, struct xe_hw_engine *hwe, 128 + u32 *scratch); 134 129 135 130 /** 136 131 * xe_lrc_update_timestamp - readout LRC timestamp and update cached value

+331 -93

drivers/gpu/drm/xe/xe_migrate.c

··· 9 9 #include <linux/sizes.h> 10 10 11 11 #include <drm/drm_managed.h> 12 + #include <drm/drm_pagemap.h> 12 13 #include <drm/ttm/ttm_tt.h> 13 14 #include <uapi/drm/xe_drm.h> 14 15 ··· 31 30 #include "xe_mocs.h" 32 31 #include "xe_pt.h" 33 32 #include "xe_res_cursor.h" 33 + #include "xe_sa.h" 34 34 #include "xe_sched_job.h" 35 35 #include "xe_sync.h" 36 36 #include "xe_trace_bo.h" 37 37 #include "xe_vm.h" 38 + #include "xe_vram.h" 38 39 39 40 /** 40 41 * struct xe_migrate - migrate context. ··· 87 84 */ 88 85 #define MAX_PTE_PER_SDI 0x1FEU 89 86 90 - /** 91 - * xe_tile_migrate_exec_queue() - Get this tile's migrate exec queue. 92 - * @tile: The tile. 93 - * 94 - * Returns the default migrate exec queue of this tile. 95 - * 96 - * Return: The default migrate exec queue 97 - */ 98 - struct xe_exec_queue *xe_tile_migrate_exec_queue(struct xe_tile *tile) 99 - { 100 - return tile->migrate->q; 101 - } 102 - 103 87 static void xe_migrate_fini(void *arg) 104 88 { 105 89 struct xe_migrate *m = arg; ··· 120 130 u64 identity_offset = IDENTITY_OFFSET; 121 131 122 132 if (GRAPHICS_VER(xe) >= 20 && is_comp_pte) 123 - identity_offset += DIV_ROUND_UP_ULL(xe->mem.vram.actual_physical_size, SZ_1G); 133 + identity_offset += DIV_ROUND_UP_ULL(xe_vram_region_actual_physical_size 134 + (xe->mem.vram), SZ_1G); 124 135 125 - addr -= xe->mem.vram.dpa_base; 136 + addr -= xe_vram_region_dpa_base(xe->mem.vram); 126 137 return addr + (identity_offset << xe_pt_shift(2)); 127 138 } 128 139 129 140 static void xe_migrate_program_identity(struct xe_device *xe, struct xe_vm *vm, struct xe_bo *bo, 130 141 u64 map_ofs, u64 vram_offset, u16 pat_index, u64 pt_2m_ofs) 131 142 { 143 + struct xe_vram_region *vram = xe->mem.vram; 144 + resource_size_t dpa_base = xe_vram_region_dpa_base(vram); 132 145 u64 pos, ofs, flags; 133 146 u64 entry; 134 147 /* XXX: Unclear if this should be usable_size? */ 135 - u64 vram_limit = xe->mem.vram.actual_physical_size + 136 - xe->mem.vram.dpa_base; 148 + u64 vram_limit = xe_vram_region_actual_physical_size(vram) + dpa_base; 137 149 u32 level = 2; 138 150 139 151 ofs = map_ofs + XE_PAGE_SIZE * level + vram_offset * 8; 140 152 flags = vm->pt_ops->pte_encode_addr(xe, 0, pat_index, level, 141 153 true, 0); 142 154 143 - xe_assert(xe, IS_ALIGNED(xe->mem.vram.usable_size, SZ_2M)); 155 + xe_assert(xe, IS_ALIGNED(xe_vram_region_usable_size(vram), SZ_2M)); 144 156 145 157 /* 146 158 * Use 1GB pages when possible, last chunk always use 2M 147 159 * pages as mixing reserved memory (stolen, WOCPM) with a single 148 160 * mapping is not allowed on certain platforms. 149 161 */ 150 - for (pos = xe->mem.vram.dpa_base; pos < vram_limit; 162 + for (pos = dpa_base; pos < vram_limit; 151 163 pos += SZ_1G, ofs += 8) { 152 164 if (pos + SZ_1G >= vram_limit) { 153 - entry = vm->pt_ops->pde_encode_bo(bo, pt_2m_ofs, 154 - pat_index); 165 + entry = vm->pt_ops->pde_encode_bo(bo, pt_2m_ofs); 155 166 xe_map_wr(xe, &bo->vmap, ofs, u64, entry); 156 167 157 168 flags = vm->pt_ops->pte_encode_addr(xe, 0, ··· 206 215 207 216 /* PT30 & PT31 reserved for 2M identity map */ 208 217 pt29_ofs = xe_bo_size(bo) - 3 * XE_PAGE_SIZE; 209 - entry = vm->pt_ops->pde_encode_bo(bo, pt29_ofs, pat_index); 218 + entry = vm->pt_ops->pde_encode_bo(bo, pt29_ofs); 210 219 xe_pt_write(xe, &vm->pt_root[id]->bo->vmap, 0, entry); 211 220 212 221 map_ofs = (num_entries - num_setup) * XE_PAGE_SIZE; ··· 274 283 flags = XE_PDE_64K; 275 284 276 285 entry = vm->pt_ops->pde_encode_bo(bo, map_ofs + (u64)(level - 1) * 277 - XE_PAGE_SIZE, pat_index); 286 + XE_PAGE_SIZE); 278 287 xe_map_wr(xe, &bo->vmap, map_ofs + XE_PAGE_SIZE * level, u64, 279 288 entry | flags); 280 289 } 281 290 282 291 /* Write PDE's that point to our BO. */ 283 - for (i = 0; i < map_ofs / PAGE_SIZE; i++) { 284 - entry = vm->pt_ops->pde_encode_bo(bo, (u64)i * XE_PAGE_SIZE, 285 - pat_index); 292 + for (i = 0; i < map_ofs / XE_PAGE_SIZE; i++) { 293 + entry = vm->pt_ops->pde_encode_bo(bo, (u64)i * XE_PAGE_SIZE); 286 294 287 295 xe_map_wr(xe, &bo->vmap, map_ofs + XE_PAGE_SIZE + 288 296 (i + 1) * 8, u64, entry); ··· 297 307 /* Identity map the entire vram at 256GiB offset */ 298 308 if (IS_DGFX(xe)) { 299 309 u64 pt30_ofs = xe_bo_size(bo) - 2 * XE_PAGE_SIZE; 310 + resource_size_t actual_phy_size = xe_vram_region_actual_physical_size(xe->mem.vram); 300 311 301 312 xe_migrate_program_identity(xe, vm, bo, map_ofs, IDENTITY_OFFSET, 302 313 pat_index, pt30_ofs); 303 - xe_assert(xe, xe->mem.vram.actual_physical_size <= 304 - (MAX_NUM_PTE - IDENTITY_OFFSET) * SZ_1G); 314 + xe_assert(xe, actual_phy_size <= (MAX_NUM_PTE - IDENTITY_OFFSET) * SZ_1G); 305 315 306 316 /* 307 317 * Identity map the entire vram for compressed pat_index for xe2+ ··· 310 320 if (GRAPHICS_VER(xe) >= 20 && xe_device_has_flat_ccs(xe)) { 311 321 u16 comp_pat_index = xe->pat.idx[XE_CACHE_NONE_COMPRESSION]; 312 322 u64 vram_offset = IDENTITY_OFFSET + 313 - DIV_ROUND_UP_ULL(xe->mem.vram.actual_physical_size, SZ_1G); 323 + DIV_ROUND_UP_ULL(actual_phy_size, SZ_1G); 314 324 u64 pt31_ofs = xe_bo_size(bo) - XE_PAGE_SIZE; 315 325 316 - xe_assert(xe, xe->mem.vram.actual_physical_size <= (MAX_NUM_PTE - 317 - IDENTITY_OFFSET - IDENTITY_OFFSET / 2) * SZ_1G); 326 + xe_assert(xe, actual_phy_size <= (MAX_NUM_PTE - IDENTITY_OFFSET - 327 + IDENTITY_OFFSET / 2) * SZ_1G); 318 328 xe_migrate_program_identity(xe, vm, bo, map_ofs, vram_offset, 319 329 comp_pat_index, pt31_ofs); 320 330 } ··· 377 387 } 378 388 379 389 /** 380 - * xe_migrate_init() - Initialize a migrate context 381 - * @tile: Back-pointer to the tile we're initializing for. 390 + * xe_migrate_alloc - Allocate a migrate struct for a given &xe_tile 391 + * @tile: &xe_tile 382 392 * 383 - * Return: Pointer to a migrate context on success. Error pointer on error. 393 + * Allocates a &xe_migrate for a given tile. 394 + * 395 + * Return: &xe_migrate on success, or NULL when out of memory. 384 396 */ 385 - struct xe_migrate *xe_migrate_init(struct xe_tile *tile) 397 + struct xe_migrate *xe_migrate_alloc(struct xe_tile *tile) 386 398 { 387 - struct xe_device *xe = tile_to_xe(tile); 399 + struct xe_migrate *m = drmm_kzalloc(&tile_to_xe(tile)->drm, sizeof(*m), GFP_KERNEL); 400 + 401 + if (m) 402 + m->tile = tile; 403 + return m; 404 + } 405 + 406 + /** 407 + * xe_migrate_init() - Initialize a migrate context 408 + * @m: The migration context 409 + * 410 + * Return: 0 if successful, negative error code on failure 411 + */ 412 + int xe_migrate_init(struct xe_migrate *m) 413 + { 414 + struct xe_tile *tile = m->tile; 388 415 struct xe_gt *primary_gt = tile->primary_gt; 389 - struct xe_migrate *m; 416 + struct xe_device *xe = tile_to_xe(tile); 390 417 struct xe_vm *vm; 391 418 int err; 392 419 393 - m = devm_kzalloc(xe->drm.dev, sizeof(*m), GFP_KERNEL); 394 - if (!m) 395 - return ERR_PTR(-ENOMEM); 396 - 397 - m->tile = tile; 398 - 399 420 /* Special layout, prepared below.. */ 400 421 vm = xe_vm_create(xe, XE_VM_FLAG_MIGRATION | 401 - XE_VM_FLAG_SET_TILE_ID(tile)); 422 + XE_VM_FLAG_SET_TILE_ID(tile), NULL); 402 423 if (IS_ERR(vm)) 403 - return ERR_CAST(vm); 424 + return PTR_ERR(vm); 404 425 405 426 xe_vm_lock(vm, false); 406 427 err = xe_migrate_prepare_vm(tile, m, vm); 407 428 xe_vm_unlock(vm); 408 - if (err) { 409 - xe_vm_close_and_put(vm); 410 - return ERR_PTR(err); 411 - } 429 + if (err) 430 + goto err_out; 412 431 413 432 if (xe->info.has_usm) { 414 433 struct xe_hw_engine *hwe = xe_gt_hw_engine(primary_gt, ··· 426 427 false); 427 428 u32 logical_mask = xe_migrate_usm_logical_mask(primary_gt); 428 429 429 - if (!hwe || !logical_mask) 430 - return ERR_PTR(-EINVAL); 430 + if (!hwe || !logical_mask) { 431 + err = -EINVAL; 432 + goto err_out; 433 + } 431 434 432 435 /* 433 436 * XXX: Currently only reserving 1 (likely slow) BCS instance on ··· 438 437 m->q = xe_exec_queue_create(xe, vm, logical_mask, 1, hwe, 439 438 EXEC_QUEUE_FLAG_KERNEL | 440 439 EXEC_QUEUE_FLAG_PERMANENT | 441 - EXEC_QUEUE_FLAG_HIGH_PRIORITY, 0); 440 + EXEC_QUEUE_FLAG_HIGH_PRIORITY | 441 + EXEC_QUEUE_FLAG_MIGRATE, 0); 442 442 } else { 443 443 m->q = xe_exec_queue_create_class(xe, primary_gt, vm, 444 444 XE_ENGINE_CLASS_COPY, 445 445 EXEC_QUEUE_FLAG_KERNEL | 446 - EXEC_QUEUE_FLAG_PERMANENT, 0); 446 + EXEC_QUEUE_FLAG_PERMANENT | 447 + EXEC_QUEUE_FLAG_MIGRATE, 0); 447 448 } 448 449 if (IS_ERR(m->q)) { 449 - xe_vm_close_and_put(vm); 450 - return ERR_CAST(m->q); 450 + err = PTR_ERR(m->q); 451 + goto err_out; 451 452 } 452 453 453 454 mutex_init(&m->job_mutex); ··· 459 456 460 457 err = devm_add_action_or_reset(xe->drm.dev, xe_migrate_fini, m); 461 458 if (err) 462 - return ERR_PTR(err); 459 + return err; 463 460 464 461 if (IS_DGFX(xe)) { 465 462 if (xe_migrate_needs_ccs_emit(xe)) ··· 474 471 (unsigned long long)m->min_chunk_size); 475 472 } 476 473 477 - return m; 474 + return err; 475 + 476 + err_out: 477 + xe_vm_close_and_put(vm); 478 + return err; 479 + 478 480 } 479 481 480 482 static u64 max_mem_transfer_per_pass(struct xe_device *xe) ··· 904 896 goto err; 905 897 } 906 898 907 - xe_sched_job_add_migrate_flush(job, flush_flags); 899 + xe_sched_job_add_migrate_flush(job, flush_flags | MI_INVALIDATE_TLB); 908 900 if (!fence) { 909 901 err = xe_sched_job_add_deps(job, src_bo->ttm.base.resv, 910 902 DMA_RESV_USAGE_BOOKKEEP); ··· 946 938 } 947 939 948 940 return fence; 941 + } 942 + 943 + /** 944 + * xe_migrate_lrc() - Get the LRC from migrate context. 945 + * @migrate: Migrate context. 946 + * 947 + * Return: Pointer to LRC on success, error on failure 948 + */ 949 + struct xe_lrc *xe_migrate_lrc(struct xe_migrate *migrate) 950 + { 951 + return migrate->q->lrc[0]; 952 + } 953 + 954 + static int emit_flush_invalidate(struct xe_exec_queue *q, u32 *dw, int i, 955 + u32 flags) 956 + { 957 + struct xe_lrc *lrc = xe_exec_queue_lrc(q); 958 + dw[i++] = MI_FLUSH_DW | MI_INVALIDATE_TLB | MI_FLUSH_DW_OP_STOREDW | 959 + MI_FLUSH_IMM_DW | flags; 960 + dw[i++] = lower_32_bits(xe_lrc_start_seqno_ggtt_addr(lrc)) | 961 + MI_FLUSH_DW_USE_GTT; 962 + dw[i++] = upper_32_bits(xe_lrc_start_seqno_ggtt_addr(lrc)); 963 + dw[i++] = MI_NOOP; 964 + dw[i++] = MI_NOOP; 965 + 966 + return i; 967 + } 968 + 969 + /** 970 + * xe_migrate_ccs_rw_copy() - Copy content of TTM resources. 971 + * @tile: Tile whose migration context to be used. 972 + * @q : Execution to be used along with migration context. 973 + * @src_bo: The buffer object @src is currently bound to. 974 + * @read_write : Creates BB commands for CCS read/write. 975 + * 976 + * Creates batch buffer instructions to copy CCS metadata from CCS pool to 977 + * memory and vice versa. 978 + * 979 + * This function should only be called for IGPU. 980 + * 981 + * Return: 0 if successful, negative error code on failure. 982 + */ 983 + int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q, 984 + struct xe_bo *src_bo, 985 + enum xe_sriov_vf_ccs_rw_ctxs read_write) 986 + 987 + { 988 + bool src_is_pltt = read_write == XE_SRIOV_VF_CCS_READ_CTX; 989 + bool dst_is_pltt = read_write == XE_SRIOV_VF_CCS_WRITE_CTX; 990 + struct ttm_resource *src = src_bo->ttm.resource; 991 + struct xe_migrate *m = tile->migrate; 992 + struct xe_gt *gt = tile->primary_gt; 993 + u32 batch_size, batch_size_allocated; 994 + struct xe_device *xe = gt_to_xe(gt); 995 + struct xe_res_cursor src_it, ccs_it; 996 + u64 size = xe_bo_size(src_bo); 997 + struct xe_bb *bb = NULL; 998 + u64 src_L0, src_L0_ofs; 999 + u32 src_L0_pt; 1000 + int err; 1001 + 1002 + xe_res_first_sg(xe_bo_sg(src_bo), 0, size, &src_it); 1003 + 1004 + xe_res_first_sg(xe_bo_sg(src_bo), xe_bo_ccs_pages_start(src_bo), 1005 + PAGE_ALIGN(xe_device_ccs_bytes(xe, size)), 1006 + &ccs_it); 1007 + 1008 + /* Calculate Batch buffer size */ 1009 + batch_size = 0; 1010 + while (size) { 1011 + batch_size += 10; /* Flush + ggtt addr + 2 NOP */ 1012 + u64 ccs_ofs, ccs_size; 1013 + u32 ccs_pt; 1014 + 1015 + u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE; 1016 + 1017 + src_L0 = min_t(u64, max_mem_transfer_per_pass(xe), size); 1018 + 1019 + batch_size += pte_update_size(m, false, src, &src_it, &src_L0, 1020 + &src_L0_ofs, &src_L0_pt, 0, 0, 1021 + avail_pts); 1022 + 1023 + ccs_size = xe_device_ccs_bytes(xe, src_L0); 1024 + batch_size += pte_update_size(m, 0, NULL, &ccs_it, &ccs_size, &ccs_ofs, 1025 + &ccs_pt, 0, avail_pts, avail_pts); 1026 + xe_assert(xe, IS_ALIGNED(ccs_it.start, PAGE_SIZE)); 1027 + 1028 + /* Add copy commands size here */ 1029 + batch_size += EMIT_COPY_CCS_DW; 1030 + 1031 + size -= src_L0; 1032 + } 1033 + 1034 + bb = xe_bb_ccs_new(gt, batch_size, read_write); 1035 + if (IS_ERR(bb)) { 1036 + drm_err(&xe->drm, "BB allocation failed.\n"); 1037 + err = PTR_ERR(bb); 1038 + goto err_ret; 1039 + } 1040 + 1041 + batch_size_allocated = batch_size; 1042 + size = xe_bo_size(src_bo); 1043 + batch_size = 0; 1044 + 1045 + /* 1046 + * Emit PTE and copy commands here. 1047 + * The CCS copy command can only support limited size. If the size to be 1048 + * copied is more than the limit, divide copy into chunks. So, calculate 1049 + * sizes here again before copy command is emitted. 1050 + */ 1051 + while (size) { 1052 + batch_size += 10; /* Flush + ggtt addr + 2 NOP */ 1053 + u32 flush_flags = 0; 1054 + u64 ccs_ofs, ccs_size; 1055 + u32 ccs_pt; 1056 + 1057 + u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE; 1058 + 1059 + src_L0 = xe_migrate_res_sizes(m, &src_it); 1060 + 1061 + batch_size += pte_update_size(m, false, src, &src_it, &src_L0, 1062 + &src_L0_ofs, &src_L0_pt, 0, 0, 1063 + avail_pts); 1064 + 1065 + ccs_size = xe_device_ccs_bytes(xe, src_L0); 1066 + batch_size += pte_update_size(m, 0, NULL, &ccs_it, &ccs_size, &ccs_ofs, 1067 + &ccs_pt, 0, avail_pts, avail_pts); 1068 + xe_assert(xe, IS_ALIGNED(ccs_it.start, PAGE_SIZE)); 1069 + batch_size += EMIT_COPY_CCS_DW; 1070 + 1071 + emit_pte(m, bb, src_L0_pt, false, true, &src_it, src_L0, src); 1072 + 1073 + emit_pte(m, bb, ccs_pt, false, false, &ccs_it, ccs_size, src); 1074 + 1075 + bb->len = emit_flush_invalidate(q, bb->cs, bb->len, flush_flags); 1076 + flush_flags = xe_migrate_ccs_copy(m, bb, src_L0_ofs, src_is_pltt, 1077 + src_L0_ofs, dst_is_pltt, 1078 + src_L0, ccs_ofs, true); 1079 + bb->len = emit_flush_invalidate(q, bb->cs, bb->len, flush_flags); 1080 + 1081 + size -= src_L0; 1082 + } 1083 + 1084 + xe_assert(xe, (batch_size_allocated == bb->len)); 1085 + src_bo->bb_ccs[read_write] = bb; 1086 + 1087 + return 0; 1088 + 1089 + err_ret: 1090 + return err; 1091 + } 1092 + 1093 + /** 1094 + * xe_get_migrate_exec_queue() - Get the execution queue from migrate context. 1095 + * @migrate: Migrate context. 1096 + * 1097 + * Return: Pointer to execution queue on success, error on failure 1098 + */ 1099 + struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate) 1100 + { 1101 + return migrate->q; 949 1102 } 950 1103 951 1104 static void emit_clear_link_copy(struct xe_gt *gt, struct xe_bb *bb, u64 src_ofs, ··· 1288 1119 1289 1120 size -= clear_L0; 1290 1121 /* Preemption is enabled again by the ring ops. */ 1291 - if (clear_vram && xe_migrate_allow_identity(clear_L0, &src_it)) 1122 + if (clear_vram && xe_migrate_allow_identity(clear_L0, &src_it)) { 1292 1123 xe_res_next(&src_it, clear_L0); 1293 - else 1294 - emit_pte(m, bb, clear_L0_pt, clear_vram, clear_only_system_ccs, 1295 - &src_it, clear_L0, dst); 1124 + } else { 1125 + emit_pte(m, bb, clear_L0_pt, clear_vram, 1126 + clear_only_system_ccs, &src_it, clear_L0, dst); 1127 + flush_flags |= MI_INVALIDATE_TLB; 1128 + } 1296 1129 1297 1130 bb->cs[bb->len++] = MI_BATCH_BUFFER_END; 1298 1131 update_idx = bb->len; ··· 1305 1134 if (xe_migrate_needs_ccs_emit(xe)) { 1306 1135 emit_copy_ccs(gt, bb, clear_L0_ofs, true, 1307 1136 m->cleared_mem_ofs, false, clear_L0); 1308 - flush_flags = MI_FLUSH_DW_CCS; 1137 + flush_flags |= MI_FLUSH_DW_CCS; 1309 1138 } 1310 1139 1311 1140 job = xe_bb_create_migration_job(m->q, bb, ··· 1640 1469 goto err_sa; 1641 1470 } 1642 1471 1472 + xe_sched_job_add_migrate_flush(job, MI_INVALIDATE_TLB); 1473 + 1643 1474 if (ops->pre_commit) { 1644 1475 pt_update->job = job; 1645 1476 err = ops->pre_commit(pt_update); ··· 1744 1571 1745 1572 static void build_pt_update_batch_sram(struct xe_migrate *m, 1746 1573 struct xe_bb *bb, u32 pt_offset, 1747 - dma_addr_t *sram_addr, u32 size) 1574 + struct drm_pagemap_addr *sram_addr, 1575 + u32 size) 1748 1576 { 1749 1577 u16 pat_index = tile_to_xe(m->tile)->pat.idx[XE_CACHE_WB]; 1750 1578 u32 ptes; ··· 1763 1589 ptes -= chunk; 1764 1590 1765 1591 while (chunk--) { 1766 - u64 addr = sram_addr[i++] & PAGE_MASK; 1592 + u64 addr = sram_addr[i].addr & PAGE_MASK; 1767 1593 1594 + xe_tile_assert(m->tile, sram_addr[i].proto == 1595 + DRM_INTERCONNECT_SYSTEM); 1768 1596 xe_tile_assert(m->tile, addr); 1769 1597 addr = m->q->vm->pt_ops->pte_encode_addr(m->tile->xe, 1770 1598 addr, pat_index, 1771 1599 0, false, 0); 1772 1600 bb->cs[bb->len++] = lower_32_bits(addr); 1773 1601 bb->cs[bb->len++] = upper_32_bits(addr); 1602 + 1603 + i++; 1774 1604 } 1775 1605 } 1776 1606 } ··· 1790 1612 static struct dma_fence *xe_migrate_vram(struct xe_migrate *m, 1791 1613 unsigned long len, 1792 1614 unsigned long sram_offset, 1793 - dma_addr_t *sram_addr, u64 vram_addr, 1615 + struct drm_pagemap_addr *sram_addr, 1616 + u64 vram_addr, 1794 1617 const enum xe_migrate_copy_dir dir) 1795 1618 { 1796 1619 struct xe_gt *gt = m->tile->primary_gt; ··· 1807 1628 unsigned int pitch = len >= PAGE_SIZE && !(len & ~PAGE_MASK) ? 1808 1629 PAGE_SIZE : 4; 1809 1630 int err; 1631 + unsigned long i, j; 1810 1632 1811 1633 if (drm_WARN_ON(&xe->drm, (len & XE_CACHELINE_MASK) || 1812 1634 (sram_offset | vram_addr) & XE_CACHELINE_MASK)) ··· 1822 1642 if (IS_ERR(bb)) { 1823 1643 err = PTR_ERR(bb); 1824 1644 return ERR_PTR(err); 1645 + } 1646 + 1647 + /* 1648 + * If the order of a struct drm_pagemap_addr entry is greater than 0, 1649 + * the entry is populated by GPU pagemap but subsequent entries within 1650 + * the range of that order are not populated. 1651 + * build_pt_update_batch_sram() expects a fully populated array of 1652 + * struct drm_pagemap_addr. Ensure this is the case even with higher 1653 + * orders. 1654 + */ 1655 + for (i = 0; i < npages;) { 1656 + unsigned int order = sram_addr[i].order; 1657 + 1658 + for (j = 1; j < NR_PAGES(order) && i + j < npages; j++) 1659 + if (!sram_addr[i + j].addr) 1660 + sram_addr[i + j].addr = sram_addr[i].addr + j * PAGE_SIZE; 1661 + 1662 + i += NR_PAGES(order); 1825 1663 } 1826 1664 1827 1665 build_pt_update_batch_sram(m, bb, pt_slot * XE_PAGE_SIZE, ··· 1867 1669 goto err; 1868 1670 } 1869 1671 1870 - xe_sched_job_add_migrate_flush(job, 0); 1672 + xe_sched_job_add_migrate_flush(job, MI_INVALIDATE_TLB); 1871 1673 1872 1674 mutex_lock(&m->job_mutex); 1873 1675 xe_sched_job_arm(job); ··· 1892 1694 * xe_migrate_to_vram() - Migrate to VRAM 1893 1695 * @m: The migration context. 1894 1696 * @npages: Number of pages to migrate. 1895 - * @src_addr: Array of dma addresses (source of migrate) 1697 + * @src_addr: Array of DMA information (source of migrate) 1896 1698 * @dst_addr: Device physical address of VRAM (destination of migrate) 1897 1699 * 1898 1700 * Copy from an array dma addresses to a VRAM device physical address ··· 1902 1704 */ 1903 1705 struct dma_fence *xe_migrate_to_vram(struct xe_migrate *m, 1904 1706 unsigned long npages, 1905 - dma_addr_t *src_addr, 1707 + struct drm_pagemap_addr *src_addr, 1906 1708 u64 dst_addr) 1907 1709 { 1908 1710 return xe_migrate_vram(m, npages * PAGE_SIZE, 0, src_addr, dst_addr, ··· 1914 1716 * @m: The migration context. 1915 1717 * @npages: Number of pages to migrate. 1916 1718 * @src_addr: Device physical address of VRAM (source of migrate) 1917 - * @dst_addr: Array of dma addresses (destination of migrate) 1719 + * @dst_addr: Array of DMA information (destination of migrate) 1918 1720 * 1919 1721 * Copy from a VRAM device physical address to an array dma addresses 1920 1722 * ··· 1924 1726 struct dma_fence *xe_migrate_from_vram(struct xe_migrate *m, 1925 1727 unsigned long npages, 1926 1728 u64 src_addr, 1927 - dma_addr_t *dst_addr) 1729 + struct drm_pagemap_addr *dst_addr) 1928 1730 { 1929 1731 return xe_migrate_vram(m, npages * PAGE_SIZE, 0, dst_addr, src_addr, 1930 1732 XE_MIGRATE_COPY_TO_SRAM); 1931 1733 } 1932 1734 1933 - static void xe_migrate_dma_unmap(struct xe_device *xe, dma_addr_t *dma_addr, 1735 + static void xe_migrate_dma_unmap(struct xe_device *xe, 1736 + struct drm_pagemap_addr *pagemap_addr, 1934 1737 int len, int write) 1935 1738 { 1936 1739 unsigned long i, npages = DIV_ROUND_UP(len, PAGE_SIZE); 1937 1740 1938 1741 for (i = 0; i < npages; ++i) { 1939 - if (!dma_addr[i]) 1742 + if (!pagemap_addr[i].addr) 1940 1743 break; 1941 1744 1942 - dma_unmap_page(xe->drm.dev, dma_addr[i], PAGE_SIZE, 1745 + dma_unmap_page(xe->drm.dev, pagemap_addr[i].addr, PAGE_SIZE, 1943 1746 write ? DMA_TO_DEVICE : DMA_FROM_DEVICE); 1944 1747 } 1945 - kfree(dma_addr); 1748 + kfree(pagemap_addr); 1946 1749 } 1947 1750 1948 - static dma_addr_t *xe_migrate_dma_map(struct xe_device *xe, 1949 - void *buf, int len, int write) 1751 + static struct drm_pagemap_addr *xe_migrate_dma_map(struct xe_device *xe, 1752 + void *buf, int len, 1753 + int write) 1950 1754 { 1951 - dma_addr_t *dma_addr; 1755 + struct drm_pagemap_addr *pagemap_addr; 1952 1756 unsigned long i, npages = DIV_ROUND_UP(len, PAGE_SIZE); 1953 1757 1954 - dma_addr = kcalloc(npages, sizeof(*dma_addr), GFP_KERNEL); 1955 - if (!dma_addr) 1758 + pagemap_addr = kcalloc(npages, sizeof(*pagemap_addr), GFP_KERNEL); 1759 + if (!pagemap_addr) 1956 1760 return ERR_PTR(-ENOMEM); 1957 1761 1958 1762 for (i = 0; i < npages; ++i) { 1959 1763 dma_addr_t addr; 1960 1764 struct page *page; 1765 + enum dma_data_direction dir = write ? DMA_TO_DEVICE : 1766 + DMA_FROM_DEVICE; 1961 1767 1962 1768 if (is_vmalloc_addr(buf)) 1963 1769 page = vmalloc_to_page(buf); 1964 1770 else 1965 1771 page = virt_to_page(buf); 1966 1772 1967 - addr = dma_map_page(xe->drm.dev, 1968 - page, 0, PAGE_SIZE, 1969 - write ? DMA_TO_DEVICE : 1970 - DMA_FROM_DEVICE); 1773 + addr = dma_map_page(xe->drm.dev, page, 0, PAGE_SIZE, dir); 1971 1774 if (dma_mapping_error(xe->drm.dev, addr)) 1972 1775 goto err_fault; 1973 1776 1974 - dma_addr[i] = addr; 1777 + pagemap_addr[i] = 1778 + drm_pagemap_addr_encode(addr, 1779 + DRM_INTERCONNECT_SYSTEM, 1780 + 0, dir); 1975 1781 buf += PAGE_SIZE; 1976 1782 } 1977 1783 1978 - return dma_addr; 1784 + return pagemap_addr; 1979 1785 1980 1786 err_fault: 1981 - xe_migrate_dma_unmap(xe, dma_addr, len, write); 1787 + xe_migrate_dma_unmap(xe, pagemap_addr, len, write); 1982 1788 return ERR_PTR(-EFAULT); 1983 1789 } 1984 1790 ··· 2011 1809 struct xe_device *xe = tile_to_xe(tile); 2012 1810 struct xe_res_cursor cursor; 2013 1811 struct dma_fence *fence = NULL; 2014 - dma_addr_t *dma_addr; 1812 + struct drm_pagemap_addr *pagemap_addr; 2015 1813 unsigned long page_offset = (unsigned long)buf & ~PAGE_MASK; 2016 1814 int bytes_left = len, current_page = 0; 2017 1815 void *orig_buf = buf; ··· 2071 1869 return err; 2072 1870 } 2073 1871 2074 - dma_addr = xe_migrate_dma_map(xe, buf, len + page_offset, write); 2075 - if (IS_ERR(dma_addr)) 2076 - return PTR_ERR(dma_addr); 1872 + pagemap_addr = xe_migrate_dma_map(xe, buf, len + page_offset, write); 1873 + if (IS_ERR(pagemap_addr)) 1874 + return PTR_ERR(pagemap_addr); 2077 1875 2078 1876 xe_res_first(bo->ttm.resource, offset, xe_bo_size(bo) - offset, &cursor); 2079 1877 ··· 2097 1895 2098 1896 __fence = xe_migrate_vram(m, current_bytes, 2099 1897 (unsigned long)buf & ~PAGE_MASK, 2100 - dma_addr + current_page, 1898 + &pagemap_addr[current_page], 2101 1899 vram_addr, write ? 2102 1900 XE_MIGRATE_COPY_TO_VRAM : 2103 1901 XE_MIGRATE_COPY_TO_SRAM); ··· 2125 1923 dma_fence_put(fence); 2126 1924 2127 1925 out_err: 2128 - xe_migrate_dma_unmap(xe, dma_addr, len + page_offset, write); 1926 + xe_migrate_dma_unmap(xe, pagemap_addr, len + page_offset, write); 2129 1927 return IS_ERR(fence) ? PTR_ERR(fence) : 0; 1928 + } 1929 + 1930 + /** 1931 + * xe_migrate_job_lock() - Lock migrate job lock 1932 + * @m: The migration context. 1933 + * @q: Queue associated with the operation which requires a lock 1934 + * 1935 + * Lock the migrate job lock if the queue is a migration queue, otherwise 1936 + * assert the VM's dma-resv is held (user queue's have own locking). 1937 + */ 1938 + void xe_migrate_job_lock(struct xe_migrate *m, struct xe_exec_queue *q) 1939 + { 1940 + bool is_migrate = q == m->q; 1941 + 1942 + if (is_migrate) 1943 + mutex_lock(&m->job_mutex); 1944 + else 1945 + xe_vm_assert_held(q->vm); /* User queues VM's should be locked */ 1946 + } 1947 + 1948 + /** 1949 + * xe_migrate_job_unlock() - Unlock migrate job lock 1950 + * @m: The migration context. 1951 + * @q: Queue associated with the operation which requires a lock 1952 + * 1953 + * Unlock the migrate job lock if the queue is a migration queue, otherwise 1954 + * assert the VM's dma-resv is held (user queue's have own locking). 1955 + */ 1956 + void xe_migrate_job_unlock(struct xe_migrate *m, struct xe_exec_queue *q) 1957 + { 1958 + bool is_migrate = q == m->q; 1959 + 1960 + if (is_migrate) 1961 + mutex_unlock(&m->job_mutex); 1962 + else 1963 + xe_vm_assert_held(q->vm); /* User queues VM's should be locked */ 2130 1964 } 2131 1965 2132 1966 #if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)

+25 -4

drivers/gpu/drm/xe/xe_migrate.h

··· 9 9 #include <linux/types.h> 10 10 11 11 struct dma_fence; 12 + struct drm_pagemap_addr; 12 13 struct iosys_map; 13 14 struct ttm_resource; 14 15 15 16 struct xe_bo; 16 17 struct xe_gt; 18 + struct xe_tlb_inval_job; 17 19 struct xe_exec_queue; 18 20 struct xe_migrate; 19 21 struct xe_migrate_pt_update; ··· 25 23 struct xe_vm; 26 24 struct xe_vm_pgtable_update; 27 25 struct xe_vma; 26 + 27 + enum xe_sriov_vf_ccs_rw_ctxs; 28 28 29 29 /** 30 30 * struct xe_migrate_pt_update_ops - Callbacks for the ··· 93 89 struct xe_vma_ops *vops; 94 90 /** @job: The job if a GPU page-table update. NULL otherwise */ 95 91 struct xe_sched_job *job; 92 + /** 93 + * @ijob: The TLB invalidation job for primary GT. NULL otherwise 94 + */ 95 + struct xe_tlb_inval_job *ijob; 96 + /** 97 + * @mjob: The TLB invalidation job for media GT. NULL otherwise 98 + */ 99 + struct xe_tlb_inval_job *mjob; 96 100 /** @tile_id: Tile ID of the update */ 97 101 u8 tile_id; 98 102 }; 99 103 100 - struct xe_migrate *xe_migrate_init(struct xe_tile *tile); 104 + struct xe_migrate *xe_migrate_alloc(struct xe_tile *tile); 105 + int xe_migrate_init(struct xe_migrate *m); 101 106 102 107 struct dma_fence *xe_migrate_to_vram(struct xe_migrate *m, 103 108 unsigned long npages, 104 - dma_addr_t *src_addr, 109 + struct drm_pagemap_addr *src_addr, 105 110 u64 dst_addr); 106 111 107 112 struct dma_fence *xe_migrate_from_vram(struct xe_migrate *m, 108 113 unsigned long npages, 109 114 u64 src_addr, 110 - dma_addr_t *dst_addr); 115 + struct drm_pagemap_addr *dst_addr); 111 116 112 117 struct dma_fence *xe_migrate_copy(struct xe_migrate *m, 113 118 struct xe_bo *src_bo, ··· 125 112 struct ttm_resource *dst, 126 113 bool copy_only_ccs); 127 114 115 + int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q, 116 + struct xe_bo *src_bo, 117 + enum xe_sriov_vf_ccs_rw_ctxs read_write); 118 + 119 + struct xe_lrc *xe_migrate_lrc(struct xe_migrate *migrate); 120 + struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate); 128 121 int xe_migrate_access_memory(struct xe_migrate *m, struct xe_bo *bo, 129 122 unsigned long offset, void *buf, int len, 130 123 int write); ··· 152 133 153 134 void xe_migrate_wait(struct xe_migrate *m); 154 135 155 - struct xe_exec_queue *xe_tile_migrate_exec_queue(struct xe_tile *tile); 136 + void xe_migrate_job_lock(struct xe_migrate *m, struct xe_exec_queue *q); 137 + void xe_migrate_job_unlock(struct xe_migrate *m, struct xe_exec_queue *q); 138 + 156 139 #endif

-33

drivers/gpu/drm/xe/xe_mmio.c

··· 58 58 static void mmio_multi_tile_setup(struct xe_device *xe, size_t tile_mmio_size) 59 59 { 60 60 struct xe_tile *tile; 61 - struct xe_gt *gt; 62 61 u8 id; 63 62 64 63 /* ··· 66 67 */ 67 68 if (xe->info.tile_count == 1) 68 69 return; 69 - 70 - /* Possibly override number of tile based on configuration register */ 71 - if (!xe->info.skip_mtcfg) { 72 - struct xe_mmio *mmio = xe_root_tile_mmio(xe); 73 - u8 tile_count, gt_count; 74 - u32 mtcfg; 75 - 76 - /* 77 - * Although the per-tile mmio regs are not yet initialized, this 78 - * is fine as it's going to the root tile's mmio, that's 79 - * guaranteed to be initialized earlier in xe_mmio_probe_early() 80 - */ 81 - mtcfg = xe_mmio_read32(mmio, XEHP_MTCFG_ADDR); 82 - tile_count = REG_FIELD_GET(TILE_COUNT, mtcfg) + 1; 83 - 84 - if (tile_count < xe->info.tile_count) { 85 - drm_info(&xe->drm, "tile_count: %d, reduced_tile_count %d\n", 86 - xe->info.tile_count, tile_count); 87 - xe->info.tile_count = tile_count; 88 - 89 - /* 90 - * We've already setup gt_count according to the full 91 - * tile count. Re-calculate it to only include the GTs 92 - * that belong to the remaining tile(s). 93 - */ 94 - gt_count = 0; 95 - for_each_gt(gt, xe, id) 96 - if (gt->info.id < tile_count * xe->info.max_gt_per_tile) 97 - gt_count++; 98 - xe->info.gt_count = gt_count; 99 - } 100 - } 101 70 102 71 for_each_remote_tile(tile, xe, id) 103 72 xe_mmio_init(&tile->mmio, tile, xe->mmio.regs + id * tile_mmio_size, SZ_4M);

+226

drivers/gpu/drm/xe/xe_mmio_gem.c

··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #include "xe_mmio_gem.h" 7 + 8 + #include <drm/drm_drv.h> 9 + #include <drm/drm_gem.h> 10 + #include <drm/drm_managed.h> 11 + 12 + #include "xe_device_types.h" 13 + 14 + /** 15 + * DOC: Exposing MMIO regions to userspace 16 + * 17 + * In certain cases, the driver may allow userspace to mmap a portion of the hardware registers. 18 + * 19 + * This can be done as follows: 20 + * 1. Call xe_mmio_gem_create() to create a GEM object with an mmap-able fake offset. 21 + * 2. Use xe_mmio_gem_mmap_offset() on the created GEM object to retrieve the fake offset. 22 + * 3. Provide the fake offset to userspace. 23 + * 4. Userspace can call mmap with the fake offset. The length provided to mmap 24 + * must match the size of the GEM object. 25 + * 5. When the region is no longer needed, call xe_mmio_gem_destroy() to release the GEM object. 26 + * 27 + * NOTE: The exposed MMIO region must be page-aligned with regards to its BAR offset and size. 28 + * 29 + * WARNING: Exposing MMIO regions to userspace can have security and stability implications. 30 + * Make sure not to expose any sensitive registers. 31 + */ 32 + 33 + static void xe_mmio_gem_free(struct drm_gem_object *); 34 + static int xe_mmio_gem_mmap(struct drm_gem_object *, struct vm_area_struct *); 35 + static vm_fault_t xe_mmio_gem_vm_fault(struct vm_fault *); 36 + 37 + struct xe_mmio_gem { 38 + struct drm_gem_object base; 39 + phys_addr_t phys_addr; 40 + }; 41 + 42 + static const struct vm_operations_struct vm_ops = { 43 + .open = drm_gem_vm_open, 44 + .close = drm_gem_vm_close, 45 + .fault = xe_mmio_gem_vm_fault, 46 + }; 47 + 48 + static const struct drm_gem_object_funcs xe_mmio_gem_funcs = { 49 + .free = xe_mmio_gem_free, 50 + .mmap = xe_mmio_gem_mmap, 51 + .vm_ops = &vm_ops, 52 + }; 53 + 54 + static inline struct xe_mmio_gem *to_xe_mmio_gem(struct drm_gem_object *obj) 55 + { 56 + return container_of(obj, struct xe_mmio_gem, base); 57 + } 58 + 59 + /** 60 + * xe_mmio_gem_create - Expose an MMIO region to userspace 61 + * @xe: The xe device 62 + * @file: DRM file descriptor 63 + * @phys_addr: Start of the exposed MMIO region 64 + * @size: The size of the exposed MMIO region 65 + * 66 + * This function creates a GEM object that exposes an MMIO region with an mmap-able 67 + * fake offset. 68 + * 69 + * See: "Exposing MMIO regions to userspace" 70 + */ 71 + struct xe_mmio_gem *xe_mmio_gem_create(struct xe_device *xe, struct drm_file *file, 72 + phys_addr_t phys_addr, size_t size) 73 + { 74 + struct xe_mmio_gem *obj; 75 + struct drm_gem_object *base; 76 + int err; 77 + 78 + if ((phys_addr % PAGE_SIZE != 0) || (size % PAGE_SIZE != 0)) 79 + return ERR_PTR(-EINVAL); 80 + 81 + obj = kzalloc(sizeof(*obj), GFP_KERNEL); 82 + if (!obj) 83 + return ERR_PTR(-ENOMEM); 84 + 85 + base = &obj->base; 86 + base->funcs = &xe_mmio_gem_funcs; 87 + obj->phys_addr = phys_addr; 88 + 89 + drm_gem_private_object_init(&xe->drm, base, size); 90 + 91 + err = drm_gem_create_mmap_offset(base); 92 + if (err) 93 + goto free_gem; 94 + 95 + err = drm_vma_node_allow(&base->vma_node, file); 96 + if (err) 97 + goto free_gem; 98 + 99 + return obj; 100 + 101 + free_gem: 102 + xe_mmio_gem_free(base); 103 + return ERR_PTR(err); 104 + } 105 + 106 + /** 107 + * xe_mmio_gem_mmap_offset - Return the mmap-able fake offset 108 + * @gem: the GEM object created with xe_mmio_gem_create() 109 + * 110 + * This function returns the mmap-able fake offset allocated during 111 + * xe_mmio_gem_create(). 112 + * 113 + * See: "Exposing MMIO regions to userspace" 114 + */ 115 + u64 xe_mmio_gem_mmap_offset(struct xe_mmio_gem *gem) 116 + { 117 + return drm_vma_node_offset_addr(&gem->base.vma_node); 118 + } 119 + 120 + static void xe_mmio_gem_free(struct drm_gem_object *base) 121 + { 122 + struct xe_mmio_gem *obj = to_xe_mmio_gem(base); 123 + 124 + drm_gem_object_release(base); 125 + kfree(obj); 126 + } 127 + 128 + /** 129 + * xe_mmio_gem_destroy - Destroy the GEM object that exposes an MMIO region 130 + * @gem: the GEM object to destroy 131 + * 132 + * This function releases resources associated with the GEM object created by 133 + * xe_mmio_gem_create(). 134 + * 135 + * See: "Exposing MMIO regions to userspace" 136 + */ 137 + void xe_mmio_gem_destroy(struct xe_mmio_gem *gem) 138 + { 139 + xe_mmio_gem_free(&gem->base); 140 + } 141 + 142 + static int xe_mmio_gem_mmap(struct drm_gem_object *base, struct vm_area_struct *vma) 143 + { 144 + if (vma->vm_end - vma->vm_start != base->size) 145 + return -EINVAL; 146 + 147 + if ((vma->vm_flags & VM_SHARED) == 0) 148 + return -EINVAL; 149 + 150 + /* Set vm_pgoff (used as a fake buffer offset by DRM) to 0 */ 151 + vma->vm_pgoff = 0; 152 + vma->vm_page_prot = pgprot_noncached(vm_get_page_prot(vma->vm_flags)); 153 + vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | 154 + VM_DONTCOPY | VM_NORESERVE); 155 + 156 + /* Defer actual mapping to the fault handler. */ 157 + return 0; 158 + } 159 + 160 + static void xe_mmio_gem_release_dummy_page(struct drm_device *dev, void *res) 161 + { 162 + __free_page((struct page *)res); 163 + } 164 + 165 + static vm_fault_t xe_mmio_gem_vm_fault_dummy_page(struct vm_area_struct *vma) 166 + { 167 + struct drm_gem_object *base = vma->vm_private_data; 168 + struct drm_device *dev = base->dev; 169 + vm_fault_t ret = VM_FAULT_NOPAGE; 170 + struct page *page; 171 + unsigned long pfn; 172 + unsigned long i; 173 + 174 + page = alloc_page(GFP_KERNEL | __GFP_ZERO); 175 + if (!page) 176 + return VM_FAULT_OOM; 177 + 178 + if (drmm_add_action_or_reset(dev, xe_mmio_gem_release_dummy_page, page)) 179 + return VM_FAULT_OOM; 180 + 181 + pfn = page_to_pfn(page); 182 + 183 + /* Map the entire VMA to the same dummy page */ 184 + for (i = 0; i < base->size; i += PAGE_SIZE) { 185 + unsigned long addr = vma->vm_start + i; 186 + 187 + ret = vmf_insert_pfn(vma, addr, pfn); 188 + if (ret & VM_FAULT_ERROR) 189 + break; 190 + } 191 + 192 + return ret; 193 + } 194 + 195 + static vm_fault_t xe_mmio_gem_vm_fault(struct vm_fault *vmf) 196 + { 197 + struct vm_area_struct *vma = vmf->vma; 198 + struct drm_gem_object *base = vma->vm_private_data; 199 + struct xe_mmio_gem *obj = to_xe_mmio_gem(base); 200 + struct drm_device *dev = base->dev; 201 + vm_fault_t ret = VM_FAULT_NOPAGE; 202 + unsigned long i; 203 + int idx; 204 + 205 + if (!drm_dev_enter(dev, &idx)) { 206 + /* 207 + * Provide a dummy page to avoid SIGBUS for events such as hot-unplug. 208 + * This gives the userspace the option to recover instead of crashing. 209 + * It is assumed the userspace will receive the notification via some 210 + * other channel (e.g. drm uevent). 211 + */ 212 + return xe_mmio_gem_vm_fault_dummy_page(vma); 213 + } 214 + 215 + for (i = 0; i < base->size; i += PAGE_SIZE) { 216 + unsigned long addr = vma->vm_start + i; 217 + unsigned long phys_addr = obj->phys_addr + i; 218 + 219 + ret = vmf_insert_pfn(vma, addr, PHYS_PFN(phys_addr)); 220 + if (ret & VM_FAULT_ERROR) 221 + break; 222 + } 223 + 224 + drm_dev_exit(idx); 225 + return ret; 226 + }

+20

drivers/gpu/drm/xe/xe_mmio_gem.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_MMIO_GEM_H_ 7 + #define _XE_MMIO_GEM_H_ 8 + 9 + #include <linux/types.h> 10 + 11 + struct drm_file; 12 + struct xe_device; 13 + struct xe_mmio_gem; 14 + 15 + struct xe_mmio_gem *xe_mmio_gem_create(struct xe_device *xe, struct drm_file *file, 16 + phys_addr_t phys_addr, size_t size); 17 + u64 xe_mmio_gem_mmap_offset(struct xe_mmio_gem *gem); 18 + void xe_mmio_gem_destroy(struct xe_mmio_gem *gem); 19 + 20 + #endif /* _XE_MMIO_GEM_H_ */

+12 -17

drivers/gpu/drm/xe/xe_module.c

··· 135 135 }, 136 136 }; 137 137 138 - static int __init xe_call_init_func(unsigned int i) 138 + static int __init xe_call_init_func(const struct init_funcs *func) 139 139 { 140 - if (WARN_ON(i >= ARRAY_SIZE(init_funcs))) 141 - return 0; 142 - if (!init_funcs[i].init) 143 - return 0; 144 - 145 - return init_funcs[i].init(); 140 + if (func->init) 141 + return func->init(); 142 + return 0; 146 143 } 147 144 148 - static void xe_call_exit_func(unsigned int i) 145 + static void xe_call_exit_func(const struct init_funcs *func) 149 146 { 150 - if (WARN_ON(i >= ARRAY_SIZE(init_funcs))) 151 - return; 152 - if (!init_funcs[i].exit) 153 - return; 154 - 155 - init_funcs[i].exit(); 147 + if (func->exit) 148 + func->exit(); 156 149 } 157 150 158 151 static int __init xe_init(void) ··· 153 160 int err, i; 154 161 155 162 for (i = 0; i < ARRAY_SIZE(init_funcs); i++) { 156 - err = xe_call_init_func(i); 163 + err = xe_call_init_func(init_funcs + i); 157 164 if (err) { 165 + pr_info("%s: module_init aborted at %ps %pe\n", 166 + DRIVER_NAME, init_funcs[i].init, ERR_PTR(err)); 158 167 while (i--) 159 - xe_call_exit_func(i); 168 + xe_call_exit_func(init_funcs + i); 160 169 return err; 161 170 } 162 171 } ··· 171 176 int i; 172 177 173 178 for (i = ARRAY_SIZE(init_funcs) - 1; i >= 0; i--) 174 - xe_call_exit_func(i); 179 + xe_call_exit_func(init_funcs + i); 175 180 } 176 181 177 182 module_init(xe_init);

+4 -4

drivers/gpu/drm/xe/xe_nvm.c

··· 39 39 40 40 static bool xe_nvm_non_posted_erase(struct xe_device *xe) 41 41 { 42 - struct xe_gt *gt = xe_root_mmio_gt(xe); 42 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 43 43 44 44 if (xe->info.platform != XE_BATTLEMAGE) 45 45 return false; 46 - return !(xe_mmio_read32(&gt->mmio, XE_REG(GEN12_CNTL_PROTECTED_NVM_REG)) & 46 + return !(xe_mmio_read32(mmio, XE_REG(GEN12_CNTL_PROTECTED_NVM_REG)) & 47 47 NVM_NON_POSTED_ERASE_CHICKEN_BIT); 48 48 } 49 49 50 50 static bool xe_nvm_writable_override(struct xe_device *xe) 51 51 { 52 - struct xe_gt *gt = xe_root_mmio_gt(xe); 52 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 53 53 bool writable_override; 54 54 resource_size_t base; 55 55 ··· 72 72 } 73 73 74 74 writable_override = 75 - !(xe_mmio_read32(&gt->mmio, HECI_FWSTS2(base)) & 75 + !(xe_mmio_read32(mmio, HECI_FWSTS2(base)) & 76 76 HECI_FW_STATUS_2_NVM_ACCESS_MODE); 77 77 if (writable_override) 78 78 drm_info(&xe->drm, "NVM access overridden by jumper\n");

+4 -4

drivers/gpu/drm/xe/xe_oa.c

··· 822 822 u32 sqcnt1; 823 823 824 824 /* Enable thread stall DOP gating and EU DOP gating. */ 825 - if (XE_WA(stream->gt, 1508761755)) { 825 + if (XE_GT_WA(stream->gt, 1508761755)) { 826 826 xe_gt_mcr_multicast_write(stream->gt, ROW_CHICKEN, 827 827 _MASKED_BIT_DISABLE(STALL_DOP_GATING_DISABLE)); 828 828 xe_gt_mcr_multicast_write(stream->gt, ROW_CHICKEN2, ··· 1079 1079 * EU NOA signals behave incorrectly if EU clock gating is enabled. 1080 1080 * Disable thread stall DOP gating and EU DOP gating. 1081 1081 */ 1082 - if (XE_WA(stream->gt, 1508761755)) { 1082 + if (XE_GT_WA(stream->gt, 1508761755)) { 1083 1083 xe_gt_mcr_multicast_write(stream->gt, ROW_CHICKEN, 1084 1084 _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE)); 1085 1085 xe_gt_mcr_multicast_write(stream->gt, ROW_CHICKEN2, ··· 1754 1754 * GuC reset of engines causes OA to lose configuration 1755 1755 * state. Prevent this by overriding GUCRC mode. 1756 1756 */ 1757 - if (XE_WA(stream->gt, 1509372804)) { 1757 + if (XE_GT_WA(stream->gt, 1509372804)) { 1758 1758 ret = xe_guc_pc_override_gucrc_mode(&gt->uc.guc.pc, 1759 1759 SLPC_GUCRC_MODE_GUCRC_NO_RC6); 1760 1760 if (ret) ··· 1886 1886 { 1887 1887 u32 reg, shift; 1888 1888 1889 - if (XE_WA(gt, 18013179988) || XE_WA(gt, 14015568240)) { 1889 + if (XE_GT_WA(gt, 18013179988) || XE_GT_WA(gt, 14015568240)) { 1890 1890 xe_pm_runtime_get(gt_to_xe(gt)); 1891 1891 reg = xe_mmio_read32(&gt->mmio, RPM_CONFIG0); 1892 1892 xe_pm_runtime_put(gt_to_xe(gt));

+68 -8

drivers/gpu/drm/xe/xe_pci.c

··· 17 17 18 18 #include "display/xe_display.h" 19 19 #include "regs/xe_gt_regs.h" 20 + #include "regs/xe_regs.h" 21 + #include "xe_configfs.h" 20 22 #include "xe_device.h" 21 23 #include "xe_drv.h" 22 24 #include "xe_gt.h" ··· 57 55 }; 58 56 59 57 #define XE_HP_FEATURES \ 60 - .has_range_tlb_invalidation = true, \ 58 + .has_range_tlb_inval = true, \ 61 59 .va_bits = 48, \ 62 60 .vm_max_level = 3 63 61 ··· 105 103 .has_asid = 1, \ 106 104 .has_atomic_enable_pte_bit = 1, \ 107 105 .has_flat_ccs = 1, \ 108 - .has_range_tlb_invalidation = 1, \ 106 + .has_range_tlb_inval = 1, \ 109 107 .has_usm = 1, \ 110 108 .has_64bit_timestamp = 1, \ 111 109 .va_bits = 48, \ ··· 171 169 .dma_mask_size = 39, 172 170 .has_display = true, 173 171 .has_llc = true, 172 + .has_sriov = true, 174 173 .max_gt_per_tile = 1, 175 174 .require_force_probe = true, 176 175 }; ··· 196 193 .dma_mask_size = 39, 197 194 .has_display = true, 198 195 .has_llc = true, 196 + .has_sriov = true, 199 197 .max_gt_per_tile = 1, 200 198 .require_force_probe = true, 201 199 .subplatforms = (const struct xe_subplatform_desc[]) { ··· 214 210 .dma_mask_size = 39, 215 211 .has_display = true, 216 212 .has_llc = true, 213 + .has_sriov = true, 217 214 .max_gt_per_tile = 1, 218 215 .require_force_probe = true, 219 216 .subplatforms = (const struct xe_subplatform_desc[]) { ··· 230 225 .dma_mask_size = 39, 231 226 .has_display = true, 232 227 .has_llc = true, 228 + .has_sriov = true, 233 229 .max_gt_per_tile = 1, 234 230 .require_force_probe = true, 235 231 }; ··· 276 270 277 271 DG2_FEATURES, 278 272 .has_display = false, 273 + .has_sriov = true, 279 274 }; 280 275 281 276 static const struct xe_device_desc dg2_desc = { ··· 606 599 } 607 600 608 601 /* 602 + * Possibly override number of tile based on configuration register. 603 + */ 604 + static void xe_info_probe_tile_count(struct xe_device *xe) 605 + { 606 + struct xe_mmio *mmio; 607 + u8 tile_count; 608 + u32 mtcfg; 609 + 610 + KUNIT_STATIC_STUB_REDIRECT(xe_info_probe_tile_count, xe); 611 + 612 + /* 613 + * Probe for tile count only for platforms that support multiple 614 + * tiles. 615 + */ 616 + if (xe->info.tile_count == 1) 617 + return; 618 + 619 + if (xe->info.skip_mtcfg) 620 + return; 621 + 622 + mmio = xe_root_tile_mmio(xe); 623 + 624 + /* 625 + * Although the per-tile mmio regs are not yet initialized, this 626 + * is fine as it's going to the root tile's mmio, that's 627 + * guaranteed to be initialized earlier in xe_mmio_probe_early() 628 + */ 629 + mtcfg = xe_mmio_read32(mmio, XEHP_MTCFG_ADDR); 630 + tile_count = REG_FIELD_GET(TILE_COUNT, mtcfg) + 1; 631 + 632 + if (tile_count < xe->info.tile_count) { 633 + drm_info(&xe->drm, "tile_count: %d, reduced_tile_count %d\n", 634 + xe->info.tile_count, tile_count); 635 + xe->info.tile_count = tile_count; 636 + } 637 + } 638 + 639 + /* 609 640 * Initialize device info content that does require knowledge about 610 641 * graphics / media IP version. 611 642 * Make sure that GT / tile structures allocated by the driver match the data ··· 713 668 /* Runtime detection may change this later */ 714 669 xe->info.has_flat_ccs = graphics_desc->has_flat_ccs; 715 670 716 - xe->info.has_range_tlb_invalidation = graphics_desc->has_range_tlb_invalidation; 671 + xe->info.has_range_tlb_inval = graphics_desc->has_range_tlb_inval; 717 672 xe->info.has_usm = graphics_desc->has_usm; 718 673 xe->info.has_64bit_timestamp = graphics_desc->has_64bit_timestamp; 674 + 675 + xe_info_probe_tile_count(xe); 719 676 720 677 for_each_remote_tile(tile, xe, id) { 721 678 int err; ··· 734 687 * All of these together determine the overall GT count. 735 688 */ 736 689 for_each_tile(tile, xe, id) { 690 + int err; 691 + 737 692 gt = tile->primary_gt; 738 693 gt->info.type = XE_GT_TYPE_MAIN; 739 694 gt->info.id = tile->id * xe->info.max_gt_per_tile; 740 695 gt->info.has_indirect_ring_state = graphics_desc->has_indirect_ring_state; 741 696 gt->info.engine_mask = graphics_desc->hw_engine_mask; 742 - xe->info.gt_count++; 697 + 698 + err = xe_tile_alloc_vram(tile); 699 + if (err) 700 + return err; 743 701 744 702 if (MEDIA_VER(xe) < 13 && media_desc) 745 703 gt->info.engine_mask |= media_desc->hw_engine_mask; ··· 765 713 gt->info.id = tile->id * xe->info.max_gt_per_tile + 1; 766 714 gt->info.has_indirect_ring_state = media_desc->has_indirect_ring_state; 767 715 gt->info.engine_mask = media_desc->hw_engine_mask; 768 - xe->info.gt_count++; 769 716 } 717 + 718 + /* 719 + * Now that we have tiles and GTs defined, let's loop over valid GTs 720 + * in order to define gt_count. 721 + */ 722 + for_each_gt(gt, xe, id) 723 + xe->info.gt_count++; 770 724 771 725 return 0; 772 726 } ··· 784 726 if (IS_SRIOV_PF(xe)) 785 727 xe_pci_sriov_configure(pdev, 0); 786 728 787 - if (xe_survivability_mode_is_enabled(xe)) 729 + if (xe_survivability_mode_is_boot_enabled(xe)) 788 730 return; 789 731 790 732 xe_device_remove(xe); ··· 816 758 const struct xe_subplatform_desc *subplatform_desc; 817 759 struct xe_device *xe; 818 760 int err; 761 + 762 + xe_configfs_check_device(pdev); 819 763 820 764 if (desc->require_force_probe && !id_forced(pdev->device)) { 821 765 dev_info(&pdev->dev, ··· 866 806 * flashed through mei. Return success, if survivability mode 867 807 * is enabled due to pcode failure or configfs being set 868 808 */ 869 - if (xe_survivability_mode_is_enabled(xe)) 809 + if (xe_survivability_mode_is_boot_enabled(xe)) 870 810 return 0; 871 811 872 812 if (err) ··· 960 900 struct xe_device *xe = pdev_to_xe_device(pdev); 961 901 int err; 962 902 963 - if (xe_survivability_mode_is_enabled(xe)) 903 + if (xe_survivability_mode_is_boot_enabled(xe)) 964 904 return -EBUSY; 965 905 966 906 err = xe_pm_suspend(xe);

+1 -1

drivers/gpu/drm/xe/xe_pci_types.h

··· 60 60 u8 has_atomic_enable_pte_bit:1; 61 61 u8 has_flat_ccs:1; 62 62 u8 has_indirect_ring_state:1; 63 - u8 has_range_tlb_invalidation:1; 63 + u8 has_range_tlb_inval:1; 64 64 u8 has_usm:1; 65 65 u8 has_64bit_timestamp:1; 66 66 };

+22 -1

drivers/gpu/drm/xe/xe_pm.c

··· 18 18 #include "xe_device.h" 19 19 #include "xe_ggtt.h" 20 20 #include "xe_gt.h" 21 - #include "xe_guc.h" 21 + #include "xe_gt_idle.h" 22 22 #include "xe_i2c.h" 23 23 #include "xe_irq.h" 24 24 #include "xe_pcode.h" 25 25 #include "xe_pxp.h" 26 + #include "xe_sriov_vf_ccs.h" 26 27 #include "xe_trace.h" 27 28 #include "xe_wa.h" 28 29 ··· 177 176 drm_dbg(&xe->drm, "Resuming device\n"); 178 177 trace_xe_pm_resume(xe, __builtin_return_address(0)); 179 178 179 + for_each_gt(gt, xe, id) 180 + xe_gt_idle_disable_c6(gt); 181 + 180 182 for_each_tile(tile, xe, id) 181 183 xe_wa_apply_tile_workarounds(tile); 182 184 ··· 211 207 goto err; 212 208 213 209 xe_pxp_pm_resume(xe->pxp); 210 + 211 + if (IS_SRIOV_VF(xe)) 212 + xe_sriov_vf_ccs_register_context(xe); 214 213 215 214 drm_dbg(&xe->drm, "Device resumed\n"); 216 215 return 0; ··· 249 242 static void xe_pm_runtime_init(struct xe_device *xe) 250 243 { 251 244 struct device *dev = xe->drm.dev; 245 + 246 + /* Our current VFs do not support RPM. so, disable it */ 247 + if (IS_SRIOV_VF(xe)) 248 + return; 252 249 253 250 /* 254 251 * Disable the system suspend direct complete optimization. ··· 377 366 static void xe_pm_runtime_fini(struct xe_device *xe) 378 367 { 379 368 struct device *dev = xe->drm.dev; 369 + 370 + /* Our current VFs do not support RPM. so, disable it */ 371 + if (IS_SRIOV_VF(xe)) 372 + return; 380 373 381 374 pm_runtime_get_sync(dev); 382 375 pm_runtime_forbid(dev); ··· 540 525 541 526 xe_rpm_lockmap_acquire(xe); 542 527 528 + for_each_gt(gt, xe, id) 529 + xe_gt_idle_disable_c6(gt); 530 + 543 531 if (xe->d3cold.allowed) { 544 532 err = xe_pcode_ready(xe, true); 545 533 if (err) ··· 575 557 } 576 558 577 559 xe_pxp_pm_resume(xe->pxp); 560 + 561 + if (IS_SRIOV_VF(xe)) 562 + xe_sriov_vf_ccs_register_context(xe); 578 563 579 564 out: 580 565 xe_rpm_lockmap_release(xe);

+306

drivers/gpu/drm/xe/xe_psmi.c

··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #include <linux/debugfs.h> 7 + 8 + #include "xe_bo.h" 9 + #include "xe_device.h" 10 + #include "xe_configfs.h" 11 + #include "xe_psmi.h" 12 + 13 + /* 14 + * PSMI capture support 15 + * 16 + * Requirement for PSMI capture is to have a physically contiguous buffer. The 17 + * PSMI tool owns doing all necessary configuration (MMIO register writes are 18 + * done from user-space). However, KMD needs to provide the PSMI tool with the 19 + * required physical address of the base of PSMI buffer in case of VRAM. 20 + * 21 + * VRAM backed PSMI buffer: 22 + * Buffer is allocated as GEM object and with XE_BO_CREATE_PINNED_BIT flag which 23 + * creates a contiguous allocation. The physical address is returned from 24 + * psmi_debugfs_capture_addr_show(). PSMI tool can mmap the buffer via the 25 + * PCIBAR through sysfs. 26 + * 27 + * SYSTEM memory backed PSMI buffer: 28 + * Interface here does not support allocating from SYSTEM memory region. The 29 + * PSMI tool needs to allocate memory themselves using hugetlbfs. In order to 30 + * get the physical address, user-space can query /proc/[pid]/pagemap. As an 31 + * alternative, CMA debugfs could also be used to allocate reserved CMA memory. 32 + */ 33 + 34 + static bool psmi_enabled(struct xe_device *xe) 35 + { 36 + return xe_configfs_get_psmi_enabled(to_pci_dev(xe->drm.dev)); 37 + } 38 + 39 + static void psmi_free_object(struct xe_bo *bo) 40 + { 41 + xe_bo_lock(bo, NULL); 42 + xe_bo_unpin(bo); 43 + xe_bo_unlock(bo); 44 + xe_bo_put(bo); 45 + } 46 + 47 + /* 48 + * Free PSMI capture buffer objects. 49 + */ 50 + static void psmi_cleanup(struct xe_device *xe) 51 + { 52 + unsigned long id, region_mask = xe->psmi.region_mask; 53 + struct xe_bo *bo; 54 + 55 + for_each_set_bit(id, &region_mask, 56 + ARRAY_SIZE(xe->psmi.capture_obj)) { 57 + /* smem should never be set */ 58 + xe_assert(xe, id); 59 + 60 + bo = xe->psmi.capture_obj[id]; 61 + if (bo) { 62 + psmi_free_object(bo); 63 + xe->psmi.capture_obj[id] = NULL; 64 + } 65 + } 66 + } 67 + 68 + static struct xe_bo *psmi_alloc_object(struct xe_device *xe, 69 + unsigned int id, size_t bo_size) 70 + { 71 + struct xe_bo *bo = NULL; 72 + struct xe_tile *tile; 73 + int err; 74 + 75 + if (!id || !bo_size) 76 + return NULL; 77 + 78 + tile = &xe->tiles[id - 1]; 79 + 80 + /* VRAM: Allocate GEM object for the capture buffer */ 81 + bo = xe_bo_create_locked(xe, tile, NULL, bo_size, 82 + ttm_bo_type_kernel, 83 + XE_BO_FLAG_VRAM_IF_DGFX(tile) | 84 + XE_BO_FLAG_PINNED | 85 + XE_BO_FLAG_PINNED_LATE_RESTORE | 86 + XE_BO_FLAG_NEEDS_CPU_ACCESS); 87 + 88 + if (!IS_ERR(bo)) { 89 + /* Buffer written by HW, ensure stays resident */ 90 + err = xe_bo_pin(bo); 91 + if (err) 92 + bo = ERR_PTR(err); 93 + xe_bo_unlock(bo); 94 + } 95 + 96 + return bo; 97 + } 98 + 99 + /* 100 + * Allocate PSMI capture buffer objects (via debugfs set function), based on 101 + * which regions the user has selected in region_mask. @size: size in bytes 102 + * (should be power of 2) 103 + * 104 + * Always release/free the current buffer objects before attempting to allocate 105 + * new ones. Size == 0 will free all current buffers. 106 + * 107 + * Note, we don't write any registers as the capture tool is already configuring 108 + * all PSMI registers itself via mmio space. 109 + */ 110 + static int psmi_resize_object(struct xe_device *xe, size_t size) 111 + { 112 + unsigned long id, region_mask = xe->psmi.region_mask; 113 + struct xe_bo *bo = NULL; 114 + int err = 0; 115 + 116 + /* if resizing, free currently allocated buffers first */ 117 + psmi_cleanup(xe); 118 + 119 + /* can set size to 0, in which case, now done */ 120 + if (!size) 121 + return 0; 122 + 123 + for_each_set_bit(id, &region_mask, 124 + ARRAY_SIZE(xe->psmi.capture_obj)) { 125 + /* smem should never be set */ 126 + xe_assert(xe, id); 127 + 128 + bo = psmi_alloc_object(xe, id, size); 129 + if (IS_ERR(bo)) { 130 + err = PTR_ERR(bo); 131 + break; 132 + } 133 + xe->psmi.capture_obj[id] = bo; 134 + 135 + drm_info(&xe->drm, 136 + "PSMI capture size requested: %zu bytes, allocated: %lu:%zu\n", 137 + size, id, bo ? xe_bo_size(bo) : 0); 138 + } 139 + 140 + /* on error, reverse what was allocated */ 141 + if (err) 142 + psmi_cleanup(xe); 143 + 144 + return err; 145 + } 146 + 147 + /* 148 + * Returns an address for the capture tool to use to find start of capture 149 + * buffer. Capture tool requires the capability to have a buffer allocated per 150 + * each tile (VRAM region), thus we return an address for each region. 151 + */ 152 + static int psmi_debugfs_capture_addr_show(struct seq_file *m, void *data) 153 + { 154 + struct xe_device *xe = m->private; 155 + unsigned long id, region_mask; 156 + struct xe_bo *bo; 157 + u64 val; 158 + 159 + region_mask = xe->psmi.region_mask; 160 + for_each_set_bit(id, &region_mask, 161 + ARRAY_SIZE(xe->psmi.capture_obj)) { 162 + /* smem should never be set */ 163 + xe_assert(xe, id); 164 + 165 + /* VRAM region */ 166 + bo = xe->psmi.capture_obj[id]; 167 + if (!bo) 168 + continue; 169 + 170 + /* pinned, so don't need bo_lock */ 171 + val = __xe_bo_addr(bo, 0, PAGE_SIZE); 172 + seq_printf(m, "%ld: 0x%llx\n", id, val); 173 + } 174 + 175 + return 0; 176 + } 177 + 178 + /* 179 + * Return capture buffer size, using the size from first allocated object that 180 + * is found. This works because all objects must be of the same size. 181 + */ 182 + static int psmi_debugfs_capture_size_get(void *data, u64 *val) 183 + { 184 + unsigned long id, region_mask; 185 + struct xe_device *xe = data; 186 + struct xe_bo *bo; 187 + 188 + region_mask = xe->psmi.region_mask; 189 + for_each_set_bit(id, &region_mask, 190 + ARRAY_SIZE(xe->psmi.capture_obj)) { 191 + /* smem should never be set */ 192 + xe_assert(xe, id); 193 + 194 + bo = xe->psmi.capture_obj[id]; 195 + if (bo) { 196 + *val = xe_bo_size(bo); 197 + return 0; 198 + } 199 + } 200 + 201 + /* no capture objects are allocated */ 202 + *val = 0; 203 + 204 + return 0; 205 + } 206 + 207 + /* 208 + * Set size of PSMI capture buffer. This triggers the allocation of capture 209 + * buffer in each memory region as specified with prior write to 210 + * psmi_capture_region_mask. 211 + */ 212 + static int psmi_debugfs_capture_size_set(void *data, u64 val) 213 + { 214 + struct xe_device *xe = data; 215 + 216 + /* user must have specified at least one region */ 217 + if (!xe->psmi.region_mask) 218 + return -EINVAL; 219 + 220 + return psmi_resize_object(xe, val); 221 + } 222 + 223 + static int psmi_debugfs_capture_region_mask_get(void *data, u64 *val) 224 + { 225 + struct xe_device *xe = data; 226 + 227 + *val = xe->psmi.region_mask; 228 + 229 + return 0; 230 + } 231 + 232 + /* 233 + * Select VRAM regions for multi-tile devices, only allowed when buffer is not 234 + * currently allocated. 235 + */ 236 + static int psmi_debugfs_capture_region_mask_set(void *data, u64 region_mask) 237 + { 238 + struct xe_device *xe = data; 239 + u64 size = 0; 240 + 241 + /* SMEM is not supported (see comments at top of file) */ 242 + if (region_mask & 0x1) 243 + return -EOPNOTSUPP; 244 + 245 + /* input bitmask should contain only valid TTM regions */ 246 + if (!region_mask || region_mask & ~xe->info.mem_region_mask) 247 + return -EINVAL; 248 + 249 + /* only allow setting mask if buffer is not yet allocated */ 250 + psmi_debugfs_capture_size_get(xe, &size); 251 + if (size) 252 + return -EBUSY; 253 + 254 + xe->psmi.region_mask = region_mask; 255 + 256 + return 0; 257 + } 258 + 259 + DEFINE_SHOW_ATTRIBUTE(psmi_debugfs_capture_addr); 260 + 261 + DEFINE_DEBUGFS_ATTRIBUTE(psmi_debugfs_capture_region_mask_fops, 262 + psmi_debugfs_capture_region_mask_get, 263 + psmi_debugfs_capture_region_mask_set, 264 + "0x%llx\n"); 265 + 266 + DEFINE_DEBUGFS_ATTRIBUTE(psmi_debugfs_capture_size_fops, 267 + psmi_debugfs_capture_size_get, 268 + psmi_debugfs_capture_size_set, 269 + "%lld\n"); 270 + 271 + void xe_psmi_debugfs_register(struct xe_device *xe) 272 + { 273 + struct drm_minor *minor; 274 + 275 + if (!psmi_enabled(xe)) 276 + return; 277 + 278 + minor = xe->drm.primary; 279 + if (!minor->debugfs_root) 280 + return; 281 + 282 + debugfs_create_file("psmi_capture_addr", 283 + 0400, minor->debugfs_root, xe, 284 + &psmi_debugfs_capture_addr_fops); 285 + 286 + debugfs_create_file("psmi_capture_region_mask", 287 + 0600, minor->debugfs_root, xe, 288 + &psmi_debugfs_capture_region_mask_fops); 289 + 290 + debugfs_create_file("psmi_capture_size", 291 + 0600, minor->debugfs_root, xe, 292 + &psmi_debugfs_capture_size_fops); 293 + } 294 + 295 + static void psmi_fini(void *arg) 296 + { 297 + psmi_cleanup(arg); 298 + } 299 + 300 + int xe_psmi_init(struct xe_device *xe) 301 + { 302 + if (!psmi_enabled(xe)) 303 + return 0; 304 + 305 + return devm_add_action(xe->drm.dev, psmi_fini, xe); 306 + }

+14

drivers/gpu/drm/xe/xe_psmi.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_PSMI_H_ 7 + #define _XE_PSMI_H_ 8 + 9 + struct xe_device; 10 + 11 + int xe_psmi_init(struct xe_device *xe); 12 + void xe_psmi_debugfs_register(struct xe_device *xe); 13 + 14 + #endif

+117 -121

drivers/gpu/drm/xe/xe_pt.c

··· 13 13 #include "xe_drm_client.h" 14 14 #include "xe_exec_queue.h" 15 15 #include "xe_gt.h" 16 - #include "xe_gt_tlb_invalidation.h" 16 + #include "xe_tlb_inval_job.h" 17 17 #include "xe_migrate.h" 18 18 #include "xe_pt_types.h" 19 19 #include "xe_pt_walk.h" ··· 21 21 #include "xe_sched_job.h" 22 22 #include "xe_sync.h" 23 23 #include "xe_svm.h" 24 + #include "xe_tlb_inval_job.h" 24 25 #include "xe_trace.h" 25 26 #include "xe_ttm_stolen_mgr.h" 26 27 #include "xe_vm.h" ··· 70 69 71 70 if (level > MAX_HUGEPTE_LEVEL) 72 71 return vm->pt_ops->pde_encode_bo(vm->scratch_pt[id][level - 1]->bo, 73 - 0, pat_index); 72 + 0); 74 73 75 74 return vm->pt_ops->pte_encode_addr(xe, 0, pat_index, level, IS_DGFX(xe), 0) | 76 75 XE_PTE_NULL; ··· 519 518 { 520 519 struct xe_pt_stage_bind_walk *xe_walk = 521 520 container_of(walk, typeof(*xe_walk), base); 522 - u16 pat_index = xe_walk->vma->pat_index; 521 + u16 pat_index = xe_walk->vma->attr.pat_index; 523 522 struct xe_pt *xe_parent = container_of(parent, typeof(*xe_parent), base); 524 523 struct xe_vm *vm = xe_walk->vm; 525 524 struct xe_pt *xe_child; ··· 617 616 xe_child->is_compact = true; 618 617 } 619 618 620 - pte = vm->pt_ops->pde_encode_bo(xe_child->bo, 0, pat_index) | flags; 619 + pte = vm->pt_ops->pde_encode_bo(xe_child->bo, 0) | flags; 621 620 ret = xe_pt_insert_entry(xe_walk, xe_parent, offset, xe_child, 622 621 pte); 623 622 } ··· 641 640 * - In all other cases device atomics will be disabled with AE=0 until an application 642 641 * request differently using a ioctl like madvise. 643 642 */ 644 - static bool xe_atomic_for_vram(struct xe_vm *vm) 643 + static bool xe_atomic_for_vram(struct xe_vm *vm, struct xe_vma *vma) 645 644 { 645 + if (vma->attr.atomic_access == DRM_XE_ATOMIC_CPU) 646 + return false; 647 + 646 648 return true; 647 649 } 648 650 649 - static bool xe_atomic_for_system(struct xe_vm *vm, struct xe_bo *bo) 651 + static bool xe_atomic_for_system(struct xe_vm *vm, struct xe_vma *vma) 650 652 { 651 653 struct xe_device *xe = vm->xe; 654 + struct xe_bo *bo = xe_vma_bo(vma); 652 655 653 - if (!xe->info.has_device_atomics_on_smem) 656 + if (!xe->info.has_device_atomics_on_smem || 657 + vma->attr.atomic_access == DRM_XE_ATOMIC_CPU) 654 658 return false; 659 + 660 + if (vma->attr.atomic_access == DRM_XE_ATOMIC_DEVICE) 661 + return true; 655 662 656 663 /* 657 664 * If a SMEM+LMEM allocation is backed by SMEM, a device 658 665 * atomics will cause a gpu page fault and which then 659 666 * gets migrated to LMEM, bind such allocations with 660 667 * device atomics enabled. 661 - * 662 - * TODO: Revisit this. Perhaps add something like a 663 - * fault_on_atomics_in_system UAPI flag. 664 - * Note that this also prohibits GPU atomics in LR mode for 665 - * userptr and system memory on DGFX. 666 668 */ 667 669 return (!IS_DGFX(xe) || (!xe_vm_in_lr_mode(vm) || 668 670 (bo && xe_bo_has_single_placement(bo)))); ··· 748 744 goto walk_pt; 749 745 750 746 if (vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT) { 751 - xe_walk.default_vram_pte = xe_atomic_for_vram(vm) ? XE_USM_PPGTT_PTE_AE : 0; 752 - xe_walk.default_system_pte = xe_atomic_for_system(vm, bo) ? 747 + xe_walk.default_vram_pte = xe_atomic_for_vram(vm, vma) ? XE_USM_PPGTT_PTE_AE : 0; 748 + xe_walk.default_system_pte = xe_atomic_for_system(vm, vma) ? 753 749 XE_USM_PPGTT_PTE_AE : 0; 754 750 } 755 751 ··· 954 950 struct xe_pt *pt = vm->pt_root[tile->id]; 955 951 u8 pt_mask = (range->tile_present & ~range->tile_invalidated); 956 952 957 - xe_svm_assert_in_notifier(vm); 953 + /* 954 + * Locking rules: 955 + * 956 + * - notifier_lock (write): full protection against page table changes 957 + * and MMU notifier invalidations. 958 + * 959 + * - notifier_lock (read) + vm_lock (write): combined protection against 960 + * invalidations and concurrent page table modifications. (e.g., madvise) 961 + * 962 + */ 963 + lockdep_assert(lockdep_is_held_type(&vm->svm.gpusvm.notifier_lock, 0) || 964 + (lockdep_is_held_type(&vm->svm.gpusvm.notifier_lock, 1) && 965 + lockdep_is_held_type(&vm->lock, 0))); 958 966 959 967 if (!(pt_mask & BIT(tile->id))) 960 968 return false; ··· 1277 1261 } 1278 1262 1279 1263 static int xe_pt_vm_dependencies(struct xe_sched_job *job, 1264 + struct xe_tlb_inval_job *ijob, 1265 + struct xe_tlb_inval_job *mjob, 1280 1266 struct xe_vm *vm, 1281 1267 struct xe_vma_ops *vops, 1282 1268 struct xe_vm_pgtable_update_ops *pt_update_ops, ··· 1346 1328 for (i = 0; job && !err && i < vops->num_syncs; i++) 1347 1329 err = xe_sync_entry_add_deps(&vops->syncs[i], job); 1348 1330 1331 + if (job) { 1332 + if (ijob) { 1333 + err = xe_tlb_inval_job_alloc_dep(ijob); 1334 + if (err) 1335 + return err; 1336 + } 1337 + 1338 + if (mjob) { 1339 + err = xe_tlb_inval_job_alloc_dep(mjob); 1340 + if (err) 1341 + return err; 1342 + } 1343 + } 1344 + 1349 1345 return err; 1350 1346 } 1351 1347 ··· 1371 1339 struct xe_vm_pgtable_update_ops *pt_update_ops = 1372 1340 &vops->pt_update_ops[pt_update->tile_id]; 1373 1341 1374 - return xe_pt_vm_dependencies(pt_update->job, vm, pt_update->vops, 1342 + return xe_pt_vm_dependencies(pt_update->job, pt_update->ijob, 1343 + pt_update->mjob, vm, pt_update->vops, 1375 1344 pt_update_ops, rftree); 1376 1345 } 1377 1346 ··· 1541 1508 return 0; 1542 1509 } 1543 1510 #endif 1544 - 1545 - struct invalidation_fence { 1546 - struct xe_gt_tlb_invalidation_fence base; 1547 - struct xe_gt *gt; 1548 - struct dma_fence *fence; 1549 - struct dma_fence_cb cb; 1550 - struct work_struct work; 1551 - u64 start; 1552 - u64 end; 1553 - u32 asid; 1554 - }; 1555 - 1556 - static void invalidation_fence_cb(struct dma_fence *fence, 1557 - struct dma_fence_cb *cb) 1558 - { 1559 - struct invalidation_fence *ifence = 1560 - container_of(cb, struct invalidation_fence, cb); 1561 - struct xe_device *xe = gt_to_xe(ifence->gt); 1562 - 1563 - trace_xe_gt_tlb_invalidation_fence_cb(xe, &ifence->base); 1564 - if (!ifence->fence->error) { 1565 - queue_work(system_wq, &ifence->work); 1566 - } else { 1567 - ifence->base.base.error = ifence->fence->error; 1568 - xe_gt_tlb_invalidation_fence_signal(&ifence->base); 1569 - } 1570 - dma_fence_put(ifence->fence); 1571 - } 1572 - 1573 - static void invalidation_fence_work_func(struct work_struct *w) 1574 - { 1575 - struct invalidation_fence *ifence = 1576 - container_of(w, struct invalidation_fence, work); 1577 - struct xe_device *xe = gt_to_xe(ifence->gt); 1578 - 1579 - trace_xe_gt_tlb_invalidation_fence_work_func(xe, &ifence->base); 1580 - xe_gt_tlb_invalidation_range(ifence->gt, &ifence->base, ifence->start, 1581 - ifence->end, ifence->asid); 1582 - } 1583 - 1584 - static void invalidation_fence_init(struct xe_gt *gt, 1585 - struct invalidation_fence *ifence, 1586 - struct dma_fence *fence, 1587 - u64 start, u64 end, u32 asid) 1588 - { 1589 - int ret; 1590 - 1591 - trace_xe_gt_tlb_invalidation_fence_create(gt_to_xe(gt), &ifence->base); 1592 - 1593 - xe_gt_tlb_invalidation_fence_init(gt, &ifence->base, false); 1594 - 1595 - ifence->fence = fence; 1596 - ifence->gt = gt; 1597 - ifence->start = start; 1598 - ifence->end = end; 1599 - ifence->asid = asid; 1600 - 1601 - INIT_WORK(&ifence->work, invalidation_fence_work_func); 1602 - ret = dma_fence_add_callback(fence, &ifence->cb, invalidation_fence_cb); 1603 - if (ret == -ENOENT) { 1604 - dma_fence_put(ifence->fence); /* Usually dropped in CB */ 1605 - invalidation_fence_work_func(&ifence->work); 1606 - } else if (ret) { 1607 - dma_fence_put(&ifence->base.base); /* Caller ref */ 1608 - dma_fence_put(&ifence->base.base); /* Creation ref */ 1609 - } 1610 - 1611 - xe_gt_assert(gt, !ret || ret == -ENOENT); 1612 - } 1613 1511 1614 1512 struct xe_pt_stage_unbind_walk { 1615 1513 /** @base: The pagewalk base-class. */ ··· 2354 2390 static const struct xe_migrate_pt_update_ops svm_migrate_ops; 2355 2391 #endif 2356 2392 2393 + static struct xe_dep_scheduler *to_dep_scheduler(struct xe_exec_queue *q, 2394 + struct xe_gt *gt) 2395 + { 2396 + if (xe_gt_is_media_type(gt)) 2397 + return q->tlb_inval[XE_EXEC_QUEUE_TLB_INVAL_MEDIA_GT].dep_scheduler; 2398 + 2399 + return q->tlb_inval[XE_EXEC_QUEUE_TLB_INVAL_PRIMARY_GT].dep_scheduler; 2400 + } 2401 + 2357 2402 /** 2358 2403 * xe_pt_update_ops_run() - Run PT update operations 2359 2404 * @tile: Tile of PT update operations ··· 2380 2407 struct xe_vm *vm = vops->vm; 2381 2408 struct xe_vm_pgtable_update_ops *pt_update_ops = 2382 2409 &vops->pt_update_ops[tile->id]; 2383 - struct dma_fence *fence; 2384 - struct invalidation_fence *ifence = NULL, *mfence = NULL; 2410 + struct dma_fence *fence, *ifence, *mfence; 2411 + struct xe_tlb_inval_job *ijob = NULL, *mjob = NULL; 2385 2412 struct dma_fence **fences = NULL; 2386 2413 struct dma_fence_array *cf = NULL; 2387 2414 struct xe_range_fence *rfence; ··· 2413 2440 #endif 2414 2441 2415 2442 if (pt_update_ops->needs_invalidation) { 2416 - ifence = kzalloc(sizeof(*ifence), GFP_KERNEL); 2417 - if (!ifence) { 2418 - err = -ENOMEM; 2443 + struct xe_exec_queue *q = pt_update_ops->q; 2444 + struct xe_dep_scheduler *dep_scheduler = 2445 + to_dep_scheduler(q, tile->primary_gt); 2446 + 2447 + ijob = xe_tlb_inval_job_create(q, &tile->primary_gt->tlb_inval, 2448 + dep_scheduler, 2449 + pt_update_ops->start, 2450 + pt_update_ops->last, 2451 + vm->usm.asid); 2452 + if (IS_ERR(ijob)) { 2453 + err = PTR_ERR(ijob); 2419 2454 goto kill_vm_tile1; 2420 2455 } 2456 + update.ijob = ijob; 2457 + 2421 2458 if (tile->media_gt) { 2422 - mfence = kzalloc(sizeof(*ifence), GFP_KERNEL); 2423 - if (!mfence) { 2424 - err = -ENOMEM; 2425 - goto free_ifence; 2459 + dep_scheduler = to_dep_scheduler(q, tile->media_gt); 2460 + 2461 + mjob = xe_tlb_inval_job_create(q, 2462 + &tile->media_gt->tlb_inval, 2463 + dep_scheduler, 2464 + pt_update_ops->start, 2465 + pt_update_ops->last, 2466 + vm->usm.asid); 2467 + if (IS_ERR(mjob)) { 2468 + err = PTR_ERR(mjob); 2469 + goto free_ijob; 2426 2470 } 2471 + update.mjob = mjob; 2472 + 2427 2473 fences = kmalloc_array(2, sizeof(*fences), GFP_KERNEL); 2428 2474 if (!fences) { 2429 2475 err = -ENOMEM; 2430 - goto free_ifence; 2476 + goto free_ijob; 2431 2477 } 2432 2478 cf = dma_fence_array_alloc(2); 2433 2479 if (!cf) { 2434 2480 err = -ENOMEM; 2435 - goto free_ifence; 2481 + goto free_ijob; 2436 2482 } 2437 2483 } 2438 2484 } ··· 2459 2467 rfence = kzalloc(sizeof(*rfence), GFP_KERNEL); 2460 2468 if (!rfence) { 2461 2469 err = -ENOMEM; 2462 - goto free_ifence; 2470 + goto free_ijob; 2463 2471 } 2464 2472 2465 2473 fence = xe_migrate_update_pgtables(tile->migrate, &update); ··· 2483 2491 pt_update_ops->last, fence)) 2484 2492 dma_fence_wait(fence, false); 2485 2493 2486 - /* tlb invalidation must be done before signaling rebind */ 2487 - if (ifence) { 2488 - if (mfence) 2489 - dma_fence_get(fence); 2490 - invalidation_fence_init(tile->primary_gt, ifence, fence, 2491 - pt_update_ops->start, 2492 - pt_update_ops->last, vm->usm.asid); 2493 - if (mfence) { 2494 - invalidation_fence_init(tile->media_gt, mfence, fence, 2495 - pt_update_ops->start, 2496 - pt_update_ops->last, vm->usm.asid); 2497 - fences[0] = &ifence->base.base; 2498 - fences[1] = &mfence->base.base; 2494 + /* tlb invalidation must be done before signaling unbind/rebind */ 2495 + if (ijob) { 2496 + struct dma_fence *__fence; 2497 + 2498 + ifence = xe_tlb_inval_job_push(ijob, tile->migrate, fence); 2499 + __fence = ifence; 2500 + 2501 + if (mjob) { 2502 + fences[0] = ifence; 2503 + mfence = xe_tlb_inval_job_push(mjob, tile->migrate, 2504 + fence); 2505 + fences[1] = mfence; 2506 + 2499 2507 dma_fence_array_init(cf, 2, fences, 2500 2508 vm->composite_fence_ctx, 2501 2509 vm->composite_fence_seqno++, 2502 2510 false); 2503 - fence = &cf->base; 2504 - } else { 2505 - fence = &ifence->base.base; 2511 + __fence = &cf->base; 2506 2512 } 2513 + 2514 + dma_fence_put(fence); 2515 + fence = __fence; 2507 2516 } 2508 2517 2509 - if (!mfence) { 2518 + if (!mjob) { 2510 2519 dma_resv_add_fence(xe_vm_resv(vm), fence, 2511 2520 pt_update_ops->wait_vm_bookkeep ? 2512 2521 DMA_RESV_USAGE_KERNEL : ··· 2516 2523 list_for_each_entry(op, &vops->list, link) 2517 2524 op_commit(vops->vm, tile, pt_update_ops, op, fence, NULL); 2518 2525 } else { 2519 - dma_resv_add_fence(xe_vm_resv(vm), &ifence->base.base, 2526 + dma_resv_add_fence(xe_vm_resv(vm), ifence, 2520 2527 pt_update_ops->wait_vm_bookkeep ? 2521 2528 DMA_RESV_USAGE_KERNEL : 2522 2529 DMA_RESV_USAGE_BOOKKEEP); 2523 2530 2524 - dma_resv_add_fence(xe_vm_resv(vm), &mfence->base.base, 2531 + dma_resv_add_fence(xe_vm_resv(vm), mfence, 2525 2532 pt_update_ops->wait_vm_bookkeep ? 2526 2533 DMA_RESV_USAGE_KERNEL : 2527 2534 DMA_RESV_USAGE_BOOKKEEP); 2528 2535 2529 2536 list_for_each_entry(op, &vops->list, link) 2530 - op_commit(vops->vm, tile, pt_update_ops, op, 2531 - &ifence->base.base, &mfence->base.base); 2537 + op_commit(vops->vm, tile, pt_update_ops, op, ifence, 2538 + mfence); 2532 2539 } 2533 2540 2534 2541 if (pt_update_ops->needs_svm_lock) ··· 2536 2543 if (pt_update_ops->needs_userptr_lock) 2537 2544 up_read(&vm->userptr.notifier_lock); 2538 2545 2546 + xe_tlb_inval_job_put(mjob); 2547 + xe_tlb_inval_job_put(ijob); 2548 + 2539 2549 return fence; 2540 2550 2541 2551 free_rfence: 2542 2552 kfree(rfence); 2543 - free_ifence: 2553 + free_ijob: 2544 2554 kfree(cf); 2545 2555 kfree(fences); 2546 - kfree(mfence); 2547 - kfree(ifence); 2556 + xe_tlb_inval_job_put(mjob); 2557 + xe_tlb_inval_job_put(ijob); 2548 2558 kill_vm_tile1: 2549 2559 if (err != -EAGAIN && err != -ENODATA && tile->id) 2550 2560 xe_vm_kill(vops->vm, false);

+1 -2

drivers/gpu/drm/xe/xe_pt_types.h

··· 45 45 u64 (*pte_encode_addr)(struct xe_device *xe, u64 addr, 46 46 u16 pat_index, 47 47 u32 pt_level, bool devmem, u64 flags); 48 - u64 (*pde_encode_bo)(struct xe_bo *bo, u64 bo_offset, 49 - u16 pat_index); 48 + u64 (*pde_encode_bo)(struct xe_bo *bo, u64 bo_offset); 50 49 }; 51 50 52 51 struct xe_pt_entry {

+1 -1

drivers/gpu/drm/xe/xe_pxp_submit.c

··· 101 101 xe_assert(xe, hwe); 102 102 103 103 /* PXP instructions must be issued from PPGTT */ 104 - vm = xe_vm_create(xe, XE_VM_FLAG_GSC); 104 + vm = xe_vm_create(xe, XE_VM_FLAG_GSC, NULL); 105 105 if (IS_ERR(vm)) 106 106 return PTR_ERR(vm); 107 107

+6 -7

drivers/gpu/drm/xe/xe_query.c

··· 27 27 #include "xe_oa.h" 28 28 #include "xe_pxp.h" 29 29 #include "xe_ttm_vram_mgr.h" 30 + #include "xe_vram_types.h" 30 31 #include "xe_wa.h" 31 32 32 33 static const u16 xe_to_user_engine_class[] = { ··· 338 337 config->num_params = num_params; 339 338 config->info[DRM_XE_QUERY_CONFIG_REV_AND_DEVICE_ID] = 340 339 xe->info.devid | (xe->info.revid << 16); 341 - if (xe_device_get_root_tile(xe)->mem.vram.usable_size) 340 + if (xe->mem.vram) 342 341 config->info[DRM_XE_QUERY_CONFIG_FLAGS] |= 343 342 DRM_XE_QUERY_CONFIG_FLAG_HAS_VRAM; 344 343 if (xe->info.has_usm && IS_ENABLED(CONFIG_DRM_XE_GPUSVM)) ··· 411 410 gt_list->gt_list[iter].near_mem_regions = 0x1; 412 411 else 413 412 gt_list->gt_list[iter].near_mem_regions = 414 - BIT(gt_to_tile(gt)->id) << 1; 413 + BIT(gt_to_tile(gt)->mem.vram->id) << 1; 415 414 gt_list->gt_list[iter].far_mem_regions = xe->info.mem_region_mask ^ 416 415 gt_list->gt_list[iter].near_mem_regions; 417 416 ··· 477 476 sizeof_field(struct xe_gt, fuse_topo.eu_mask_per_dss); 478 477 479 478 /* L3bank mask may not be available for some GTs */ 480 - if (!XE_WA(gt, no_media_l3)) 479 + if (!XE_GT_WA(gt, no_media_l3)) 481 480 query_size += sizeof(struct drm_xe_query_topology_mask) + 482 481 sizeof_field(struct xe_gt, fuse_topo.l3_bank_mask); 483 482 } ··· 540 539 * mask, then it's better to omit L3 from the query rather than 541 540 * reporting bogus or zeroed information to userspace. 542 541 */ 543 - if (!XE_WA(gt, no_media_l3)) { 542 + if (!XE_GT_WA(gt, no_media_l3)) { 544 543 topo.type = DRM_XE_TOPO_L3_BANK; 545 544 err = copy_mask(&query_ptr, &topo, gt->fuse_topo.l3_bank_mask, 546 545 sizeof(gt->fuse_topo.l3_bank_mask)); ··· 749 748 u32 num_rates; 750 749 int ret; 751 750 752 - if (!xe_eu_stall_supported_on_platform(xe)) { 753 - drm_dbg(&xe->drm, "EU stall monitoring is not supported on this platform\n"); 751 + if (!xe_eu_stall_supported_on_platform(xe)) 754 752 return -ENODEV; 755 - } 756 753 757 754 array_size = xe_eu_stall_get_sampling_rates(&num_rates, &rates); 758 755 size = sizeof(struct drm_xe_query_eu_stall) + array_size;

+5 -5

drivers/gpu/drm/xe/xe_res_cursor.h

··· 55 55 u32 mem_type; 56 56 /** @sgl: Scatterlist for cursor */ 57 57 struct scatterlist *sgl; 58 - /** @dma_addr: Current element in a struct drm_pagemap_device_addr array */ 59 - const struct drm_pagemap_device_addr *dma_addr; 58 + /** @dma_addr: Current element in a struct drm_pagemap_addr array */ 59 + const struct drm_pagemap_addr *dma_addr; 60 60 /** @mm: Buddy allocator for VRAM cursor */ 61 61 struct drm_buddy *mm; 62 62 /** ··· 170 170 */ 171 171 static inline void __xe_res_dma_next(struct xe_res_cursor *cur) 172 172 { 173 - const struct drm_pagemap_device_addr *addr = cur->dma_addr; 173 + const struct drm_pagemap_addr *addr = cur->dma_addr; 174 174 u64 start = cur->start; 175 175 176 176 while (start >= cur->dma_seg_size) { ··· 222 222 /** 223 223 * xe_res_first_dma - initialize a xe_res_cursor with dma_addr array 224 224 * 225 - * @dma_addr: struct drm_pagemap_device_addr array to walk 225 + * @dma_addr: struct drm_pagemap_addr array to walk 226 226 * @start: Start of the range 227 227 * @size: Size of the range 228 228 * @cur: cursor object to initialize 229 229 * 230 230 * Start walking over the range of allocations between @start and @size. 231 231 */ 232 - static inline void xe_res_first_dma(const struct drm_pagemap_device_addr *dma_addr, 232 + static inline void xe_res_first_dma(const struct drm_pagemap_addr *dma_addr, 233 233 u64 start, u64 size, 234 234 struct xe_res_cursor *cur) 235 235 {

+10 -12

drivers/gpu/drm/xe/xe_ring_ops.c

··· 110 110 return i; 111 111 } 112 112 113 - static int emit_flush_invalidate(u32 addr, u32 val, u32 *dw, int i) 113 + static int emit_flush_invalidate(u32 addr, u32 val, u32 flush_flags, u32 *dw, int i) 114 114 { 115 - dw[i++] = MI_FLUSH_DW | MI_INVALIDATE_TLB | MI_FLUSH_DW_OP_STOREDW | 116 - MI_FLUSH_IMM_DW; 115 + dw[i++] = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | 116 + MI_FLUSH_IMM_DW | (flush_flags & MI_INVALIDATE_TLB) ?: 0; 117 117 118 118 dw[i++] = addr | MI_FLUSH_DW_USE_GTT; 119 119 dw[i++] = 0; ··· 179 179 bool lacks_render = !(gt->info.engine_mask & XE_HW_ENGINE_RCS_MASK); 180 180 u32 flags; 181 181 182 - if (XE_WA(gt, 14016712196)) 182 + if (XE_GT_WA(gt, 14016712196)) 183 183 i = emit_pipe_control(dw, i, 0, PIPE_CONTROL_DEPTH_CACHE_FLUSH, 184 184 LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR, 0); 185 185 ··· 190 190 PIPE_CONTROL_DC_FLUSH_ENABLE | 191 191 PIPE_CONTROL_FLUSH_ENABLE); 192 192 193 - if (XE_WA(gt, 1409600907)) 193 + if (XE_GT_WA(gt, 1409600907)) 194 194 flags |= PIPE_CONTROL_DEPTH_STALL; 195 195 196 196 if (lacks_render) ··· 206 206 if (hwe->class != XE_ENGINE_CLASS_RENDER) 207 207 return i; 208 208 209 - if (XE_WA(hwe->gt, 16020292621)) 209 + if (XE_GT_WA(hwe->gt, 16020292621)) 210 210 i = emit_pipe_control(dw, i, 0, PIPE_CONTROL_LRI_POST_SYNC, 211 211 RING_NOPID(hwe->mmio_base).addr, 0); 212 212 ··· 410 410 i = emit_bb_start(job->ptrs[0].batch_addr, BIT(8), dw, i); 411 411 412 412 dw[i++] = preparser_disable(true); 413 - i = emit_flush_invalidate(saddr, seqno, dw, i); 413 + i = emit_flush_invalidate(saddr, seqno, job->migrate_flush_flags, dw, i); 414 414 dw[i++] = preparser_disable(false); 415 415 416 416 i = emit_bb_start(job->ptrs[1].batch_addr, BIT(8), dw, i); 417 417 418 - dw[i++] = MI_FLUSH_DW | MI_INVALIDATE_TLB | job->migrate_flush_flags | 419 - MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_IMM_DW; 420 - dw[i++] = xe_lrc_seqno_ggtt_addr(lrc) | MI_FLUSH_DW_USE_GTT; 421 - dw[i++] = 0; 422 - dw[i++] = seqno; /* value */ 418 + i = emit_flush_imm_ggtt(xe_lrc_seqno_ggtt_addr(lrc), seqno, 419 + job->migrate_flush_flags, 420 + dw, i); 423 421 424 422 i = emit_user_interrupt(dw, i); 425 423

+7

drivers/gpu/drm/xe/xe_rtp.c

··· 9 9 10 10 #include <uapi/drm/xe_drm.h> 11 11 12 + #include "xe_configfs.h" 12 13 #include "xe_gt.h" 13 14 #include "xe_gt_topology.h" 14 15 #include "xe_macros.h" ··· 363 362 const struct xe_hw_engine *hwe) 364 363 { 365 364 return !IS_SRIOV_VF(gt_to_xe(gt)); 365 + } 366 + 367 + bool xe_rtp_match_psmi_enabled(const struct xe_gt *gt, 368 + const struct xe_hw_engine *hwe) 369 + { 370 + return xe_configfs_get_psmi_enabled(to_pci_dev(gt_to_xe(gt)->drm.dev)); 366 371 }

+3

drivers/gpu/drm/xe/xe_rtp.h

··· 477 477 bool xe_rtp_match_not_sriov_vf(const struct xe_gt *gt, 478 478 const struct xe_hw_engine *hwe); 479 479 480 + bool xe_rtp_match_psmi_enabled(const struct xe_gt *gt, 481 + const struct xe_hw_engine *hwe); 482 + 480 483 #endif

-1

drivers/gpu/drm/xe/xe_sa.c

··· 69 69 } 70 70 sa_manager->bo = bo; 71 71 sa_manager->is_iomem = bo->vmap.is_iomem; 72 - sa_manager->gpu_addr = xe_bo_ggtt_addr(bo); 73 72 74 73 if (bo->vmap.is_iomem) { 75 74 sa_manager->cpu_ptr = kvzalloc(managed_size, GFP_KERNEL);

+14 -1

drivers/gpu/drm/xe/xe_sa.h

··· 7 7 8 8 #include <linux/sizes.h> 9 9 #include <linux/types.h> 10 + 11 + #include "xe_bo.h" 10 12 #include "xe_sa_types.h" 11 13 12 14 struct dma_fence; ··· 45 43 return container_of(mng, struct xe_sa_manager, base); 46 44 } 47 45 46 + /** 47 + * xe_sa_manager_gpu_addr - Retrieve GPU address of a back storage BO 48 + * within suballocator. 49 + * @sa_manager: the &xe_sa_manager struct instance 50 + * Return: GGTT address of the back storage BO. 51 + */ 52 + static inline u64 xe_sa_manager_gpu_addr(struct xe_sa_manager *sa_manager) 53 + { 54 + return xe_bo_ggtt_addr(sa_manager->bo); 55 + } 56 + 48 57 static inline u64 xe_sa_bo_gpu_addr(struct drm_suballoc *sa) 49 58 { 50 - return to_xe_sa_manager(sa->manager)->gpu_addr + 59 + return xe_sa_manager_gpu_addr(to_xe_sa_manager(sa->manager)) + 51 60 drm_suballoc_soffset(sa); 52 61 } 53 62

-1

drivers/gpu/drm/xe/xe_sa_types.h

··· 12 12 struct xe_sa_manager { 13 13 struct drm_suballoc_manager base; 14 14 struct xe_bo *bo; 15 - u64 gpu_addr; 16 15 void *cpu_ptr; 17 16 bool is_iomem; 18 17 };

+19

drivers/gpu/drm/xe/xe_sriov.c

··· 15 15 #include "xe_sriov.h" 16 16 #include "xe_sriov_pf.h" 17 17 #include "xe_sriov_vf.h" 18 + #include "xe_sriov_vf_ccs.h" 18 19 19 20 /** 20 21 * xe_sriov_mode_to_string - Convert enum value to string. ··· 157 156 else 158 157 strscpy(buf, "PF", size); 159 158 return buf; 159 + } 160 + 161 + /** 162 + * xe_sriov_late_init() - SR-IOV late initialization functions. 163 + * @xe: the &xe_device to initialize 164 + * 165 + * On VF this function will initialize code for CCS migration. 166 + * 167 + * Return: 0 on success or a negative error code on failure. 168 + */ 169 + int xe_sriov_late_init(struct xe_device *xe) 170 + { 171 + int err = 0; 172 + 173 + if (IS_VF_CCS_INIT_NEEDED(xe)) 174 + err = xe_sriov_vf_ccs_init(xe); 175 + 176 + return err; 160 177 }

+1

drivers/gpu/drm/xe/xe_sriov.h

··· 18 18 void xe_sriov_probe_early(struct xe_device *xe); 19 19 void xe_sriov_print_info(struct xe_device *xe, struct drm_printer *p); 20 20 int xe_sriov_init(struct xe_device *xe); 21 + int xe_sriov_late_init(struct xe_device *xe); 21 22 22 23 static inline enum xe_sriov_mode xe_device_sriov_mode(const struct xe_device *xe) 23 24 {

+75 -3

drivers/gpu/drm/xe/xe_sriov_vf.c

··· 11 11 #include "xe_gt_sriov_printk.h" 12 12 #include "xe_gt_sriov_vf.h" 13 13 #include "xe_guc_ct.h" 14 + #include "xe_guc_submit.h" 15 + #include "xe_irq.h" 16 + #include "xe_lrc.h" 14 17 #include "xe_pm.h" 15 18 #include "xe_sriov.h" 16 19 #include "xe_sriov_printk.h" ··· 150 147 xe_sriov_info(xe, "migration not supported by this module version\n"); 151 148 } 152 149 150 + /** 151 + * vf_post_migration_shutdown - Stop the driver activities after VF migration. 152 + * @xe: the &xe_device struct instance 153 + * 154 + * After this VM is migrated and assigned to a new VF, it is running on a new 155 + * hardware, and therefore many hardware-dependent states and related structures 156 + * require fixups. Without fixups, the hardware cannot do any work, and therefore 157 + * all GPU pipelines are stalled. 158 + * Stop some of kernel activities to make the fixup process faster. 159 + */ 160 + static void vf_post_migration_shutdown(struct xe_device *xe) 161 + { 162 + struct xe_gt *gt; 163 + unsigned int id; 164 + int ret = 0; 165 + 166 + for_each_gt(gt, xe, id) { 167 + xe_guc_submit_pause(&gt->uc.guc); 168 + ret |= xe_guc_submit_reset_block(&gt->uc.guc); 169 + } 170 + 171 + if (ret) 172 + drm_info(&xe->drm, "migration recovery encountered ongoing reset\n"); 173 + } 174 + 175 + /** 176 + * vf_post_migration_kickstart - Re-start the driver activities under new hardware. 177 + * @xe: the &xe_device struct instance 178 + * 179 + * After we have finished with all post-migration fixups, restart the driver 180 + * activities to continue feeding the GPU with workloads. 181 + */ 182 + static void vf_post_migration_kickstart(struct xe_device *xe) 183 + { 184 + struct xe_gt *gt; 185 + unsigned int id; 186 + 187 + /* 188 + * Make sure interrupts on the new HW are properly set. The GuC IRQ 189 + * must be working at this point, since the recovery did started, 190 + * but the rest was not enabled using the procedure from spec. 191 + */ 192 + xe_irq_resume(xe); 193 + 194 + for_each_gt(gt, xe, id) { 195 + xe_guc_submit_reset_unblock(&gt->uc.guc); 196 + xe_guc_submit_unpause(&gt->uc.guc); 197 + } 198 + } 199 + 153 200 static bool gt_vf_post_migration_needed(struct xe_gt *gt) 154 201 { 155 202 return test_bit(gt->info.id, &gt_to_xe(gt)->sriov.vf.migration.gt_flags); ··· 245 192 return -1; 246 193 } 247 194 195 + static size_t post_migration_scratch_size(struct xe_device *xe) 196 + { 197 + return max(xe_lrc_reg_size(xe), LRC_WA_BB_SIZE); 198 + } 199 + 248 200 /** 249 201 * Perform post-migration fixups on a single GT. 250 202 * ··· 266 208 static int gt_vf_post_migration_fixups(struct xe_gt *gt) 267 209 { 268 210 s64 shift; 211 + void *buf; 269 212 int err; 213 + 214 + buf = kmalloc(post_migration_scratch_size(gt_to_xe(gt)), GFP_KERNEL); 215 + if (!buf) 216 + return -ENOMEM; 270 217 271 218 err = xe_gt_sriov_vf_query_config(gt); 272 219 if (err) 273 - return err; 220 + goto out; 274 221 275 222 shift = xe_gt_sriov_vf_ggtt_shift(gt); 276 223 if (shift) { 277 224 xe_tile_sriov_vf_fixup_ggtt_nodes(gt_to_tile(gt), shift); 278 - /* FIXME: add the recovery steps */ 225 + xe_gt_sriov_vf_default_lrcs_hwsp_rebase(gt); 226 + err = xe_guc_contexts_hwsp_rebase(&gt->uc.guc, buf); 227 + if (err) 228 + goto out; 229 + xe_guc_jobs_ring_rebase(&gt->uc.guc); 279 230 xe_guc_ct_fixup_messages_with_ggtt(&gt->uc.guc.ct, shift); 280 231 } 281 - return 0; 232 + 233 + out: 234 + kfree(buf); 235 + return err; 282 236 } 283 237 284 238 static void vf_post_migration_recovery(struct xe_device *xe) ··· 300 230 301 231 drm_dbg(&xe->drm, "migration recovery in progress\n"); 302 232 xe_pm_runtime_get(xe); 233 + vf_post_migration_shutdown(xe); 303 234 304 235 if (!vf_migration_supported(xe)) { 305 236 xe_sriov_err(xe, "migration not supported by this module version\n"); ··· 318 247 set_bit(id, &fixed_gts); 319 248 } 320 249 250 + vf_post_migration_kickstart(xe); 321 251 err = vf_post_migration_notify_resfix_done(xe, fixed_gts); 322 252 if (err) 323 253 goto fail;

+377

drivers/gpu/drm/xe/xe_sriov_vf_ccs.c

··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #include "instructions/xe_mi_commands.h" 7 + #include "instructions/xe_gpu_commands.h" 8 + #include "xe_bb.h" 9 + #include "xe_bo.h" 10 + #include "xe_device.h" 11 + #include "xe_exec_queue.h" 12 + #include "xe_exec_queue_types.h" 13 + #include "xe_guc_submit.h" 14 + #include "xe_lrc.h" 15 + #include "xe_migrate.h" 16 + #include "xe_sa.h" 17 + #include "xe_sriov_printk.h" 18 + #include "xe_sriov_vf_ccs.h" 19 + #include "xe_sriov_vf_ccs_types.h" 20 + 21 + /** 22 + * DOC: VF save/restore of compression Meta Data 23 + * 24 + * VF KMD registers two special contexts/LRCAs. 25 + * 26 + * Save Context/LRCA: contain necessary cmds+page table to trigger Meta data / 27 + * compression control surface (Aka CCS) save in regular System memory in VM. 28 + * 29 + * Restore Context/LRCA: contain necessary cmds+page table to trigger Meta data / 30 + * compression control surface (Aka CCS) Restore from regular System memory in 31 + * VM to corresponding CCS pool. 32 + * 33 + * Below diagram explain steps needed for VF save/Restore of compression Meta Data:: 34 + * 35 + * CCS Save CCS Restore VF KMD Guc BCS 36 + * LRCA LRCA 37 + * | | | | | 38 + * | | | | | 39 + * | Create Save LRCA | | | 40 + * [ ]<----------------------------- [ ] | | 41 + * | | | | | 42 + * | | | | | 43 + * | | | Register save LRCA | | 44 + * | | | with Guc | | 45 + * | | [ ]--------------------------->[ ] | 46 + * | | | | | 47 + * | | Create restore LRCA | | | 48 + * | [ ]<------------------[ ] | | 49 + * | | | | | 50 + * | | | Register restore LRCA | | 51 + * | | | with Guc | | 52 + * | | [ ]--------------------------->[ ] | 53 + * | | | | | 54 + * | | | | | 55 + * | | [ ]------------------------- | | 56 + * | | [ ] Allocate main memory. | | | 57 + * | | [ ] Allocate CCS memory. | | | 58 + * | | [ ] Update Main memory & | | | 59 + * [ ]<------------------------------[ ] CCS pages PPGTT + BB | | | 60 + * | [ ]<------------------[ ] cmds to save & restore.| | | 61 + * | | [ ]<------------------------ | | 62 + * | | | | | 63 + * | | | | | 64 + * | | | | | 65 + * : : : : : 66 + * ---------------------------- VF Paused ------------------------------------- 67 + * | | | | | 68 + * | | | | | 69 + * | | | |Schedule | 70 + * | | | |CCS Save | 71 + * | | | | LRCA | 72 + * | | | [ ]------>[ ] 73 + * | | | | | 74 + * | | | | | 75 + * | | | |CCS save | 76 + * | | | |completed| 77 + * | | | [ ]<------[ ] 78 + * | | | | | 79 + * : : : : : 80 + * ---------------------------- VM Migrated ----------------------------------- 81 + * | | | | | 82 + * | | | | | 83 + * : : : : : 84 + * ---------------------------- VF Resumed ------------------------------------ 85 + * | | | | | 86 + * | | | | | 87 + * | | [ ]-------------- | | 88 + * | | [ ] Fix up GGTT | | | 89 + * | | [ ]<------------- | | 90 + * | | | | | 91 + * | | | | | 92 + * | | | Notify VF_RESFIX_DONE | | 93 + * | | [ ]--------------------------->[ ] | 94 + * | | | | | 95 + * | | | |Schedule | 96 + * | | | |CCS | 97 + * | | | |Restore | 98 + * | | | |LRCA | 99 + * | | | [ ]------>[ ] 100 + * | | | | | 101 + * | | | | | 102 + * | | | |CCS | 103 + * | | | |restore | 104 + * | | | |completed| 105 + * | | | [ ]<------[ ] 106 + * | | | | | 107 + * | | | | | 108 + * | | | VF_RESFIX_DONE complete | | 109 + * | | | notification | | 110 + * | | [ ]<---------------------------[ ] | 111 + * | | | | | 112 + * | | | | | 113 + * : : : : : 114 + * ------------------------- Continue VM restore ------------------------------ 115 + */ 116 + 117 + static u64 get_ccs_bb_pool_size(struct xe_device *xe) 118 + { 119 + u64 sys_mem_size, ccs_mem_size, ptes, bb_pool_size; 120 + struct sysinfo si; 121 + 122 + si_meminfo(&si); 123 + sys_mem_size = si.totalram * si.mem_unit; 124 + ccs_mem_size = div64_u64(sys_mem_size, NUM_BYTES_PER_CCS_BYTE(xe)); 125 + ptes = DIV_ROUND_UP_ULL(sys_mem_size + ccs_mem_size, XE_PAGE_SIZE); 126 + 127 + /** 128 + * We need below BB size to hold PTE mappings and some DWs for copy 129 + * command. In reality, we need space for many copy commands. So, let 130 + * us allocate double the calculated size which is enough to holds GPU 131 + * instructions for the whole region. 132 + */ 133 + bb_pool_size = ptes * sizeof(u32); 134 + 135 + return round_up(bb_pool_size * 2, SZ_1M); 136 + } 137 + 138 + static int alloc_bb_pool(struct xe_tile *tile, struct xe_tile_vf_ccs *ctx) 139 + { 140 + struct xe_device *xe = tile_to_xe(tile); 141 + struct xe_sa_manager *sa_manager; 142 + u64 bb_pool_size; 143 + int offset, err; 144 + 145 + bb_pool_size = get_ccs_bb_pool_size(xe); 146 + xe_sriov_info(xe, "Allocating %s CCS BB pool size = %lldMB\n", 147 + ctx->ctx_id ? "Restore" : "Save", bb_pool_size / SZ_1M); 148 + 149 + sa_manager = xe_sa_bo_manager_init(tile, bb_pool_size, SZ_16); 150 + 151 + if (IS_ERR(sa_manager)) { 152 + xe_sriov_err(xe, "Suballocator init failed with error: %pe\n", 153 + sa_manager); 154 + err = PTR_ERR(sa_manager); 155 + return err; 156 + } 157 + 158 + offset = 0; 159 + xe_map_memset(xe, &sa_manager->bo->vmap, offset, MI_NOOP, 160 + bb_pool_size); 161 + 162 + offset = bb_pool_size - sizeof(u32); 163 + xe_map_wr(xe, &sa_manager->bo->vmap, offset, u32, MI_BATCH_BUFFER_END); 164 + 165 + ctx->mem.ccs_bb_pool = sa_manager; 166 + 167 + return 0; 168 + } 169 + 170 + static void ccs_rw_update_ring(struct xe_tile_vf_ccs *ctx) 171 + { 172 + u64 addr = xe_sa_manager_gpu_addr(ctx->mem.ccs_bb_pool); 173 + struct xe_lrc *lrc = xe_exec_queue_lrc(ctx->mig_q); 174 + u32 dw[10], i = 0; 175 + 176 + dw[i++] = MI_ARB_ON_OFF | MI_ARB_ENABLE; 177 + dw[i++] = MI_BATCH_BUFFER_START | XE_INSTR_NUM_DW(3); 178 + dw[i++] = lower_32_bits(addr); 179 + dw[i++] = upper_32_bits(addr); 180 + dw[i++] = MI_NOOP; 181 + dw[i++] = MI_NOOP; 182 + 183 + xe_lrc_write_ring(lrc, dw, i * sizeof(u32)); 184 + xe_lrc_set_ring_tail(lrc, lrc->ring.tail); 185 + } 186 + 187 + static int register_save_restore_context(struct xe_tile_vf_ccs *ctx) 188 + { 189 + int err = -EINVAL; 190 + int ctx_type; 191 + 192 + switch (ctx->ctx_id) { 193 + case XE_SRIOV_VF_CCS_READ_CTX: 194 + ctx_type = GUC_CONTEXT_COMPRESSION_SAVE; 195 + break; 196 + case XE_SRIOV_VF_CCS_WRITE_CTX: 197 + ctx_type = GUC_CONTEXT_COMPRESSION_RESTORE; 198 + break; 199 + default: 200 + return err; 201 + } 202 + 203 + xe_guc_register_exec_queue(ctx->mig_q, ctx_type); 204 + return 0; 205 + } 206 + 207 + /** 208 + * xe_sriov_vf_ccs_register_context - Register read/write contexts with guc. 209 + * @xe: the &xe_device to register contexts on. 210 + * 211 + * This function registers read and write contexts with Guc. Re-registration 212 + * is needed whenever resuming from pm runtime suspend. 213 + * 214 + * Return: 0 on success. Negative error code on failure. 215 + */ 216 + int xe_sriov_vf_ccs_register_context(struct xe_device *xe) 217 + { 218 + struct xe_tile *tile = xe_device_get_root_tile(xe); 219 + enum xe_sriov_vf_ccs_rw_ctxs ctx_id; 220 + struct xe_tile_vf_ccs *ctx; 221 + int err; 222 + 223 + if (!IS_VF_CCS_READY(xe)) 224 + return 0; 225 + 226 + for_each_ccs_rw_ctx(ctx_id) { 227 + ctx = &tile->sriov.vf.ccs[ctx_id]; 228 + err = register_save_restore_context(ctx); 229 + if (err) 230 + return err; 231 + } 232 + 233 + return err; 234 + } 235 + 236 + static void xe_sriov_vf_ccs_fini(void *arg) 237 + { 238 + struct xe_tile_vf_ccs *ctx = arg; 239 + struct xe_lrc *lrc = xe_exec_queue_lrc(ctx->mig_q); 240 + 241 + /* 242 + * Make TAIL = HEAD in the ring so that no issues are seen if Guc 243 + * submits this context to HW on VF pause after unbinding device. 244 + */ 245 + xe_lrc_set_ring_tail(lrc, xe_lrc_ring_head(lrc)); 246 + xe_exec_queue_put(ctx->mig_q); 247 + } 248 + 249 + /** 250 + * xe_sriov_vf_ccs_init - Setup LRCA for save & restore. 251 + * @xe: the &xe_device to start recovery on 252 + * 253 + * This function shall be called only by VF. It initializes 254 + * LRCA and suballocator needed for CCS save & restore. 255 + * 256 + * Return: 0 on success. Negative error code on failure. 257 + */ 258 + int xe_sriov_vf_ccs_init(struct xe_device *xe) 259 + { 260 + struct xe_tile *tile = xe_device_get_root_tile(xe); 261 + enum xe_sriov_vf_ccs_rw_ctxs ctx_id; 262 + struct xe_tile_vf_ccs *ctx; 263 + struct xe_exec_queue *q; 264 + u32 flags; 265 + int err; 266 + 267 + xe_assert(xe, IS_SRIOV_VF(xe)); 268 + xe_assert(xe, !IS_DGFX(xe)); 269 + xe_assert(xe, xe_device_has_flat_ccs(xe)); 270 + 271 + for_each_ccs_rw_ctx(ctx_id) { 272 + ctx = &tile->sriov.vf.ccs[ctx_id]; 273 + ctx->ctx_id = ctx_id; 274 + 275 + flags = EXEC_QUEUE_FLAG_KERNEL | 276 + EXEC_QUEUE_FLAG_PERMANENT | 277 + EXEC_QUEUE_FLAG_MIGRATE; 278 + q = xe_exec_queue_create_bind(xe, tile, flags, 0); 279 + if (IS_ERR(q)) { 280 + err = PTR_ERR(q); 281 + goto err_ret; 282 + } 283 + ctx->mig_q = q; 284 + 285 + err = alloc_bb_pool(tile, ctx); 286 + if (err) 287 + goto err_free_queue; 288 + 289 + ccs_rw_update_ring(ctx); 290 + 291 + err = register_save_restore_context(ctx); 292 + if (err) 293 + goto err_free_queue; 294 + 295 + err = devm_add_action_or_reset(xe->drm.dev, 296 + xe_sriov_vf_ccs_fini, 297 + ctx); 298 + if (err) 299 + goto err_ret; 300 + } 301 + 302 + xe->sriov.vf.ccs.initialized = 1; 303 + 304 + return 0; 305 + 306 + err_free_queue: 307 + xe_exec_queue_put(q); 308 + 309 + err_ret: 310 + return err; 311 + } 312 + 313 + /** 314 + * xe_sriov_vf_ccs_attach_bo - Insert CCS read write commands in the BO. 315 + * @bo: the &buffer object to which batch buffer commands will be added. 316 + * 317 + * This function shall be called only by VF. It inserts the PTEs and copy 318 + * command instructions in the BO by calling xe_migrate_ccs_rw_copy() 319 + * function. 320 + * 321 + * Returns: 0 if successful, negative error code on failure. 322 + */ 323 + int xe_sriov_vf_ccs_attach_bo(struct xe_bo *bo) 324 + { 325 + struct xe_device *xe = xe_bo_device(bo); 326 + enum xe_sriov_vf_ccs_rw_ctxs ctx_id; 327 + struct xe_tile_vf_ccs *ctx; 328 + struct xe_tile *tile; 329 + struct xe_bb *bb; 330 + int err = 0; 331 + 332 + if (!IS_VF_CCS_READY(xe)) 333 + return 0; 334 + 335 + tile = xe_device_get_root_tile(xe); 336 + 337 + for_each_ccs_rw_ctx(ctx_id) { 338 + bb = bo->bb_ccs[ctx_id]; 339 + /* bb should be NULL here. Assert if not NULL */ 340 + xe_assert(xe, !bb); 341 + 342 + ctx = &tile->sriov.vf.ccs[ctx_id]; 343 + err = xe_migrate_ccs_rw_copy(tile, ctx->mig_q, bo, ctx_id); 344 + } 345 + return err; 346 + } 347 + 348 + /** 349 + * xe_sriov_vf_ccs_detach_bo - Remove CCS read write commands from the BO. 350 + * @bo: the &buffer object from which batch buffer commands will be removed. 351 + * 352 + * This function shall be called only by VF. It removes the PTEs and copy 353 + * command instructions from the BO. Make sure to update the BB with MI_NOOP 354 + * before freeing. 355 + * 356 + * Returns: 0 if successful. 357 + */ 358 + int xe_sriov_vf_ccs_detach_bo(struct xe_bo *bo) 359 + { 360 + struct xe_device *xe = xe_bo_device(bo); 361 + enum xe_sriov_vf_ccs_rw_ctxs ctx_id; 362 + struct xe_bb *bb; 363 + 364 + if (!IS_VF_CCS_READY(xe)) 365 + return 0; 366 + 367 + for_each_ccs_rw_ctx(ctx_id) { 368 + bb = bo->bb_ccs[ctx_id]; 369 + if (!bb) 370 + continue; 371 + 372 + memset(bb->cs, MI_NOOP, bb->len * sizeof(u32)); 373 + xe_bb_free(bb, NULL); 374 + bo->bb_ccs[ctx_id] = NULL; 375 + } 376 + return 0; 377 + }

+17

drivers/gpu/drm/xe/xe_sriov_vf_ccs.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_SRIOV_VF_CCS_H_ 7 + #define _XE_SRIOV_VF_CCS_H_ 8 + 9 + struct xe_device; 10 + struct xe_bo; 11 + 12 + int xe_sriov_vf_ccs_init(struct xe_device *xe); 13 + int xe_sriov_vf_ccs_attach_bo(struct xe_bo *bo); 14 + int xe_sriov_vf_ccs_detach_bo(struct xe_bo *bo); 15 + int xe_sriov_vf_ccs_register_context(struct xe_device *xe); 16 + 17 + #endif

+53

drivers/gpu/drm/xe/xe_sriov_vf_ccs_types.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_SRIOV_VF_CCS_TYPES_H_ 7 + #define _XE_SRIOV_VF_CCS_TYPES_H_ 8 + 9 + #define for_each_ccs_rw_ctx(id__) \ 10 + for ((id__) = 0; (id__) < XE_SRIOV_VF_CCS_CTX_COUNT; (id__)++) 11 + 12 + #define IS_VF_CCS_READY(xe) ({ \ 13 + struct xe_device *___xe = (xe); \ 14 + xe_assert(___xe, IS_SRIOV_VF(___xe)); \ 15 + ___xe->sriov.vf.ccs.initialized; \ 16 + }) 17 + 18 + #define IS_VF_CCS_INIT_NEEDED(xe) ({\ 19 + struct xe_device *___xe = (xe); \ 20 + IS_SRIOV_VF(___xe) && !IS_DGFX(___xe) && \ 21 + xe_device_has_flat_ccs(___xe) && GRAPHICS_VER(___xe) >= 20; \ 22 + }) 23 + 24 + enum xe_sriov_vf_ccs_rw_ctxs { 25 + XE_SRIOV_VF_CCS_READ_CTX, 26 + XE_SRIOV_VF_CCS_WRITE_CTX, 27 + XE_SRIOV_VF_CCS_CTX_COUNT 28 + }; 29 + 30 + #define IS_VF_CCS_BB_VALID(xe, bo) ({ \ 31 + struct xe_device *___xe = (xe); \ 32 + struct xe_bo *___bo = (bo); \ 33 + IS_SRIOV_VF(___xe) && \ 34 + ___bo->bb_ccs[XE_SRIOV_VF_CCS_READ_CTX] && \ 35 + ___bo->bb_ccs[XE_SRIOV_VF_CCS_WRITE_CTX]; \ 36 + }) 37 + 38 + struct xe_migrate; 39 + struct xe_sa_manager; 40 + 41 + struct xe_tile_vf_ccs { 42 + /** @id: Id to which context it belongs to */ 43 + enum xe_sriov_vf_ccs_rw_ctxs ctx_id; 44 + /** @mig_q: exec queues used for migration */ 45 + struct xe_exec_queue *mig_q; 46 + 47 + struct { 48 + /** @ccs_bb_pool: Pool from which batch buffers are allocated. */ 49 + struct xe_sa_manager *ccs_bb_pool; 50 + } mem; 51 + }; 52 + 53 + #endif

+6

drivers/gpu/drm/xe/xe_sriov_vf_types.h

··· 36 36 /** @migration.gt_flags: Per-GT request flags for VF migration recovery */ 37 37 unsigned long gt_flags; 38 38 } migration; 39 + 40 + /** @ccs: VF CCS state data */ 41 + struct { 42 + /** @ccs.initialized: Initilalization of VF CCS is completed or not */ 43 + bool initialized; 44 + } ccs; 39 45 }; 40 46 41 47 #endif

+135 -34

drivers/gpu/drm/xe/xe_survivability_mode.c

··· 22 22 #define MAX_SCRATCH_MMIO 8 23 23 24 24 /** 25 - * DOC: Xe Boot Survivability 25 + * DOC: Survivability Mode 26 26 * 27 - * Boot Survivability is a software based workflow for recovering a system in a failed boot state 27 + * Survivability Mode is a software based workflow for recovering a system in a failed boot state 28 28 * Here system recoverability is concerned with recovering the firmware responsible for boot. 29 29 * 30 - * This is implemented by loading the driver with bare minimum (no drm card) to allow the firmware 31 - * to be flashed through mei and collect telemetry. The driver's probe flow is modified 32 - * such that it enters survivability mode when pcode initialization is incomplete and boot status 33 - * denotes a failure. 30 + * Boot Survivability 31 + * =================== 32 + * 33 + * Boot Survivability is implemented by loading the driver with bare minimum (no drm card) to allow 34 + * the firmware to be flashed through mei driver and collect telemetry. The driver's probe flow is 35 + * modified such that it enters survivability mode when pcode initialization is incomplete and boot 36 + * status denotes a failure. 34 37 * 35 38 * Survivability mode can also be entered manually using the survivability mode attribute available 36 39 * through configfs which is beneficial in several usecases. It can be used to address scenarios ··· 49 46 * Survivability mode is indicated by the below admin-only readable sysfs which provides additional 50 47 * debug information:: 51 48 * 52 - * /sys/bus/pci/devices/<device>/surivability_mode 49 + * /sys/bus/pci/devices/<device>/survivability_mode 53 50 * 54 51 * Capability Information: 55 52 * Provides boot status ··· 59 56 * Provides history of previous failures 60 57 * Auxiliary Information 61 58 * Certain failures may have information in addition to postcode information 59 + * 60 + * Runtime Survivability 61 + * ===================== 62 + * 63 + * Certain runtime firmware errors can cause the device to enter a wedged state 64 + * (:ref:`xe-device-wedging`) requiring a firmware flash to restore normal operation. 65 + * Runtime Survivability Mode indicates that a firmware flash is necessary to recover the device and 66 + * is indicated by the presence of survivability mode sysfs:: 67 + * 68 + * /sys/bus/pci/devices/<device>/survivability_mode 69 + * 70 + * Survivability mode sysfs provides information about the type of survivability mode. 71 + * 72 + * When such errors occur, userspace is notified with the drm device wedged uevent and runtime 73 + * survivability mode. User can then initiate a firmware flash using userspace tools like fwupd 74 + * to restore device to normal operation. 62 75 */ 63 76 64 77 static u32 aux_history_offset(u32 reg_value) ··· 140 121 } 141 122 } 142 123 124 + static int check_boot_failure(struct xe_device *xe) 125 + { 126 + struct xe_survivability *survivability = &xe->survivability; 127 + 128 + return survivability->boot_status == NON_CRITICAL_FAILURE || 129 + survivability->boot_status == CRITICAL_FAILURE; 130 + } 131 + 143 132 static ssize_t survivability_mode_show(struct device *dev, 144 133 struct device_attribute *attr, char *buff) 145 134 { ··· 156 129 struct xe_survivability *survivability = &xe->survivability; 157 130 struct xe_survivability_info *info = survivability->info; 158 131 int index = 0, count = 0; 132 + 133 + count += sysfs_emit_at(buff, count, "Survivability mode type: %s\n", 134 + survivability->type ? "Runtime" : "Boot"); 135 + 136 + if (!check_boot_failure(xe)) 137 + return count; 159 138 160 139 for (index = 0; index < MAX_SCRATCH_MMIO; index++) { 161 140 if (info[index].reg) ··· 184 151 sysfs_remove_file(&dev->kobj, &dev_attr_survivability_mode.attr); 185 152 } 186 153 187 - static int enable_survivability_mode(struct pci_dev *pdev) 154 + static int create_survivability_sysfs(struct pci_dev *pdev) 188 155 { 189 156 struct device *dev = &pdev->dev; 190 157 struct xe_device *xe = pdev_to_xe_device(pdev); 191 - struct xe_survivability *survivability = &xe->survivability; 192 - int ret = 0; 158 + int ret; 193 159 194 160 /* create survivability mode sysfs */ 195 161 ret = sysfs_create_file(&dev->kobj, &dev_attr_survivability_mode.attr); ··· 199 167 200 168 ret = devm_add_action_or_reset(xe->drm.dev, 201 169 xe_survivability_mode_fini, xe); 170 + if (ret) 171 + return ret; 172 + 173 + return 0; 174 + } 175 + 176 + static int enable_boot_survivability_mode(struct pci_dev *pdev) 177 + { 178 + struct device *dev = &pdev->dev; 179 + struct xe_device *xe = pdev_to_xe_device(pdev); 180 + struct xe_survivability *survivability = &xe->survivability; 181 + int ret = 0; 182 + 183 + ret = create_survivability_sysfs(pdev); 202 184 if (ret) 203 185 return ret; 204 186 ··· 238 192 return ret; 239 193 } 240 194 195 + static int init_survivability_mode(struct xe_device *xe) 196 + { 197 + struct xe_survivability *survivability = &xe->survivability; 198 + struct xe_survivability_info *info; 199 + 200 + survivability->size = MAX_SCRATCH_MMIO; 201 + 202 + info = devm_kcalloc(xe->drm.dev, survivability->size, sizeof(*info), 203 + GFP_KERNEL); 204 + if (!info) 205 + return -ENOMEM; 206 + 207 + survivability->info = info; 208 + 209 + populate_survivability_info(xe); 210 + 211 + return 0; 212 + } 213 + 241 214 /** 242 - * xe_survivability_mode_is_enabled - check if survivability mode is enabled 215 + * xe_survivability_mode_is_boot_enabled- check if boot survivability mode is enabled 243 216 * @xe: xe device instance 244 217 * 245 - * Returns true if in survivability mode, false otherwise 218 + * Returns true if in boot survivability mode of type, else false 246 219 */ 247 - bool xe_survivability_mode_is_enabled(struct xe_device *xe) 220 + bool xe_survivability_mode_is_boot_enabled(struct xe_device *xe) 248 221 { 249 - return xe->survivability.mode; 222 + struct xe_survivability *survivability = &xe->survivability; 223 + 224 + return survivability->mode && survivability->type == XE_SURVIVABILITY_TYPE_BOOT; 250 225 } 251 226 252 227 /** ··· 308 241 data = xe_mmio_read32(mmio, PCODE_SCRATCH(0)); 309 242 survivability->boot_status = REG_FIELD_GET(BOOT_STATUS, data); 310 243 311 - return survivability->boot_status == NON_CRITICAL_FAILURE || 312 - survivability->boot_status == CRITICAL_FAILURE; 244 + return check_boot_failure(xe); 313 245 } 314 246 315 247 /** 316 - * xe_survivability_mode_enable - Initialize and enable the survivability mode 248 + * xe_survivability_mode_runtime_enable - Initialize and enable runtime survivability mode 317 249 * @xe: xe device instance 318 250 * 319 - * Initialize survivability information and enable survivability mode 251 + * Initialize survivability information and enable runtime survivability mode. 252 + * Runtime survivability mode is enabled when certain errors cause the device to be 253 + * in non-recoverable state. The device is declared wedged with the appropriate 254 + * recovery method and survivability mode sysfs exposed to userspace 320 255 * 321 - * Return: 0 if survivability mode is enabled or not requested; negative error 322 - * code otherwise. 256 + * Return: 0 if runtime survivability mode is enabled, negative error code otherwise. 323 257 */ 324 - int xe_survivability_mode_enable(struct xe_device *xe) 258 + int xe_survivability_mode_runtime_enable(struct xe_device *xe) 325 259 { 326 260 struct xe_survivability *survivability = &xe->survivability; 327 - struct xe_survivability_info *info; 328 261 struct pci_dev *pdev = to_pci_dev(xe->drm.dev); 262 + int ret; 263 + 264 + if (!IS_DGFX(xe) || IS_SRIOV_VF(xe) || xe->info.platform < XE_BATTLEMAGE) { 265 + dev_err(&pdev->dev, "Runtime Survivability Mode not supported\n"); 266 + return -EINVAL; 267 + } 268 + 269 + ret = init_survivability_mode(xe); 270 + if (ret) 271 + return ret; 272 + 273 + ret = create_survivability_sysfs(pdev); 274 + if (ret) 275 + dev_err(&pdev->dev, "Failed to create survivability mode sysfs\n"); 276 + 277 + survivability->type = XE_SURVIVABILITY_TYPE_RUNTIME; 278 + dev_err(&pdev->dev, "Runtime Survivability mode enabled\n"); 279 + 280 + xe_device_set_wedged_method(xe, DRM_WEDGE_RECOVERY_VENDOR); 281 + xe_device_declare_wedged(xe); 282 + dev_err(&pdev->dev, "Firmware flash required, Please refer to the userspace documentation for more details!\n"); 283 + 284 + return 0; 285 + } 286 + 287 + /** 288 + * xe_survivability_mode_boot_enable - Initialize and enable boot survivability mode 289 + * @xe: xe device instance 290 + * 291 + * Initialize survivability information and enable boot survivability mode 292 + * 293 + * Return: 0 if boot survivability mode is enabled or not requested, negative error 294 + * code otherwise. 295 + */ 296 + int xe_survivability_mode_boot_enable(struct xe_device *xe) 297 + { 298 + struct xe_survivability *survivability = &xe->survivability; 299 + struct pci_dev *pdev = to_pci_dev(xe->drm.dev); 300 + int ret; 329 301 330 302 if (!xe_survivability_mode_is_requested(xe)) 331 303 return 0; 332 304 333 - survivability->size = MAX_SCRATCH_MMIO; 305 + ret = init_survivability_mode(xe); 306 + if (ret) 307 + return ret; 334 308 335 - info = devm_kcalloc(xe->drm.dev, survivability->size, sizeof(*info), 336 - GFP_KERNEL); 337 - if (!info) 338 - return -ENOMEM; 339 - 340 - survivability->info = info; 341 - 342 - populate_survivability_info(xe); 343 - 344 - /* Only log debug information and exit if it is a critical failure */ 309 + /* Log breadcrumbs but do not enter survivability mode for Critical boot errors */ 345 310 if (survivability->boot_status == CRITICAL_FAILURE) { 346 311 log_survivability_info(pdev); 347 312 return -ENXIO; 348 313 } 349 314 350 - return enable_survivability_mode(pdev); 315 + survivability->type = XE_SURVIVABILITY_TYPE_BOOT; 316 + 317 + return enable_boot_survivability_mode(pdev); 351 318 }

+3 -2

drivers/gpu/drm/xe/xe_survivability_mode.h

··· 10 10 11 11 struct xe_device; 12 12 13 - int xe_survivability_mode_enable(struct xe_device *xe); 14 - bool xe_survivability_mode_is_enabled(struct xe_device *xe); 13 + int xe_survivability_mode_boot_enable(struct xe_device *xe); 14 + int xe_survivability_mode_runtime_enable(struct xe_device *xe); 15 + bool xe_survivability_mode_is_boot_enabled(struct xe_device *xe); 15 16 bool xe_survivability_mode_is_requested(struct xe_device *xe); 16 17 17 18 #endif /* _XE_SURVIVABILITY_MODE_H_ */

+8

drivers/gpu/drm/xe/xe_survivability_mode_types.h

··· 9 9 #include <linux/limits.h> 10 10 #include <linux/types.h> 11 11 12 + enum xe_survivability_type { 13 + XE_SURVIVABILITY_TYPE_BOOT, 14 + XE_SURVIVABILITY_TYPE_RUNTIME, 15 + }; 16 + 12 17 struct xe_survivability_info { 13 18 char name[NAME_MAX]; 14 19 u32 reg; ··· 35 30 36 31 /** @mode: boolean to indicate survivability mode */ 37 32 bool mode; 33 + 34 + /** @type: survivability type */ 35 + enum xe_survivability_type type; 38 36 }; 39 37 40 38 #endif /* _XE_SURVIVABILITY_MODE_TYPES_H_ */

+291 -78

drivers/gpu/drm/xe/xe_svm.c

··· 7 7 8 8 #include "xe_bo.h" 9 9 #include "xe_gt_stats.h" 10 - #include "xe_gt_tlb_invalidation.h" 11 10 #include "xe_migrate.h" 12 11 #include "xe_module.h" 13 12 #include "xe_pm.h" ··· 16 17 #include "xe_ttm_vram_mgr.h" 17 18 #include "xe_vm.h" 18 19 #include "xe_vm_types.h" 20 + #include "xe_vram_types.h" 19 21 20 22 static bool xe_svm_range_in_vram(struct xe_svm_range *range) 21 23 { ··· 224 224 225 225 xe_device_wmb(xe); 226 226 227 - err = xe_vm_range_tilemask_tlb_invalidation(vm, adj_start, adj_end, tile_mask); 227 + err = xe_vm_range_tilemask_tlb_inval(vm, adj_start, adj_end, tile_mask); 228 228 WARN_ON_ONCE(err); 229 229 230 230 range_notifier_event_end: ··· 252 252 return 0; 253 253 } 254 254 255 + static int xe_svm_range_set_default_attr(struct xe_vm *vm, u64 range_start, u64 range_end) 256 + { 257 + struct xe_vma *vma; 258 + struct xe_vma_mem_attr default_attr = { 259 + .preferred_loc = { 260 + .devmem_fd = DRM_XE_PREFERRED_LOC_DEFAULT_DEVICE, 261 + .migration_policy = DRM_XE_MIGRATE_ALL_PAGES, 262 + }, 263 + .atomic_access = DRM_XE_ATOMIC_UNDEFINED, 264 + }; 265 + int err = 0; 266 + 267 + vma = xe_vm_find_vma_by_addr(vm, range_start); 268 + if (!vma) 269 + return -EINVAL; 270 + 271 + if (xe_vma_has_default_mem_attrs(vma)) 272 + return 0; 273 + 274 + vm_dbg(&vm->xe->drm, "Existing VMA start=0x%016llx, vma_end=0x%016llx", 275 + xe_vma_start(vma), xe_vma_end(vma)); 276 + 277 + if (xe_vma_start(vma) == range_start && xe_vma_end(vma) == range_end) { 278 + default_attr.pat_index = vma->attr.default_pat_index; 279 + default_attr.default_pat_index = vma->attr.default_pat_index; 280 + vma->attr = default_attr; 281 + } else { 282 + vm_dbg(&vm->xe->drm, "Split VMA start=0x%016llx, vma_end=0x%016llx", 283 + range_start, range_end); 284 + err = xe_vm_alloc_cpu_addr_mirror_vma(vm, range_start, range_end - range_start); 285 + if (err) { 286 + drm_warn(&vm->xe->drm, "VMA SPLIT failed: %pe\n", ERR_PTR(err)); 287 + xe_vm_kill(vm, true); 288 + return err; 289 + } 290 + } 291 + 292 + /* 293 + * On call from xe_svm_handle_pagefault original VMA might be changed 294 + * signal this to lookup for VMA again. 295 + */ 296 + return -EAGAIN; 297 + } 298 + 255 299 static int xe_svm_garbage_collector(struct xe_vm *vm) 256 300 { 257 301 struct xe_svm_range *range; 258 - int err; 302 + u64 range_start; 303 + u64 range_end; 304 + int err, ret = 0; 259 305 260 306 lockdep_assert_held_write(&vm->lock); 261 307 ··· 316 270 if (!range) 317 271 break; 318 272 273 + range_start = xe_svm_range_start(range); 274 + range_end = xe_svm_range_end(range); 275 + 319 276 list_del(&range->garbage_collector_link); 320 277 spin_unlock(&vm->svm.garbage_collector.lock); 321 278 ··· 331 282 return err; 332 283 } 333 284 285 + err = xe_svm_range_set_default_attr(vm, range_start, range_end); 286 + if (err) { 287 + if (err == -EAGAIN) 288 + ret = -EAGAIN; 289 + else 290 + return err; 291 + } 292 + 334 293 spin_lock(&vm->svm.garbage_collector.lock); 335 294 } 336 295 spin_unlock(&vm->svm.garbage_collector.lock); 337 296 338 - return 0; 297 + return ret; 339 298 } 340 299 341 300 static void xe_svm_garbage_collector_work_func(struct work_struct *w) ··· 363 306 return container_of(page_pgmap(page), struct xe_vram_region, pagemap); 364 307 } 365 308 366 - static struct xe_tile *vr_to_tile(struct xe_vram_region *vr) 367 - { 368 - return container_of(vr, struct xe_tile, mem.vram); 369 - } 370 - 371 309 static u64 xe_vram_region_page_to_dpa(struct xe_vram_region *vr, 372 310 struct page *page) 373 311 { 374 312 u64 dpa; 375 - struct xe_tile *tile = vr_to_tile(vr); 376 313 u64 pfn = page_to_pfn(page); 377 314 u64 offset; 378 315 379 - xe_tile_assert(tile, is_device_private_page(page)); 380 - xe_tile_assert(tile, (pfn << PAGE_SHIFT) >= vr->hpa_base); 316 + xe_assert(vr->xe, is_device_private_page(page)); 317 + xe_assert(vr->xe, (pfn << PAGE_SHIFT) >= vr->hpa_base); 381 318 382 319 offset = (pfn << PAGE_SHIFT) - vr->hpa_base; 383 320 dpa = vr->dpa_base + offset; ··· 384 333 XE_SVM_COPY_TO_SRAM, 385 334 }; 386 335 387 - static int xe_svm_copy(struct page **pages, dma_addr_t *dma_addr, 336 + static int xe_svm_copy(struct page **pages, 337 + struct drm_pagemap_addr *pagemap_addr, 388 338 unsigned long npages, const enum xe_svm_copy_dir dir) 389 339 { 390 340 struct xe_vram_region *vr = NULL; 391 - struct xe_tile *tile; 341 + struct xe_device *xe; 392 342 struct dma_fence *fence = NULL; 393 343 unsigned long i; 394 344 #define XE_VRAM_ADDR_INVALID ~0x0ull ··· 417 365 last = (i + 1) == npages; 418 366 419 367 /* No CPU page and no device pages queue'd to copy */ 420 - if (!dma_addr[i] && vram_addr == XE_VRAM_ADDR_INVALID) 368 + if (!pagemap_addr[i].addr && vram_addr == XE_VRAM_ADDR_INVALID) 421 369 continue; 422 370 423 371 if (!vr && spage) { 424 372 vr = page_to_vr(spage); 425 - tile = vr_to_tile(vr); 373 + xe = vr->xe; 426 374 } 427 375 XE_WARN_ON(spage && page_to_vr(spage) != vr); 428 376 ··· 431 379 * first device page, check if physical contiguous on subsequent 432 380 * device pages. 433 381 */ 434 - if (dma_addr[i] && spage) { 382 + if (pagemap_addr[i].addr && spage) { 435 383 __vram_addr = xe_vram_region_page_to_dpa(vr, spage); 436 384 if (vram_addr == XE_VRAM_ADDR_INVALID) { 437 385 vram_addr = __vram_addr; ··· 439 387 } 440 388 441 389 match = vram_addr + PAGE_SIZE * (i - pos) == __vram_addr; 390 + /* Expected with contiguous memory */ 391 + xe_assert(vr->xe, match); 392 + 393 + if (pagemap_addr[i].order) { 394 + i += NR_PAGES(pagemap_addr[i].order) - 1; 395 + chunk = (i - pos) == (XE_MIGRATE_CHUNK_SIZE / PAGE_SIZE); 396 + last = (i + 1) == npages; 397 + } 442 398 } 443 399 444 400 /* ··· 462 402 463 403 if (vram_addr != XE_VRAM_ADDR_INVALID) { 464 404 if (sram) { 465 - vm_dbg(&tile->xe->drm, 405 + vm_dbg(&xe->drm, 466 406 "COPY TO SRAM - 0x%016llx -> 0x%016llx, NPAGES=%ld", 467 - vram_addr, (u64)dma_addr[pos], i - pos + incr); 468 - __fence = xe_migrate_from_vram(tile->migrate, 407 + vram_addr, 408 + (u64)pagemap_addr[pos].addr, i - pos + incr); 409 + __fence = xe_migrate_from_vram(vr->migrate, 469 410 i - pos + incr, 470 411 vram_addr, 471 - dma_addr + pos); 412 + &pagemap_addr[pos]); 472 413 } else { 473 - vm_dbg(&tile->xe->drm, 414 + vm_dbg(&xe->drm, 474 415 "COPY TO VRAM - 0x%016llx -> 0x%016llx, NPAGES=%ld", 475 - (u64)dma_addr[pos], vram_addr, i - pos + incr); 476 - __fence = xe_migrate_to_vram(tile->migrate, 416 + (u64)pagemap_addr[pos].addr, vram_addr, 417 + i - pos + incr); 418 + __fence = xe_migrate_to_vram(vr->migrate, 477 419 i - pos + incr, 478 - dma_addr + pos, 420 + &pagemap_addr[pos], 479 421 vram_addr); 480 422 } 481 423 if (IS_ERR(__fence)) { ··· 490 428 } 491 429 492 430 /* Setup physical address of next device page */ 493 - if (dma_addr[i] && spage) { 431 + if (pagemap_addr[i].addr && spage) { 494 432 vram_addr = __vram_addr; 495 433 pos = i; 496 434 } else { ··· 500 438 /* Extra mismatched device page, copy it */ 501 439 if (!match && last && vram_addr != XE_VRAM_ADDR_INVALID) { 502 440 if (sram) { 503 - vm_dbg(&tile->xe->drm, 441 + vm_dbg(&xe->drm, 504 442 "COPY TO SRAM - 0x%016llx -> 0x%016llx, NPAGES=%d", 505 - vram_addr, (u64)dma_addr[pos], 1); 506 - __fence = xe_migrate_from_vram(tile->migrate, 1, 443 + vram_addr, (u64)pagemap_addr[pos].addr, 1); 444 + __fence = xe_migrate_from_vram(vr->migrate, 1, 507 445 vram_addr, 508 - dma_addr + pos); 446 + &pagemap_addr[pos]); 509 447 } else { 510 - vm_dbg(&tile->xe->drm, 448 + vm_dbg(&xe->drm, 511 449 "COPY TO VRAM - 0x%016llx -> 0x%016llx, NPAGES=%d", 512 - (u64)dma_addr[pos], vram_addr, 1); 513 - __fence = xe_migrate_to_vram(tile->migrate, 1, 514 - dma_addr + pos, 450 + (u64)pagemap_addr[pos].addr, vram_addr, 1); 451 + __fence = xe_migrate_to_vram(vr->migrate, 1, 452 + &pagemap_addr[pos], 515 453 vram_addr); 516 454 } 517 455 if (IS_ERR(__fence)) { ··· 537 475 #undef XE_VRAM_ADDR_INVALID 538 476 } 539 477 540 - static int xe_svm_copy_to_devmem(struct page **pages, dma_addr_t *dma_addr, 478 + static int xe_svm_copy_to_devmem(struct page **pages, 479 + struct drm_pagemap_addr *pagemap_addr, 541 480 unsigned long npages) 542 481 { 543 - return xe_svm_copy(pages, dma_addr, npages, XE_SVM_COPY_TO_VRAM); 482 + return xe_svm_copy(pages, pagemap_addr, npages, XE_SVM_COPY_TO_VRAM); 544 483 } 545 484 546 - static int xe_svm_copy_to_ram(struct page **pages, dma_addr_t *dma_addr, 485 + static int xe_svm_copy_to_ram(struct page **pages, 486 + struct drm_pagemap_addr *pagemap_addr, 547 487 unsigned long npages) 548 488 { 549 - return xe_svm_copy(pages, dma_addr, npages, XE_SVM_COPY_TO_SRAM); 489 + return xe_svm_copy(pages, pagemap_addr, npages, XE_SVM_COPY_TO_SRAM); 550 490 } 551 491 552 492 static struct xe_bo *to_xe_bo(struct drm_pagemap_devmem *devmem_allocation) ··· 570 506 return PHYS_PFN(offset + vr->hpa_base); 571 507 } 572 508 573 - static struct drm_buddy *tile_to_buddy(struct xe_tile *tile) 509 + static struct drm_buddy *vram_to_buddy(struct xe_vram_region *vram) 574 510 { 575 - return &tile->mem.vram.ttm.mm; 511 + return &vram->ttm.mm; 576 512 } 577 513 578 514 static int xe_svm_populate_devmem_pfn(struct drm_pagemap_devmem *devmem_allocation, ··· 586 522 587 523 list_for_each_entry(block, blocks, link) { 588 524 struct xe_vram_region *vr = block->private; 589 - struct xe_tile *tile = vr_to_tile(vr); 590 - struct drm_buddy *buddy = tile_to_buddy(tile); 525 + struct drm_buddy *buddy = vram_to_buddy(vr); 591 526 u64 block_pfn = block_offset_to_pfn(vr, drm_buddy_block_offset(block)); 592 527 int i; 593 528 ··· 746 683 } 747 684 748 685 #if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP) 749 - static struct xe_vram_region *tile_to_vr(struct xe_tile *tile) 750 - { 751 - return &tile->mem.vram; 752 - } 753 - 754 686 static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap, 755 687 unsigned long start, unsigned long end, 756 688 struct mm_struct *mm, 757 689 unsigned long timeslice_ms) 758 690 { 759 - struct xe_tile *tile = container_of(dpagemap, typeof(*tile), mem.vram.dpagemap); 760 - struct xe_device *xe = tile_to_xe(tile); 691 + struct xe_vram_region *vr = container_of(dpagemap, typeof(*vr), dpagemap); 692 + struct xe_device *xe = vr->xe; 761 693 struct device *dev = xe->drm.dev; 762 - struct xe_vram_region *vr = tile_to_vr(tile); 763 694 struct drm_buddy_block *block; 764 695 struct list_head *blocks; 765 696 struct xe_bo *bo; ··· 766 709 xe_pm_runtime_get(xe); 767 710 768 711 retry: 769 - bo = xe_bo_create_locked(tile_to_xe(tile), NULL, NULL, end - start, 712 + bo = xe_bo_create_locked(vr->xe, NULL, NULL, end - start, 770 713 ttm_bo_type_device, 771 - XE_BO_FLAG_VRAM_IF_DGFX(tile) | 714 + (IS_DGFX(xe) ? XE_BO_FLAG_VRAM(vr) : XE_BO_FLAG_SYSTEM) | 772 715 XE_BO_FLAG_CPU_ADDR_MIRROR); 773 716 if (IS_ERR(bo)) { 774 717 err = PTR_ERR(bo); ··· 778 721 } 779 722 780 723 drm_pagemap_devmem_init(&bo->devmem_allocation, dev, mm, 781 - &dpagemap_devmem_ops, 782 - &tile->mem.vram.dpagemap, 783 - end - start); 724 + &dpagemap_devmem_ops, dpagemap, end - start); 784 725 785 726 blocks = &to_xe_ttm_vram_mgr_resource(bo->ttm.resource)->blocks; 786 727 list_for_each_entry(block, blocks, link) ··· 845 790 return true; 846 791 } 847 792 848 - /** 849 - * xe_svm_handle_pagefault() - SVM handle page fault 850 - * @vm: The VM. 851 - * @vma: The CPU address mirror VMA. 852 - * @gt: The gt upon the fault occurred. 853 - * @fault_addr: The GPU fault address. 854 - * @atomic: The fault atomic access bit. 855 - * 856 - * Create GPU bindings for a SVM page fault. Optionally migrate to device 857 - * memory. 858 - * 859 - * Return: 0 on success, negative error code on error. 860 - */ 861 - int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, 862 - struct xe_gt *gt, u64 fault_addr, 863 - bool atomic) 793 + static int __xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, 794 + struct xe_gt *gt, u64 fault_addr, 795 + bool need_vram) 864 796 { 865 797 struct drm_gpusvm_ctx ctx = { 866 798 .read_only = xe_vma_read_only(vma), ··· 855 813 IS_ENABLED(CONFIG_DRM_XE_PAGEMAP), 856 814 .check_pages_threshold = IS_DGFX(vm->xe) && 857 815 IS_ENABLED(CONFIG_DRM_XE_PAGEMAP) ? SZ_64K : 0, 858 - .devmem_only = atomic && IS_DGFX(vm->xe) && 859 - IS_ENABLED(CONFIG_DRM_XE_PAGEMAP), 860 - .timeslice_ms = atomic && IS_DGFX(vm->xe) && 816 + .devmem_only = need_vram && IS_ENABLED(CONFIG_DRM_XE_PAGEMAP), 817 + .timeslice_ms = need_vram && IS_DGFX(vm->xe) && 861 818 IS_ENABLED(CONFIG_DRM_XE_PAGEMAP) ? 862 819 vm->xe->atomic_svm_timeslice_ms : 0, 863 820 }; 864 821 struct xe_svm_range *range; 865 822 struct dma_fence *fence; 823 + struct drm_pagemap *dpagemap; 866 824 struct xe_tile *tile = gt_to_tile(gt); 867 825 int migrate_try_count = ctx.devmem_only ? 3 : 1; 868 826 ktime_t end = 0; ··· 892 850 893 851 range_debug(range, "PAGE FAULT"); 894 852 853 + dpagemap = xe_vma_resolve_pagemap(vma, tile); 895 854 if (--migrate_try_count >= 0 && 896 - xe_svm_range_needs_migrate_to_vram(range, vma, IS_DGFX(vm->xe))) { 855 + xe_svm_range_needs_migrate_to_vram(range, vma, !!dpagemap || ctx.devmem_only)) { 856 + /* TODO : For multi-device dpagemap will be used to find the 857 + * remote tile and remote device. Will need to modify 858 + * xe_svm_alloc_vram to use dpagemap for future multi-device 859 + * support. 860 + */ 897 861 err = xe_svm_alloc_vram(tile, range, &ctx); 898 862 ctx.timeslice_ms <<= 1; /* Double timeslice if we have to retry */ 899 863 if (err) { ··· 967 919 } 968 920 969 921 /** 922 + * xe_svm_handle_pagefault() - SVM handle page fault 923 + * @vm: The VM. 924 + * @vma: The CPU address mirror VMA. 925 + * @gt: The gt upon the fault occurred. 926 + * @fault_addr: The GPU fault address. 927 + * @atomic: The fault atomic access bit. 928 + * 929 + * Create GPU bindings for a SVM page fault. Optionally migrate to device 930 + * memory. 931 + * 932 + * Return: 0 on success, negative error code on error. 933 + */ 934 + int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, 935 + struct xe_gt *gt, u64 fault_addr, 936 + bool atomic) 937 + { 938 + int need_vram, ret; 939 + retry: 940 + need_vram = xe_vma_need_vram_for_atomic(vm->xe, vma, atomic); 941 + if (need_vram < 0) 942 + return need_vram; 943 + 944 + ret = __xe_svm_handle_pagefault(vm, vma, gt, fault_addr, 945 + need_vram ? true : false); 946 + if (ret == -EAGAIN) { 947 + /* 948 + * Retry once on -EAGAIN to re-lookup the VMA, as the original VMA 949 + * may have been split by xe_svm_range_set_default_attr. 950 + */ 951 + vma = xe_vm_find_vma_by_addr(vm, fault_addr); 952 + if (!vma) 953 + return -EINVAL; 954 + 955 + goto retry; 956 + } 957 + return ret; 958 + } 959 + 960 + /** 970 961 * xe_svm_has_mapping() - SVM has mappings 971 962 * @vm: The VM. 972 963 * @start: Start address. ··· 1018 931 bool xe_svm_has_mapping(struct xe_vm *vm, u64 start, u64 end) 1019 932 { 1020 933 return drm_gpusvm_has_mapping(&vm->svm.gpusvm, start, end); 934 + } 935 + 936 + /** 937 + * xe_svm_unmap_address_range - UNMAP SVM mappings and ranges 938 + * @vm: The VM 939 + * @start: start addr 940 + * @end: end addr 941 + * 942 + * This function UNMAPS svm ranges if start or end address are inside them. 943 + */ 944 + void xe_svm_unmap_address_range(struct xe_vm *vm, u64 start, u64 end) 945 + { 946 + struct drm_gpusvm_notifier *notifier, *next; 947 + 948 + lockdep_assert_held_write(&vm->lock); 949 + 950 + drm_gpusvm_for_each_notifier_safe(notifier, next, &vm->svm.gpusvm, start, end) { 951 + struct drm_gpusvm_range *range, *__next; 952 + 953 + drm_gpusvm_for_each_range_safe(range, __next, notifier, start, end) { 954 + if (start > drm_gpusvm_range_start(range) || 955 + end < drm_gpusvm_range_end(range)) { 956 + if (IS_DGFX(vm->xe) && xe_svm_range_in_vram(to_xe_range(range))) 957 + drm_gpusvm_range_evict(&vm->svm.gpusvm, range); 958 + drm_gpusvm_range_get(range); 959 + __xe_svm_garbage_collector(vm, to_xe_range(range)); 960 + if (!list_empty(&to_xe_range(range)->garbage_collector_link)) { 961 + spin_lock(&vm->svm.garbage_collector.lock); 962 + list_del(&to_xe_range(range)->garbage_collector_link); 963 + spin_unlock(&vm->svm.garbage_collector.lock); 964 + } 965 + drm_gpusvm_range_put(range); 966 + } 967 + } 968 + } 1021 969 } 1022 970 1023 971 /** ··· 1119 997 return err; 1120 998 } 1121 999 1000 + /** 1001 + * xe_svm_ranges_zap_ptes_in_range - clear ptes of svm ranges in input range 1002 + * @vm: Pointer to the xe_vm structure 1003 + * @start: Start of the input range 1004 + * @end: End of the input range 1005 + * 1006 + * This function removes the page table entries (PTEs) associated 1007 + * with the svm ranges within the given input start and end 1008 + * 1009 + * Return: tile_mask for which gt's need to be tlb invalidated. 1010 + */ 1011 + u8 xe_svm_ranges_zap_ptes_in_range(struct xe_vm *vm, u64 start, u64 end) 1012 + { 1013 + struct drm_gpusvm_notifier *notifier; 1014 + struct xe_svm_range *range; 1015 + u64 adj_start, adj_end; 1016 + struct xe_tile *tile; 1017 + u8 tile_mask = 0; 1018 + u8 id; 1019 + 1020 + lockdep_assert(lockdep_is_held_type(&vm->svm.gpusvm.notifier_lock, 1) && 1021 + lockdep_is_held_type(&vm->lock, 0)); 1022 + 1023 + drm_gpusvm_for_each_notifier(notifier, &vm->svm.gpusvm, start, end) { 1024 + struct drm_gpusvm_range *r = NULL; 1025 + 1026 + adj_start = max(start, drm_gpusvm_notifier_start(notifier)); 1027 + adj_end = min(end, drm_gpusvm_notifier_end(notifier)); 1028 + drm_gpusvm_for_each_range(r, notifier, adj_start, adj_end) { 1029 + range = to_xe_range(r); 1030 + for_each_tile(tile, vm->xe, id) { 1031 + if (xe_pt_zap_ptes_range(tile, vm, range)) { 1032 + tile_mask |= BIT(id); 1033 + /* 1034 + * WRITE_ONCE pairs with READ_ONCE in 1035 + * xe_vm_has_valid_gpu_mapping(). 1036 + * Must not fail after setting 1037 + * tile_invalidated and before 1038 + * TLB invalidation. 1039 + */ 1040 + WRITE_ONCE(range->tile_invalidated, 1041 + range->tile_invalidated | BIT(id)); 1042 + } 1043 + } 1044 + } 1045 + } 1046 + 1047 + return tile_mask; 1048 + } 1049 + 1122 1050 #if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP) 1051 + 1052 + static struct drm_pagemap *tile_local_pagemap(struct xe_tile *tile) 1053 + { 1054 + return &tile->mem.vram->dpagemap; 1055 + } 1056 + 1057 + /** 1058 + * xe_vma_resolve_pagemap - Resolve the appropriate DRM pagemap for a VMA 1059 + * @vma: Pointer to the xe_vma structure containing memory attributes 1060 + * @tile: Pointer to the xe_tile structure used as fallback for VRAM mapping 1061 + * 1062 + * This function determines the correct DRM pagemap to use for a given VMA. 1063 + * It first checks if a valid devmem_fd is provided in the VMA's preferred 1064 + * location. If the devmem_fd is negative, it returns NULL, indicating no 1065 + * pagemap is available and smem to be used as preferred location. 1066 + * If the devmem_fd is equal to the default faulting 1067 + * GT identifier, it returns the VRAM pagemap associated with the tile. 1068 + * 1069 + * Future support for multi-device configurations may use drm_pagemap_from_fd() 1070 + * to resolve pagemaps from arbitrary file descriptors. 1071 + * 1072 + * Return: A pointer to the resolved drm_pagemap, or NULL if none is applicable. 1073 + */ 1074 + struct drm_pagemap *xe_vma_resolve_pagemap(struct xe_vma *vma, struct xe_tile *tile) 1075 + { 1076 + s32 fd = (s32)vma->attr.preferred_loc.devmem_fd; 1077 + 1078 + if (fd == DRM_XE_PREFERRED_LOC_DEFAULT_SYSTEM) 1079 + return NULL; 1080 + 1081 + if (fd == DRM_XE_PREFERRED_LOC_DEFAULT_DEVICE) 1082 + return IS_DGFX(tile_to_xe(tile)) ? tile_local_pagemap(tile) : NULL; 1083 + 1084 + /* TODO: Support multi-device with drm_pagemap_from_fd(fd) */ 1085 + return NULL; 1086 + } 1123 1087 1124 1088 /** 1125 1089 * xe_svm_alloc_vram()- Allocate device memory pages for range, ··· 1224 1016 xe_assert(tile_to_xe(tile), range->base.flags.migrate_devmem); 1225 1017 range_debug(range, "ALLOCATE VRAM"); 1226 1018 1227 - dpagemap = xe_tile_local_pagemap(tile); 1019 + dpagemap = tile_local_pagemap(tile); 1228 1020 return drm_pagemap_populate_mm(dpagemap, xe_svm_range_start(range), 1229 1021 xe_svm_range_end(range), 1230 1022 range->base.gpusvm->mm, 1231 1023 ctx->timeslice_ms); 1232 1024 } 1233 1025 1234 - static struct drm_pagemap_device_addr 1026 + static struct drm_pagemap_addr 1235 1027 xe_drm_pagemap_device_map(struct drm_pagemap *dpagemap, 1236 1028 struct device *dev, 1237 1029 struct page *page, ··· 1250 1042 prot = 0; 1251 1043 } 1252 1044 1253 - return drm_pagemap_device_addr_encode(addr, prot, order, dir); 1045 + return drm_pagemap_addr_encode(addr, prot, order, dir); 1254 1046 } 1255 1047 1256 1048 static const struct drm_pagemap_ops xe_drm_pagemap_ops = { ··· 1318 1110 int xe_devm_add(struct xe_tile *tile, struct xe_vram_region *vr) 1319 1111 { 1320 1112 return 0; 1113 + } 1114 + 1115 + struct drm_pagemap *xe_vma_resolve_pagemap(struct xe_vma *vma, struct xe_tile *tile) 1116 + { 1117 + return NULL; 1321 1118 } 1322 1119 #endif 1323 1120

+25 -2

drivers/gpu/drm/xe/xe_svm.h

··· 90 90 91 91 u64 xe_svm_find_vma_start(struct xe_vm *vm, u64 addr, u64 end, struct xe_vma *vma); 92 92 93 + void xe_svm_unmap_address_range(struct xe_vm *vm, u64 start, u64 end); 94 + 95 + u8 xe_svm_ranges_zap_ptes_in_range(struct xe_vm *vm, u64 start, u64 end); 96 + 97 + struct drm_pagemap *xe_vma_resolve_pagemap(struct xe_vma *vma, struct xe_tile *tile); 98 + 93 99 /** 94 100 * xe_svm_range_has_dma_mapping() - SVM range has DMA mapping 95 101 * @range: SVM range ··· 169 163 #else 170 164 #include <linux/interval_tree.h> 171 165 172 - struct drm_pagemap_device_addr; 166 + struct drm_pagemap_addr; 173 167 struct drm_gpusvm_ctx; 174 168 struct drm_gpusvm_range; 175 169 struct xe_bo; ··· 184 178 struct xe_svm_range { 185 179 struct { 186 180 struct interval_tree_node itree; 187 - const struct drm_pagemap_device_addr *dma_addr; 181 + const struct drm_pagemap_addr *dma_addr; 188 182 } base; 189 183 u32 tile_present; 190 184 u32 tile_invalidated; ··· 307 301 u64 xe_svm_find_vma_start(struct xe_vm *vm, u64 addr, u64 end, struct xe_vma *vma) 308 302 { 309 303 return ULONG_MAX; 304 + } 305 + 306 + static inline 307 + void xe_svm_unmap_address_range(struct xe_vm *vm, u64 start, u64 end) 308 + { 309 + } 310 + 311 + static inline 312 + u8 xe_svm_ranges_zap_ptes_in_range(struct xe_vm *vm, u64 start, u64 end) 313 + { 314 + return 0; 315 + } 316 + 317 + static inline 318 + struct drm_pagemap *xe_vma_resolve_pagemap(struct xe_vma *vma, struct xe_tile *tile) 319 + { 320 + return NULL; 310 321 } 311 322 312 323 #define xe_svm_assert_in_notifier(...) do {} while (0)

+1 -1

drivers/gpu/drm/xe/xe_sync.c

··· 77 77 { 78 78 struct xe_user_fence *ufence = container_of(w, struct xe_user_fence, worker); 79 79 80 + WRITE_ONCE(ufence->signalled, 1); 80 81 if (mmget_not_zero(ufence->mm)) { 81 82 kthread_use_mm(ufence->mm); 82 83 if (copy_to_user(ufence->addr, &ufence->value, sizeof(ufence->value))) ··· 92 91 * Wake up waiters only after updating the ufence state, allowing the UMD 93 92 * to safely reuse the same ufence without encountering -EBUSY errors. 94 93 */ 95 - WRITE_ONCE(ufence->signalled, 1); 96 94 wake_up_all(&ufence->xe->ufence_wq); 97 95 user_fence_put(ufence); 98 96 }

+41 -21

drivers/gpu/drm/xe/xe_tile.c

··· 7 7 8 8 #include <drm/drm_managed.h> 9 9 10 + #include "xe_bo.h" 10 11 #include "xe_device.h" 11 12 #include "xe_ggtt.h" 12 13 #include "xe_gt.h" ··· 20 19 #include "xe_tile_sysfs.h" 21 20 #include "xe_ttm_vram_mgr.h" 22 21 #include "xe_wa.h" 22 + #include "xe_vram.h" 23 + #include "xe_vram_types.h" 23 24 24 25 /** 25 26 * DOC: Multi-tile Design ··· 95 92 if (!tile->mem.ggtt) 96 93 return -ENOMEM; 97 94 95 + tile->migrate = xe_migrate_alloc(tile); 96 + if (!tile->migrate) 97 + return -ENOMEM; 98 + 99 + return 0; 100 + } 101 + 102 + /** 103 + * xe_tile_alloc_vram - Perform per-tile VRAM structs allocation 104 + * @tile: Tile to perform allocations for 105 + * 106 + * Allocates VRAM per-tile data structures using DRM-managed allocations. 107 + * Does not touch the hardware. 108 + * 109 + * Returns -ENOMEM if allocations fail, otherwise 0. 110 + */ 111 + int xe_tile_alloc_vram(struct xe_tile *tile) 112 + { 113 + struct xe_device *xe = tile_to_xe(tile); 114 + struct xe_vram_region *vram; 115 + 116 + if (!IS_DGFX(xe)) 117 + return 0; 118 + 119 + vram = xe_vram_region_alloc(xe, tile->id, XE_PL_VRAM0 + tile->id); 120 + if (!vram) 121 + return -ENOMEM; 122 + tile->mem.vram = vram; 123 + 98 124 return 0; 99 125 } 100 126 ··· 159 127 } 160 128 ALLOW_ERROR_INJECTION(xe_tile_init_early, ERRNO); /* See xe_pci_probe() */ 161 129 162 - static int tile_ttm_mgr_init(struct xe_tile *tile) 163 - { 164 - struct xe_device *xe = tile_to_xe(tile); 165 - int err; 166 - 167 - if (tile->mem.vram.usable_size) { 168 - err = xe_ttm_vram_mgr_init(tile, &tile->mem.vram.ttm); 169 - if (err) 170 - return err; 171 - xe->info.mem_region_mask |= BIT(tile->id) << 1; 172 - } 173 - 174 - return 0; 175 - } 176 - 177 130 /** 178 131 * xe_tile_init_noalloc - Init tile up to the point where allocations can happen. 179 132 * @tile: The tile to initialize. ··· 176 159 int xe_tile_init_noalloc(struct xe_tile *tile) 177 160 { 178 161 struct xe_device *xe = tile_to_xe(tile); 179 - int err; 180 - 181 - err = tile_ttm_mgr_init(tile); 182 - if (err) 183 - return err; 184 162 185 163 xe_wa_apply_tile_workarounds(tile); 186 164 187 165 if (xe->info.has_usm && IS_DGFX(xe)) 188 - xe_devm_add(tile, &tile->mem.vram); 166 + xe_devm_add(tile, tile->mem.vram); 167 + 168 + if (IS_DGFX(xe) && !ttm_resource_manager_used(&tile->mem.vram->ttm.manager)) { 169 + int err = xe_ttm_vram_mgr_init(xe, tile->mem.vram); 170 + 171 + if (err) 172 + return err; 173 + xe->info.mem_region_mask |= BIT(tile->mem.vram->id) << 1; 174 + } 189 175 190 176 return xe_tile_sysfs_init(tile); 191 177 }

+2 -12

drivers/gpu/drm/xe/xe_tile.h

··· 14 14 int xe_tile_init_noalloc(struct xe_tile *tile); 15 15 int xe_tile_init(struct xe_tile *tile); 16 16 17 - void xe_tile_migrate_wait(struct xe_tile *tile); 17 + int xe_tile_alloc_vram(struct xe_tile *tile); 18 18 19 - #if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP) 20 - static inline struct drm_pagemap *xe_tile_local_pagemap(struct xe_tile *tile) 21 - { 22 - return &tile->mem.vram.dpagemap; 23 - } 24 - #else 25 - static inline struct drm_pagemap *xe_tile_local_pagemap(struct xe_tile *tile) 26 - { 27 - return NULL; 28 - } 29 - #endif 19 + void xe_tile_migrate_wait(struct xe_tile *tile); 30 20 31 21 static inline bool xe_tile_is_root(struct xe_tile *tile) 32 22 {

+434

drivers/gpu/drm/xe/xe_tlb_inval.c

··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2023 Intel Corporation 4 + */ 5 + 6 + #include <drm/drm_managed.h> 7 + 8 + #include "abi/guc_actions_abi.h" 9 + #include "xe_device.h" 10 + #include "xe_force_wake.h" 11 + #include "xe_gt.h" 12 + #include "xe_gt_printk.h" 13 + #include "xe_guc.h" 14 + #include "xe_guc_ct.h" 15 + #include "xe_guc_tlb_inval.h" 16 + #include "xe_gt_stats.h" 17 + #include "xe_tlb_inval.h" 18 + #include "xe_mmio.h" 19 + #include "xe_pm.h" 20 + #include "xe_tlb_inval.h" 21 + #include "xe_trace.h" 22 + 23 + /** 24 + * DOC: Xe TLB invalidation 25 + * 26 + * Xe TLB invalidation is implemented in two layers. The first is the frontend 27 + * API, which provides an interface for TLB invalidations to the driver code. 28 + * The frontend handles seqno assignment, synchronization (fences), and the 29 + * timeout mechanism. The frontend is implemented via an embedded structure 30 + * xe_tlb_inval that includes a set of ops hooking into the backend. The backend 31 + * interacts with the hardware (or firmware) to perform the actual invalidation. 32 + */ 33 + 34 + #define FENCE_STACK_BIT DMA_FENCE_FLAG_USER_BITS 35 + 36 + static void xe_tlb_inval_fence_fini(struct xe_tlb_inval_fence *fence) 37 + { 38 + if (WARN_ON_ONCE(!fence->tlb_inval)) 39 + return; 40 + 41 + xe_pm_runtime_put(fence->tlb_inval->xe); 42 + fence->tlb_inval = NULL; /* fini() should be called once */ 43 + } 44 + 45 + static void 46 + xe_tlb_inval_fence_signal(struct xe_tlb_inval_fence *fence) 47 + { 48 + bool stack = test_bit(FENCE_STACK_BIT, &fence->base.flags); 49 + 50 + lockdep_assert_held(&fence->tlb_inval->pending_lock); 51 + 52 + list_del(&fence->link); 53 + trace_xe_tlb_inval_fence_signal(fence->tlb_inval->xe, fence); 54 + xe_tlb_inval_fence_fini(fence); 55 + dma_fence_signal(&fence->base); 56 + if (!stack) 57 + dma_fence_put(&fence->base); 58 + } 59 + 60 + static void 61 + xe_tlb_inval_fence_signal_unlocked(struct xe_tlb_inval_fence *fence) 62 + { 63 + struct xe_tlb_inval *tlb_inval = fence->tlb_inval; 64 + 65 + spin_lock_irq(&tlb_inval->pending_lock); 66 + xe_tlb_inval_fence_signal(fence); 67 + spin_unlock_irq(&tlb_inval->pending_lock); 68 + } 69 + 70 + static void xe_tlb_inval_fence_timeout(struct work_struct *work) 71 + { 72 + struct xe_tlb_inval *tlb_inval = container_of(work, struct xe_tlb_inval, 73 + fence_tdr.work); 74 + struct xe_device *xe = tlb_inval->xe; 75 + struct xe_tlb_inval_fence *fence, *next; 76 + long timeout_delay = tlb_inval->ops->timeout_delay(tlb_inval); 77 + 78 + tlb_inval->ops->flush(tlb_inval); 79 + 80 + spin_lock_irq(&tlb_inval->pending_lock); 81 + list_for_each_entry_safe(fence, next, 82 + &tlb_inval->pending_fences, link) { 83 + s64 since_inval_ms = ktime_ms_delta(ktime_get(), 84 + fence->inval_time); 85 + 86 + if (msecs_to_jiffies(since_inval_ms) < timeout_delay) 87 + break; 88 + 89 + trace_xe_tlb_inval_fence_timeout(xe, fence); 90 + drm_err(&xe->drm, 91 + "TLB invalidation fence timeout, seqno=%d recv=%d", 92 + fence->seqno, tlb_inval->seqno_recv); 93 + 94 + fence->base.error = -ETIME; 95 + xe_tlb_inval_fence_signal(fence); 96 + } 97 + if (!list_empty(&tlb_inval->pending_fences)) 98 + queue_delayed_work(system_wq, &tlb_inval->fence_tdr, 99 + timeout_delay); 100 + spin_unlock_irq(&tlb_inval->pending_lock); 101 + } 102 + 103 + /** 104 + * tlb_inval_fini - Clean up TLB invalidation state 105 + * @drm: @drm_device 106 + * @arg: pointer to struct @xe_tlb_inval 107 + * 108 + * Cancel pending fence workers and clean up any additional 109 + * TLB invalidation state. 110 + */ 111 + static void tlb_inval_fini(struct drm_device *drm, void *arg) 112 + { 113 + struct xe_tlb_inval *tlb_inval = arg; 114 + 115 + xe_tlb_inval_reset(tlb_inval); 116 + } 117 + 118 + /** 119 + * xe_gt_tlb_inval_init - Initialize TLB invalidation state 120 + * @gt: GT structure 121 + * 122 + * Initialize TLB invalidation state, purely software initialization, should 123 + * be called once during driver load. 124 + * 125 + * Return: 0 on success, negative error code on error. 126 + */ 127 + int xe_gt_tlb_inval_init_early(struct xe_gt *gt) 128 + { 129 + struct xe_device *xe = gt_to_xe(gt); 130 + struct xe_tlb_inval *tlb_inval = &gt->tlb_inval; 131 + int err; 132 + 133 + tlb_inval->xe = xe; 134 + tlb_inval->seqno = 1; 135 + INIT_LIST_HEAD(&tlb_inval->pending_fences); 136 + spin_lock_init(&tlb_inval->pending_lock); 137 + spin_lock_init(&tlb_inval->lock); 138 + INIT_DELAYED_WORK(&tlb_inval->fence_tdr, xe_tlb_inval_fence_timeout); 139 + 140 + err = drmm_mutex_init(&xe->drm, &tlb_inval->seqno_lock); 141 + if (err) 142 + return err; 143 + 144 + tlb_inval->job_wq = drmm_alloc_ordered_workqueue(&xe->drm, 145 + "gt-tbl-inval-job-wq", 146 + WQ_MEM_RECLAIM); 147 + if (IS_ERR(tlb_inval->job_wq)) 148 + return PTR_ERR(tlb_inval->job_wq); 149 + 150 + /* XXX: Blindly setting up backend to GuC */ 151 + xe_guc_tlb_inval_init_early(&gt->uc.guc, tlb_inval); 152 + 153 + return drmm_add_action_or_reset(&xe->drm, tlb_inval_fini, tlb_inval); 154 + } 155 + 156 + /** 157 + * xe_tlb_inval_reset() - TLB invalidation reset 158 + * @tlb_inval: TLB invalidation client 159 + * 160 + * Signal any pending invalidation fences, should be called during a GT reset 161 + */ 162 + void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval) 163 + { 164 + struct xe_tlb_inval_fence *fence, *next; 165 + int pending_seqno; 166 + 167 + /* 168 + * we can get here before the backends are even initialized if we're 169 + * wedging very early, in which case there are not going to be any 170 + * pendind fences so we can bail immediately. 171 + */ 172 + if (!tlb_inval->ops->initialized(tlb_inval)) 173 + return; 174 + 175 + /* 176 + * Backend is already disabled at this point. No new TLB requests can 177 + * appear. 178 + */ 179 + 180 + mutex_lock(&tlb_inval->seqno_lock); 181 + spin_lock_irq(&tlb_inval->pending_lock); 182 + cancel_delayed_work(&tlb_inval->fence_tdr); 183 + /* 184 + * We might have various kworkers waiting for TLB flushes to complete 185 + * which are not tracked with an explicit TLB fence, however at this 186 + * stage that will never happen since the backend is already disabled, 187 + * so make sure we signal them here under the assumption that we have 188 + * completed a full GT reset. 189 + */ 190 + if (tlb_inval->seqno == 1) 191 + pending_seqno = TLB_INVALIDATION_SEQNO_MAX - 1; 192 + else 193 + pending_seqno = tlb_inval->seqno - 1; 194 + WRITE_ONCE(tlb_inval->seqno_recv, pending_seqno); 195 + 196 + list_for_each_entry_safe(fence, next, 197 + &tlb_inval->pending_fences, link) 198 + xe_tlb_inval_fence_signal(fence); 199 + spin_unlock_irq(&tlb_inval->pending_lock); 200 + mutex_unlock(&tlb_inval->seqno_lock); 201 + } 202 + 203 + static bool xe_tlb_inval_seqno_past(struct xe_tlb_inval *tlb_inval, int seqno) 204 + { 205 + int seqno_recv = READ_ONCE(tlb_inval->seqno_recv); 206 + 207 + lockdep_assert_held(&tlb_inval->pending_lock); 208 + 209 + if (seqno - seqno_recv < -(TLB_INVALIDATION_SEQNO_MAX / 2)) 210 + return false; 211 + 212 + if (seqno - seqno_recv > (TLB_INVALIDATION_SEQNO_MAX / 2)) 213 + return true; 214 + 215 + return seqno_recv >= seqno; 216 + } 217 + 218 + static void xe_tlb_inval_fence_prep(struct xe_tlb_inval_fence *fence) 219 + { 220 + struct xe_tlb_inval *tlb_inval = fence->tlb_inval; 221 + 222 + fence->seqno = tlb_inval->seqno; 223 + trace_xe_tlb_inval_fence_send(tlb_inval->xe, fence); 224 + 225 + spin_lock_irq(&tlb_inval->pending_lock); 226 + fence->inval_time = ktime_get(); 227 + list_add_tail(&fence->link, &tlb_inval->pending_fences); 228 + 229 + if (list_is_singular(&tlb_inval->pending_fences)) 230 + queue_delayed_work(system_wq, &tlb_inval->fence_tdr, 231 + tlb_inval->ops->timeout_delay(tlb_inval)); 232 + spin_unlock_irq(&tlb_inval->pending_lock); 233 + 234 + tlb_inval->seqno = (tlb_inval->seqno + 1) % 235 + TLB_INVALIDATION_SEQNO_MAX; 236 + if (!tlb_inval->seqno) 237 + tlb_inval->seqno = 1; 238 + } 239 + 240 + #define xe_tlb_inval_issue(__tlb_inval, __fence, op, args...) \ 241 + ({ \ 242 + int __ret; \ 243 + \ 244 + xe_assert((__tlb_inval)->xe, (__tlb_inval)->ops); \ 245 + xe_assert((__tlb_inval)->xe, (__fence)); \ 246 + \ 247 + mutex_lock(&(__tlb_inval)->seqno_lock); \ 248 + xe_tlb_inval_fence_prep((__fence)); \ 249 + __ret = op((__tlb_inval), (__fence)->seqno, ##args); \ 250 + if (__ret < 0) \ 251 + xe_tlb_inval_fence_signal_unlocked((__fence)); \ 252 + mutex_unlock(&(__tlb_inval)->seqno_lock); \ 253 + \ 254 + __ret == -ECANCELED ? 0 : __ret; \ 255 + }) 256 + 257 + /** 258 + * xe_tlb_inval_all() - Issue a TLB invalidation for all TLBs 259 + * @tlb_inval: TLB invalidation client 260 + * @fence: invalidation fence which will be signal on TLB invalidation 261 + * completion 262 + * 263 + * Issue a TLB invalidation for all TLBs. Completion of TLB is asynchronous and 264 + * caller can use the invalidation fence to wait for completion. 265 + * 266 + * Return: 0 on success, negative error code on error 267 + */ 268 + int xe_tlb_inval_all(struct xe_tlb_inval *tlb_inval, 269 + struct xe_tlb_inval_fence *fence) 270 + { 271 + return xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->all); 272 + } 273 + 274 + /** 275 + * xe_tlb_inval_ggtt() - Issue a TLB invalidation for the GGTT 276 + * @tlb_inval: TLB invalidation client 277 + * 278 + * Issue a TLB invalidation for the GGTT. Completion of TLB is asynchronous and 279 + * caller can use the invalidation fence to wait for completion. 280 + * 281 + * Return: 0 on success, negative error code on error 282 + */ 283 + int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval) 284 + { 285 + struct xe_tlb_inval_fence fence, *fence_ptr = &fence; 286 + int ret; 287 + 288 + xe_tlb_inval_fence_init(tlb_inval, fence_ptr, true); 289 + ret = xe_tlb_inval_issue(tlb_inval, fence_ptr, tlb_inval->ops->ggtt); 290 + xe_tlb_inval_fence_wait(fence_ptr); 291 + 292 + return ret; 293 + } 294 + 295 + /** 296 + * xe_tlb_inval_range() - Issue a TLB invalidation for an address range 297 + * @tlb_inval: TLB invalidation client 298 + * @fence: invalidation fence which will be signal on TLB invalidation 299 + * completion 300 + * @start: start address 301 + * @end: end address 302 + * @asid: address space id 303 + * 304 + * Issue a range based TLB invalidation if supported, if not fallback to a full 305 + * TLB invalidation. Completion of TLB is asynchronous and caller can use 306 + * the invalidation fence to wait for completion. 307 + * 308 + * Return: Negative error code on error, 0 on success 309 + */ 310 + int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval, 311 + struct xe_tlb_inval_fence *fence, u64 start, u64 end, 312 + u32 asid) 313 + { 314 + return xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt, 315 + start, end, asid); 316 + } 317 + 318 + /** 319 + * xe_tlb_inval_vm() - Issue a TLB invalidation for a VM 320 + * @tlb_inval: TLB invalidation client 321 + * @vm: VM to invalidate 322 + * 323 + * Invalidate entire VM's address space 324 + */ 325 + void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval, struct xe_vm *vm) 326 + { 327 + struct xe_tlb_inval_fence fence; 328 + u64 range = 1ull << vm->xe->info.va_bits; 329 + 330 + xe_tlb_inval_fence_init(tlb_inval, &fence, true); 331 + xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid); 332 + xe_tlb_inval_fence_wait(&fence); 333 + } 334 + 335 + /** 336 + * xe_tlb_inval_done_handler() - TLB invalidation done handler 337 + * @tlb_inval: TLB invalidation client 338 + * @seqno: seqno of invalidation that is done 339 + * 340 + * Update recv seqno, signal any TLB invalidation fences, and restart TDR 341 + */ 342 + void xe_tlb_inval_done_handler(struct xe_tlb_inval *tlb_inval, int seqno) 343 + { 344 + struct xe_device *xe = tlb_inval->xe; 345 + struct xe_tlb_inval_fence *fence, *next; 346 + unsigned long flags; 347 + 348 + /* 349 + * This can also be run both directly from the IRQ handler and also in 350 + * process_g2h_msg(). Only one may process any individual CT message, 351 + * however the order they are processed here could result in skipping a 352 + * seqno. To handle that we just process all the seqnos from the last 353 + * seqno_recv up to and including the one in msg[0]. The delta should be 354 + * very small so there shouldn't be much of pending_fences we actually 355 + * need to iterate over here. 356 + * 357 + * From GuC POV we expect the seqnos to always appear in-order, so if we 358 + * see something later in the timeline we can be sure that anything 359 + * appearing earlier has already signalled, just that we have yet to 360 + * officially process the CT message like if racing against 361 + * process_g2h_msg(). 362 + */ 363 + spin_lock_irqsave(&tlb_inval->pending_lock, flags); 364 + if (xe_tlb_inval_seqno_past(tlb_inval, seqno)) { 365 + spin_unlock_irqrestore(&tlb_inval->pending_lock, flags); 366 + return; 367 + } 368 + 369 + WRITE_ONCE(tlb_inval->seqno_recv, seqno); 370 + 371 + list_for_each_entry_safe(fence, next, 372 + &tlb_inval->pending_fences, link) { 373 + trace_xe_tlb_inval_fence_recv(xe, fence); 374 + 375 + if (!xe_tlb_inval_seqno_past(tlb_inval, fence->seqno)) 376 + break; 377 + 378 + xe_tlb_inval_fence_signal(fence); 379 + } 380 + 381 + if (!list_empty(&tlb_inval->pending_fences)) 382 + mod_delayed_work(system_wq, 383 + &tlb_inval->fence_tdr, 384 + tlb_inval->ops->timeout_delay(tlb_inval)); 385 + else 386 + cancel_delayed_work(&tlb_inval->fence_tdr); 387 + 388 + spin_unlock_irqrestore(&tlb_inval->pending_lock, flags); 389 + } 390 + 391 + static const char * 392 + xe_inval_fence_get_driver_name(struct dma_fence *dma_fence) 393 + { 394 + return "xe"; 395 + } 396 + 397 + static const char * 398 + xe_inval_fence_get_timeline_name(struct dma_fence *dma_fence) 399 + { 400 + return "tlb_inval_fence"; 401 + } 402 + 403 + static const struct dma_fence_ops inval_fence_ops = { 404 + .get_driver_name = xe_inval_fence_get_driver_name, 405 + .get_timeline_name = xe_inval_fence_get_timeline_name, 406 + }; 407 + 408 + /** 409 + * xe_tlb_inval_fence_init() - Initialize TLB invalidation fence 410 + * @tlb_inval: TLB invalidation client 411 + * @fence: TLB invalidation fence to initialize 412 + * @stack: fence is stack variable 413 + * 414 + * Initialize TLB invalidation fence for use. xe_tlb_inval_fence_fini 415 + * will be automatically called when fence is signalled (all fences must signal), 416 + * even on error. 417 + */ 418 + void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval, 419 + struct xe_tlb_inval_fence *fence, 420 + bool stack) 421 + { 422 + xe_pm_runtime_get_noresume(tlb_inval->xe); 423 + 424 + spin_lock_irq(&tlb_inval->lock); 425 + dma_fence_init(&fence->base, &inval_fence_ops, &tlb_inval->lock, 426 + dma_fence_context_alloc(1), 1); 427 + spin_unlock_irq(&tlb_inval->lock); 428 + INIT_LIST_HEAD(&fence->link); 429 + if (stack) 430 + set_bit(FENCE_STACK_BIT, &fence->base.flags); 431 + else 432 + dma_fence_get(&fence->base); 433 + fence->tlb_inval = tlb_inval; 434 + }

+46

drivers/gpu/drm/xe/xe_tlb_inval.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_TLB_INVAL_H_ 7 + #define _XE_TLB_INVAL_H_ 8 + 9 + #include <linux/types.h> 10 + 11 + #include "xe_tlb_inval_types.h" 12 + 13 + struct xe_gt; 14 + struct xe_guc; 15 + struct xe_vm; 16 + 17 + int xe_gt_tlb_inval_init_early(struct xe_gt *gt); 18 + 19 + void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval); 20 + int xe_tlb_inval_all(struct xe_tlb_inval *tlb_inval, 21 + struct xe_tlb_inval_fence *fence); 22 + int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval); 23 + void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval, struct xe_vm *vm); 24 + int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval, 25 + struct xe_tlb_inval_fence *fence, 26 + u64 start, u64 end, u32 asid); 27 + 28 + void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval, 29 + struct xe_tlb_inval_fence *fence, 30 + bool stack); 31 + 32 + /** 33 + * xe_tlb_inval_fence_wait() - TLB invalidiation fence wait 34 + * @fence: TLB invalidation fence to wait on 35 + * 36 + * Wait on a TLB invalidiation fence until it signals, non interruptable 37 + */ 38 + static inline void 39 + xe_tlb_inval_fence_wait(struct xe_tlb_inval_fence *fence) 40 + { 41 + dma_fence_wait(&fence->base, false); 42 + } 43 + 44 + void xe_tlb_inval_done_handler(struct xe_tlb_inval *tlb_inval, int seqno); 45 + 46 + #endif /* _XE_TLB_INVAL_ */

+268

drivers/gpu/drm/xe/xe_tlb_inval_job.c

··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #include "xe_assert.h" 7 + #include "xe_dep_job_types.h" 8 + #include "xe_dep_scheduler.h" 9 + #include "xe_exec_queue.h" 10 + #include "xe_gt_types.h" 11 + #include "xe_tlb_inval.h" 12 + #include "xe_tlb_inval_job.h" 13 + #include "xe_migrate.h" 14 + #include "xe_pm.h" 15 + 16 + /** struct xe_tlb_inval_job - TLB invalidation job */ 17 + struct xe_tlb_inval_job { 18 + /** @dep: base generic dependency Xe job */ 19 + struct xe_dep_job dep; 20 + /** @tlb_inval: TLB invalidation client */ 21 + struct xe_tlb_inval *tlb_inval; 22 + /** @q: exec queue issuing the invalidate */ 23 + struct xe_exec_queue *q; 24 + /** @refcount: ref count of this job */ 25 + struct kref refcount; 26 + /** 27 + * @fence: dma fence to indicate completion. 1 way relationship - job 28 + * can safely reference fence, fence cannot safely reference job. 29 + */ 30 + struct dma_fence *fence; 31 + /** @start: Start address to invalidate */ 32 + u64 start; 33 + /** @end: End address to invalidate */ 34 + u64 end; 35 + /** @asid: Address space ID to invalidate */ 36 + u32 asid; 37 + /** @fence_armed: Fence has been armed */ 38 + bool fence_armed; 39 + }; 40 + 41 + static struct dma_fence *xe_tlb_inval_job_run(struct xe_dep_job *dep_job) 42 + { 43 + struct xe_tlb_inval_job *job = 44 + container_of(dep_job, typeof(*job), dep); 45 + struct xe_tlb_inval_fence *ifence = 46 + container_of(job->fence, typeof(*ifence), base); 47 + 48 + xe_tlb_inval_range(job->tlb_inval, ifence, job->start, 49 + job->end, job->asid); 50 + 51 + return job->fence; 52 + } 53 + 54 + static void xe_tlb_inval_job_free(struct xe_dep_job *dep_job) 55 + { 56 + struct xe_tlb_inval_job *job = 57 + container_of(dep_job, typeof(*job), dep); 58 + 59 + /* Pairs with get in xe_tlb_inval_job_push */ 60 + xe_tlb_inval_job_put(job); 61 + } 62 + 63 + static const struct xe_dep_job_ops dep_job_ops = { 64 + .run_job = xe_tlb_inval_job_run, 65 + .free_job = xe_tlb_inval_job_free, 66 + }; 67 + 68 + /** 69 + * xe_tlb_inval_job_create() - TLB invalidation job create 70 + * @q: exec queue issuing the invalidate 71 + * @tlb_inval: TLB invalidation client 72 + * @dep_scheduler: Dependency scheduler for job 73 + * @start: Start address to invalidate 74 + * @end: End address to invalidate 75 + * @asid: Address space ID to invalidate 76 + * 77 + * Create a TLB invalidation job and initialize internal fields. The caller is 78 + * responsible for releasing the creation reference. 79 + * 80 + * Return: TLB invalidation job object on success, ERR_PTR failure 81 + */ 82 + struct xe_tlb_inval_job * 83 + xe_tlb_inval_job_create(struct xe_exec_queue *q, struct xe_tlb_inval *tlb_inval, 84 + struct xe_dep_scheduler *dep_scheduler, u64 start, 85 + u64 end, u32 asid) 86 + { 87 + struct xe_tlb_inval_job *job; 88 + struct drm_sched_entity *entity = 89 + xe_dep_scheduler_entity(dep_scheduler); 90 + struct xe_tlb_inval_fence *ifence; 91 + int err; 92 + 93 + job = kmalloc(sizeof(*job), GFP_KERNEL); 94 + if (!job) 95 + return ERR_PTR(-ENOMEM); 96 + 97 + job->q = q; 98 + job->tlb_inval = tlb_inval; 99 + job->start = start; 100 + job->end = end; 101 + job->asid = asid; 102 + job->fence_armed = false; 103 + job->dep.ops = &dep_job_ops; 104 + kref_init(&job->refcount); 105 + xe_exec_queue_get(q); /* Pairs with put in xe_tlb_inval_job_destroy */ 106 + 107 + ifence = kmalloc(sizeof(*ifence), GFP_KERNEL); 108 + if (!ifence) { 109 + err = -ENOMEM; 110 + goto err_job; 111 + } 112 + job->fence = &ifence->base; 113 + 114 + err = drm_sched_job_init(&job->dep.drm, entity, 1, NULL, 115 + q->xef ? q->xef->drm->client_id : 0); 116 + if (err) 117 + goto err_fence; 118 + 119 + /* Pairs with put in xe_tlb_inval_job_destroy */ 120 + xe_pm_runtime_get_noresume(gt_to_xe(q->gt)); 121 + 122 + return job; 123 + 124 + err_fence: 125 + kfree(ifence); 126 + err_job: 127 + xe_exec_queue_put(q); 128 + kfree(job); 129 + 130 + return ERR_PTR(err); 131 + } 132 + 133 + static void xe_tlb_inval_job_destroy(struct kref *ref) 134 + { 135 + struct xe_tlb_inval_job *job = container_of(ref, typeof(*job), 136 + refcount); 137 + struct xe_tlb_inval_fence *ifence = 138 + container_of(job->fence, typeof(*ifence), base); 139 + struct xe_exec_queue *q = job->q; 140 + struct xe_device *xe = gt_to_xe(q->gt); 141 + 142 + if (!job->fence_armed) 143 + kfree(ifence); 144 + else 145 + /* Ref from xe_tlb_inval_fence_init */ 146 + dma_fence_put(job->fence); 147 + 148 + drm_sched_job_cleanup(&job->dep.drm); 149 + kfree(job); 150 + xe_exec_queue_put(q); /* Pairs with get from xe_tlb_inval_job_create */ 151 + xe_pm_runtime_put(xe); /* Pairs with get from xe_tlb_inval_job_create */ 152 + } 153 + 154 + /** 155 + * xe_tlb_inval_alloc_dep() - TLB invalidation job alloc dependency 156 + * @job: TLB invalidation job to alloc dependency for 157 + * 158 + * Allocate storage for a dependency in the TLB invalidation fence. This 159 + * function should be called at most once per job and must be paired with 160 + * xe_tlb_inval_job_push being called with a real fence. 161 + * 162 + * Return: 0 on success, -errno on failure 163 + */ 164 + int xe_tlb_inval_job_alloc_dep(struct xe_tlb_inval_job *job) 165 + { 166 + xe_assert(gt_to_xe(job->q->gt), !xa_load(&job->dep.drm.dependencies, 0)); 167 + might_alloc(GFP_KERNEL); 168 + 169 + return drm_sched_job_add_dependency(&job->dep.drm, 170 + dma_fence_get_stub()); 171 + } 172 + 173 + /** 174 + * xe_tlb_inval_job_push() - TLB invalidation job push 175 + * @job: TLB invalidation job to push 176 + * @m: The migration object being used 177 + * @fence: Dependency for TLB invalidation job 178 + * 179 + * Pushes a TLB invalidation job for execution, using @fence as a dependency. 180 + * Storage for @fence must be preallocated with xe_tlb_inval_job_alloc_dep 181 + * prior to this call if @fence is not signaled. Takes a reference to the job’s 182 + * finished fence, which the caller is responsible for releasing, and return it 183 + * to the caller. This function is safe to be called in the path of reclaim. 184 + * 185 + * Return: Job's finished fence on success, cannot fail 186 + */ 187 + struct dma_fence *xe_tlb_inval_job_push(struct xe_tlb_inval_job *job, 188 + struct xe_migrate *m, 189 + struct dma_fence *fence) 190 + { 191 + struct xe_tlb_inval_fence *ifence = 192 + container_of(job->fence, typeof(*ifence), base); 193 + 194 + if (!dma_fence_is_signaled(fence)) { 195 + void *ptr; 196 + 197 + /* 198 + * Can be in path of reclaim, hence the preallocation of fence 199 + * storage in xe_tlb_inval_job_alloc_dep. Verify caller did 200 + * this correctly. 201 + */ 202 + xe_assert(gt_to_xe(job->q->gt), 203 + xa_load(&job->dep.drm.dependencies, 0) == 204 + dma_fence_get_stub()); 205 + 206 + dma_fence_get(fence); /* ref released once dependency processed by scheduler */ 207 + ptr = xa_store(&job->dep.drm.dependencies, 0, fence, 208 + GFP_ATOMIC); 209 + xe_assert(gt_to_xe(job->q->gt), !xa_is_err(ptr)); 210 + } 211 + 212 + xe_tlb_inval_job_get(job); /* Pairs with put in free_job */ 213 + job->fence_armed = true; 214 + 215 + /* 216 + * We need the migration lock to protect the job's seqno and the spsc 217 + * queue, only taken on migration queue, user queues protected dma-resv 218 + * VM lock. 219 + */ 220 + xe_migrate_job_lock(m, job->q); 221 + 222 + /* Creation ref pairs with put in xe_tlb_inval_job_destroy */ 223 + xe_tlb_inval_fence_init(job->tlb_inval, ifence, false); 224 + dma_fence_get(job->fence); /* Pairs with put in DRM scheduler */ 225 + 226 + drm_sched_job_arm(&job->dep.drm); 227 + /* 228 + * caller ref, get must be done before job push as it could immediately 229 + * signal and free. 230 + */ 231 + dma_fence_get(&job->dep.drm.s_fence->finished); 232 + drm_sched_entity_push_job(&job->dep.drm); 233 + 234 + xe_migrate_job_unlock(m, job->q); 235 + 236 + /* 237 + * Not using job->fence, as it has its own dma-fence context, which does 238 + * not allow TLB invalidation fences on the same queue, GT tuple to 239 + * be squashed in dma-resv/DRM scheduler. Instead, we use the DRM scheduler 240 + * context and job's finished fence, which enables squashing. 241 + */ 242 + return &job->dep.drm.s_fence->finished; 243 + } 244 + 245 + /** 246 + * xe_tlb_inval_job_get() - Get a reference to TLB invalidation job 247 + * @job: TLB invalidation job object 248 + * 249 + * Increment the TLB invalidation job's reference count 250 + */ 251 + void xe_tlb_inval_job_get(struct xe_tlb_inval_job *job) 252 + { 253 + kref_get(&job->refcount); 254 + } 255 + 256 + /** 257 + * xe_tlb_inval_job_put() - Put a reference to TLB invalidation job 258 + * @job: TLB invalidation job object 259 + * 260 + * Decrement the TLB invalidation job's reference count, call 261 + * xe_tlb_inval_job_destroy when reference count == 0. Skips decrement if 262 + * input @job is NULL or IS_ERR. 263 + */ 264 + void xe_tlb_inval_job_put(struct xe_tlb_inval_job *job) 265 + { 266 + if (!IS_ERR_OR_NULL(job)) 267 + kref_put(&job->refcount, xe_tlb_inval_job_destroy); 268 + }

+33

drivers/gpu/drm/xe/xe_tlb_inval_job.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_TLB_INVAL_JOB_H_ 7 + #define _XE_TLB_INVAL_JOB_H_ 8 + 9 + #include <linux/types.h> 10 + 11 + struct dma_fence; 12 + struct xe_dep_scheduler; 13 + struct xe_exec_queue; 14 + struct xe_tlb_inval; 15 + struct xe_tlb_inval_job; 16 + struct xe_migrate; 17 + 18 + struct xe_tlb_inval_job * 19 + xe_tlb_inval_job_create(struct xe_exec_queue *q, struct xe_tlb_inval *tlb_inval, 20 + struct xe_dep_scheduler *dep_scheduler, 21 + u64 start, u64 end, u32 asid); 22 + 23 + int xe_tlb_inval_job_alloc_dep(struct xe_tlb_inval_job *job); 24 + 25 + struct dma_fence *xe_tlb_inval_job_push(struct xe_tlb_inval_job *job, 26 + struct xe_migrate *m, 27 + struct dma_fence *fence); 28 + 29 + void xe_tlb_inval_job_get(struct xe_tlb_inval_job *job); 30 + 31 + void xe_tlb_inval_job_put(struct xe_tlb_inval_job *job); 32 + 33 + #endif

+130

drivers/gpu/drm/xe/xe_tlb_inval_types.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2023 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_TLB_INVAL_TYPES_H_ 7 + #define _XE_TLB_INVAL_TYPES_H_ 8 + 9 + #include <linux/workqueue.h> 10 + #include <linux/dma-fence.h> 11 + 12 + struct xe_tlb_inval; 13 + 14 + /** struct xe_tlb_inval_ops - TLB invalidation ops (backend) */ 15 + struct xe_tlb_inval_ops { 16 + /** 17 + * @all: Invalidate all TLBs 18 + * @tlb_inval: TLB invalidation client 19 + * @seqno: Seqno of TLB invalidation 20 + * 21 + * Return 0 on success, -ECANCELED if backend is mid-reset, error on 22 + * failure 23 + */ 24 + int (*all)(struct xe_tlb_inval *tlb_inval, u32 seqno); 25 + 26 + /** 27 + * @ggtt: Invalidate global translation TLBs 28 + * @tlb_inval: TLB invalidation client 29 + * @seqno: Seqno of TLB invalidation 30 + * 31 + * Return 0 on success, -ECANCELED if backend is mid-reset, error on 32 + * failure 33 + */ 34 + int (*ggtt)(struct xe_tlb_inval *tlb_inval, u32 seqno); 35 + 36 + /** 37 + * @ppgtt: Invalidate per-process translation TLBs 38 + * @tlb_inval: TLB invalidation client 39 + * @seqno: Seqno of TLB invalidation 40 + * @start: Start address 41 + * @end: End address 42 + * @asid: Address space ID 43 + * 44 + * Return 0 on success, -ECANCELED if backend is mid-reset, error on 45 + * failure 46 + */ 47 + int (*ppgtt)(struct xe_tlb_inval *tlb_inval, u32 seqno, u64 start, 48 + u64 end, u32 asid); 49 + 50 + /** 51 + * @initialized: Backend is initialized 52 + * @tlb_inval: TLB invalidation client 53 + * 54 + * Return: True if back is initialized, False otherwise 55 + */ 56 + bool (*initialized)(struct xe_tlb_inval *tlb_inval); 57 + 58 + /** 59 + * @flush: Flush pending TLB invalidations 60 + * @tlb_inval: TLB invalidation client 61 + */ 62 + void (*flush)(struct xe_tlb_inval *tlb_inval); 63 + 64 + /** 65 + * @timeout_delay: Timeout delay for TLB invalidation 66 + * @tlb_inval: TLB invalidation client 67 + * 68 + * Return: Timeout delay for TLB invalidation in jiffies 69 + */ 70 + long (*timeout_delay)(struct xe_tlb_inval *tlb_inval); 71 + }; 72 + 73 + /** struct xe_tlb_inval - TLB invalidation client (frontend) */ 74 + struct xe_tlb_inval { 75 + /** @private: Backend private pointer */ 76 + void *private; 77 + /** @xe: Pointer to Xe device */ 78 + struct xe_device *xe; 79 + /** @ops: TLB invalidation ops */ 80 + const struct xe_tlb_inval_ops *ops; 81 + /** @tlb_inval.seqno: TLB invalidation seqno, protected by CT lock */ 82 + #define TLB_INVALIDATION_SEQNO_MAX 0x100000 83 + int seqno; 84 + /** @tlb_invalidation.seqno_lock: protects @tlb_invalidation.seqno */ 85 + struct mutex seqno_lock; 86 + /** 87 + * @seqno_recv: last received TLB invalidation seqno, protected by 88 + * CT lock 89 + */ 90 + int seqno_recv; 91 + /** 92 + * @pending_fences: list of pending fences waiting TLB invaliations, 93 + * protected CT lock 94 + */ 95 + struct list_head pending_fences; 96 + /** 97 + * @pending_lock: protects @pending_fences and updating @seqno_recv. 98 + */ 99 + spinlock_t pending_lock; 100 + /** 101 + * @fence_tdr: schedules a delayed call to xe_tlb_fence_timeout after 102 + * the timeout interval is over. 103 + */ 104 + struct delayed_work fence_tdr; 105 + /** @job_wq: schedules TLB invalidation jobs */ 106 + struct workqueue_struct *job_wq; 107 + /** @tlb_inval.lock: protects TLB invalidation fences */ 108 + spinlock_t lock; 109 + }; 110 + 111 + /** 112 + * struct xe_tlb_inval_fence - TLB invalidation fence 113 + * 114 + * Optionally passed to xe_tlb_inval* functions and will be signaled upon TLB 115 + * invalidation completion. 116 + */ 117 + struct xe_tlb_inval_fence { 118 + /** @base: dma fence base */ 119 + struct dma_fence base; 120 + /** @tlb_inval: TLB invalidation client which fence belong to */ 121 + struct xe_tlb_inval *tlb_inval; 122 + /** @link: link into list of pending tlb fences */ 123 + struct list_head link; 124 + /** @seqno: seqno of TLB invalidation to signal fence one */ 125 + int seqno; 126 + /** @inval_time: time of TLB invalidation */ 127 + ktime_t inval_time; 128 + }; 129 + 130 + #endif

+12 -28

drivers/gpu/drm/xe/xe_trace.h

··· 14 14 15 15 #include "xe_exec_queue_types.h" 16 16 #include "xe_gpu_scheduler_types.h" 17 - #include "xe_gt_tlb_invalidation_types.h" 18 17 #include "xe_gt_types.h" 19 18 #include "xe_guc_exec_queue_types.h" 20 19 #include "xe_sched_job.h" 20 + #include "xe_tlb_inval_types.h" 21 21 #include "xe_vm.h" 22 22 23 23 #define __dev_name_xe(xe) dev_name((xe)->drm.dev) ··· 25 25 #define __dev_name_gt(gt) __dev_name_xe(gt_to_xe((gt))) 26 26 #define __dev_name_eq(q) __dev_name_gt((q)->gt) 27 27 28 - DECLARE_EVENT_CLASS(xe_gt_tlb_invalidation_fence, 29 - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence), 28 + DECLARE_EVENT_CLASS(xe_tlb_inval_fence, 29 + TP_PROTO(struct xe_device *xe, struct xe_tlb_inval_fence *fence), 30 30 TP_ARGS(xe, fence), 31 31 32 32 TP_STRUCT__entry( 33 33 __string(dev, __dev_name_xe(xe)) 34 - __field(struct xe_gt_tlb_invalidation_fence *, fence) 34 + __field(struct xe_tlb_inval_fence *, fence) 35 35 __field(int, seqno) 36 36 ), 37 37 ··· 45 45 __get_str(dev), __entry->fence, __entry->seqno) 46 46 ); 47 47 48 - DEFINE_EVENT(xe_gt_tlb_invalidation_fence, xe_gt_tlb_invalidation_fence_create, 49 - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence), 48 + DEFINE_EVENT(xe_tlb_inval_fence, xe_tlb_inval_fence_send, 49 + TP_PROTO(struct xe_device *xe, struct xe_tlb_inval_fence *fence), 50 50 TP_ARGS(xe, fence) 51 51 ); 52 52 53 - DEFINE_EVENT(xe_gt_tlb_invalidation_fence, 54 - xe_gt_tlb_invalidation_fence_work_func, 55 - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence), 53 + DEFINE_EVENT(xe_tlb_inval_fence, xe_tlb_inval_fence_recv, 54 + TP_PROTO(struct xe_device *xe, struct xe_tlb_inval_fence *fence), 56 55 TP_ARGS(xe, fence) 57 56 ); 58 57 59 - DEFINE_EVENT(xe_gt_tlb_invalidation_fence, xe_gt_tlb_invalidation_fence_cb, 60 - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence), 58 + DEFINE_EVENT(xe_tlb_inval_fence, xe_tlb_inval_fence_signal, 59 + TP_PROTO(struct xe_device *xe, struct xe_tlb_inval_fence *fence), 61 60 TP_ARGS(xe, fence) 62 61 ); 63 62 64 - DEFINE_EVENT(xe_gt_tlb_invalidation_fence, xe_gt_tlb_invalidation_fence_send, 65 - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence), 66 - TP_ARGS(xe, fence) 67 - ); 68 - 69 - DEFINE_EVENT(xe_gt_tlb_invalidation_fence, xe_gt_tlb_invalidation_fence_recv, 70 - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence), 71 - TP_ARGS(xe, fence) 72 - ); 73 - 74 - DEFINE_EVENT(xe_gt_tlb_invalidation_fence, xe_gt_tlb_invalidation_fence_signal, 75 - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence), 76 - TP_ARGS(xe, fence) 77 - ); 78 - 79 - DEFINE_EVENT(xe_gt_tlb_invalidation_fence, xe_gt_tlb_invalidation_fence_timeout, 80 - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence), 63 + DEFINE_EVENT(xe_tlb_inval_fence, xe_tlb_inval_fence_timeout, 64 + TP_PROTO(struct xe_device *xe, struct xe_tlb_inval_fence *fence), 81 65 TP_ARGS(xe, fence) 82 66 ); 83 67

+7 -5

drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c

··· 25 25 #include "xe_ttm_stolen_mgr.h" 26 26 #include "xe_ttm_vram_mgr.h" 27 27 #include "xe_wa.h" 28 + #include "xe_vram.h" 28 29 29 30 struct xe_ttm_stolen_mgr { 30 31 struct xe_ttm_vram_mgr base; ··· 83 82 84 83 static s64 detect_bar2_dgfx(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr) 85 84 { 86 - struct xe_tile *tile = xe_device_get_root_tile(xe); 85 + struct xe_vram_region *tile_vram = xe_device_get_root_tile(xe)->mem.vram; 86 + resource_size_t tile_io_start = xe_vram_region_io_start(tile_vram); 87 87 struct xe_mmio *mmio = xe_root_tile_mmio(xe); 88 88 struct pci_dev *pdev = to_pci_dev(xe->drm.dev); 89 89 u64 stolen_size, wopcm_size; 90 90 u64 tile_offset; 91 91 u64 tile_size; 92 92 93 - tile_offset = tile->mem.vram.io_start - xe->mem.vram.io_start; 94 - tile_size = tile->mem.vram.actual_physical_size; 93 + tile_offset = tile_io_start - xe_vram_region_io_start(xe->mem.vram); 94 + tile_size = xe_vram_region_actual_physical_size(tile_vram); 95 95 96 96 /* Use DSM base address instead for stolen memory */ 97 97 mgr->stolen_base = (xe_mmio_read64_2x32(mmio, DSMBASE) & BDSM_MASK) - tile_offset; ··· 109 107 110 108 /* Verify usage fits in the actual resource available */ 111 109 if (mgr->stolen_base + stolen_size <= pci_resource_len(pdev, LMEM_BAR)) 112 - mgr->io_base = tile->mem.vram.io_start + mgr->stolen_base; 110 + mgr->io_base = tile_io_start + mgr->stolen_base; 113 111 114 112 /* 115 113 * There may be few KB of platform dependent reserved memory at the end ··· 166 164 167 165 stolen_size -= wopcm_size; 168 166 169 - if (media_gt && XE_WA(media_gt, 14019821291)) { 167 + if (media_gt && XE_GT_WA(media_gt, 14019821291)) { 170 168 u64 gscpsmi_base = xe_mmio_read64_2x32(&media_gt->mmio, GSCPSMI_BASE) 171 169 & ~GENMASK_ULL(5, 0); 172 170

+15 -7

drivers/gpu/drm/xe/xe_ttm_vram_mgr.c

··· 15 15 #include "xe_gt.h" 16 16 #include "xe_res_cursor.h" 17 17 #include "xe_ttm_vram_mgr.h" 18 + #include "xe_vram_types.h" 18 19 19 20 static inline struct drm_buddy_block * 20 21 xe_ttm_vram_mgr_first_block(struct list_head *list) ··· 338 337 return drmm_add_action_or_reset(&xe->drm, ttm_vram_mgr_fini, mgr); 339 338 } 340 339 341 - int xe_ttm_vram_mgr_init(struct xe_tile *tile, struct xe_ttm_vram_mgr *mgr) 340 + /** 341 + * xe_ttm_vram_mgr_init - initialize TTM VRAM region 342 + * @xe: pointer to Xe device 343 + * @vram: pointer to xe_vram_region that contains the memory region attributes 344 + * 345 + * Initialize the Xe TTM for given @vram region using the given parameters. 346 + * 347 + * Returns 0 for success, negative error code otherwise. 348 + */ 349 + int xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_vram_region *vram) 342 350 { 343 - struct xe_device *xe = tile_to_xe(tile); 344 - struct xe_vram_region *vram = &tile->mem.vram; 345 - 346 - return __xe_ttm_vram_mgr_init(xe, mgr, XE_PL_VRAM0 + tile->id, 347 - vram->usable_size, vram->io_size, 351 + return __xe_ttm_vram_mgr_init(xe, &vram->ttm, vram->placement, 352 + xe_vram_region_usable_size(vram), 353 + xe_vram_region_io_size(vram), 348 354 PAGE_SIZE); 349 355 } 350 356 ··· 400 392 */ 401 393 xe_res_first(res, offset, length, &cursor); 402 394 for_each_sgtable_sg((*sgt), sg, i) { 403 - phys_addr_t phys = cursor.start + tile->mem.vram.io_start; 395 + phys_addr_t phys = cursor.start + xe_vram_region_io_start(tile->mem.vram); 404 396 size_t size = min_t(u64, cursor.size, SZ_2G); 405 397 dma_addr_t addr; 406 398

+2 -1

drivers/gpu/drm/xe/xe_ttm_vram_mgr.h

··· 11 11 enum dma_data_direction; 12 12 struct xe_device; 13 13 struct xe_tile; 14 + struct xe_vram_region; 14 15 15 16 int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr, 16 17 u32 mem_type, u64 size, u64 io_size, 17 18 u64 default_page_size); 18 - int xe_ttm_vram_mgr_init(struct xe_tile *tile, struct xe_ttm_vram_mgr *mgr); 19 + int xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_vram_region *vram); 19 20 int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe, 20 21 struct ttm_resource *res, 21 22 u64 offset, u64 length,

+1 -1

drivers/gpu/drm/xe/xe_tuning.c

··· 99 99 XE_RTP_ACTIONS(SET(SAMPLER_MODE, INDIRECT_STATE_BASE_ADDR_OVERRIDE)) 100 100 }, 101 101 { XE_RTP_NAME("Tuning: Disable NULL query for Anyhit Shader"), 102 - XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, XE_RTP_END_VERSION_UNDEFINED), 102 + XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2000, XE_RTP_END_VERSION_UNDEFINED), 103 103 FUNC(xe_rtp_match_first_render_or_compute)), 104 104 XE_RTP_ACTIONS(SET(RT_CTRL, DIS_NULL_QUERY)) 105 105 },

+501 -101

drivers/gpu/drm/xe/xe_vm.c

··· 28 28 #include "xe_drm_client.h" 29 29 #include "xe_exec_queue.h" 30 30 #include "xe_gt_pagefault.h" 31 - #include "xe_gt_tlb_invalidation.h" 32 31 #include "xe_migrate.h" 33 32 #include "xe_pat.h" 34 33 #include "xe_pm.h" ··· 37 38 #include "xe_res_cursor.h" 38 39 #include "xe_svm.h" 39 40 #include "xe_sync.h" 41 + #include "xe_tile.h" 42 + #include "xe_tlb_inval.h" 40 43 #include "xe_trace_bo.h" 41 44 #include "xe_wa.h" 42 45 #include "xe_hmm.h" ··· 954 953 for_each_tile(tile, vm->xe, id) { 955 954 vops.pt_update_ops[id].wait_vm_bookkeep = true; 956 955 vops.pt_update_ops[tile->id].q = 957 - xe_tile_migrate_exec_queue(tile); 956 + xe_migrate_exec_queue(tile->migrate); 958 957 } 959 958 960 959 err = xe_vm_ops_add_rebind(&vops, vma, tile_mask); ··· 1044 1043 for_each_tile(tile, vm->xe, id) { 1045 1044 vops.pt_update_ops[id].wait_vm_bookkeep = true; 1046 1045 vops.pt_update_ops[tile->id].q = 1047 - xe_tile_migrate_exec_queue(tile); 1046 + xe_migrate_exec_queue(tile->migrate); 1048 1047 } 1049 1048 1050 1049 err = xe_vm_ops_add_range_rebind(&vops, vma, range, tile_mask); ··· 1127 1126 for_each_tile(tile, vm->xe, id) { 1128 1127 vops.pt_update_ops[id].wait_vm_bookkeep = true; 1129 1128 vops.pt_update_ops[tile->id].q = 1130 - xe_tile_migrate_exec_queue(tile); 1129 + xe_migrate_exec_queue(tile->migrate); 1131 1130 } 1132 1131 1133 1132 err = xe_vm_ops_add_range_unbind(&vops, range); ··· 1169 1168 struct xe_bo *bo, 1170 1169 u64 bo_offset_or_userptr, 1171 1170 u64 start, u64 end, 1172 - u16 pat_index, unsigned int flags) 1171 + struct xe_vma_mem_attr *attr, 1172 + unsigned int flags) 1173 1173 { 1174 1174 struct xe_vma *vma; 1175 1175 struct xe_tile *tile; ··· 1225 1223 if (vm->xe->info.has_atomic_enable_pte_bit) 1226 1224 vma->gpuva.flags |= XE_VMA_ATOMIC_PTE_BIT; 1227 1225 1228 - vma->pat_index = pat_index; 1226 + vma->attr = *attr; 1229 1227 1230 1228 if (bo) { 1231 1229 struct drm_gpuvm_bo *vm_bo; ··· 1514 1512 return 0; 1515 1513 } 1516 1514 1517 - static u64 xelp_pde_encode_bo(struct xe_bo *bo, u64 bo_offset, 1518 - const u16 pat_index) 1515 + static u16 pde_pat_index(struct xe_bo *bo) 1516 + { 1517 + struct xe_device *xe = xe_bo_device(bo); 1518 + u16 pat_index; 1519 + 1520 + /* 1521 + * We only have two bits to encode the PAT index in non-leaf nodes, but 1522 + * these only point to other paging structures so we only need a minimal 1523 + * selection of options. The user PAT index is only for encoding leaf 1524 + * nodes, where we have use of more bits to do the encoding. The 1525 + * non-leaf nodes are instead under driver control so the chosen index 1526 + * here should be distict from the user PAT index. Also the 1527 + * corresponding coherency of the PAT index should be tied to the 1528 + * allocation type of the page table (or at least we should pick 1529 + * something which is always safe). 1530 + */ 1531 + if (!xe_bo_is_vram(bo) && bo->ttm.ttm->caching == ttm_cached) 1532 + pat_index = xe->pat.idx[XE_CACHE_WB]; 1533 + else 1534 + pat_index = xe->pat.idx[XE_CACHE_NONE]; 1535 + 1536 + xe_assert(xe, pat_index <= 3); 1537 + 1538 + return pat_index; 1539 + } 1540 + 1541 + static u64 xelp_pde_encode_bo(struct xe_bo *bo, u64 bo_offset) 1519 1542 { 1520 1543 u64 pde; 1521 1544 1522 1545 pde = xe_bo_addr(bo, bo_offset, XE_PAGE_SIZE); 1523 1546 pde |= XE_PAGE_PRESENT | XE_PAGE_RW; 1524 - pde |= pde_encode_pat_index(pat_index); 1547 + pde |= pde_encode_pat_index(pde_pat_index(bo)); 1525 1548 1526 1549 return pde; 1527 1550 } ··· 1637 1610 1638 1611 for (i = MAX_HUGEPTE_LEVEL; i < vm->pt_root[id]->level; i++) { 1639 1612 vm->scratch_pt[id][i] = xe_pt_create(vm, tile, i); 1640 - if (IS_ERR(vm->scratch_pt[id][i])) 1641 - return PTR_ERR(vm->scratch_pt[id][i]); 1613 + if (IS_ERR(vm->scratch_pt[id][i])) { 1614 + int err = PTR_ERR(vm->scratch_pt[id][i]); 1615 + 1616 + vm->scratch_pt[id][i] = NULL; 1617 + return err; 1618 + } 1642 1619 1643 1620 xe_pt_populate_empty(tile, vm, vm->scratch_pt[id][i]); 1644 1621 } ··· 1671 1640 } 1672 1641 } 1673 1642 1674 - struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags) 1643 + struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags, struct xe_file *xef) 1675 1644 { 1676 1645 struct drm_gem_object *vm_resv_obj; 1677 1646 struct xe_vm *vm; ··· 1692 1661 vm->xe = xe; 1693 1662 1694 1663 vm->size = 1ull << xe->info.va_bits; 1695 - 1696 1664 vm->flags = flags; 1697 1665 1666 + if (xef) 1667 + vm->xef = xe_file_get(xef); 1698 1668 /** 1699 1669 * GSC VMs are kernel-owned, only used for PXP ops and can sometimes be 1700 1670 * manipulated under the PXP mutex. However, the PXP mutex can be taken ··· 1826 1794 if (number_tiles > 1) 1827 1795 vm->composite_fence_ctx = dma_fence_context_alloc(1); 1828 1796 1797 + if (xef && xe->info.has_asid) { 1798 + u32 asid; 1799 + 1800 + down_write(&xe->usm.lock); 1801 + err = xa_alloc_cyclic(&xe->usm.asid_to_vm, &asid, vm, 1802 + XA_LIMIT(1, XE_MAX_ASID - 1), 1803 + &xe->usm.next_asid, GFP_KERNEL); 1804 + up_write(&xe->usm.lock); 1805 + if (err < 0) 1806 + goto err_unlock_close; 1807 + 1808 + vm->usm.asid = asid; 1809 + } 1810 + 1829 1811 trace_xe_vm_create(vm); 1830 1812 1831 1813 return vm; ··· 1860 1814 for_each_tile(tile, xe, id) 1861 1815 xe_range_fence_tree_fini(&vm->rftree[id]); 1862 1816 ttm_lru_bulk_move_fini(&xe->ttm, &vm->lru_bulk_move); 1817 + if (vm->xef) 1818 + xe_file_put(vm->xef); 1863 1819 kfree(vm); 1864 1820 if (flags & XE_VM_FLAG_LR_MODE) 1865 1821 xe_pm_runtime_put(xe); ··· 1898 1850 xe_pt_clear(xe, vm->pt_root[id]); 1899 1851 1900 1852 for_each_gt(gt, xe, id) 1901 - xe_gt_tlb_invalidation_vm(gt, vm); 1853 + xe_tlb_inval_vm(&gt->tlb_inval, vm); 1902 1854 } 1903 1855 } 1904 1856 ··· 2072 2024 2073 2025 u64 xe_vm_pdp4_descriptor(struct xe_vm *vm, struct xe_tile *tile) 2074 2026 { 2075 - return vm->pt_ops->pde_encode_bo(vm->pt_root[tile->id]->bo, 0, 2076 - tile_to_xe(tile)->pat.idx[XE_CACHE_WB]); 2027 + return vm->pt_ops->pde_encode_bo(vm->pt_root[tile->id]->bo, 0); 2077 2028 } 2078 2029 2079 2030 static struct xe_exec_queue * ··· 2106 2059 struct xe_device *xe = to_xe_device(dev); 2107 2060 struct xe_file *xef = to_xe_file(file); 2108 2061 struct drm_xe_vm_create *args = data; 2109 - struct xe_tile *tile; 2110 2062 struct xe_vm *vm; 2111 - u32 id, asid; 2063 + u32 id; 2112 2064 int err; 2113 2065 u32 flags = 0; 2114 2066 2115 2067 if (XE_IOCTL_DBG(xe, args->extensions)) 2116 2068 return -EINVAL; 2117 2069 2118 - if (XE_WA(xe_root_mmio_gt(xe), 14016763929)) 2070 + if (XE_GT_WA(xe_root_mmio_gt(xe), 14016763929)) 2119 2071 args->flags |= DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE; 2120 2072 2121 2073 if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE && ··· 2143 2097 if (args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE) 2144 2098 flags |= XE_VM_FLAG_FAULT_MODE; 2145 2099 2146 - vm = xe_vm_create(xe, flags); 2100 + vm = xe_vm_create(xe, flags, xef); 2147 2101 if (IS_ERR(vm)) 2148 2102 return PTR_ERR(vm); 2149 - 2150 - if (xe->info.has_asid) { 2151 - down_write(&xe->usm.lock); 2152 - err = xa_alloc_cyclic(&xe->usm.asid_to_vm, &asid, vm, 2153 - XA_LIMIT(1, XE_MAX_ASID - 1), 2154 - &xe->usm.next_asid, GFP_KERNEL); 2155 - up_write(&xe->usm.lock); 2156 - if (err < 0) 2157 - goto err_close_and_put; 2158 - 2159 - vm->usm.asid = asid; 2160 - } 2161 - 2162 - vm->xef = xe_file_get(xef); 2163 - 2164 - /* Record BO memory for VM pagetable created against client */ 2165 - for_each_tile(tile, xe, id) 2166 - if (vm->pt_root[id]) 2167 - xe_drm_client_add_bo(vm->xef->client, vm->pt_root[id]->bo); 2168 2103 2169 2104 #if IS_ENABLED(CONFIG_DRM_XE_DEBUG_MEM) 2170 2105 /* Warning: Security issue - never enable by default */ ··· 2193 2166 if (!err) 2194 2167 xe_vm_close_and_put(vm); 2195 2168 2169 + return err; 2170 + } 2171 + 2172 + static int xe_vm_query_vmas(struct xe_vm *vm, u64 start, u64 end) 2173 + { 2174 + struct drm_gpuva *gpuva; 2175 + u32 num_vmas = 0; 2176 + 2177 + lockdep_assert_held(&vm->lock); 2178 + drm_gpuvm_for_each_va_range(gpuva, &vm->gpuvm, start, end) 2179 + num_vmas++; 2180 + 2181 + return num_vmas; 2182 + } 2183 + 2184 + static int get_mem_attrs(struct xe_vm *vm, u32 *num_vmas, u64 start, 2185 + u64 end, struct drm_xe_mem_range_attr *attrs) 2186 + { 2187 + struct drm_gpuva *gpuva; 2188 + int i = 0; 2189 + 2190 + lockdep_assert_held(&vm->lock); 2191 + 2192 + drm_gpuvm_for_each_va_range(gpuva, &vm->gpuvm, start, end) { 2193 + struct xe_vma *vma = gpuva_to_vma(gpuva); 2194 + 2195 + if (i == *num_vmas) 2196 + return -ENOSPC; 2197 + 2198 + attrs[i].start = xe_vma_start(vma); 2199 + attrs[i].end = xe_vma_end(vma); 2200 + attrs[i].atomic.val = vma->attr.atomic_access; 2201 + attrs[i].pat_index.val = vma->attr.pat_index; 2202 + attrs[i].preferred_mem_loc.devmem_fd = vma->attr.preferred_loc.devmem_fd; 2203 + attrs[i].preferred_mem_loc.migration_policy = 2204 + vma->attr.preferred_loc.migration_policy; 2205 + 2206 + i++; 2207 + } 2208 + 2209 + *num_vmas = i; 2210 + return 0; 2211 + } 2212 + 2213 + int xe_vm_query_vmas_attrs_ioctl(struct drm_device *dev, void *data, struct drm_file *file) 2214 + { 2215 + struct xe_device *xe = to_xe_device(dev); 2216 + struct xe_file *xef = to_xe_file(file); 2217 + struct drm_xe_mem_range_attr *mem_attrs; 2218 + struct drm_xe_vm_query_mem_range_attr *args = data; 2219 + u64 __user *attrs_user = u64_to_user_ptr(args->vector_of_mem_attr); 2220 + struct xe_vm *vm; 2221 + int err = 0; 2222 + 2223 + if (XE_IOCTL_DBG(xe, 2224 + ((args->num_mem_ranges == 0 && 2225 + (attrs_user || args->sizeof_mem_range_attr != 0)) || 2226 + (args->num_mem_ranges > 0 && 2227 + (!attrs_user || 2228 + args->sizeof_mem_range_attr != 2229 + sizeof(struct drm_xe_mem_range_attr)))))) 2230 + return -EINVAL; 2231 + 2232 + vm = xe_vm_lookup(xef, args->vm_id); 2233 + if (XE_IOCTL_DBG(xe, !vm)) 2234 + return -EINVAL; 2235 + 2236 + err = down_read_interruptible(&vm->lock); 2237 + if (err) 2238 + goto put_vm; 2239 + 2240 + attrs_user = u64_to_user_ptr(args->vector_of_mem_attr); 2241 + 2242 + if (args->num_mem_ranges == 0 && !attrs_user) { 2243 + args->num_mem_ranges = xe_vm_query_vmas(vm, args->start, args->start + args->range); 2244 + args->sizeof_mem_range_attr = sizeof(struct drm_xe_mem_range_attr); 2245 + goto unlock_vm; 2246 + } 2247 + 2248 + mem_attrs = kvmalloc_array(args->num_mem_ranges, args->sizeof_mem_range_attr, 2249 + GFP_KERNEL | __GFP_ACCOUNT | 2250 + __GFP_RETRY_MAYFAIL | __GFP_NOWARN); 2251 + if (!mem_attrs) { 2252 + err = args->num_mem_ranges > 1 ? -ENOBUFS : -ENOMEM; 2253 + goto unlock_vm; 2254 + } 2255 + 2256 + memset(mem_attrs, 0, args->num_mem_ranges * args->sizeof_mem_range_attr); 2257 + err = get_mem_attrs(vm, &args->num_mem_ranges, args->start, 2258 + args->start + args->range, mem_attrs); 2259 + if (err) 2260 + goto free_mem_attrs; 2261 + 2262 + err = copy_to_user(attrs_user, mem_attrs, 2263 + args->sizeof_mem_range_attr * args->num_mem_ranges); 2264 + 2265 + free_mem_attrs: 2266 + kvfree(mem_attrs); 2267 + unlock_vm: 2268 + up_read(&vm->lock); 2269 + put_vm: 2270 + xe_vm_put(vm); 2196 2271 return err; 2197 2272 } 2198 2273 ··· 2503 2374 __xe_vm_needs_clear_scratch_pages(vm, flags); 2504 2375 } else if (__op->op == DRM_GPUVA_OP_PREFETCH) { 2505 2376 struct xe_vma *vma = gpuva_to_vma(op->base.prefetch.va); 2377 + struct xe_tile *tile; 2506 2378 struct xe_svm_range *svm_range; 2507 2379 struct drm_gpusvm_ctx ctx = {}; 2508 - struct xe_tile *tile; 2380 + struct drm_pagemap *dpagemap; 2509 2381 u8 id, tile_mask = 0; 2510 2382 u32 i; 2511 2383 ··· 2523 2393 tile_mask |= 0x1 << id; 2524 2394 2525 2395 xa_init_flags(&op->prefetch_range.range, XA_FLAGS_ALLOC); 2526 - op->prefetch_range.region = prefetch_region; 2527 2396 op->prefetch_range.ranges_count = 0; 2397 + tile = NULL; 2398 + 2399 + if (prefetch_region == DRM_XE_CONSULT_MEM_ADVISE_PREF_LOC) { 2400 + dpagemap = xe_vma_resolve_pagemap(vma, 2401 + xe_device_get_root_tile(vm->xe)); 2402 + /* 2403 + * TODO: Once multigpu support is enabled will need 2404 + * something to dereference tile from dpagemap. 2405 + */ 2406 + if (dpagemap) 2407 + tile = xe_device_get_root_tile(vm->xe); 2408 + } else if (prefetch_region) { 2409 + tile = &vm->xe->tiles[region_to_mem_type[prefetch_region] - 2410 + XE_PL_VRAM0]; 2411 + } 2412 + 2413 + op->prefetch_range.tile = tile; 2528 2414 alloc_next_range: 2529 2415 svm_range = xe_svm_range_find_or_insert(vm, addr, vma, &ctx); 2530 2416 ··· 2559 2413 goto unwind_prefetch_ops; 2560 2414 } 2561 2415 2562 - if (xe_svm_range_validate(vm, svm_range, tile_mask, !!prefetch_region)) { 2416 + if (xe_svm_range_validate(vm, svm_range, tile_mask, !!tile)) { 2563 2417 xe_svm_range_debug(svm_range, "PREFETCH - RANGE IS VALID"); 2564 2418 goto check_next_range; 2565 2419 } ··· 2596 2450 ALLOW_ERROR_INJECTION(vm_bind_ioctl_ops_create, ERRNO); 2597 2451 2598 2452 static struct xe_vma *new_vma(struct xe_vm *vm, struct drm_gpuva_op_map *op, 2599 - u16 pat_index, unsigned int flags) 2453 + struct xe_vma_mem_attr *attr, unsigned int flags) 2600 2454 { 2601 2455 struct xe_bo *bo = op->gem.obj ? gem_to_xe_bo(op->gem.obj) : NULL; 2602 2456 struct drm_exec exec; ··· 2625 2479 } 2626 2480 vma = xe_vma_create(vm, bo, op->gem.offset, 2627 2481 op->va.addr, op->va.addr + 2628 - op->va.range - 1, pat_index, flags); 2482 + op->va.range - 1, attr, flags); 2629 2483 if (IS_ERR(vma)) 2630 2484 goto err_unlock; 2631 2485 ··· 2742 2596 return err; 2743 2597 } 2744 2598 2599 + /** 2600 + * xe_vma_has_default_mem_attrs - Check if a VMA has default memory attributes 2601 + * @vma: Pointer to the xe_vma structure to check 2602 + * 2603 + * This function determines whether the given VMA (Virtual Memory Area) 2604 + * has its memory attributes set to their default values. Specifically, 2605 + * it checks the following conditions: 2606 + * 2607 + * - `atomic_access` is `DRM_XE_VMA_ATOMIC_UNDEFINED` 2608 + * - `pat_index` is equal to `default_pat_index` 2609 + * - `preferred_loc.devmem_fd` is `DRM_XE_PREFERRED_LOC_DEFAULT_DEVICE` 2610 + * - `preferred_loc.migration_policy` is `DRM_XE_MIGRATE_ALL_PAGES` 2611 + * 2612 + * Return: true if all attributes are at their default values, false otherwise. 2613 + */ 2614 + bool xe_vma_has_default_mem_attrs(struct xe_vma *vma) 2615 + { 2616 + return (vma->attr.atomic_access == DRM_XE_ATOMIC_UNDEFINED && 2617 + vma->attr.pat_index == vma->attr.default_pat_index && 2618 + vma->attr.preferred_loc.devmem_fd == DRM_XE_PREFERRED_LOC_DEFAULT_DEVICE && 2619 + vma->attr.preferred_loc.migration_policy == DRM_XE_MIGRATE_ALL_PAGES); 2620 + } 2621 + 2745 2622 static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct drm_gpuva_ops *ops, 2746 2623 struct xe_vma_ops *vops) 2747 2624 { ··· 2791 2622 switch (op->base.op) { 2792 2623 case DRM_GPUVA_OP_MAP: 2793 2624 { 2625 + struct xe_vma_mem_attr default_attr = { 2626 + .preferred_loc = { 2627 + .devmem_fd = DRM_XE_PREFERRED_LOC_DEFAULT_DEVICE, 2628 + .migration_policy = DRM_XE_MIGRATE_ALL_PAGES, 2629 + }, 2630 + .atomic_access = DRM_XE_ATOMIC_UNDEFINED, 2631 + .default_pat_index = op->map.pat_index, 2632 + .pat_index = op->map.pat_index, 2633 + }; 2634 + 2794 2635 flags |= op->map.read_only ? 2795 2636 VMA_CREATE_FLAG_READ_ONLY : 0; 2796 2637 flags |= op->map.is_null ? ··· 2810 2631 flags |= op->map.is_cpu_addr_mirror ? 2811 2632 VMA_CREATE_FLAG_IS_SYSTEM_ALLOCATOR : 0; 2812 2633 2813 - vma = new_vma(vm, &op->base.map, op->map.pat_index, 2634 + vma = new_vma(vm, &op->base.map, &default_attr, 2814 2635 flags); 2815 2636 if (IS_ERR(vma)) 2816 2637 return PTR_ERR(vma); ··· 2838 2659 end = op->base.remap.next->va.addr; 2839 2660 2840 2661 if (xe_vma_is_cpu_addr_mirror(old) && 2841 - xe_svm_has_mapping(vm, start, end)) 2842 - return -EBUSY; 2662 + xe_svm_has_mapping(vm, start, end)) { 2663 + if (vops->flags & XE_VMA_OPS_FLAG_MADVISE) 2664 + xe_svm_unmap_address_range(vm, start, end); 2665 + else 2666 + return -EBUSY; 2667 + } 2843 2668 2844 2669 op->remap.start = xe_vma_start(old); 2845 2670 op->remap.range = xe_vma_size(old); ··· 2862 2679 2863 2680 if (op->base.remap.prev) { 2864 2681 vma = new_vma(vm, op->base.remap.prev, 2865 - old->pat_index, flags); 2682 + &old->attr, flags); 2866 2683 if (IS_ERR(vma)) 2867 2684 return PTR_ERR(vma); 2868 2685 ··· 2892 2709 2893 2710 if (op->base.remap.next) { 2894 2711 vma = new_vma(vm, op->base.remap.next, 2895 - old->pat_index, flags); 2712 + &old->attr, flags); 2896 2713 if (IS_ERR(vma)) 2897 2714 return PTR_ERR(vma); 2898 2715 ··· 3079 2896 { 3080 2897 bool devmem_possible = IS_DGFX(vm->xe) && IS_ENABLED(CONFIG_DRM_XE_PAGEMAP); 3081 2898 struct xe_vma *vma = gpuva_to_vma(op->base.prefetch.va); 2899 + struct xe_tile *tile = op->prefetch_range.tile; 3082 2900 int err = 0; 3083 2901 3084 2902 struct xe_svm_range *svm_range; 3085 2903 struct drm_gpusvm_ctx ctx = {}; 3086 - struct xe_tile *tile; 3087 2904 unsigned long i; 3088 - u32 region; 3089 2905 3090 2906 if (!xe_vma_is_cpu_addr_mirror(vma)) 3091 2907 return 0; 3092 - 3093 - region = op->prefetch_range.region; 3094 2908 3095 2909 ctx.read_only = xe_vma_read_only(vma); 3096 2910 ctx.devmem_possible = devmem_possible; ··· 3095 2915 3096 2916 /* TODO: Threading the migration */ 3097 2917 xa_for_each(&op->prefetch_range.range, i, svm_range) { 3098 - if (!region) 2918 + if (!tile) 3099 2919 xe_svm_range_migrate_to_smem(vm, svm_range); 3100 2920 3101 - if (xe_svm_range_needs_migrate_to_vram(svm_range, vma, region)) { 3102 - tile = &vm->xe->tiles[region_to_mem_type[region] - XE_PL_VRAM0]; 2921 + if (xe_svm_range_needs_migrate_to_vram(svm_range, vma, !!tile)) { 3103 2922 err = xe_svm_alloc_vram(tile, svm_range, &ctx); 3104 2923 if (err) { 3105 2924 drm_dbg(&vm->xe->drm, "VRAM allocation failed, retry from userspace, asid=%u, gpusvm=%p, errno=%pe\n", ··· 3161 2982 struct xe_vma *vma = gpuva_to_vma(op->base.prefetch.va); 3162 2983 u32 region; 3163 2984 3164 - if (xe_vma_is_cpu_addr_mirror(vma)) 3165 - region = op->prefetch_range.region; 3166 - else 2985 + if (!xe_vma_is_cpu_addr_mirror(vma)) { 3167 2986 region = op->prefetch.region; 3168 - 3169 - xe_assert(vm->xe, region <= ARRAY_SIZE(region_to_mem_type)); 2987 + xe_assert(vm->xe, region == DRM_XE_CONSULT_MEM_ADVISE_PREF_LOC || 2988 + region <= ARRAY_SIZE(region_to_mem_type)); 2989 + } 3170 2990 3171 2991 err = vma_lock_and_validate(exec, 3172 2992 gpuva_to_vma(op->base.prefetch.va), ··· 3583 3405 op == DRM_XE_VM_BIND_OP_PREFETCH) || 3584 3406 XE_IOCTL_DBG(xe, prefetch_region && 3585 3407 op != DRM_XE_VM_BIND_OP_PREFETCH) || 3586 - XE_IOCTL_DBG(xe, !(BIT(prefetch_region) & 3587 - xe->info.mem_region_mask)) || 3408 + XE_IOCTL_DBG(xe, (prefetch_region != DRM_XE_CONSULT_MEM_ADVISE_PREF_LOC && 3409 + !(BIT(prefetch_region) & xe->info.mem_region_mask))) || 3588 3410 XE_IOCTL_DBG(xe, obj && 3589 3411 op == DRM_XE_VM_BIND_OP_UNMAP)) { 3590 3412 err = -EINVAL; ··· 3606 3428 free_bind_ops: 3607 3429 if (args->num_binds > 1) 3608 3430 kvfree(*bind_ops); 3431 + *bind_ops = NULL; 3609 3432 return err; 3610 3433 } 3611 3434 ··· 3713 3534 struct xe_exec_queue *q = NULL; 3714 3535 u32 num_syncs, num_ufence = 0; 3715 3536 struct xe_sync_entry *syncs = NULL; 3716 - struct drm_xe_vm_bind_op *bind_ops; 3537 + struct drm_xe_vm_bind_op *bind_ops = NULL; 3717 3538 struct xe_vma_ops vops; 3718 3539 struct dma_fence *fence; 3719 3540 int err; ··· 3731 3552 q = xe_exec_queue_lookup(xef, args->exec_queue_id); 3732 3553 if (XE_IOCTL_DBG(xe, !q)) { 3733 3554 err = -ENOENT; 3734 - goto put_vm; 3555 + goto free_bind_ops; 3735 3556 } 3736 3557 3737 3558 if (XE_IOCTL_DBG(xe, !(q->flags & EXEC_QUEUE_FLAG_VM))) { ··· 3777 3598 __GFP_RETRY_MAYFAIL | __GFP_NOWARN); 3778 3599 if (!ops) { 3779 3600 err = -ENOMEM; 3780 - goto release_vm_lock; 3601 + goto free_bos; 3781 3602 } 3782 3603 } 3783 3604 ··· 3911 3732 put_obj: 3912 3733 for (i = 0; i < args->num_binds; ++i) 3913 3734 xe_bo_put(bos[i]); 3735 + 3736 + kvfree(ops); 3737 + free_bos: 3738 + kvfree(bos); 3914 3739 release_vm_lock: 3915 3740 up_write(&vm->lock); 3916 3741 put_exec_queue: 3917 3742 if (q) 3918 3743 xe_exec_queue_put(q); 3919 - put_vm: 3920 - xe_vm_put(vm); 3921 - kvfree(bos); 3922 - kvfree(ops); 3744 + free_bind_ops: 3923 3745 if (args->num_binds > 1) 3924 3746 kvfree(bind_ops); 3747 + put_vm: 3748 + xe_vm_put(vm); 3925 3749 return err; 3926 3750 } 3927 3751 ··· 4032 3850 } 4033 3851 4034 3852 /** 4035 - * xe_vm_range_tilemask_tlb_invalidation - Issue a TLB invalidation on this tilemask for an 3853 + * xe_vm_range_tilemask_tlb_inval - Issue a TLB invalidation on this tilemask for an 4036 3854 * address range 4037 3855 * @vm: The VM 4038 3856 * @start: start address ··· 4043 3861 * 4044 3862 * Returns 0 for success, negative error code otherwise. 4045 3863 */ 4046 - int xe_vm_range_tilemask_tlb_invalidation(struct xe_vm *vm, u64 start, 4047 - u64 end, u8 tile_mask) 3864 + int xe_vm_range_tilemask_tlb_inval(struct xe_vm *vm, u64 start, 3865 + u64 end, u8 tile_mask) 4048 3866 { 4049 - struct xe_gt_tlb_invalidation_fence fence[XE_MAX_TILES_PER_DEVICE * XE_MAX_GT_PER_TILE]; 3867 + struct xe_tlb_inval_fence 3868 + fence[XE_MAX_TILES_PER_DEVICE * XE_MAX_GT_PER_TILE]; 4050 3869 struct xe_tile *tile; 4051 3870 u32 fence_id = 0; 4052 3871 u8 id; ··· 4057 3874 return 0; 4058 3875 4059 3876 for_each_tile(tile, vm->xe, id) { 4060 - if (tile_mask & BIT(id)) { 4061 - xe_gt_tlb_invalidation_fence_init(tile->primary_gt, 4062 - &fence[fence_id], true); 3877 + if (!(tile_mask & BIT(id))) 3878 + continue; 4063 3879 4064 - err = xe_gt_tlb_invalidation_range(tile->primary_gt, 4065 - &fence[fence_id], 4066 - start, 4067 - end, 4068 - vm->usm.asid); 4069 - if (err) 4070 - goto wait; 4071 - ++fence_id; 3880 + xe_tlb_inval_fence_init(&tile->primary_gt->tlb_inval, 3881 + &fence[fence_id], true); 4072 3882 4073 - if (!tile->media_gt) 4074 - continue; 3883 + err = xe_tlb_inval_range(&tile->primary_gt->tlb_inval, 3884 + &fence[fence_id], start, end, 3885 + vm->usm.asid); 3886 + if (err) 3887 + goto wait; 3888 + ++fence_id; 4075 3889 4076 - xe_gt_tlb_invalidation_fence_init(tile->media_gt, 4077 - &fence[fence_id], true); 3890 + if (!tile->media_gt) 3891 + continue; 4078 3892 4079 - err = xe_gt_tlb_invalidation_range(tile->media_gt, 4080 - &fence[fence_id], 4081 - start, 4082 - end, 4083 - vm->usm.asid); 4084 - if (err) 4085 - goto wait; 4086 - ++fence_id; 4087 - } 3893 + xe_tlb_inval_fence_init(&tile->media_gt->tlb_inval, 3894 + &fence[fence_id], true); 3895 + 3896 + err = xe_tlb_inval_range(&tile->media_gt->tlb_inval, 3897 + &fence[fence_id], start, end, 3898 + vm->usm.asid); 3899 + if (err) 3900 + goto wait; 3901 + ++fence_id; 4088 3902 } 4089 3903 4090 3904 wait: 4091 3905 for (id = 0; id < fence_id; ++id) 4092 - xe_gt_tlb_invalidation_fence_wait(&fence[id]); 3906 + xe_tlb_inval_fence_wait(&fence[id]); 4093 3907 4094 3908 return err; 4095 3909 } ··· 4145 3965 4146 3966 xe_device_wmb(xe); 4147 3967 4148 - ret = xe_vm_range_tilemask_tlb_invalidation(xe_vma_vm(vma), xe_vma_start(vma), 4149 - xe_vma_end(vma), tile_mask); 3968 + ret = xe_vm_range_tilemask_tlb_inval(xe_vma_vm(vma), xe_vma_start(vma), 3969 + xe_vma_end(vma), tile_mask); 4150 3970 4151 3971 /* WRITE_ONCE pairs with READ_ONCE in xe_vm_has_valid_gpu_mapping() */ 4152 3972 WRITE_ONCE(vma->tile_invalidated, vma->tile_mask); ··· 4347 4167 mmput(snap->snap[i].mm); 4348 4168 } 4349 4169 kvfree(snap); 4170 + } 4171 + 4172 + /** 4173 + * xe_vma_need_vram_for_atomic - Check if VMA needs VRAM migration for atomic operations 4174 + * @xe: Pointer to the XE device structure 4175 + * @vma: Pointer to the virtual memory area (VMA) structure 4176 + * @is_atomic: In pagefault path and atomic operation 4177 + * 4178 + * This function determines whether the given VMA needs to be migrated to 4179 + * VRAM in order to do atomic GPU operation. 4180 + * 4181 + * Return: 4182 + * 1 - Migration to VRAM is required 4183 + * 0 - Migration is not required 4184 + * -EACCES - Invalid access for atomic memory attr 4185 + * 4186 + */ 4187 + int xe_vma_need_vram_for_atomic(struct xe_device *xe, struct xe_vma *vma, bool is_atomic) 4188 + { 4189 + u32 atomic_access = xe_vma_bo(vma) ? xe_vma_bo(vma)->attr.atomic_access : 4190 + vma->attr.atomic_access; 4191 + 4192 + if (!IS_DGFX(xe) || !is_atomic) 4193 + return false; 4194 + 4195 + /* 4196 + * NOTE: The checks implemented here are platform-specific. For 4197 + * instance, on a device supporting CXL atomics, these would ideally 4198 + * work universally without additional handling. 4199 + */ 4200 + switch (atomic_access) { 4201 + case DRM_XE_ATOMIC_DEVICE: 4202 + return !xe->info.has_device_atomics_on_smem; 4203 + 4204 + case DRM_XE_ATOMIC_CPU: 4205 + return -EACCES; 4206 + 4207 + case DRM_XE_ATOMIC_UNDEFINED: 4208 + case DRM_XE_ATOMIC_GLOBAL: 4209 + default: 4210 + return 1; 4211 + } 4212 + } 4213 + 4214 + static int xe_vm_alloc_vma(struct xe_vm *vm, 4215 + struct drm_gpuvm_map_req *map_req, 4216 + bool is_madvise) 4217 + { 4218 + struct xe_vma_ops vops; 4219 + struct drm_gpuva_ops *ops = NULL; 4220 + struct drm_gpuva_op *__op; 4221 + bool is_cpu_addr_mirror = false; 4222 + bool remap_op = false; 4223 + struct xe_vma_mem_attr tmp_attr; 4224 + u16 default_pat; 4225 + int err; 4226 + 4227 + lockdep_assert_held_write(&vm->lock); 4228 + 4229 + if (is_madvise) 4230 + ops = drm_gpuvm_madvise_ops_create(&vm->gpuvm, map_req); 4231 + else 4232 + ops = drm_gpuvm_sm_map_ops_create(&vm->gpuvm, map_req); 4233 + 4234 + if (IS_ERR(ops)) 4235 + return PTR_ERR(ops); 4236 + 4237 + if (list_empty(&ops->list)) { 4238 + err = 0; 4239 + goto free_ops; 4240 + } 4241 + 4242 + drm_gpuva_for_each_op(__op, ops) { 4243 + struct xe_vma_op *op = gpuva_op_to_vma_op(__op); 4244 + struct xe_vma *vma = NULL; 4245 + 4246 + if (!is_madvise) { 4247 + if (__op->op == DRM_GPUVA_OP_UNMAP) { 4248 + vma = gpuva_to_vma(op->base.unmap.va); 4249 + XE_WARN_ON(!xe_vma_has_default_mem_attrs(vma)); 4250 + default_pat = vma->attr.default_pat_index; 4251 + } 4252 + 4253 + if (__op->op == DRM_GPUVA_OP_REMAP) { 4254 + vma = gpuva_to_vma(op->base.remap.unmap->va); 4255 + default_pat = vma->attr.default_pat_index; 4256 + } 4257 + 4258 + if (__op->op == DRM_GPUVA_OP_MAP) { 4259 + op->map.is_cpu_addr_mirror = true; 4260 + op->map.pat_index = default_pat; 4261 + } 4262 + } else { 4263 + if (__op->op == DRM_GPUVA_OP_REMAP) { 4264 + vma = gpuva_to_vma(op->base.remap.unmap->va); 4265 + xe_assert(vm->xe, !remap_op); 4266 + xe_assert(vm->xe, xe_vma_has_no_bo(vma)); 4267 + remap_op = true; 4268 + 4269 + if (xe_vma_is_cpu_addr_mirror(vma)) 4270 + is_cpu_addr_mirror = true; 4271 + else 4272 + is_cpu_addr_mirror = false; 4273 + } 4274 + 4275 + if (__op->op == DRM_GPUVA_OP_MAP) { 4276 + xe_assert(vm->xe, remap_op); 4277 + remap_op = false; 4278 + /* 4279 + * In case of madvise ops DRM_GPUVA_OP_MAP is 4280 + * always after DRM_GPUVA_OP_REMAP, so ensure 4281 + * we assign op->map.is_cpu_addr_mirror true 4282 + * if REMAP is for xe_vma_is_cpu_addr_mirror vma 4283 + */ 4284 + op->map.is_cpu_addr_mirror = is_cpu_addr_mirror; 4285 + } 4286 + } 4287 + print_op(vm->xe, __op); 4288 + } 4289 + 4290 + xe_vma_ops_init(&vops, vm, NULL, NULL, 0); 4291 + 4292 + if (is_madvise) 4293 + vops.flags |= XE_VMA_OPS_FLAG_MADVISE; 4294 + 4295 + err = vm_bind_ioctl_ops_parse(vm, ops, &vops); 4296 + if (err) 4297 + goto unwind_ops; 4298 + 4299 + xe_vm_lock(vm, false); 4300 + 4301 + drm_gpuva_for_each_op(__op, ops) { 4302 + struct xe_vma_op *op = gpuva_op_to_vma_op(__op); 4303 + struct xe_vma *vma; 4304 + 4305 + if (__op->op == DRM_GPUVA_OP_UNMAP) { 4306 + vma = gpuva_to_vma(op->base.unmap.va); 4307 + /* There should be no unmap for madvise */ 4308 + if (is_madvise) 4309 + XE_WARN_ON("UNEXPECTED UNMAP"); 4310 + 4311 + xe_vma_destroy(vma, NULL); 4312 + } else if (__op->op == DRM_GPUVA_OP_REMAP) { 4313 + vma = gpuva_to_vma(op->base.remap.unmap->va); 4314 + /* In case of madvise ops Store attributes for REMAP UNMAPPED 4315 + * VMA, so they can be assigned to newly MAP created vma. 4316 + */ 4317 + if (is_madvise) 4318 + tmp_attr = vma->attr; 4319 + 4320 + xe_vma_destroy(gpuva_to_vma(op->base.remap.unmap->va), NULL); 4321 + } else if (__op->op == DRM_GPUVA_OP_MAP) { 4322 + vma = op->map.vma; 4323 + /* In case of madvise call, MAP will always be follwed by REMAP. 4324 + * Therefore temp_attr will always have sane values, making it safe to 4325 + * copy them to new vma. 4326 + */ 4327 + if (is_madvise) 4328 + vma->attr = tmp_attr; 4329 + } 4330 + } 4331 + 4332 + xe_vm_unlock(vm); 4333 + drm_gpuva_ops_free(&vm->gpuvm, ops); 4334 + return 0; 4335 + 4336 + unwind_ops: 4337 + vm_bind_ioctl_ops_unwind(vm, &ops, 1); 4338 + free_ops: 4339 + drm_gpuva_ops_free(&vm->gpuvm, ops); 4340 + return err; 4341 + } 4342 + 4343 + /** 4344 + * xe_vm_alloc_madvise_vma - Allocate VMA's with madvise ops 4345 + * @vm: Pointer to the xe_vm structure 4346 + * @start: Starting input address 4347 + * @range: Size of the input range 4348 + * 4349 + * This function splits existing vma to create new vma for user provided input range 4350 + * 4351 + * Return: 0 if success 4352 + */ 4353 + int xe_vm_alloc_madvise_vma(struct xe_vm *vm, uint64_t start, uint64_t range) 4354 + { 4355 + struct drm_gpuvm_map_req map_req = { 4356 + .map.va.addr = start, 4357 + .map.va.range = range, 4358 + }; 4359 + 4360 + lockdep_assert_held_write(&vm->lock); 4361 + 4362 + vm_dbg(&vm->xe->drm, "MADVISE_OPS_CREATE: addr=0x%016llx, size=0x%016llx", start, range); 4363 + 4364 + return xe_vm_alloc_vma(vm, &map_req, true); 4365 + } 4366 + 4367 + /** 4368 + * xe_vm_alloc_cpu_addr_mirror_vma - Allocate CPU addr mirror vma 4369 + * @vm: Pointer to the xe_vm structure 4370 + * @start: Starting input address 4371 + * @range: Size of the input range 4372 + * 4373 + * This function splits/merges existing vma to create new vma for user provided input range 4374 + * 4375 + * Return: 0 if success 4376 + */ 4377 + int xe_vm_alloc_cpu_addr_mirror_vma(struct xe_vm *vm, uint64_t start, uint64_t range) 4378 + { 4379 + struct drm_gpuvm_map_req map_req = { 4380 + .map.va.addr = start, 4381 + .map.va.range = range, 4382 + }; 4383 + 4384 + lockdep_assert_held_write(&vm->lock); 4385 + 4386 + vm_dbg(&vm->xe->drm, "CPU_ADDR_MIRROR_VMA_OPS_CREATE: addr=0x%016llx, size=0x%016llx", 4387 + start, range); 4388 + 4389 + return xe_vm_alloc_vma(vm, &map_req, false); 4350 4390 }

+14 -17

drivers/gpu/drm/xe/xe_vm.h

··· 26 26 struct xe_svm_range; 27 27 struct drm_exec; 28 28 29 - struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags); 29 + struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags, struct xe_file *xef); 30 30 31 31 struct xe_vm *xe_vm_lookup(struct xe_file *xef, u32 id); 32 32 int xe_vma_cmp_vma_cb(const void *key, const struct rb_node *node); ··· 65 65 66 66 struct xe_vma * 67 67 xe_vm_find_overlapping_vma(struct xe_vm *vm, u64 start, u64 range); 68 + 69 + bool xe_vma_has_default_mem_attrs(struct xe_vma *vma); 68 70 69 71 /** 70 72 * xe_vm_has_scratch() - Whether the vm is configured for scratch PTEs ··· 173 171 174 172 struct xe_vma *xe_vm_find_vma_by_addr(struct xe_vm *vm, u64 page_addr); 175 173 174 + int xe_vma_need_vram_for_atomic(struct xe_device *xe, struct xe_vma *vma, bool is_atomic); 175 + 176 + int xe_vm_alloc_madvise_vma(struct xe_vm *vm, uint64_t addr, uint64_t size); 177 + 178 + int xe_vm_alloc_cpu_addr_mirror_vma(struct xe_vm *vm, uint64_t addr, uint64_t size); 179 + 176 180 /** 177 181 * to_userptr_vma() - Return a pointer to an embedding userptr vma 178 182 * @vma: Pointer to the embedded struct xe_vma ··· 199 191 struct drm_file *file); 200 192 int xe_vm_bind_ioctl(struct drm_device *dev, void *data, 201 193 struct drm_file *file); 202 - 194 + int xe_vm_query_vmas_attrs_ioctl(struct drm_device *dev, void *data, struct drm_file *file); 203 195 void xe_vm_close_and_put(struct xe_vm *vm); 204 196 205 197 static inline bool xe_vm_in_fault_mode(struct xe_vm *vm) ··· 236 228 struct dma_fence *xe_vm_range_unbind(struct xe_vm *vm, 237 229 struct xe_svm_range *range); 238 230 239 - int xe_vm_range_tilemask_tlb_invalidation(struct xe_vm *vm, u64 start, 240 - u64 end, u8 tile_mask); 231 + int xe_vm_range_tilemask_tlb_inval(struct xe_vm *vm, u64 start, 232 + u64 end, u8 tile_mask); 241 233 242 234 int xe_vm_invalidate_vma(struct xe_vma *vma); 243 235 ··· 323 315 * Register this task as currently making bos resident for the vm. Intended 324 316 * to avoid eviction by the same task of shared bos bound to the vm. 325 317 * Call with the vm's resv lock held. 326 - * 327 - * Return: A pin cookie that should be used for xe_vm_clear_validating(). 328 318 */ 329 - static inline struct pin_cookie xe_vm_set_validating(struct xe_vm *vm, 330 - bool allow_res_evict) 319 + static inline void xe_vm_set_validating(struct xe_vm *vm, bool allow_res_evict) 331 320 { 332 - struct pin_cookie cookie = {}; 333 - 334 321 if (vm && !allow_res_evict) { 335 322 xe_vm_assert_held(vm); 336 - cookie = lockdep_pin_lock(&xe_vm_resv(vm)->lock.base); 337 323 /* Pairs with READ_ONCE in xe_vm_is_validating() */ 338 324 WRITE_ONCE(vm->validating, current); 339 325 } 340 - 341 - return cookie; 342 326 } 343 327 344 328 /** ··· 338 338 * @vm: Pointer to the vm or NULL 339 339 * @allow_res_evict: Eviction from @vm was allowed. Must be set to the same 340 340 * value as for xe_vm_set_validation(). 341 - * @cookie: Cookie obtained from xe_vm_set_validating(). 342 341 * 343 342 * Register this task as currently making bos resident for the vm. Intended 344 343 * to avoid eviction by the same task of shared bos bound to the vm. 345 344 * Call with the vm's resv lock held. 346 345 */ 347 - static inline void xe_vm_clear_validating(struct xe_vm *vm, bool allow_res_evict, 348 - struct pin_cookie cookie) 346 + static inline void xe_vm_clear_validating(struct xe_vm *vm, bool allow_res_evict) 349 347 { 350 348 if (vm && !allow_res_evict) { 351 - lockdep_unpin_lock(&xe_vm_resv(vm)->lock.base, cookie); 352 349 /* Pairs with READ_ONCE in xe_vm_is_validating() */ 353 350 WRITE_ONCE(vm->validating, NULL); 354 351 }

+445

drivers/gpu/drm/xe/xe_vm_madvise.c

··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #include "xe_vm_madvise.h" 7 + 8 + #include <linux/nospec.h> 9 + #include <drm/xe_drm.h> 10 + 11 + #include "xe_bo.h" 12 + #include "xe_pat.h" 13 + #include "xe_pt.h" 14 + #include "xe_svm.h" 15 + 16 + struct xe_vmas_in_madvise_range { 17 + u64 addr; 18 + u64 range; 19 + struct xe_vma **vmas; 20 + int num_vmas; 21 + bool has_svm_vmas; 22 + bool has_bo_vmas; 23 + bool has_userptr_vmas; 24 + }; 25 + 26 + static int get_vmas(struct xe_vm *vm, struct xe_vmas_in_madvise_range *madvise_range) 27 + { 28 + u64 addr = madvise_range->addr; 29 + u64 range = madvise_range->range; 30 + 31 + struct xe_vma **__vmas; 32 + struct drm_gpuva *gpuva; 33 + int max_vmas = 8; 34 + 35 + lockdep_assert_held(&vm->lock); 36 + 37 + madvise_range->num_vmas = 0; 38 + madvise_range->vmas = kmalloc_array(max_vmas, sizeof(*madvise_range->vmas), GFP_KERNEL); 39 + if (!madvise_range->vmas) 40 + return -ENOMEM; 41 + 42 + vm_dbg(&vm->xe->drm, "VMA's in range: start=0x%016llx, end=0x%016llx", addr, addr + range); 43 + 44 + drm_gpuvm_for_each_va_range(gpuva, &vm->gpuvm, addr, addr + range) { 45 + struct xe_vma *vma = gpuva_to_vma(gpuva); 46 + 47 + if (xe_vma_bo(vma)) 48 + madvise_range->has_bo_vmas = true; 49 + else if (xe_vma_is_cpu_addr_mirror(vma)) 50 + madvise_range->has_svm_vmas = true; 51 + else if (xe_vma_is_userptr(vma)) 52 + madvise_range->has_userptr_vmas = true; 53 + 54 + if (madvise_range->num_vmas == max_vmas) { 55 + max_vmas <<= 1; 56 + __vmas = krealloc(madvise_range->vmas, 57 + max_vmas * sizeof(*madvise_range->vmas), 58 + GFP_KERNEL); 59 + if (!__vmas) { 60 + kfree(madvise_range->vmas); 61 + return -ENOMEM; 62 + } 63 + madvise_range->vmas = __vmas; 64 + } 65 + 66 + madvise_range->vmas[madvise_range->num_vmas] = vma; 67 + (madvise_range->num_vmas)++; 68 + } 69 + 70 + if (!madvise_range->num_vmas) 71 + kfree(madvise_range->vmas); 72 + 73 + vm_dbg(&vm->xe->drm, "madvise_range-num_vmas = %d\n", madvise_range->num_vmas); 74 + 75 + return 0; 76 + } 77 + 78 + static void madvise_preferred_mem_loc(struct xe_device *xe, struct xe_vm *vm, 79 + struct xe_vma **vmas, int num_vmas, 80 + struct drm_xe_madvise *op) 81 + { 82 + int i; 83 + 84 + xe_assert(vm->xe, op->type == DRM_XE_MEM_RANGE_ATTR_PREFERRED_LOC); 85 + 86 + for (i = 0; i < num_vmas; i++) { 87 + /*TODO: Extend attributes to bo based vmas */ 88 + if ((vmas[i]->attr.preferred_loc.devmem_fd == op->preferred_mem_loc.devmem_fd && 89 + vmas[i]->attr.preferred_loc.migration_policy == 90 + op->preferred_mem_loc.migration_policy) || 91 + !xe_vma_is_cpu_addr_mirror(vmas[i])) { 92 + vmas[i]->skip_invalidation = true; 93 + } else { 94 + vmas[i]->skip_invalidation = false; 95 + vmas[i]->attr.preferred_loc.devmem_fd = op->preferred_mem_loc.devmem_fd; 96 + /* Till multi-device support is not added migration_policy 97 + * is of no use and can be ignored. 98 + */ 99 + vmas[i]->attr.preferred_loc.migration_policy = 100 + op->preferred_mem_loc.migration_policy; 101 + } 102 + } 103 + } 104 + 105 + static void madvise_atomic(struct xe_device *xe, struct xe_vm *vm, 106 + struct xe_vma **vmas, int num_vmas, 107 + struct drm_xe_madvise *op) 108 + { 109 + struct xe_bo *bo; 110 + int i; 111 + 112 + xe_assert(vm->xe, op->type == DRM_XE_MEM_RANGE_ATTR_ATOMIC); 113 + xe_assert(vm->xe, op->atomic.val <= DRM_XE_ATOMIC_CPU); 114 + 115 + for (i = 0; i < num_vmas; i++) { 116 + if (xe_vma_is_userptr(vmas[i]) && 117 + !(op->atomic.val == DRM_XE_ATOMIC_DEVICE && 118 + xe->info.has_device_atomics_on_smem)) { 119 + vmas[i]->skip_invalidation = true; 120 + continue; 121 + } 122 + 123 + if (vmas[i]->attr.atomic_access == op->atomic.val) { 124 + vmas[i]->skip_invalidation = true; 125 + } else { 126 + vmas[i]->skip_invalidation = false; 127 + vmas[i]->attr.atomic_access = op->atomic.val; 128 + } 129 + 130 + vmas[i]->attr.atomic_access = op->atomic.val; 131 + 132 + bo = xe_vma_bo(vmas[i]); 133 + if (!bo || bo->attr.atomic_access == op->atomic.val) 134 + continue; 135 + 136 + vmas[i]->skip_invalidation = false; 137 + xe_bo_assert_held(bo); 138 + bo->attr.atomic_access = op->atomic.val; 139 + 140 + /* Invalidate cpu page table, so bo can migrate to smem in next access */ 141 + if (xe_bo_is_vram(bo) && 142 + (bo->attr.atomic_access == DRM_XE_ATOMIC_CPU || 143 + bo->attr.atomic_access == DRM_XE_ATOMIC_GLOBAL)) 144 + ttm_bo_unmap_virtual(&bo->ttm); 145 + } 146 + } 147 + 148 + static void madvise_pat_index(struct xe_device *xe, struct xe_vm *vm, 149 + struct xe_vma **vmas, int num_vmas, 150 + struct drm_xe_madvise *op) 151 + { 152 + int i; 153 + 154 + xe_assert(vm->xe, op->type == DRM_XE_MEM_RANGE_ATTR_PAT); 155 + 156 + for (i = 0; i < num_vmas; i++) { 157 + if (vmas[i]->attr.pat_index == op->pat_index.val) { 158 + vmas[i]->skip_invalidation = true; 159 + } else { 160 + vmas[i]->skip_invalidation = false; 161 + vmas[i]->attr.pat_index = op->pat_index.val; 162 + } 163 + } 164 + } 165 + 166 + typedef void (*madvise_func)(struct xe_device *xe, struct xe_vm *vm, 167 + struct xe_vma **vmas, int num_vmas, 168 + struct drm_xe_madvise *op); 169 + 170 + static const madvise_func madvise_funcs[] = { 171 + [DRM_XE_MEM_RANGE_ATTR_PREFERRED_LOC] = madvise_preferred_mem_loc, 172 + [DRM_XE_MEM_RANGE_ATTR_ATOMIC] = madvise_atomic, 173 + [DRM_XE_MEM_RANGE_ATTR_PAT] = madvise_pat_index, 174 + }; 175 + 176 + static u8 xe_zap_ptes_in_madvise_range(struct xe_vm *vm, u64 start, u64 end) 177 + { 178 + struct drm_gpuva *gpuva; 179 + struct xe_tile *tile; 180 + u8 id, tile_mask = 0; 181 + 182 + lockdep_assert_held_write(&vm->lock); 183 + 184 + /* Wait for pending binds */ 185 + if (dma_resv_wait_timeout(xe_vm_resv(vm), DMA_RESV_USAGE_BOOKKEEP, 186 + false, MAX_SCHEDULE_TIMEOUT) <= 0) 187 + XE_WARN_ON(1); 188 + 189 + drm_gpuvm_for_each_va_range(gpuva, &vm->gpuvm, start, end) { 190 + struct xe_vma *vma = gpuva_to_vma(gpuva); 191 + 192 + if (vma->skip_invalidation || xe_vma_is_null(vma)) 193 + continue; 194 + 195 + if (xe_vma_is_cpu_addr_mirror(vma)) { 196 + tile_mask |= xe_svm_ranges_zap_ptes_in_range(vm, 197 + xe_vma_start(vma), 198 + xe_vma_end(vma)); 199 + } else { 200 + for_each_tile(tile, vm->xe, id) { 201 + if (xe_pt_zap_ptes(tile, vma)) { 202 + tile_mask |= BIT(id); 203 + 204 + /* 205 + * WRITE_ONCE pairs with READ_ONCE 206 + * in xe_vm_has_valid_gpu_mapping() 207 + */ 208 + WRITE_ONCE(vma->tile_invalidated, 209 + vma->tile_invalidated | BIT(id)); 210 + } 211 + } 212 + } 213 + } 214 + 215 + return tile_mask; 216 + } 217 + 218 + static int xe_vm_invalidate_madvise_range(struct xe_vm *vm, u64 start, u64 end) 219 + { 220 + u8 tile_mask = xe_zap_ptes_in_madvise_range(vm, start, end); 221 + 222 + if (!tile_mask) 223 + return 0; 224 + 225 + xe_device_wmb(vm->xe); 226 + 227 + return xe_vm_range_tilemask_tlb_inval(vm, start, end, tile_mask); 228 + } 229 + 230 + static bool madvise_args_are_sane(struct xe_device *xe, const struct drm_xe_madvise *args) 231 + { 232 + if (XE_IOCTL_DBG(xe, !args)) 233 + return false; 234 + 235 + if (XE_IOCTL_DBG(xe, !IS_ALIGNED(args->start, SZ_4K))) 236 + return false; 237 + 238 + if (XE_IOCTL_DBG(xe, !IS_ALIGNED(args->range, SZ_4K))) 239 + return false; 240 + 241 + if (XE_IOCTL_DBG(xe, args->range < SZ_4K)) 242 + return false; 243 + 244 + switch (args->type) { 245 + case DRM_XE_MEM_RANGE_ATTR_PREFERRED_LOC: 246 + { 247 + s32 fd = (s32)args->preferred_mem_loc.devmem_fd; 248 + 249 + if (XE_IOCTL_DBG(xe, fd < DRM_XE_PREFERRED_LOC_DEFAULT_SYSTEM)) 250 + return false; 251 + 252 + if (XE_IOCTL_DBG(xe, args->preferred_mem_loc.migration_policy > 253 + DRM_XE_MIGRATE_ONLY_SYSTEM_PAGES)) 254 + return false; 255 + 256 + if (XE_IOCTL_DBG(xe, args->preferred_mem_loc.pad)) 257 + return false; 258 + 259 + if (XE_IOCTL_DBG(xe, args->atomic.reserved)) 260 + return false; 261 + break; 262 + } 263 + case DRM_XE_MEM_RANGE_ATTR_ATOMIC: 264 + if (XE_IOCTL_DBG(xe, args->atomic.val > DRM_XE_ATOMIC_CPU)) 265 + return false; 266 + 267 + if (XE_IOCTL_DBG(xe, args->atomic.pad)) 268 + return false; 269 + 270 + if (XE_IOCTL_DBG(xe, args->atomic.reserved)) 271 + return false; 272 + 273 + break; 274 + case DRM_XE_MEM_RANGE_ATTR_PAT: 275 + { 276 + u16 coh_mode = xe_pat_index_get_coh_mode(xe, args->pat_index.val); 277 + 278 + if (XE_IOCTL_DBG(xe, !coh_mode)) 279 + return false; 280 + 281 + if (XE_WARN_ON(coh_mode > XE_COH_AT_LEAST_1WAY)) 282 + return false; 283 + 284 + if (XE_IOCTL_DBG(xe, args->pat_index.pad)) 285 + return false; 286 + 287 + if (XE_IOCTL_DBG(xe, args->pat_index.reserved)) 288 + return false; 289 + break; 290 + } 291 + default: 292 + if (XE_IOCTL_DBG(xe, 1)) 293 + return false; 294 + } 295 + 296 + if (XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1])) 297 + return false; 298 + 299 + return true; 300 + } 301 + 302 + static bool check_bo_args_are_sane(struct xe_vm *vm, struct xe_vma **vmas, 303 + int num_vmas, u32 atomic_val) 304 + { 305 + struct xe_device *xe = vm->xe; 306 + struct xe_bo *bo; 307 + int i; 308 + 309 + for (i = 0; i < num_vmas; i++) { 310 + bo = xe_vma_bo(vmas[i]); 311 + if (!bo) 312 + continue; 313 + /* 314 + * NOTE: The following atomic checks are platform-specific. For example, 315 + * if a device supports CXL atomics, these may not be necessary or 316 + * may behave differently. 317 + */ 318 + if (XE_IOCTL_DBG(xe, atomic_val == DRM_XE_ATOMIC_CPU && 319 + !(bo->flags & XE_BO_FLAG_SYSTEM))) 320 + return false; 321 + 322 + if (XE_IOCTL_DBG(xe, atomic_val == DRM_XE_ATOMIC_DEVICE && 323 + !(bo->flags & XE_BO_FLAG_VRAM0) && 324 + !(bo->flags & XE_BO_FLAG_VRAM1) && 325 + !(bo->flags & XE_BO_FLAG_SYSTEM && 326 + xe->info.has_device_atomics_on_smem))) 327 + return false; 328 + 329 + if (XE_IOCTL_DBG(xe, atomic_val == DRM_XE_ATOMIC_GLOBAL && 330 + (!(bo->flags & XE_BO_FLAG_SYSTEM) || 331 + (!(bo->flags & XE_BO_FLAG_VRAM0) && 332 + !(bo->flags & XE_BO_FLAG_VRAM1))))) 333 + return false; 334 + } 335 + return true; 336 + } 337 + /** 338 + * xe_vm_madvise_ioctl - Handle MADVise ioctl for a VM 339 + * @dev: DRM device pointer 340 + * @data: Pointer to ioctl data (drm_xe_madvise*) 341 + * @file: DRM file pointer 342 + * 343 + * Handles the MADVISE ioctl to provide memory advice for vma's within 344 + * input range. 345 + * 346 + * Return: 0 on success or a negative error code on failure. 347 + */ 348 + int xe_vm_madvise_ioctl(struct drm_device *dev, void *data, struct drm_file *file) 349 + { 350 + struct xe_device *xe = to_xe_device(dev); 351 + struct xe_file *xef = to_xe_file(file); 352 + struct drm_xe_madvise *args = data; 353 + struct xe_vmas_in_madvise_range madvise_range = {.addr = args->start, 354 + .range = args->range, }; 355 + struct xe_vm *vm; 356 + struct drm_exec exec; 357 + int err, attr_type; 358 + 359 + vm = xe_vm_lookup(xef, args->vm_id); 360 + if (XE_IOCTL_DBG(xe, !vm)) 361 + return -EINVAL; 362 + 363 + if (!madvise_args_are_sane(vm->xe, args)) { 364 + err = -EINVAL; 365 + goto put_vm; 366 + } 367 + 368 + xe_svm_flush(vm); 369 + 370 + err = down_write_killable(&vm->lock); 371 + if (err) 372 + goto put_vm; 373 + 374 + if (XE_IOCTL_DBG(xe, xe_vm_is_closed_or_banned(vm))) { 375 + err = -ENOENT; 376 + goto unlock_vm; 377 + } 378 + 379 + err = xe_vm_alloc_madvise_vma(vm, args->start, args->range); 380 + if (err) 381 + goto unlock_vm; 382 + 383 + err = get_vmas(vm, &madvise_range); 384 + if (err || !madvise_range.num_vmas) 385 + goto unlock_vm; 386 + 387 + if (madvise_range.has_bo_vmas) { 388 + if (args->type == DRM_XE_MEM_RANGE_ATTR_ATOMIC) { 389 + if (!check_bo_args_are_sane(vm, madvise_range.vmas, 390 + madvise_range.num_vmas, 391 + args->atomic.val)) { 392 + err = -EINVAL; 393 + goto unlock_vm; 394 + } 395 + } 396 + 397 + drm_exec_init(&exec, DRM_EXEC_IGNORE_DUPLICATES | DRM_EXEC_INTERRUPTIBLE_WAIT, 0); 398 + drm_exec_until_all_locked(&exec) { 399 + for (int i = 0; i < madvise_range.num_vmas; i++) { 400 + struct xe_bo *bo = xe_vma_bo(madvise_range.vmas[i]); 401 + 402 + if (!bo) 403 + continue; 404 + err = drm_exec_lock_obj(&exec, &bo->ttm.base); 405 + drm_exec_retry_on_contention(&exec); 406 + if (err) 407 + goto err_fini; 408 + } 409 + } 410 + } 411 + 412 + if (madvise_range.has_userptr_vmas) { 413 + err = down_read_interruptible(&vm->userptr.notifier_lock); 414 + if (err) 415 + goto err_fini; 416 + } 417 + 418 + if (madvise_range.has_svm_vmas) { 419 + err = down_read_interruptible(&vm->svm.gpusvm.notifier_lock); 420 + if (err) 421 + goto unlock_userptr; 422 + } 423 + 424 + attr_type = array_index_nospec(args->type, ARRAY_SIZE(madvise_funcs)); 425 + madvise_funcs[attr_type](xe, vm, madvise_range.vmas, madvise_range.num_vmas, args); 426 + 427 + err = xe_vm_invalidate_madvise_range(vm, args->start, args->start + args->range); 428 + 429 + if (madvise_range.has_svm_vmas) 430 + xe_svm_notifier_unlock(vm); 431 + 432 + unlock_userptr: 433 + if (madvise_range.has_userptr_vmas) 434 + up_read(&vm->userptr.notifier_lock); 435 + err_fini: 436 + if (madvise_range.has_bo_vmas) 437 + drm_exec_fini(&exec); 438 + kfree(madvise_range.vmas); 439 + madvise_range.vmas = NULL; 440 + unlock_vm: 441 + up_write(&vm->lock); 442 + put_vm: 443 + xe_vm_put(vm); 444 + return err; 445 + }

+15

drivers/gpu/drm/xe/xe_vm_madvise.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_VM_MADVISE_H_ 7 + #define _XE_VM_MADVISE_H_ 8 + 9 + struct drm_device; 10 + struct drm_file; 11 + 12 + int xe_vm_madvise_ioctl(struct drm_device *dev, void *data, 13 + struct drm_file *file); 14 + 15 + #endif

+53 -4

drivers/gpu/drm/xe/xe_vm_types.h

··· 77 77 #endif 78 78 }; 79 79 80 + /** 81 + * struct xe_vma_mem_attr - memory attributes associated with vma 82 + */ 83 + struct xe_vma_mem_attr { 84 + /** @preferred_loc: perferred memory_location */ 85 + struct { 86 + /** @preferred_loc.migration_policy: Pages migration policy */ 87 + u32 migration_policy; 88 + 89 + /** 90 + * @preferred_loc.devmem_fd: used for determining pagemap_fd 91 + * requested by user DRM_XE_PREFERRED_LOC_DEFAULT_SYSTEM and 92 + * DRM_XE_PREFERRED_LOC_DEFAULT_DEVICE mean system memory or 93 + * closest device memory respectively. 94 + */ 95 + u32 devmem_fd; 96 + } preferred_loc; 97 + 98 + /** 99 + * @atomic_access: The atomic access type for the vma 100 + * See %DRM_XE_VMA_ATOMIC_UNDEFINED, %DRM_XE_VMA_ATOMIC_DEVICE, 101 + * %DRM_XE_VMA_ATOMIC_GLOBAL, and %DRM_XE_VMA_ATOMIC_CPU for possible 102 + * values. These are defined in uapi/drm/xe_drm.h. 103 + */ 104 + u32 atomic_access; 105 + 106 + /** 107 + * @default_pat_index: The pat index for VMA set during first bind by user. 108 + */ 109 + u16 default_pat_index; 110 + 111 + /** 112 + * @pat_index: The pat index to use when encoding the PTEs for this vma. 113 + * same as default_pat_index unless overwritten by madvise. 114 + */ 115 + u16 pat_index; 116 + }; 117 + 80 118 struct xe_vma { 81 119 /** @gpuva: Base GPUVA object */ 82 120 struct drm_gpuva gpuva; ··· 164 126 u8 tile_staged; 165 127 166 128 /** 167 - * @pat_index: The pat index to use when encoding the PTEs for this vma. 129 + * @skip_invalidation: Used in madvise to avoid invalidation 130 + * if mem attributes doesn't change 168 131 */ 169 - u16 pat_index; 132 + bool skip_invalidation; 170 133 171 134 /** 172 135 * @ufence: The user fence that was provided with MAP. 173 136 * Needs to be signalled before UNMAP can be processed. 174 137 */ 175 138 struct xe_user_fence *ufence; 139 + 140 + /** 141 + * @attr: The attributes of vma which determines the migration policy 142 + * and encoding of the PTEs for this vma. 143 + */ 144 + struct xe_vma_mem_attr attr; 176 145 }; 177 146 178 147 /** ··· 440 395 struct xarray range; 441 396 /** @ranges_count: number of svm ranges to map */ 442 397 u32 ranges_count; 443 - /** @region: memory region to prefetch to */ 444 - u32 region; 398 + /** 399 + * @tile: Pointer to the tile structure containing memory to prefetch. 400 + * NULL if prefetch requested region is smem 401 + */ 402 + struct xe_tile *tile; 445 403 }; 446 404 447 405 /** enum xe_vma_op_flags - flags for VMA operation */ ··· 510 462 struct xe_vm_pgtable_update_ops pt_update_ops[XE_MAX_TILES_PER_DEVICE]; 511 463 /** @flag: signify the properties within xe_vma_ops*/ 512 464 #define XE_VMA_OPS_FLAG_HAS_SVM_PREFETCH BIT(0) 465 + #define XE_VMA_OPS_FLAG_MADVISE BIT(1) 513 466 u32 flags; 514 467 #ifdef TEST_VM_OPS_ERROR 515 468 /** @inject_error: inject error to test error handling */

+151 -60

drivers/gpu/drm/xe/xe_vram.c

··· 3 3 * Copyright © 2021-2024 Intel Corporation 4 4 */ 5 5 6 + #include <kunit/visibility.h> 6 7 #include <linux/pci.h> 7 8 8 9 #include <drm/drm_managed.h> ··· 20 19 #include "xe_mmio.h" 21 20 #include "xe_module.h" 22 21 #include "xe_sriov.h" 22 + #include "xe_ttm_vram_mgr.h" 23 23 #include "xe_vram.h" 24 + #include "xe_vram_types.h" 24 25 25 26 #define BAR_SIZE_SHIFT 20 26 27 ··· 139 136 return true; 140 137 } 141 138 142 - static int determine_lmem_bar_size(struct xe_device *xe) 139 + static int determine_lmem_bar_size(struct xe_device *xe, struct xe_vram_region *lmem_bar) 143 140 { 144 141 struct pci_dev *pdev = to_pci_dev(xe->drm.dev); 145 142 ··· 150 147 151 148 resize_vram_bar(xe); 152 149 153 - xe->mem.vram.io_start = pci_resource_start(pdev, LMEM_BAR); 154 - xe->mem.vram.io_size = pci_resource_len(pdev, LMEM_BAR); 155 - if (!xe->mem.vram.io_size) 150 + lmem_bar->io_start = pci_resource_start(pdev, LMEM_BAR); 151 + lmem_bar->io_size = pci_resource_len(pdev, LMEM_BAR); 152 + if (!lmem_bar->io_size) 156 153 return -EIO; 157 154 158 155 /* XXX: Need to change when xe link code is ready */ 159 - xe->mem.vram.dpa_base = 0; 156 + lmem_bar->dpa_base = 0; 160 157 161 158 /* set up a map to the total memory area. */ 162 - xe->mem.vram.mapping = ioremap_wc(xe->mem.vram.io_start, xe->mem.vram.io_size); 159 + lmem_bar->mapping = devm_ioremap_wc(&pdev->dev, lmem_bar->io_start, lmem_bar->io_size); 163 160 164 161 return 0; 165 162 } ··· 281 278 struct xe_tile *tile; 282 279 int id; 283 280 284 - if (xe->mem.vram.mapping) 285 - iounmap(xe->mem.vram.mapping); 286 - 287 - xe->mem.vram.mapping = NULL; 281 + xe->mem.vram->mapping = NULL; 288 282 289 283 for_each_tile(tile, xe, id) 290 - tile->mem.vram.mapping = NULL; 284 + tile->mem.vram->mapping = NULL; 285 + } 286 + 287 + struct xe_vram_region *xe_vram_region_alloc(struct xe_device *xe, u8 id, u32 placement) 288 + { 289 + struct xe_vram_region *vram; 290 + struct drm_device *drm = &xe->drm; 291 + 292 + xe_assert(xe, id < xe->info.tile_count); 293 + 294 + vram = drmm_kzalloc(drm, sizeof(*vram), GFP_KERNEL); 295 + if (!vram) 296 + return NULL; 297 + 298 + vram->xe = xe; 299 + vram->id = id; 300 + vram->placement = placement; 301 + #if defined(CONFIG_DRM_XE_PAGEMAP) 302 + vram->migrate = xe->tiles[id].migrate; 303 + #endif 304 + return vram; 305 + } 306 + 307 + static void print_vram_region_info(struct xe_device *xe, struct xe_vram_region *vram) 308 + { 309 + struct drm_device *drm = &xe->drm; 310 + 311 + if (vram->io_size < vram->usable_size) 312 + drm_info(drm, "Small BAR device\n"); 313 + 314 + drm_info(drm, 315 + "VRAM[%u]: Actual physical size %pa, usable size exclude stolen %pa, CPU accessible size %pa\n", 316 + vram->id, &vram->actual_physical_size, &vram->usable_size, &vram->io_size); 317 + drm_info(drm, "VRAM[%u]: DPA range: [%pa-%llx], io range: [%pa-%llx]\n", 318 + vram->id, &vram->dpa_base, vram->dpa_base + (u64)vram->actual_physical_size, 319 + &vram->io_start, vram->io_start + (u64)vram->io_size); 320 + } 321 + 322 + static int vram_region_init(struct xe_device *xe, struct xe_vram_region *vram, 323 + struct xe_vram_region *lmem_bar, u64 offset, u64 usable_size, 324 + u64 region_size, resource_size_t remain_io_size) 325 + { 326 + /* Check if VRAM region is already initialized */ 327 + if (vram->mapping) 328 + return 0; 329 + 330 + vram->actual_physical_size = region_size; 331 + vram->io_start = lmem_bar->io_start + offset; 332 + vram->io_size = min_t(u64, usable_size, remain_io_size); 333 + 334 + if (!vram->io_size) { 335 + drm_err(&xe->drm, "Tile without any CPU visible VRAM. Aborting.\n"); 336 + return -ENODEV; 337 + } 338 + 339 + vram->dpa_base = lmem_bar->dpa_base + offset; 340 + vram->mapping = lmem_bar->mapping + offset; 341 + vram->usable_size = usable_size; 342 + 343 + print_vram_region_info(xe, vram); 344 + 345 + return 0; 291 346 } 292 347 293 348 /** ··· 359 298 int xe_vram_probe(struct xe_device *xe) 360 299 { 361 300 struct xe_tile *tile; 362 - resource_size_t io_size; 301 + struct xe_vram_region lmem_bar; 302 + resource_size_t remain_io_size; 363 303 u64 available_size = 0; 364 304 u64 total_size = 0; 365 - u64 tile_offset; 366 - u64 tile_size; 367 - u64 vram_size; 368 305 int err; 369 306 u8 id; 370 307 371 308 if (!IS_DGFX(xe)) 372 309 return 0; 373 310 374 - /* Get the size of the root tile's vram for later accessibility comparison */ 375 - tile = xe_device_get_root_tile(xe); 376 - err = tile_vram_size(tile, &vram_size, &tile_size, &tile_offset); 311 + err = determine_lmem_bar_size(xe, &lmem_bar); 377 312 if (err) 378 313 return err; 314 + drm_info(&xe->drm, "VISIBLE VRAM: %pa, %pa\n", &lmem_bar.io_start, &lmem_bar.io_size); 379 315 380 - err = determine_lmem_bar_size(xe); 381 - if (err) 382 - return err; 316 + remain_io_size = lmem_bar.io_size; 383 317 384 - drm_info(&xe->drm, "VISIBLE VRAM: %pa, %pa\n", &xe->mem.vram.io_start, 385 - &xe->mem.vram.io_size); 386 - 387 - io_size = xe->mem.vram.io_size; 388 - 389 - /* tile specific ranges */ 390 318 for_each_tile(tile, xe, id) { 391 - err = tile_vram_size(tile, &vram_size, &tile_size, &tile_offset); 319 + u64 region_size; 320 + u64 usable_size; 321 + u64 tile_offset; 322 + 323 + err = tile_vram_size(tile, &usable_size, &region_size, &tile_offset); 392 324 if (err) 393 325 return err; 394 326 395 - tile->mem.vram.actual_physical_size = tile_size; 396 - tile->mem.vram.io_start = xe->mem.vram.io_start + tile_offset; 397 - tile->mem.vram.io_size = min_t(u64, vram_size, io_size); 327 + total_size += region_size; 328 + available_size += usable_size; 398 329 399 - if (!tile->mem.vram.io_size) { 400 - drm_err(&xe->drm, "Tile without any CPU visible VRAM. Aborting.\n"); 401 - return -ENODEV; 402 - } 330 + err = vram_region_init(xe, tile->mem.vram, &lmem_bar, tile_offset, usable_size, 331 + region_size, remain_io_size); 332 + if (err) 333 + return err; 403 334 404 - tile->mem.vram.dpa_base = xe->mem.vram.dpa_base + tile_offset; 405 - tile->mem.vram.usable_size = vram_size; 406 - tile->mem.vram.mapping = xe->mem.vram.mapping + tile_offset; 407 - 408 - if (tile->mem.vram.io_size < tile->mem.vram.usable_size) 409 - drm_info(&xe->drm, "Small BAR device\n"); 410 - drm_info(&xe->drm, "VRAM[%u, %u]: Actual physical size %pa, usable size exclude stolen %pa, CPU accessible size %pa\n", id, 411 - tile->id, &tile->mem.vram.actual_physical_size, &tile->mem.vram.usable_size, &tile->mem.vram.io_size); 412 - drm_info(&xe->drm, "VRAM[%u, %u]: DPA range: [%pa-%llx], io range: [%pa-%llx]\n", id, tile->id, 413 - &tile->mem.vram.dpa_base, tile->mem.vram.dpa_base + (u64)tile->mem.vram.actual_physical_size, 414 - &tile->mem.vram.io_start, tile->mem.vram.io_start + (u64)tile->mem.vram.io_size); 415 - 416 - /* calculate total size using tile size to get the correct HW sizing */ 417 - total_size += tile_size; 418 - available_size += vram_size; 419 - 420 - if (total_size > xe->mem.vram.io_size) { 335 + if (total_size > lmem_bar.io_size) { 421 336 drm_info(&xe->drm, "VRAM: %pa is larger than resource %pa\n", 422 - &total_size, &xe->mem.vram.io_size); 337 + &total_size, &lmem_bar.io_size); 423 338 } 424 339 425 - io_size -= min_t(u64, tile_size, io_size); 340 + remain_io_size -= min_t(u64, tile->mem.vram->actual_physical_size, remain_io_size); 426 341 } 427 342 428 - xe->mem.vram.actual_physical_size = total_size; 429 - 430 - drm_info(&xe->drm, "Total VRAM: %pa, %pa\n", &xe->mem.vram.io_start, 431 - &xe->mem.vram.actual_physical_size); 432 - drm_info(&xe->drm, "Available VRAM: %pa, %pa\n", &xe->mem.vram.io_start, 433 - &available_size); 343 + err = vram_region_init(xe, xe->mem.vram, &lmem_bar, 0, available_size, total_size, 344 + lmem_bar.io_size); 345 + if (err) 346 + return err; 434 347 435 348 return devm_add_action_or_reset(xe->drm.dev, vram_fini, xe); 436 349 } 350 + 351 + /** 352 + * xe_vram_region_io_start - Get the IO start of a VRAM region 353 + * @vram: the VRAM region 354 + * 355 + * Return: the IO start of the VRAM region, or 0 if not valid 356 + */ 357 + resource_size_t xe_vram_region_io_start(const struct xe_vram_region *vram) 358 + { 359 + return vram ? vram->io_start : 0; 360 + } 361 + 362 + /** 363 + * xe_vram_region_io_size - Get the IO size of a VRAM region 364 + * @vram: the VRAM region 365 + * 366 + * Return: the IO size of the VRAM region, or 0 if not valid 367 + */ 368 + resource_size_t xe_vram_region_io_size(const struct xe_vram_region *vram) 369 + { 370 + return vram ? vram->io_size : 0; 371 + } 372 + 373 + /** 374 + * xe_vram_region_dpa_base - Get the DPA base of a VRAM region 375 + * @vram: the VRAM region 376 + * 377 + * Return: the DPA base of the VRAM region, or 0 if not valid 378 + */ 379 + resource_size_t xe_vram_region_dpa_base(const struct xe_vram_region *vram) 380 + { 381 + return vram ? vram->dpa_base : 0; 382 + } 383 + 384 + /** 385 + * xe_vram_region_usable_size - Get the usable size of a VRAM region 386 + * @vram: the VRAM region 387 + * 388 + * Return: the usable size of the VRAM region, or 0 if not valid 389 + */ 390 + resource_size_t xe_vram_region_usable_size(const struct xe_vram_region *vram) 391 + { 392 + return vram ? vram->usable_size : 0; 393 + } 394 + 395 + /** 396 + * xe_vram_region_actual_physical_size - Get the actual physical size of a VRAM region 397 + * @vram: the VRAM region 398 + * 399 + * Return: the actual physical size of the VRAM region, or 0 if not valid 400 + */ 401 + resource_size_t xe_vram_region_actual_physical_size(const struct xe_vram_region *vram) 402 + { 403 + return vram ? vram->actual_physical_size : 0; 404 + } 405 + EXPORT_SYMBOL_IF_KUNIT(xe_vram_region_actual_physical_size);

+11

drivers/gpu/drm/xe/xe_vram.h

··· 6 6 #ifndef _XE_VRAM_H_ 7 7 #define _XE_VRAM_H_ 8 8 9 + #include <linux/types.h> 10 + 9 11 struct xe_device; 12 + struct xe_vram_region; 10 13 11 14 int xe_vram_probe(struct xe_device *xe); 15 + 16 + struct xe_vram_region *xe_vram_region_alloc(struct xe_device *xe, u8 id, u32 placement); 17 + 18 + resource_size_t xe_vram_region_io_start(const struct xe_vram_region *vram); 19 + resource_size_t xe_vram_region_io_size(const struct xe_vram_region *vram); 20 + resource_size_t xe_vram_region_dpa_base(const struct xe_vram_region *vram); 21 + resource_size_t xe_vram_region_usable_size(const struct xe_vram_region *vram); 22 + resource_size_t xe_vram_region_actual_physical_size(const struct xe_vram_region *vram); 12 23 13 24 #endif

+2 -2

drivers/gpu/drm/xe/xe_vram_freq.c

··· 34 34 char *buf) 35 35 { 36 36 struct xe_tile *tile = dev_to_tile(dev); 37 - u32 val, mbox; 37 + u32 val = 0, mbox; 38 38 int err; 39 39 40 40 mbox = REG_FIELD_PREP(PCODE_MB_COMMAND, PCODE_FREQUENCY_CONFIG) ··· 56 56 char *buf) 57 57 { 58 58 struct xe_tile *tile = dev_to_tile(dev); 59 - u32 val, mbox; 59 + u32 val = 0, mbox; 60 60 int err; 61 61 62 62 mbox = REG_FIELD_PREP(PCODE_MB_COMMAND, PCODE_FREQUENCY_CONFIG)

+85

drivers/gpu/drm/xe/xe_vram_types.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_VRAM_TYPES_H_ 7 + #define _XE_VRAM_TYPES_H_ 8 + 9 + #if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP) 10 + #include <drm/drm_pagemap.h> 11 + #endif 12 + 13 + #include "xe_ttm_vram_mgr_types.h" 14 + 15 + struct xe_device; 16 + struct xe_migrate; 17 + 18 + /** 19 + * struct xe_vram_region - memory region structure 20 + * This is used to describe a memory region in xe 21 + * device, such as HBM memory or CXL extension memory. 22 + */ 23 + struct xe_vram_region { 24 + /** @xe: Back pointer to xe device */ 25 + struct xe_device *xe; 26 + /** 27 + * @id: VRAM region instance id 28 + * 29 + * The value should be unique for VRAM region. 30 + */ 31 + u8 id; 32 + /** @io_start: IO start address of this VRAM instance */ 33 + resource_size_t io_start; 34 + /** 35 + * @io_size: IO size of this VRAM instance 36 + * 37 + * This represents how much of this VRAM we can access 38 + * via the CPU through the VRAM BAR. This can be smaller 39 + * than @usable_size, in which case only part of VRAM is CPU 40 + * accessible (typically the first 256M). This 41 + * configuration is known as small-bar. 42 + */ 43 + resource_size_t io_size; 44 + /** @dpa_base: This memory regions's DPA (device physical address) base */ 45 + resource_size_t dpa_base; 46 + /** 47 + * @usable_size: usable size of VRAM 48 + * 49 + * Usable size of VRAM excluding reserved portions 50 + * (e.g stolen mem) 51 + */ 52 + resource_size_t usable_size; 53 + /** 54 + * @actual_physical_size: Actual VRAM size 55 + * 56 + * Actual VRAM size including reserved portions 57 + * (e.g stolen mem) 58 + */ 59 + resource_size_t actual_physical_size; 60 + /** @mapping: pointer to VRAM mappable space */ 61 + void __iomem *mapping; 62 + /** @ttm: VRAM TTM manager */ 63 + struct xe_ttm_vram_mgr ttm; 64 + /** @placement: TTM placement dedicated for this region */ 65 + u32 placement; 66 + #if IS_ENABLED(CONFIG_DRM_XE_PAGEMAP) 67 + /** @migrate: Back pointer to migrate */ 68 + struct xe_migrate *migrate; 69 + /** @pagemap: Used to remap device memory as ZONE_DEVICE */ 70 + struct dev_pagemap pagemap; 71 + /** 72 + * @dpagemap: The struct drm_pagemap of the ZONE_DEVICE memory 73 + * pages of this tile. 74 + */ 75 + struct drm_pagemap dpagemap; 76 + /** 77 + * @hpa_base: base host physical address 78 + * 79 + * This is generated when remap device memory as ZONE_DEVICE 80 + */ 81 + resource_size_t hpa_base; 82 + #endif 83 + }; 84 + 85 + #endif

+18 -7

drivers/gpu/drm/xe/xe_wa.c

··· 538 538 XE_RTP_RULES(GRAPHICS_VERSION(2004), ENGINE_CLASS(RENDER)), 539 539 XE_RTP_ACTIONS(SET(HALF_SLICE_CHICKEN7, CLEAR_OPTIMIZATION_DISABLE)) 540 540 }, 541 + { XE_RTP_NAME("13012615864"), 542 + XE_RTP_RULES(GRAPHICS_VERSION(2004), 543 + FUNC(xe_rtp_match_first_render_or_compute)), 544 + XE_RTP_ACTIONS(SET(TDL_TSL_CHICKEN, RES_CHK_SPR_DIS)) 545 + }, 541 546 542 547 /* Xe2_HPG */ 543 548 ··· 607 602 FUNC(xe_rtp_match_first_render_or_compute)), 608 603 XE_RTP_ACTIONS(SET(TDL_TSL_CHICKEN, STK_ID_RESTRICT)) 609 604 }, 605 + { XE_RTP_NAME("13012615864"), 606 + XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 2002), 607 + FUNC(xe_rtp_match_first_render_or_compute)), 608 + XE_RTP_ACTIONS(SET(TDL_TSL_CHICKEN, RES_CHK_SPR_DIS)) 609 + }, 610 610 611 611 /* Xe2_LPM */ 612 612 ··· 657 647 XE_RTP_ACTIONS(SET(TDL_CHICKEN, QID_WAIT_FOR_THREAD_NOT_RUN_DISABLE)) 658 648 }, 659 649 { XE_RTP_NAME("13012615864"), 660 - XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3001), 650 + XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3001), OR, 651 + GRAPHICS_VERSION(3003), 661 652 FUNC(xe_rtp_match_first_render_or_compute)), 662 653 XE_RTP_ACTIONS(SET(TDL_TSL_CHICKEN, RES_CHK_SPR_DIS)) 663 654 }, ··· 916 905 } 917 906 918 907 /** 919 - * xe_wa_process_oob - process OOB workaround table 908 + * xe_wa_process_gt_oob - process GT OOB workaround table 920 909 * @gt: GT instance to process workarounds for 921 910 * 922 911 * Process OOB workaround table for this platform, marking in @gt the 923 912 * workarounds that are active. 924 913 */ 925 - void xe_wa_process_oob(struct xe_gt *gt) 914 + void xe_wa_process_gt_oob(struct xe_gt *gt) 926 915 { 927 916 struct xe_rtp_process_ctx ctx = XE_RTP_PROCESS_CTX_INITIALIZER(gt); 928 917 ··· 1006 995 } 1007 996 1008 997 /** 1009 - * xe_wa_init - initialize gt with workaround bookkeeping 998 + * xe_wa_gt_init - initialize gt with workaround bookkeeping 1010 999 * @gt: GT instance to initialize 1011 1000 * 1012 1001 * Returns 0 for success, negative error code otherwise. 1013 1002 */ 1014 - int xe_wa_init(struct xe_gt *gt) 1003 + int xe_wa_gt_init(struct xe_gt *gt) 1015 1004 { 1016 1005 struct xe_device *xe = gt_to_xe(gt); 1017 1006 size_t n_oob, n_lrc, n_engine, n_gt, total; ··· 1037 1026 1038 1027 return 0; 1039 1028 } 1040 - ALLOW_ERROR_INJECTION(xe_wa_init, ERRNO); /* See xe_pci_probe() */ 1029 + ALLOW_ERROR_INJECTION(xe_wa_gt_init, ERRNO); /* See xe_pci_probe() */ 1041 1030 1042 1031 void xe_wa_device_dump(struct xe_device *xe, struct drm_printer *p) 1043 1032 { ··· 1090 1079 if (IS_SRIOV_VF(tile->xe)) 1091 1080 return; 1092 1081 1093 - if (XE_WA(tile->primary_gt, 22010954014)) 1082 + if (XE_GT_WA(tile->primary_gt, 22010954014)) 1094 1083 xe_mmio_rmw32(mmio, XEHP_CLOCK_GATE_DIS, 0, SGSI_SIDECLK_DIS); 1095 1084 }

+4 -4

drivers/gpu/drm/xe/xe_wa.h

··· 14 14 struct xe_tile; 15 15 16 16 int xe_wa_device_init(struct xe_device *xe); 17 - int xe_wa_init(struct xe_gt *gt); 17 + int xe_wa_gt_init(struct xe_gt *gt); 18 18 void xe_wa_process_device_oob(struct xe_device *xe); 19 - void xe_wa_process_oob(struct xe_gt *gt); 19 + void xe_wa_process_gt_oob(struct xe_gt *gt); 20 20 void xe_wa_process_gt(struct xe_gt *gt); 21 21 void xe_wa_process_engine(struct xe_hw_engine *hwe); 22 22 void xe_wa_process_lrc(struct xe_hw_engine *hwe); ··· 25 25 void xe_wa_dump(struct xe_gt *gt, struct drm_printer *p); 26 26 27 27 /** 28 - * XE_WA - Out-of-band workarounds, to be queried and called as needed. 28 + * XE_GT_WA - Out-of-band GT workarounds, to be queried and called as needed. 29 29 * @gt__: gt instance 30 30 * @id__: XE_OOB_<id__>, as generated by build system in generated/xe_wa_oob.h 31 31 */ 32 - #define XE_WA(gt__, id__) ({ \ 32 + #define XE_GT_WA(gt__, id__) ({ \ 33 33 xe_gt_assert(gt__, (gt__)->wa_active.oob_initialized); \ 34 34 test_bit(XE_WA_OOB_ ## id__, (gt__)->wa_active.oob); \ 35 35 })

+10 -1

drivers/gpu/drm/xe/xe_wa_oob.rules

··· 1 1 1607983814 GRAPHICS_VERSION_RANGE(1200, 1210) 2 + 16010904313 GRAPHICS_VERSION_RANGE(1200, 1210) 3 + 18022495364 GRAPHICS_VERSION_RANGE(1200, 1210) 2 4 22012773006 GRAPHICS_VERSION_RANGE(1200, 1250) 3 5 14014475959 GRAPHICS_VERSION_RANGE(1270, 1271), GRAPHICS_STEP(A0, B0) 4 6 PLATFORM(DG2) ··· 48 46 16023588340 GRAPHICS_VERSION(2001), FUNC(xe_rtp_match_not_sriov_vf) 49 47 14019789679 GRAPHICS_VERSION(1255) 50 48 GRAPHICS_VERSION_RANGE(1270, 2004) 51 - no_media_l3 MEDIA_VERSION(3000) 49 + no_media_l3 MEDIA_VERSION_RANGE(3000, 3002) 52 50 14022866841 GRAPHICS_VERSION(3000), GRAPHICS_STEP(A0, B0) 53 51 MEDIA_VERSION(3000), MEDIA_STEP(A0, B0) 54 52 16021333562 GRAPHICS_VERSION_RANGE(1200, 1274) ··· 68 66 MEDIA_VERSION_RANGE(1300, 3000) 69 67 MEDIA_VERSION(3002) 70 68 GRAPHICS_VERSION(3003) 69 + 14020001231 GRAPHICS_VERSION_RANGE(2001,2004), FUNC(xe_rtp_match_psmi_enabled) 70 + MEDIA_VERSION(2000), FUNC(xe_rtp_match_psmi_enabled) 71 + MEDIA_VERSION(3000), FUNC(xe_rtp_match_psmi_enabled) 72 + MEDIA_VERSION(3002), FUNC(xe_rtp_match_psmi_enabled) 73 + 16023683509 MEDIA_VERSION(2000), FUNC(xe_rtp_match_psmi_enabled) 74 + MEDIA_VERSION(3000), MEDIA_STEP(A0, B0), FUNC(xe_rtp_match_psmi_enabled) 71 75 72 76 # SoC workaround - currently applies to all platforms with the following 73 77 # primary GT GMDID 74 78 14022085890 GRAPHICS_VERSION(2001) 75 79 76 80 15015404425_disable PLATFORM(PANTHERLAKE), MEDIA_STEP(B0, FOREVER) 81 + 16026007364 MEDIA_VERSION(3000)

+4

include/drm/drm_device.h

··· 26 26 * Recovery methods for wedged device in order of less to more side-effects. 27 27 * To be used with drm_dev_wedged_event() as recovery @method. Callers can 28 28 * use any one, multiple (or'd) or none depending on their needs. 29 + * 30 + * Refer to "Device Wedging" chapter in Documentation/gpu/drm-uapi.rst for more 31 + * details. 29 32 */ 30 33 #define DRM_WEDGE_RECOVERY_NONE BIT(0) /* optional telemetry collection */ 31 34 #define DRM_WEDGE_RECOVERY_REBIND BIT(1) /* unbind + bind driver */ 32 35 #define DRM_WEDGE_RECOVERY_BUS_RESET BIT(2) /* unbind + reset bus device + bind */ 36 + #define DRM_WEDGE_RECOVERY_VENDOR BIT(3) /* vendor specific recovery method */ 33 37 34 38 /** 35 39 * struct drm_wedge_task_info - information about the guilty task of a wedge dev

+2 -2

include/drm/drm_gpusvm.h

··· 17 17 struct drm_gpusvm_ops; 18 18 struct drm_gpusvm_range; 19 19 struct drm_pagemap; 20 - struct drm_pagemap_device_addr; 20 + struct drm_pagemap_addr; 21 21 22 22 /** 23 23 * struct drm_gpusvm_ops - Operations structure for GPU SVM ··· 154 154 struct interval_tree_node itree; 155 155 struct list_head entry; 156 156 unsigned long notifier_seq; 157 - struct drm_pagemap_device_addr *dma_addr; 157 + struct drm_pagemap_addr *dma_addr; 158 158 struct drm_pagemap *dpagemap; 159 159 struct drm_gpusvm_range_flags flags; 160 160 };

+28 -22

include/drm/drm_pagemap.h

··· 6 6 #include <linux/hmm.h> 7 7 #include <linux/types.h> 8 8 9 + #define NR_PAGES(order) (1U << (order)) 10 + 9 11 struct drm_pagemap; 10 12 struct drm_pagemap_zdd; 11 13 struct device; ··· 25 23 }; 26 24 27 25 /** 28 - * struct drm_pagemap_device_addr - Device address representation. 26 + * struct drm_pagemap_addr - Address representation. 29 27 * @addr: The dma address or driver-defined address for driver private interconnects. 30 28 * @proto: The interconnect protocol. 31 29 * @order: The page order of the device mapping. (Size is PAGE_SIZE << order). ··· 34 32 * Note: There is room for improvement here. We should be able to pack into 35 33 * 64 bits. 36 34 */ 37 - struct drm_pagemap_device_addr { 35 + struct drm_pagemap_addr { 38 36 dma_addr_t addr; 39 37 u64 proto : 54; 40 38 u64 order : 8; ··· 42 40 }; 43 41 44 42 /** 45 - * drm_pagemap_device_addr_encode() - Encode a dma address with metadata 43 + * drm_pagemap_addr_encode() - Encode a dma address with metadata 46 44 * @addr: The dma address or driver-defined address for driver private interconnects. 47 45 * @proto: The interconnect protocol. 48 46 * @order: The page order of the dma mapping. (Size is PAGE_SIZE << order). 49 47 * @dir: The DMA direction. 50 48 * 51 - * Return: A struct drm_pagemap_device_addr encoding the above information. 49 + * Return: A struct drm_pagemap_addr encoding the above information. 52 50 */ 53 - static inline struct drm_pagemap_device_addr 54 - drm_pagemap_device_addr_encode(dma_addr_t addr, 55 - enum drm_interconnect_protocol proto, 56 - unsigned int order, 57 - enum dma_data_direction dir) 51 + static inline struct drm_pagemap_addr 52 + drm_pagemap_addr_encode(dma_addr_t addr, 53 + enum drm_interconnect_protocol proto, 54 + unsigned int order, 55 + enum dma_data_direction dir) 58 56 { 59 - return (struct drm_pagemap_device_addr) { 57 + return (struct drm_pagemap_addr) { 60 58 .addr = addr, 61 59 .proto = proto, 62 60 .order = order, ··· 77 75 * @order: The page order of the device mapping. (Size is PAGE_SIZE << order). 78 76 * @dir: The transfer direction. 79 77 */ 80 - struct drm_pagemap_device_addr (*device_map)(struct drm_pagemap *dpagemap, 81 - struct device *dev, 82 - struct page *page, 83 - unsigned int order, 84 - enum dma_data_direction dir); 78 + struct drm_pagemap_addr (*device_map)(struct drm_pagemap *dpagemap, 79 + struct device *dev, 80 + struct page *page, 81 + unsigned int order, 82 + enum dma_data_direction dir); 85 83 86 84 /** 87 85 * @device_unmap: Unmap a device address previously obtained using @device_map. ··· 92 90 */ 93 91 void (*device_unmap)(struct drm_pagemap *dpagemap, 94 92 struct device *dev, 95 - struct drm_pagemap_device_addr addr); 93 + struct drm_pagemap_addr addr); 96 94 97 95 /** 98 96 * @populate_mm: Populate part of the mm with @dpagemap memory, ··· 172 170 /** 173 171 * @copy_to_devmem: Copy to device memory (required for migration) 174 172 * @pages: Pointer to array of device memory pages (destination) 175 - * @dma_addr: Pointer to array of DMA addresses (source) 173 + * @pagemap_addr: Pointer to array of DMA information (source) 176 174 * @npages: Number of pages to copy 177 175 * 178 - * Copy pages to device memory. 176 + * Copy pages to device memory. If the order of a @pagemap_addr entry 177 + * is greater than 0, the entry is populated but subsequent entries 178 + * within the range of that order are not populated. 179 179 * 180 180 * Return: 0 on success, a negative error code on failure. 181 181 */ 182 182 int (*copy_to_devmem)(struct page **pages, 183 - dma_addr_t *dma_addr, 183 + struct drm_pagemap_addr *pagemap_addr, 184 184 unsigned long npages); 185 185 186 186 /** 187 187 * @copy_to_ram: Copy to system RAM (required for migration) 188 188 * @pages: Pointer to array of device memory pages (source) 189 - * @dma_addr: Pointer to array of DMA addresses (destination) 189 + * @pagemap_addr: Pointer to array of DMA information (destination) 190 190 * @npages: Number of pages to copy 191 191 * 192 - * Copy pages to system RAM. 192 + * Copy pages to system RAM. If the order of a @pagemap_addr entry 193 + * is greater than 0, the entry is populated but subsequent entries 194 + * within the range of that order are not populated. 193 195 * 194 196 * Return: 0 on success, a negative error code on failure. 195 197 */ 196 198 int (*copy_to_ram)(struct page **pages, 197 - dma_addr_t *dma_addr, 199 + struct drm_pagemap_addr *pagemap_addr, 198 200 unsigned long npages); 199 201 }; 200 202

+5

include/drm/intel/pciids.h

··· 26 26 #define __PCIIDS_H__ 27 27 28 28 #ifdef __KERNEL__ 29 + #define INTEL_PCI_DEVICE(_id, _info) { \ 30 + PCI_DEVICE(PCI_VENDOR_ID_INTEL, (_id)), \ 31 + .driver_data = (kernel_ulong_t)(_info), \ 32 + } 33 + 29 34 #define INTEL_VGA_DEVICE(_id, _info) { \ 30 35 PCI_DEVICE(PCI_VENDOR_ID_INTEL, (_id)), \ 31 36 .class = PCI_BASE_CLASS_DISPLAY << 16, .class_mask = 0xff << 16, \

+281 -1

include/uapi/drm/xe_drm.h

··· 81 81 * - &DRM_IOCTL_XE_EXEC 82 82 * - &DRM_IOCTL_XE_WAIT_USER_FENCE 83 83 * - &DRM_IOCTL_XE_OBSERVATION 84 + * - &DRM_IOCTL_XE_MADVISE 85 + * - &DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS 84 86 */ 85 87 86 88 /* ··· 104 102 #define DRM_XE_EXEC 0x09 105 103 #define DRM_XE_WAIT_USER_FENCE 0x0a 106 104 #define DRM_XE_OBSERVATION 0x0b 105 + #define DRM_XE_MADVISE 0x0c 106 + #define DRM_XE_VM_QUERY_MEM_RANGE_ATTRS 0x0d 107 107 108 108 /* Must be kept compact -- no holes */ 109 109 ··· 121 117 #define DRM_IOCTL_XE_EXEC DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC, struct drm_xe_exec) 122 118 #define DRM_IOCTL_XE_WAIT_USER_FENCE DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence) 123 119 #define DRM_IOCTL_XE_OBSERVATION DRM_IOW(DRM_COMMAND_BASE + DRM_XE_OBSERVATION, struct drm_xe_observation_param) 120 + #define DRM_IOCTL_XE_MADVISE DRM_IOW(DRM_COMMAND_BASE + DRM_XE_MADVISE, struct drm_xe_madvise) 121 + #define DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_QUERY_MEM_RANGE_ATTRS, struct drm_xe_vm_query_mem_range_attr) 124 122 125 123 /** 126 124 * DOC: Xe IOCTL Extensions ··· 766 760 * gem creation 767 761 * 768 762 * The @flags can be: 769 - * - %DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING 763 + * - %DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING - Modify the GEM object 764 + * allocation strategy by deferring physical memory allocation 765 + * until the object is either bound to a virtual memory region via 766 + * VM_BIND or accessed by the CPU. As a result, no backing memory is 767 + * reserved at the time of GEM object creation. 770 768 * - %DRM_XE_GEM_CREATE_FLAG_SCANOUT 771 769 * - %DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM - When using VRAM as a 772 770 * possible placement, ensure that the corresponding VRAM allocation ··· 1013 1003 * valid on VMs with DRM_XE_VM_CREATE_FLAG_FAULT_MODE set. The CPU address 1014 1004 * mirror flag are only valid for DRM_XE_VM_BIND_OP_MAP operations, the BO 1015 1005 * handle MBZ, and the BO offset MBZ. 1006 + * 1007 + * The @prefetch_mem_region_instance for %DRM_XE_VM_BIND_OP_PREFETCH can also be: 1008 + * - %DRM_XE_CONSULT_MEM_ADVISE_PREF_LOC, which ensures prefetching occurs in 1009 + * the memory region advised by madvise. 1016 1010 */ 1017 1011 struct drm_xe_vm_bind_op { 1018 1012 /** @extensions: Pointer to the first extension struct, if any */ ··· 1122 1108 /** @flags: Bind flags */ 1123 1109 __u32 flags; 1124 1110 1111 + #define DRM_XE_CONSULT_MEM_ADVISE_PREF_LOC -1 1125 1112 /** 1126 1113 * @prefetch_mem_region_instance: Memory region to prefetch VMA to. 1127 1114 * It is a region instance, not a mask. ··· 1987 1972 * Sampling rates are specified in GPU clock cycles. 1988 1973 */ 1989 1974 __u64 sampling_rates[]; 1975 + }; 1976 + 1977 + /** 1978 + * struct drm_xe_madvise - Input of &DRM_IOCTL_XE_MADVISE 1979 + * 1980 + * This structure is used to set memory attributes for a virtual address range 1981 + * in a VM. The type of attribute is specified by @type, and the corresponding 1982 + * union member is used to provide additional parameters for @type. 1983 + * 1984 + * Supported attribute types: 1985 + * - DRM_XE_MEM_RANGE_ATTR_PREFERRED_LOC: Set preferred memory location. 1986 + * - DRM_XE_MEM_RANGE_ATTR_ATOMIC: Set atomic access policy. 1987 + * - DRM_XE_MEM_RANGE_ATTR_PAT: Set page attribute table index. 1988 + * 1989 + * Example: 1990 + * 1991 + * .. code-block:: C 1992 + * 1993 + * struct drm_xe_madvise madvise = { 1994 + * .vm_id = vm_id, 1995 + * .start = 0x100000, 1996 + * .range = 0x2000, 1997 + * .type = DRM_XE_MEM_RANGE_ATTR_ATOMIC, 1998 + * .atomic_val = DRM_XE_ATOMIC_DEVICE, 1999 + * }; 2000 + * 2001 + * ioctl(fd, DRM_IOCTL_XE_MADVISE, &madvise); 2002 + * 2003 + */ 2004 + struct drm_xe_madvise { 2005 + /** @extensions: Pointer to the first extension struct, if any */ 2006 + __u64 extensions; 2007 + 2008 + /** @start: start of the virtual address range */ 2009 + __u64 start; 2010 + 2011 + /** @range: size of the virtual address range */ 2012 + __u64 range; 2013 + 2014 + /** @vm_id: vm_id of the virtual range */ 2015 + __u32 vm_id; 2016 + 2017 + #define DRM_XE_MEM_RANGE_ATTR_PREFERRED_LOC 0 2018 + #define DRM_XE_MEM_RANGE_ATTR_ATOMIC 1 2019 + #define DRM_XE_MEM_RANGE_ATTR_PAT 2 2020 + /** @type: type of attribute */ 2021 + __u32 type; 2022 + 2023 + union { 2024 + /** 2025 + * @preferred_mem_loc: preferred memory location 2026 + * 2027 + * Used when @type == DRM_XE_MEM_RANGE_ATTR_PREFERRED_LOC 2028 + * 2029 + * Supported values for @preferred_mem_loc.devmem_fd: 2030 + * - DRM_XE_PREFERRED_LOC_DEFAULT_DEVICE: set vram of fault tile as preferred loc 2031 + * - DRM_XE_PREFERRED_LOC_DEFAULT_SYSTEM: set smem as preferred loc 2032 + * 2033 + * Supported values for @preferred_mem_loc.migration_policy: 2034 + * - DRM_XE_MIGRATE_ALL_PAGES 2035 + * - DRM_XE_MIGRATE_ONLY_SYSTEM_PAGES 2036 + */ 2037 + struct { 2038 + #define DRM_XE_PREFERRED_LOC_DEFAULT_DEVICE 0 2039 + #define DRM_XE_PREFERRED_LOC_DEFAULT_SYSTEM -1 2040 + /** @preferred_mem_loc.devmem_fd: fd for preferred loc */ 2041 + __u32 devmem_fd; 2042 + 2043 + #define DRM_XE_MIGRATE_ALL_PAGES 0 2044 + #define DRM_XE_MIGRATE_ONLY_SYSTEM_PAGES 1 2045 + /** @preferred_mem_loc.migration_policy: Page migration policy */ 2046 + __u16 migration_policy; 2047 + 2048 + /** @preferred_mem_loc.pad : MBZ */ 2049 + __u16 pad; 2050 + 2051 + /** @preferred_mem_loc.reserved : Reserved */ 2052 + __u64 reserved; 2053 + } preferred_mem_loc; 2054 + 2055 + /** 2056 + * @atomic: Atomic access policy 2057 + * 2058 + * Used when @type == DRM_XE_MEM_RANGE_ATTR_ATOMIC. 2059 + * 2060 + * Supported values for @atomic.val: 2061 + * - DRM_XE_ATOMIC_UNDEFINED: Undefined or default behaviour. 2062 + * Support both GPU and CPU atomic operations for system allocator. 2063 + * Support GPU atomic operations for normal(bo) allocator. 2064 + * - DRM_XE_ATOMIC_DEVICE: Support GPU atomic operations. 2065 + * - DRM_XE_ATOMIC_GLOBAL: Support both GPU and CPU atomic operations. 2066 + * - DRM_XE_ATOMIC_CPU: Support CPU atomic only, no GPU atomics supported. 2067 + */ 2068 + struct { 2069 + #define DRM_XE_ATOMIC_UNDEFINED 0 2070 + #define DRM_XE_ATOMIC_DEVICE 1 2071 + #define DRM_XE_ATOMIC_GLOBAL 2 2072 + #define DRM_XE_ATOMIC_CPU 3 2073 + /** @atomic.val: value of atomic operation */ 2074 + __u32 val; 2075 + 2076 + /** @atomic.pad: MBZ */ 2077 + __u32 pad; 2078 + 2079 + /** @atomic.reserved: Reserved */ 2080 + __u64 reserved; 2081 + } atomic; 2082 + 2083 + /** 2084 + * @pat_index: Page attribute table index 2085 + * 2086 + * Used when @type == DRM_XE_MEM_RANGE_ATTR_PAT. 2087 + */ 2088 + struct { 2089 + /** @pat_index.val: PAT index value */ 2090 + __u32 val; 2091 + 2092 + /** @pat_index.pad: MBZ */ 2093 + __u32 pad; 2094 + 2095 + /** @pat_index.reserved: Reserved */ 2096 + __u64 reserved; 2097 + } pat_index; 2098 + }; 2099 + 2100 + /** @reserved: Reserved */ 2101 + __u64 reserved[2]; 2102 + }; 2103 + 2104 + /** 2105 + * struct drm_xe_mem_range_attr - Output of &DRM_IOCTL_XE_VM_QUERY_MEM_RANGES_ATTRS 2106 + * 2107 + * This structure is provided by userspace and filled by KMD in response to the 2108 + * DRM_IOCTL_XE_VM_QUERY_MEM_RANGES_ATTRS ioctl. It describes memory attributes of 2109 + * a memory ranges within a user specified address range in a VM. 2110 + * 2111 + * The structure includes information such as atomic access policy, 2112 + * page attribute table (PAT) index, and preferred memory location. 2113 + * Userspace allocates an array of these structures and passes a pointer to the 2114 + * ioctl to retrieve attributes for each memory ranges 2115 + * 2116 + * @extensions: Pointer to the first extension struct, if any 2117 + * @start: Start address of the memory range 2118 + * @end: End address of the virtual memory range 2119 + * 2120 + */ 2121 + struct drm_xe_mem_range_attr { 2122 + /** @extensions: Pointer to the first extension struct, if any */ 2123 + __u64 extensions; 2124 + 2125 + /** @start: start of the memory range */ 2126 + __u64 start; 2127 + 2128 + /** @end: end of the memory range */ 2129 + __u64 end; 2130 + 2131 + /** @preferred_mem_loc: preferred memory location */ 2132 + struct { 2133 + /** @preferred_mem_loc.devmem_fd: fd for preferred loc */ 2134 + __u32 devmem_fd; 2135 + 2136 + /** @preferred_mem_loc.migration_policy: Page migration policy */ 2137 + __u32 migration_policy; 2138 + } preferred_mem_loc; 2139 + 2140 + /** @atomic: Atomic access policy */ 2141 + struct { 2142 + /** @atomic.val: atomic attribute */ 2143 + __u32 val; 2144 + 2145 + /** @atomic.reserved: Reserved */ 2146 + __u32 reserved; 2147 + } atomic; 2148 + 2149 + /** @pat_index: Page attribute table index */ 2150 + struct { 2151 + /** @pat_index.val: PAT index */ 2152 + __u32 val; 2153 + 2154 + /** @pat_index.reserved: Reserved */ 2155 + __u32 reserved; 2156 + } pat_index; 2157 + 2158 + /** @reserved: Reserved */ 2159 + __u64 reserved[2]; 2160 + }; 2161 + 2162 + /** 2163 + * struct drm_xe_vm_query_mem_range_attr - Input of &DRM_IOCTL_XE_VM_QUERY_MEM_ATTRIBUTES 2164 + * 2165 + * This structure is used to query memory attributes of memory regions 2166 + * within a user specified address range in a VM. It provides detailed 2167 + * information about each memory range, including atomic access policy, 2168 + * page attribute table (PAT) index, and preferred memory location. 2169 + * 2170 + * Userspace first calls the ioctl with @num_mem_ranges = 0, 2171 + * @sizeof_mem_ranges_attr = 0 and @vector_of_vma_mem_attr = NULL to retrieve 2172 + * the number of memory regions and size of each memory range attribute. 2173 + * Then, it allocates a buffer of that size and calls the ioctl again to fill 2174 + * the buffer with memory range attributes. 2175 + * 2176 + * If second call fails with -ENOSPC, it means memory ranges changed between 2177 + * first call and now, retry IOCTL again with @num_mem_ranges = 0, 2178 + * @sizeof_mem_ranges_attr = 0 and @vector_of_vma_mem_attr = NULL followed by 2179 + * Second ioctl call. 2180 + * 2181 + * Example: 2182 + * 2183 + * .. code-block:: C 2184 + * 2185 + * struct drm_xe_vm_query_mem_range_attr query = { 2186 + * .vm_id = vm_id, 2187 + * .start = 0x100000, 2188 + * .range = 0x2000, 2189 + * }; 2190 + * 2191 + * // First ioctl call to get num of mem regions and sizeof each attribute 2192 + * ioctl(fd, DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS, &query); 2193 + * 2194 + * // Allocate buffer for the memory region attributes 2195 + * void *ptr = malloc(query.num_mem_ranges * query.sizeof_mem_range_attr); 2196 + * void *ptr_start = ptr; 2197 + * 2198 + * query.vector_of_mem_attr = (uintptr_t)ptr; 2199 + * 2200 + * // Second ioctl call to actually fill the memory attributes 2201 + * ioctl(fd, DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS, &query); 2202 + * 2203 + * // Iterate over the returned memory region attributes 2204 + * for (unsigned int i = 0; i < query.num_mem_ranges; ++i) { 2205 + * struct drm_xe_mem_range_attr *attr = (struct drm_xe_mem_range_attr *)ptr; 2206 + * 2207 + * // Do something with attr 2208 + * 2209 + * // Move pointer by one entry 2210 + * ptr += query.sizeof_mem_range_attr; 2211 + * } 2212 + * 2213 + * free(ptr_start); 2214 + */ 2215 + struct drm_xe_vm_query_mem_range_attr { 2216 + /** @extensions: Pointer to the first extension struct, if any */ 2217 + __u64 extensions; 2218 + 2219 + /** @vm_id: vm_id of the virtual range */ 2220 + __u32 vm_id; 2221 + 2222 + /** @num_mem_ranges: number of mem_ranges in range */ 2223 + __u32 num_mem_ranges; 2224 + 2225 + /** @start: start of the virtual address range */ 2226 + __u64 start; 2227 + 2228 + /** @range: size of the virtual address range */ 2229 + __u64 range; 2230 + 2231 + /** @sizeof_mem_range_attr: size of struct drm_xe_mem_range_attr */ 2232 + __u64 sizeof_mem_range_attr; 2233 + 2234 + /** @vector_of_mem_attr: userptr to array of struct drm_xe_mem_range_attr */ 2235 + __u64 vector_of_mem_attr; 2236 + 2237 + /** @reserved: Reserved */ 2238 + __u64 reserved[2]; 2239 + 1990 2240 }; 1991 2241 1992 2242 #if defined(__cplusplus)

Configure Feed

Configure Feed