Merge tag 'drm-misc-next-2023-11-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

+8 -3

Documentation/accel/qaic/aic100.rst

··· 36 36 37 37 AIC100 does not implement FLR (function level reset). 38 38 39 - AIC100 implements MSI but does not implement MSI-X. AIC100 requires 17 MSIs to 40 - operate (1 for MHI, 16 for the DMA Bridge). 39 + AIC100 implements MSI but does not implement MSI-X. AIC100 prefers 17 MSIs to 40 + operate (1 for MHI, 16 for the DMA Bridge). Falling back to 1 MSI is possible in 41 + scenarios where reserving 32 MSIs isn't feasible. 41 42 42 43 As a PCIe device, AIC100 utilizes BARs to provide host interfaces to the device 43 44 hardware. AIC100 provides 3, 64-bit BARs. ··· 221 220 +----------------+---------+----------+----------------------------------------+ 222 221 | QAIC_DEBUG | 18 & 19 | AMSS | Not used. | 223 222 +----------------+---------+----------+----------------------------------------+ 224 - | QAIC_TIMESYNC | 20 & 21 | SBL/AMSS | Used to synchronize timestamps in the | 223 + | QAIC_TIMESYNC | 20 & 21 | SBL | Used to synchronize timestamps in the | 225 224 | | | | device side logs with the host time | 226 225 | | | | source. | 226 + +----------------+---------+----------+----------------------------------------+ 227 + | QAIC_TIMESYNC | 22 & 23 | AMSS | Used to periodically synchronize | 228 + | _PERIODIC | | | timestamps in the device side logs with| 229 + | | | | the host time source. | 227 230 +----------------+---------+----------+----------------------------------------+ 228 231 229 232 DMA Bridge

+28

Documentation/accel/qaic/qaic.rst

··· 10 10 Interrupts 11 11 ========== 12 12 13 + IRQ Storm Mitigation 14 + -------------------- 15 + 13 16 While the AIC100 DMA Bridge hardware implements an IRQ storm mitigation 14 17 mechanism, it is still possible for an IRQ storm to occur. A storm can happen 15 18 if the workload is particularly quick, and the host is responsive. If the host ··· 37 34 generates 100k IRQs per second (per /proc/interrupts) is reduced to roughly 64 38 35 IRQs over 5 minutes while keeping the host system stable, and having the same 39 36 workload throughput performance (within run to run noise variation). 37 + 38 + Single MSI Mode 39 + --------------- 40 + 41 + MultiMSI is not well supported on all systems; virtualized ones even less so 42 + (circa 2023). Between hypervisors masking the PCIe MSI capability structure to 43 + large memory requirements for vIOMMUs (required for supporting MultiMSI), it is 44 + useful to be able to fall back to a single MSI when needed. 45 + 46 + To support this fallback, we allow the case where only one MSI is able to be 47 + allocated, and share that one MSI between MHI and the DBCs. The device detects 48 + when only one MSI has been configured and directs the interrupts for the DBCs 49 + to the interrupt normally used for MHI. Unfortunately this means that the 50 + interrupt handlers for every DBC and MHI wake up for every interrupt that 51 + arrives; however, the DBC threaded irq handlers only are started when work to be 52 + done is detected (MHI will always start its threaded handler). 53 + 54 + If the DBC is configured to force MSI interrupts, this can circumvent the 55 + software IRQ storm mitigation mentioned above. Since the MSI is shared it is 56 + never disabled, allowing each new entry to the FIFO to trigger a new interrupt. 40 57 41 58 42 59 Neural Network Control (NNC) Protocol ··· 201 178 202 179 Sets the polling interval in microseconds (us) when datapath polling is active. 203 180 Takes effect at the next polling interval. Default is 100 (100 us). 181 + 182 + **timesync_delay_ms (unsigned int)** 183 + 184 + Sets the time interval in milliseconds (ms) between two consecutive timesync 185 + operations. Default is 1000 (1000 ms).

+1

Documentation/devicetree/bindings/gpu/brcm,bcm-v3d.yaml

··· 17 17 compatible: 18 18 enum: 19 19 - brcm,2711-v3d 20 + - brcm,2712-v3d 20 21 - brcm,7268-v3d 21 22 - brcm,7278-v3d 22 23

+6

Documentation/gpu/drm-kms-helpers.rst

··· 363 363 .. kernel-doc:: drivers/gpu/drm/drm_edid.c 364 364 :export: 365 365 366 + .. kernel-doc:: include/drm/drm_eld.h 367 + :internal: 368 + 369 + .. kernel-doc:: drivers/gpu/drm/drm_eld.c 370 + :export: 371 + 366 372 SCDC Helper Functions Reference 367 373 =============================== 368 374

+6

Documentation/gpu/drm-mm.rst

··· 552 552 .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c 553 553 :doc: Overview 554 554 555 + Flow Control 556 + ------------ 557 + 558 + .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c 559 + :doc: Flow Control 560 + 555 561 Scheduler Function References 556 562 ----------------------------- 557 563

+17

Documentation/gpu/todo.rst

··· 621 621 622 622 Level: Intermediate 623 623 624 + Clean up and document former selftests suites 625 + --------------------------------------------- 626 + 627 + Some KUnit test suites (drm_buddy, drm_cmdline_parser, drm_damage_helper, 628 + drm_format, drm_framebuffer, drm_dp_mst_helper, drm_mm, drm_plane_helper and 629 + drm_rect) are former selftests suites that have been converted over when KUnit 630 + was first introduced. 631 + 632 + These suites were fairly undocumented, and with different goals than what unit 633 + tests can be. Trying to identify what each test in these suites actually test 634 + for, whether that makes sense for a unit test, and either remove it if it 635 + doesn't or document it if it does would be of great help. 636 + 637 + Contact: Maxime Ripard <mripard@kernel.org> 638 + 639 + Level: Intermediate 640 + 624 641 Enable trinity for DRM 625 642 ---------------------- 626 643

+3 -6

MAINTAINERS

··· 6503 6503 F: drivers/gpu/drm/sun4i/sun8i* 6504 6504 6505 6505 DRM DRIVER FOR ARM PL111 CLCD 6506 - M: Emma Anholt <emma@anholt.net> 6507 - S: Supported 6506 + S: Orphan 6508 6507 T: git git://anongit.freedesktop.org/drm/drm-misc 6509 6508 F: drivers/gpu/drm/pl111/ 6510 6509 ··· 6618 6619 F: drivers/gpu/drm/panel/panel-himax-hx8394.c 6619 6620 6620 6621 DRM DRIVER FOR HX8357D PANELS 6621 - M: Emma Anholt <emma@anholt.net> 6622 - S: Maintained 6622 + S: Orphan 6623 6623 T: git git://anongit.freedesktop.org/drm/drm-misc 6624 6624 F: Documentation/devicetree/bindings/display/himax,hx8357d.txt 6625 6625 F: drivers/gpu/drm/tiny/hx8357d.c ··· 7211 7213 F: drivers/gpu/drm/omapdrm/ 7212 7214 7213 7215 DRM DRIVERS FOR V3D 7214 - M: Emma Anholt <emma@anholt.net> 7215 7216 M: Melissa Wen <mwen@igalia.com> 7217 + M: Maíra Canal <mcanal@igalia.com> 7216 7218 S: Supported 7217 7219 T: git git://anongit.freedesktop.org/drm/drm-misc 7218 7220 F: Documentation/devicetree/bindings/gpu/brcm,bcm-v3d.yaml ··· 7220 7222 F: include/uapi/drm/v3d_drm.h 7221 7223 7222 7224 DRM DRIVERS FOR VC4 7223 - M: Emma Anholt <emma@anholt.net> 7224 7225 M: Maxime Ripard <mripard@kernel.org> 7225 7226 S: Supported 7226 7227 T: git git://github.com/anholt/linux

+6 -5

drivers/accel/ivpu/Kconfig

··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 2 3 3 config DRM_ACCEL_IVPU 4 - tristate "Intel VPU for Meteor Lake and newer" 4 + tristate "Intel NPU (Neural Processing Unit)" 5 5 depends on DRM_ACCEL 6 6 depends on X86_64 && !UML 7 7 depends on PCI && PCI_MSI 8 8 select FW_LOADER 9 - select SHMEM 9 + select DRM_GEM_SHMEM_HELPER 10 10 select GENERIC_ALLOCATOR 11 11 help 12 - Choose this option if you have a system that has an 14th generation Intel CPU 13 - or newer. VPU stands for Versatile Processing Unit and it's a CPU-integrated 14 - inference accelerator for Computer Vision and Deep Learning applications. 12 + Choose this option if you have a system with an 14th generation 13 + Intel CPU (Meteor Lake) or newer. Intel NPU (formerly called Intel VPU) 14 + is a CPU-integrated inference accelerator for Computer Vision 15 + and Deep Learning applications. 15 16 16 17 If "M" is selected, the module will be called intel_vpu.

+57

drivers/accel/ivpu/ivpu_debugfs.c

··· 14 14 #include "ivpu_fw.h" 15 15 #include "ivpu_fw_log.h" 16 16 #include "ivpu_gem.h" 17 + #include "ivpu_hw.h" 17 18 #include "ivpu_jsm_msg.h" 18 19 #include "ivpu_pm.h" 19 20 ··· 116 115 {"reset_pending", reset_pending_show, 0}, 117 116 }; 118 117 118 + static ssize_t 119 + dvfs_mode_fops_write(struct file *file, const char __user *user_buf, size_t size, loff_t *pos) 120 + { 121 + struct ivpu_device *vdev = file->private_data; 122 + struct ivpu_fw_info *fw = vdev->fw; 123 + u32 dvfs_mode; 124 + int ret; 125 + 126 + ret = kstrtou32_from_user(user_buf, size, 0, &dvfs_mode); 127 + if (ret < 0) 128 + return ret; 129 + 130 + fw->dvfs_mode = dvfs_mode; 131 + 132 + ivpu_pm_schedule_recovery(vdev); 133 + 134 + return size; 135 + } 136 + 137 + static const struct file_operations dvfs_mode_fops = { 138 + .owner = THIS_MODULE, 139 + .open = simple_open, 140 + .write = dvfs_mode_fops_write, 141 + }; 142 + 119 143 static int fw_log_show(struct seq_file *s, void *v) 120 144 { 121 145 struct ivpu_device *vdev = s->private; ··· 175 149 .read = seq_read, 176 150 .llseek = seq_lseek, 177 151 .release = single_release, 152 + }; 153 + 154 + static ssize_t 155 + fw_profiling_freq_fops_write(struct file *file, const char __user *user_buf, 156 + size_t size, loff_t *pos) 157 + { 158 + struct ivpu_device *vdev = file->private_data; 159 + bool enable; 160 + int ret; 161 + 162 + ret = kstrtobool_from_user(user_buf, size, &enable); 163 + if (ret < 0) 164 + return ret; 165 + 166 + ivpu_hw_profiling_freq_drive(vdev, enable); 167 + ivpu_pm_schedule_recovery(vdev); 168 + 169 + return size; 170 + } 171 + 172 + static const struct file_operations fw_profiling_freq_fops = { 173 + .owner = THIS_MODULE, 174 + .open = simple_open, 175 + .write = fw_profiling_freq_fops_write, 178 176 }; 179 177 180 178 static ssize_t ··· 330 280 debugfs_create_file("force_recovery", 0200, debugfs_root, vdev, 331 281 &ivpu_force_recovery_fops); 332 282 283 + debugfs_create_file("dvfs_mode", 0200, debugfs_root, vdev, 284 + &dvfs_mode_fops); 285 + 333 286 debugfs_create_file("fw_log", 0644, debugfs_root, vdev, 334 287 &fw_log_fops); 335 288 debugfs_create_file("fw_trace_destination_mask", 0200, debugfs_root, vdev, ··· 344 291 345 292 debugfs_create_file("reset_engine", 0200, debugfs_root, vdev, 346 293 &ivpu_reset_engine_fops); 294 + 295 + if (ivpu_hw_gen(vdev) >= IVPU_HW_40XX) 296 + debugfs_create_file("fw_profiling_freq_drive", 0200, 297 + debugfs_root, vdev, &fw_profiling_freq_fops); 347 298 }

+27 -22

drivers/accel/ivpu/ivpu_drv.c

··· 31 31 __stringify(DRM_IVPU_DRIVER_MINOR) "." 32 32 #endif 33 33 34 - static const struct drm_driver driver; 35 - 36 34 static struct lock_class_key submitted_jobs_xa_lock_class_key; 37 35 38 36 int ivpu_dbg_mask; ··· 39 41 40 42 int ivpu_test_mode; 41 43 module_param_named_unsafe(test_mode, ivpu_test_mode, int, 0644); 42 - MODULE_PARM_DESC(test_mode, "Test mode: 0 - normal operation, 1 - fw unit test, 2 - null hw"); 44 + MODULE_PARM_DESC(test_mode, "Test mode mask. See IVPU_TEST_MODE_* macros."); 43 45 44 46 u8 ivpu_pll_min_ratio; 45 47 module_param_named(pll_min_ratio, ivpu_pll_min_ratio, byte, 0644); ··· 91 93 ivpu_dbg(vdev, FILE, "file_priv release: ctx %u\n", file_priv->ctx.id); 92 94 93 95 ivpu_cmdq_release_all(file_priv); 94 - ivpu_bo_remove_all_bos_from_context(&file_priv->ctx); 95 96 ivpu_jsm_context_release(vdev, file_priv->ctx.id); 97 + ivpu_bo_remove_all_bos_from_context(vdev, &file_priv->ctx); 96 98 ivpu_mmu_user_context_fini(vdev, &file_priv->ctx); 97 99 drm_WARN_ON(&vdev->drm, xa_erase_irq(&vdev->context_xa, file_priv->ctx.id) != file_priv); 98 100 mutex_destroy(&file_priv->lock); ··· 315 317 unsigned long timeout; 316 318 int ret; 317 319 318 - if (ivpu_test_mode == IVPU_TEST_MODE_FW_TEST) 320 + if (ivpu_test_mode & IVPU_TEST_MODE_FW_TEST) 319 321 return 0; 320 322 321 - ivpu_ipc_consumer_add(vdev, &cons, IVPU_IPC_CHAN_BOOT_MSG); 323 + ivpu_ipc_consumer_add(vdev, &cons, IVPU_IPC_CHAN_BOOT_MSG, NULL); 322 324 323 325 timeout = jiffies + msecs_to_jiffies(vdev->timeout.boot); 324 326 while (1) { 325 - ret = ivpu_ipc_irq_handler(vdev); 326 - if (ret) 327 - break; 327 + ivpu_ipc_irq_handler(vdev, NULL); 328 328 ret = ivpu_ipc_receive(vdev, &cons, &ipc_hdr, NULL, 0); 329 329 if (ret != -ETIMEDOUT || time_after_eq(jiffies, timeout)) 330 330 break; ··· 358 362 int ret; 359 363 360 364 /* Update boot params located at first 4KB of FW memory */ 361 - ivpu_fw_boot_params_setup(vdev, vdev->fw->mem->kvaddr); 365 + ivpu_fw_boot_params_setup(vdev, ivpu_bo_vaddr(vdev->fw->mem)); 362 366 363 367 ret = ivpu_hw_boot_fw(vdev); 364 368 if (ret) { ··· 410 414 411 415 .open = ivpu_open, 412 416 .postclose = ivpu_postclose, 413 - .gem_prime_import = ivpu_gem_prime_import, 417 + 418 + .gem_create_object = ivpu_gem_create_object, 419 + .gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table, 414 420 415 421 .ioctls = ivpu_drm_ioctls, 416 422 .num_ioctls = ARRAY_SIZE(ivpu_drm_ioctls), ··· 424 426 .major = DRM_IVPU_DRIVER_MAJOR, 425 427 .minor = DRM_IVPU_DRIVER_MINOR, 426 428 }; 429 + 430 + static irqreturn_t ivpu_irq_thread_handler(int irq, void *arg) 431 + { 432 + struct ivpu_device *vdev = arg; 433 + 434 + return ivpu_ipc_irq_thread_handler(vdev); 435 + } 427 436 428 437 static int ivpu_irq_init(struct ivpu_device *vdev) 429 438 { ··· 445 440 446 441 vdev->irq = pci_irq_vector(pdev, 0); 447 442 448 - ret = devm_request_irq(vdev->drm.dev, vdev->irq, vdev->hw->ops->irq_handler, 449 - IRQF_NO_AUTOEN, DRIVER_NAME, vdev); 443 + ret = devm_request_threaded_irq(vdev->drm.dev, vdev->irq, vdev->hw->ops->irq_handler, 444 + ivpu_irq_thread_handler, IRQF_NO_AUTOEN, DRIVER_NAME, vdev); 450 445 if (ret) 451 446 ivpu_err(vdev, "Failed to request an IRQ %d\n", ret); 452 447 ··· 538 533 xa_init_flags(&vdev->context_xa, XA_FLAGS_ALLOC); 539 534 xa_init_flags(&vdev->submitted_jobs_xa, XA_FLAGS_ALLOC1); 540 535 lockdep_set_class(&vdev->submitted_jobs_xa.xa_lock, &submitted_jobs_xa_lock_class_key); 536 + INIT_LIST_HEAD(&vdev->bo_list); 537 + 538 + ret = drmm_mutex_init(&vdev->drm, &vdev->bo_list_lock); 539 + if (ret) 540 + goto err_xa_destroy; 541 541 542 542 ret = ivpu_pci_init(vdev); 543 543 if (ret) ··· 560 550 /* Power up early so the rest of init code can access VPU registers */ 561 551 ret = ivpu_hw_power_up(vdev); 562 552 if (ret) 563 - goto err_xa_destroy; 553 + goto err_power_down; 564 554 565 555 ret = ivpu_mmu_global_context_init(vdev); 566 556 if (ret) ··· 584 574 585 575 ivpu_pm_init(vdev); 586 576 587 - ret = ivpu_job_done_thread_init(vdev); 577 + ret = ivpu_boot(vdev); 588 578 if (ret) 589 579 goto err_ipc_fini; 590 580 591 - ret = ivpu_boot(vdev); 592 - if (ret) 593 - goto err_job_done_thread_fini; 594 - 581 + ivpu_job_done_consumer_init(vdev); 595 582 ivpu_pm_enable(vdev); 596 583 597 584 return 0; 598 585 599 - err_job_done_thread_fini: 600 - ivpu_job_done_thread_fini(vdev); 601 586 err_ipc_fini: 602 587 ivpu_ipc_fini(vdev); 603 588 err_fw_fini: ··· 617 612 ivpu_shutdown(vdev); 618 613 if (IVPU_WA(d3hot_after_power_off)) 619 614 pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D3hot); 620 - ivpu_job_done_thread_fini(vdev); 615 + ivpu_job_done_consumer_fini(vdev); 621 616 ivpu_pm_cancel_recovery(vdev); 622 617 623 618 ivpu_ipc_fini(vdev);

+13 -5

drivers/accel/ivpu/ivpu_drv.h

··· 17 17 #include <uapi/drm/ivpu_accel.h> 18 18 19 19 #include "ivpu_mmu_context.h" 20 + #include "ivpu_ipc.h" 20 21 21 22 #define DRIVER_NAME "intel_vpu" 22 - #define DRIVER_DESC "Driver for Intel Versatile Processing Unit (VPU)" 23 + #define DRIVER_DESC "Driver for Intel NPU (Neural Processing Unit)" 23 24 #define DRIVER_DATE "20230117" 24 25 25 26 #define PCI_DEVICE_ID_MTL 0x7d1d ··· 89 88 bool d3hot_after_power_off; 90 89 bool interrupt_clear_with_0; 91 90 bool disable_clock_relinquish; 91 + bool disable_d0i3_msg; 92 92 }; 93 93 94 94 struct ivpu_hw_info; ··· 117 115 struct xarray context_xa; 118 116 struct xa_limit context_xa_limit; 119 117 118 + struct mutex bo_list_lock; /* Protects bo_list */ 119 + struct list_head bo_list; 120 + 120 121 struct xarray submitted_jobs_xa; 121 - struct task_struct *job_done_thread; 122 + struct ivpu_ipc_consumer job_done_consumer; 122 123 123 124 atomic64_t unique_id_counter; 124 125 ··· 131 126 int tdr; 132 127 int reschedule_suspend; 133 128 int autosuspend; 129 + int d0i3_entry_msg; 134 130 } timeout; 135 131 }; 136 132 ··· 154 148 extern u8 ivpu_pll_max_ratio; 155 149 extern bool ivpu_disable_mmu_cont_pages; 156 150 157 - #define IVPU_TEST_MODE_DISABLED 0 158 - #define IVPU_TEST_MODE_FW_TEST 1 159 - #define IVPU_TEST_MODE_NULL_HW 2 151 + #define IVPU_TEST_MODE_FW_TEST BIT(0) 152 + #define IVPU_TEST_MODE_NULL_HW BIT(1) 153 + #define IVPU_TEST_MODE_NULL_SUBMISSION BIT(2) 154 + #define IVPU_TEST_MODE_D0I3_MSG_DISABLE BIT(4) 155 + #define IVPU_TEST_MODE_D0I3_MSG_ENABLE BIT(5) 160 156 extern int ivpu_test_mode; 161 157 162 158 struct ivpu_file_priv *ivpu_file_priv_get(struct ivpu_file_priv *file_priv);

+74 -5

drivers/accel/ivpu/ivpu_fw.c

··· 33 33 34 34 #define ADDR_TO_L2_CACHE_CFG(addr) ((addr) >> 31) 35 35 36 - #define IVPU_FW_CHECK_API(vdev, fw_hdr, name, min_major) \ 36 + /* Check if FW API is compatible with the driver */ 37 + #define IVPU_FW_CHECK_API_COMPAT(vdev, fw_hdr, name, min_major) \ 37 38 ivpu_fw_check_api(vdev, fw_hdr, #name, \ 38 39 VPU_##name##_API_VER_INDEX, \ 39 40 VPU_##name##_API_VER_MAJOR, \ 40 41 VPU_##name##_API_VER_MINOR, min_major) 42 + 43 + /* Check if API version is lower that the given version */ 44 + #define IVPU_FW_CHECK_API_VER_LT(vdev, fw_hdr, name, major, minor) \ 45 + ivpu_fw_check_api_ver_lt(vdev, fw_hdr, #name, VPU_##name##_API_VER_INDEX, major, minor) 41 46 42 47 static char *ivpu_firmware; 43 48 module_param_named_unsafe(firmware, ivpu_firmware, charp, 0644); ··· 110 105 return 0; 111 106 } 112 107 108 + static bool 109 + ivpu_fw_check_api_ver_lt(struct ivpu_device *vdev, const struct vpu_firmware_header *fw_hdr, 110 + const char *str, int index, u16 major, u16 minor) 111 + { 112 + u16 fw_major = (u16)(fw_hdr->api_version[index] >> 16); 113 + u16 fw_minor = (u16)(fw_hdr->api_version[index]); 114 + 115 + if (fw_major < major || (fw_major == major && fw_minor < minor)) 116 + return true; 117 + 118 + return false; 119 + } 120 + 113 121 static int ivpu_fw_parse(struct ivpu_device *vdev) 114 122 { 115 123 struct ivpu_fw_info *fw = vdev->fw; ··· 182 164 ivpu_info(vdev, "Firmware: %s, version: %s", fw->name, 183 165 (const char *)fw_hdr + VPU_FW_HEADER_SIZE); 184 166 185 - if (IVPU_FW_CHECK_API(vdev, fw_hdr, BOOT, 3)) 167 + if (IVPU_FW_CHECK_API_COMPAT(vdev, fw_hdr, BOOT, 3)) 186 168 return -EINVAL; 187 - if (IVPU_FW_CHECK_API(vdev, fw_hdr, JSM, 3)) 169 + if (IVPU_FW_CHECK_API_COMPAT(vdev, fw_hdr, JSM, 3)) 188 170 return -EINVAL; 189 171 190 172 fw->runtime_addr = runtime_addr; ··· 200 182 fw->trace_destination_mask = VPU_TRACE_DESTINATION_VERBOSE_TRACING; 201 183 fw->trace_hw_component_mask = -1; 202 184 185 + fw->dvfs_mode = 0; 186 + 203 187 ivpu_dbg(vdev, FW_BOOT, "Size: file %lu image %u runtime %u shavenn %u\n", 204 188 fw->file->size, fw->image_size, fw->runtime_size, fw->shave_nn_size); 205 189 ivpu_dbg(vdev, FW_BOOT, "Address: runtime 0x%llx, load 0x%llx, entry point 0x%llx\n", ··· 213 193 static void ivpu_fw_release(struct ivpu_device *vdev) 214 194 { 215 195 release_firmware(vdev->fw->file); 196 + } 197 + 198 + /* Initialize workarounds that depend on FW version */ 199 + static void 200 + ivpu_fw_init_wa(struct ivpu_device *vdev) 201 + { 202 + const struct vpu_firmware_header *fw_hdr = (const void *)vdev->fw->file->data; 203 + 204 + if (IVPU_FW_CHECK_API_VER_LT(vdev, fw_hdr, BOOT, 3, 17) || 205 + (ivpu_hw_gen(vdev) > IVPU_HW_37XX) || 206 + (ivpu_test_mode & IVPU_TEST_MODE_D0I3_MSG_DISABLE)) 207 + vdev->wa.disable_d0i3_msg = true; 208 + 209 + /* Force enable the feature for testing purposes */ 210 + if (ivpu_test_mode & IVPU_TEST_MODE_D0I3_MSG_ENABLE) 211 + vdev->wa.disable_d0i3_msg = false; 212 + 213 + IVPU_PRINT_WA(disable_d0i3_msg); 216 214 } 217 215 218 216 static int ivpu_fw_update_global_range(struct ivpu_device *vdev) ··· 286 248 287 249 if (fw->shave_nn_size) { 288 250 fw->mem_shave_nn = ivpu_bo_alloc_internal(vdev, vdev->hw->ranges.shave.start, 289 - fw->shave_nn_size, DRM_IVPU_BO_UNCACHED); 251 + fw->shave_nn_size, DRM_IVPU_BO_WC); 290 252 if (!fw->mem_shave_nn) { 291 253 ivpu_err(vdev, "Failed to allocate shavenn buffer\n"); 292 254 ret = -ENOMEM; ··· 334 296 ret = ivpu_fw_parse(vdev); 335 297 if (ret) 336 298 goto err_fw_release; 299 + 300 + ivpu_fw_init_wa(vdev); 337 301 338 302 ret = ivpu_fw_mem_init(vdev); 339 303 if (ret) ··· 462 422 boot_params->punit_telemetry_sram_size); 463 423 ivpu_dbg(vdev, FW_BOOT, "boot_params.vpu_telemetry_enable = 0x%x\n", 464 424 boot_params->vpu_telemetry_enable); 425 + ivpu_dbg(vdev, FW_BOOT, "boot_params.dvfs_mode = %u\n", 426 + boot_params->dvfs_mode); 427 + ivpu_dbg(vdev, FW_BOOT, "boot_params.d0i3_delayed_entry = %d\n", 428 + boot_params->d0i3_delayed_entry); 429 + ivpu_dbg(vdev, FW_BOOT, "boot_params.d0i3_residency_time_us = %lld\n", 430 + boot_params->d0i3_residency_time_us); 431 + ivpu_dbg(vdev, FW_BOOT, "boot_params.d0i3_entry_vpu_ts = %llu\n", 432 + boot_params->d0i3_entry_vpu_ts); 465 433 } 466 434 467 435 void ivpu_fw_boot_params_setup(struct ivpu_device *vdev, struct vpu_boot_params *boot_params) 468 436 { 469 437 struct ivpu_bo *ipc_mem_rx = vdev->ipc->mem_rx; 470 438 471 - /* In case of warm boot we only have to reset the entrypoint addr */ 439 + /* In case of warm boot only update variable params */ 472 440 if (!ivpu_fw_is_cold_boot(vdev)) { 441 + boot_params->d0i3_residency_time_us = 442 + ktime_us_delta(ktime_get_boottime(), vdev->hw->d0i3_entry_host_ts); 443 + boot_params->d0i3_entry_vpu_ts = vdev->hw->d0i3_entry_vpu_ts; 444 + 445 + ivpu_dbg(vdev, FW_BOOT, "boot_params.d0i3_residency_time_us = %lld\n", 446 + boot_params->d0i3_residency_time_us); 447 + ivpu_dbg(vdev, FW_BOOT, "boot_params.d0i3_entry_vpu_ts = %llu\n", 448 + boot_params->d0i3_entry_vpu_ts); 449 + 473 450 boot_params->save_restore_ret_address = 0; 474 451 vdev->pm->is_warmboot = true; 475 452 wmb(); /* Flush WC buffers after writing save_restore_ret_address */ ··· 498 441 boot_params->magic = VPU_BOOT_PARAMS_MAGIC; 499 442 boot_params->vpu_id = to_pci_dev(vdev->drm.dev)->bus->number; 500 443 boot_params->frequency = ivpu_hw_reg_pll_freq_get(vdev); 444 + 445 + /* 446 + * This param is a debug firmware feature. It switches default clock 447 + * to higher resolution one for fine-grained and more accurate firmware 448 + * task profiling. 449 + */ 450 + boot_params->perf_clk_frequency = ivpu_hw_profiling_freq_get(vdev); 501 451 502 452 /* 503 453 * Uncached region of VPU address space, covers IPC buffers, job queues ··· 557 493 boot_params->punit_telemetry_sram_base = ivpu_hw_reg_telemetry_offset_get(vdev); 558 494 boot_params->punit_telemetry_sram_size = ivpu_hw_reg_telemetry_size_get(vdev); 559 495 boot_params->vpu_telemetry_enable = ivpu_hw_reg_telemetry_enable_get(vdev); 496 + boot_params->dvfs_mode = vdev->fw->dvfs_mode; 497 + if (!IVPU_WA(disable_d0i3_msg)) 498 + boot_params->d0i3_delayed_entry = 1; 499 + boot_params->d0i3_residency_time_us = 0; 500 + boot_params->d0i3_entry_vpu_ts = 0; 560 501 561 502 wmb(); /* Flush WC buffers after writing bootparams */ 562 503

+1

drivers/accel/ivpu/ivpu_fw.h

··· 27 27 u32 trace_level; 28 28 u32 trace_destination_mask; 29 29 u64 trace_hw_component_mask; 30 + u32 dvfs_mode; 30 31 }; 31 32 32 33 int ivpu_fw_init(struct ivpu_device *vdev);

+206 -486

drivers/accel/ivpu/ivpu_gem.c

··· 20 20 #include "ivpu_mmu.h" 21 21 #include "ivpu_mmu_context.h" 22 22 23 - MODULE_IMPORT_NS(DMA_BUF); 24 - 25 23 static const struct drm_gem_object_funcs ivpu_gem_funcs; 26 24 27 - static struct lock_class_key prime_bo_lock_class_key; 28 - 29 - static int __must_check prime_alloc_pages_locked(struct ivpu_bo *bo) 25 + static inline void ivpu_dbg_bo(struct ivpu_device *vdev, struct ivpu_bo *bo, const char *action) 30 26 { 31 - /* Pages are managed by the underlying dma-buf */ 32 - return 0; 33 - } 34 - 35 - static void prime_free_pages_locked(struct ivpu_bo *bo) 36 - { 37 - /* Pages are managed by the underlying dma-buf */ 38 - } 39 - 40 - static int prime_map_pages_locked(struct ivpu_bo *bo) 41 - { 42 - struct ivpu_device *vdev = ivpu_bo_to_vdev(bo); 43 - struct sg_table *sgt; 44 - 45 - sgt = dma_buf_map_attachment_unlocked(bo->base.import_attach, DMA_BIDIRECTIONAL); 46 - if (IS_ERR(sgt)) { 47 - ivpu_err(vdev, "Failed to map attachment: %ld\n", PTR_ERR(sgt)); 48 - return PTR_ERR(sgt); 49 - } 50 - 51 - bo->sgt = sgt; 52 - return 0; 53 - } 54 - 55 - static void prime_unmap_pages_locked(struct ivpu_bo *bo) 56 - { 57 - dma_buf_unmap_attachment_unlocked(bo->base.import_attach, bo->sgt, DMA_BIDIRECTIONAL); 58 - bo->sgt = NULL; 59 - } 60 - 61 - static const struct ivpu_bo_ops prime_ops = { 62 - .type = IVPU_BO_TYPE_PRIME, 63 - .name = "prime", 64 - .alloc_pages = prime_alloc_pages_locked, 65 - .free_pages = prime_free_pages_locked, 66 - .map_pages = prime_map_pages_locked, 67 - .unmap_pages = prime_unmap_pages_locked, 68 - }; 69 - 70 - static int __must_check shmem_alloc_pages_locked(struct ivpu_bo *bo) 71 - { 72 - int npages = ivpu_bo_size(bo) >> PAGE_SHIFT; 73 - struct page **pages; 74 - 75 - pages = drm_gem_get_pages(&bo->base); 76 - if (IS_ERR(pages)) 77 - return PTR_ERR(pages); 78 - 79 - if (bo->flags & DRM_IVPU_BO_WC) 80 - set_pages_array_wc(pages, npages); 81 - else if (bo->flags & DRM_IVPU_BO_UNCACHED) 82 - set_pages_array_uc(pages, npages); 83 - 84 - bo->pages = pages; 85 - return 0; 86 - } 87 - 88 - static void shmem_free_pages_locked(struct ivpu_bo *bo) 89 - { 90 - if (ivpu_bo_cache_mode(bo) != DRM_IVPU_BO_CACHED) 91 - set_pages_array_wb(bo->pages, ivpu_bo_size(bo) >> PAGE_SHIFT); 92 - 93 - drm_gem_put_pages(&bo->base, bo->pages, true, false); 94 - bo->pages = NULL; 95 - } 96 - 97 - static int ivpu_bo_map_pages_locked(struct ivpu_bo *bo) 98 - { 99 - int npages = ivpu_bo_size(bo) >> PAGE_SHIFT; 100 - struct ivpu_device *vdev = ivpu_bo_to_vdev(bo); 101 - struct sg_table *sgt; 102 - int ret; 103 - 104 - sgt = drm_prime_pages_to_sg(&vdev->drm, bo->pages, npages); 105 - if (IS_ERR(sgt)) { 106 - ivpu_err(vdev, "Failed to allocate sgtable\n"); 107 - return PTR_ERR(sgt); 108 - } 109 - 110 - ret = dma_map_sgtable(vdev->drm.dev, sgt, DMA_BIDIRECTIONAL, 0); 111 - if (ret) { 112 - ivpu_err(vdev, "Failed to map BO in IOMMU: %d\n", ret); 113 - goto err_free_sgt; 114 - } 115 - 116 - bo->sgt = sgt; 117 - return 0; 118 - 119 - err_free_sgt: 120 - kfree(sgt); 121 - return ret; 122 - } 123 - 124 - static void ivpu_bo_unmap_pages_locked(struct ivpu_bo *bo) 125 - { 126 - struct ivpu_device *vdev = ivpu_bo_to_vdev(bo); 127 - 128 - dma_unmap_sgtable(vdev->drm.dev, bo->sgt, DMA_BIDIRECTIONAL, 0); 129 - sg_free_table(bo->sgt); 130 - kfree(bo->sgt); 131 - bo->sgt = NULL; 132 - } 133 - 134 - static const struct ivpu_bo_ops shmem_ops = { 135 - .type = IVPU_BO_TYPE_SHMEM, 136 - .name = "shmem", 137 - .alloc_pages = shmem_alloc_pages_locked, 138 - .free_pages = shmem_free_pages_locked, 139 - .map_pages = ivpu_bo_map_pages_locked, 140 - .unmap_pages = ivpu_bo_unmap_pages_locked, 141 - }; 142 - 143 - static int __must_check internal_alloc_pages_locked(struct ivpu_bo *bo) 144 - { 145 - unsigned int i, npages = ivpu_bo_size(bo) >> PAGE_SHIFT; 146 - struct page **pages; 147 - int ret; 148 - 149 - pages = kvmalloc_array(npages, sizeof(*bo->pages), GFP_KERNEL); 150 - if (!pages) 151 - return -ENOMEM; 152 - 153 - for (i = 0; i < npages; i++) { 154 - pages[i] = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO); 155 - if (!pages[i]) { 156 - ret = -ENOMEM; 157 - goto err_free_pages; 158 - } 159 - cond_resched(); 160 - } 161 - 162 - bo->pages = pages; 163 - return 0; 164 - 165 - err_free_pages: 166 - while (i--) 167 - put_page(pages[i]); 168 - kvfree(pages); 169 - return ret; 170 - } 171 - 172 - static void internal_free_pages_locked(struct ivpu_bo *bo) 173 - { 174 - unsigned int i, npages = ivpu_bo_size(bo) >> PAGE_SHIFT; 175 - 176 - if (ivpu_bo_cache_mode(bo) != DRM_IVPU_BO_CACHED) 177 - set_pages_array_wb(bo->pages, ivpu_bo_size(bo) >> PAGE_SHIFT); 178 - 179 - for (i = 0; i < npages; i++) 180 - put_page(bo->pages[i]); 181 - 182 - kvfree(bo->pages); 183 - bo->pages = NULL; 184 - } 185 - 186 - static const struct ivpu_bo_ops internal_ops = { 187 - .type = IVPU_BO_TYPE_INTERNAL, 188 - .name = "internal", 189 - .alloc_pages = internal_alloc_pages_locked, 190 - .free_pages = internal_free_pages_locked, 191 - .map_pages = ivpu_bo_map_pages_locked, 192 - .unmap_pages = ivpu_bo_unmap_pages_locked, 193 - }; 194 - 195 - static int __must_check ivpu_bo_alloc_and_map_pages_locked(struct ivpu_bo *bo) 196 - { 197 - struct ivpu_device *vdev = ivpu_bo_to_vdev(bo); 198 - int ret; 199 - 200 - lockdep_assert_held(&bo->lock); 201 - drm_WARN_ON(&vdev->drm, bo->sgt); 202 - 203 - ret = bo->ops->alloc_pages(bo); 204 - if (ret) { 205 - ivpu_err(vdev, "Failed to allocate pages for BO: %d", ret); 206 - return ret; 207 - } 208 - 209 - ret = bo->ops->map_pages(bo); 210 - if (ret) { 211 - ivpu_err(vdev, "Failed to map pages for BO: %d", ret); 212 - goto err_free_pages; 213 - } 214 - return ret; 215 - 216 - err_free_pages: 217 - bo->ops->free_pages(bo); 218 - return ret; 219 - } 220 - 221 - static void ivpu_bo_unmap_and_free_pages(struct ivpu_bo *bo) 222 - { 223 - mutex_lock(&bo->lock); 224 - 225 - WARN_ON(!bo->sgt); 226 - bo->ops->unmap_pages(bo); 227 - WARN_ON(bo->sgt); 228 - bo->ops->free_pages(bo); 229 - WARN_ON(bo->pages); 230 - 231 - mutex_unlock(&bo->lock); 27 + if (bo->ctx) 28 + ivpu_dbg(vdev, BO, "%6s: size %zu has_pages %d dma_mapped %d handle %u ctx %d vpu_addr 0x%llx mmu_mapped %d\n", 29 + action, ivpu_bo_size(bo), (bool)bo->base.pages, (bool)bo->base.sgt, 30 + bo->handle, bo->ctx->id, bo->vpu_addr, bo->mmu_mapped); 31 + else 32 + ivpu_dbg(vdev, BO, "%6s: size %zu has_pages %d dma_mapped %d handle %u (not added to context)\n", 33 + action, ivpu_bo_size(bo), (bool)bo->base.pages, (bool)bo->base.sgt, 34 + bo->handle); 232 35 } 233 36 234 37 /* ··· 48 245 49 246 mutex_lock(&bo->lock); 50 247 51 - if (!bo->vpu_addr) { 52 - ivpu_err(vdev, "vpu_addr not set for BO ctx_id: %d handle: %d\n", 53 - bo->ctx->id, bo->handle); 248 + ivpu_dbg_bo(vdev, bo, "pin"); 249 + 250 + if (!bo->ctx) { 251 + ivpu_err(vdev, "vpu_addr not allocated for BO %d\n", bo->handle); 54 252 ret = -EINVAL; 55 253 goto unlock; 56 254 } 57 255 58 - if (!bo->sgt) { 59 - ret = ivpu_bo_alloc_and_map_pages_locked(bo); 60 - if (ret) 61 - goto unlock; 62 - } 63 - 64 256 if (!bo->mmu_mapped) { 65 - ret = ivpu_mmu_context_map_sgt(vdev, bo->ctx, bo->vpu_addr, bo->sgt, 257 + struct sg_table *sgt = drm_gem_shmem_get_pages_sgt(&bo->base); 258 + 259 + if (IS_ERR(sgt)) { 260 + ret = PTR_ERR(sgt); 261 + ivpu_err(vdev, "Failed to map BO in IOMMU: %d\n", ret); 262 + goto unlock; 263 + } 264 + 265 + ret = ivpu_mmu_context_map_sgt(vdev, bo->ctx, bo->vpu_addr, sgt, 66 266 ivpu_bo_is_snooped(bo)); 67 267 if (ret) { 68 268 ivpu_err(vdev, "Failed to map BO in MMU: %d\n", ret); ··· 87 281 struct ivpu_device *vdev = ivpu_bo_to_vdev(bo); 88 282 int ret; 89 283 90 - if (!range) { 91 - if (bo->flags & DRM_IVPU_BO_SHAVE_MEM) 92 - range = &vdev->hw->ranges.shave; 93 - else if (bo->flags & DRM_IVPU_BO_DMA_MEM) 94 - range = &vdev->hw->ranges.dma; 95 - else 96 - range = &vdev->hw->ranges.user; 97 - } 284 + mutex_lock(&bo->lock); 98 285 99 - mutex_lock(&ctx->lock); 100 - ret = ivpu_mmu_context_insert_node_locked(ctx, range, ivpu_bo_size(bo), &bo->mm_node); 286 + ret = ivpu_mmu_context_insert_node(ctx, range, ivpu_bo_size(bo), &bo->mm_node); 101 287 if (!ret) { 102 288 bo->ctx = ctx; 103 289 bo->vpu_addr = bo->mm_node.start; 104 - list_add_tail(&bo->ctx_node, &ctx->bo_list); 290 + } else { 291 + ivpu_err(vdev, "Failed to add BO to context %u: %d\n", ctx->id, ret); 105 292 } 106 - mutex_unlock(&ctx->lock); 293 + 294 + ivpu_dbg_bo(vdev, bo, "alloc"); 295 + 296 + mutex_unlock(&bo->lock); 107 297 108 298 return ret; 109 299 } 110 300 111 - static void ivpu_bo_free_vpu_addr(struct ivpu_bo *bo) 301 + static void ivpu_bo_unbind_locked(struct ivpu_bo *bo) 112 302 { 113 303 struct ivpu_device *vdev = ivpu_bo_to_vdev(bo); 114 - struct ivpu_mmu_context *ctx = bo->ctx; 115 304 116 - ivpu_dbg(vdev, BO, "remove from ctx: ctx %d vpu_addr 0x%llx allocated %d mmu_mapped %d\n", 117 - ctx->id, bo->vpu_addr, (bool)bo->sgt, bo->mmu_mapped); 305 + lockdep_assert_held(&bo->lock); 118 306 119 - mutex_lock(&bo->lock); 307 + ivpu_dbg_bo(vdev, bo, "unbind"); 308 + 309 + /* TODO: dma_unmap */ 120 310 121 311 if (bo->mmu_mapped) { 122 - drm_WARN_ON(&vdev->drm, !bo->sgt); 123 - ivpu_mmu_context_unmap_sgt(vdev, ctx, bo->vpu_addr, bo->sgt); 312 + drm_WARN_ON(&vdev->drm, !bo->ctx); 313 + drm_WARN_ON(&vdev->drm, !bo->vpu_addr); 314 + drm_WARN_ON(&vdev->drm, !bo->base.sgt); 315 + ivpu_mmu_context_unmap_sgt(vdev, bo->ctx, bo->vpu_addr, bo->base.sgt); 124 316 bo->mmu_mapped = false; 125 317 } 126 318 127 - mutex_lock(&ctx->lock); 128 - list_del(&bo->ctx_node); 129 - bo->vpu_addr = 0; 130 - bo->ctx = NULL; 131 - ivpu_mmu_context_remove_node_locked(ctx, &bo->mm_node); 132 - mutex_unlock(&ctx->lock); 319 + if (bo->ctx) { 320 + ivpu_mmu_context_remove_node(bo->ctx, &bo->mm_node); 321 + bo->vpu_addr = 0; 322 + bo->ctx = NULL; 323 + } 324 + } 133 325 326 + static void ivpu_bo_unbind(struct ivpu_bo *bo) 327 + { 328 + mutex_lock(&bo->lock); 329 + ivpu_bo_unbind_locked(bo); 134 330 mutex_unlock(&bo->lock); 135 331 } 136 332 137 - void ivpu_bo_remove_all_bos_from_context(struct ivpu_mmu_context *ctx) 333 + void ivpu_bo_remove_all_bos_from_context(struct ivpu_device *vdev, struct ivpu_mmu_context *ctx) 138 334 { 139 - struct ivpu_bo *bo, *tmp; 335 + struct ivpu_bo *bo; 140 336 141 - list_for_each_entry_safe(bo, tmp, &ctx->bo_list, ctx_node) 142 - ivpu_bo_free_vpu_addr(bo); 337 + if (drm_WARN_ON(&vdev->drm, !ctx)) 338 + return; 339 + 340 + mutex_lock(&vdev->bo_list_lock); 341 + list_for_each_entry(bo, &vdev->bo_list, bo_list_node) { 342 + mutex_lock(&bo->lock); 343 + if (bo->ctx == ctx) 344 + ivpu_bo_unbind_locked(bo); 345 + mutex_unlock(&bo->lock); 346 + } 347 + mutex_unlock(&vdev->bo_list_lock); 348 + } 349 + 350 + struct drm_gem_object *ivpu_gem_create_object(struct drm_device *dev, size_t size) 351 + { 352 + struct ivpu_bo *bo; 353 + 354 + if (size == 0 || !PAGE_ALIGNED(size)) 355 + return ERR_PTR(-EINVAL); 356 + 357 + bo = kzalloc(sizeof(*bo), GFP_KERNEL); 358 + if (!bo) 359 + return ERR_PTR(-ENOMEM); 360 + 361 + bo->base.base.funcs = &ivpu_gem_funcs; 362 + bo->base.pages_mark_dirty_on_put = true; /* VPU can dirty a BO anytime */ 363 + 364 + INIT_LIST_HEAD(&bo->bo_list_node); 365 + mutex_init(&bo->lock); 366 + 367 + return &bo->base.base; 143 368 } 144 369 145 370 static struct ivpu_bo * 146 - ivpu_bo_alloc(struct ivpu_device *vdev, struct ivpu_mmu_context *mmu_context, 147 - u64 size, u32 flags, const struct ivpu_bo_ops *ops, 148 - const struct ivpu_addr_range *range, u64 user_ptr) 371 + ivpu_bo_create(struct ivpu_device *vdev, u64 size, u32 flags) 149 372 { 373 + struct drm_gem_shmem_object *shmem; 150 374 struct ivpu_bo *bo; 151 - int ret = 0; 152 - 153 - if (drm_WARN_ON(&vdev->drm, size == 0 || !PAGE_ALIGNED(size))) 154 - return ERR_PTR(-EINVAL); 155 375 156 376 switch (flags & DRM_IVPU_BO_CACHE_MASK) { 157 377 case DRM_IVPU_BO_CACHED: 158 - case DRM_IVPU_BO_UNCACHED: 159 378 case DRM_IVPU_BO_WC: 160 379 break; 161 380 default: 162 381 return ERR_PTR(-EINVAL); 163 382 } 164 383 165 - bo = kzalloc(sizeof(*bo), GFP_KERNEL); 166 - if (!bo) 167 - return ERR_PTR(-ENOMEM); 384 + shmem = drm_gem_shmem_create(&vdev->drm, size); 385 + if (IS_ERR(shmem)) 386 + return ERR_CAST(shmem); 168 387 169 - mutex_init(&bo->lock); 170 - bo->base.funcs = &ivpu_gem_funcs; 388 + bo = to_ivpu_bo(&shmem->base); 389 + bo->base.map_wc = flags & DRM_IVPU_BO_WC; 171 390 bo->flags = flags; 172 - bo->ops = ops; 173 - bo->user_ptr = user_ptr; 174 391 175 - if (ops->type == IVPU_BO_TYPE_SHMEM) 176 - ret = drm_gem_object_init(&vdev->drm, &bo->base, size); 177 - else 178 - drm_gem_private_object_init(&vdev->drm, &bo->base, size); 392 + mutex_lock(&vdev->bo_list_lock); 393 + list_add_tail(&bo->bo_list_node, &vdev->bo_list); 394 + mutex_unlock(&vdev->bo_list_lock); 179 395 180 - if (ret) { 181 - ivpu_err(vdev, "Failed to initialize drm object\n"); 182 - goto err_free; 183 - } 184 - 185 - if (flags & DRM_IVPU_BO_MAPPABLE) { 186 - ret = drm_gem_create_mmap_offset(&bo->base); 187 - if (ret) { 188 - ivpu_err(vdev, "Failed to allocate mmap offset\n"); 189 - goto err_release; 190 - } 191 - } 192 - 193 - if (mmu_context) { 194 - ret = ivpu_bo_alloc_vpu_addr(bo, mmu_context, range); 195 - if (ret) { 196 - ivpu_err(vdev, "Failed to add BO to context: %d\n", ret); 197 - goto err_release; 198 - } 199 - } 396 + ivpu_dbg(vdev, BO, "create: vpu_addr 0x%llx size %zu flags 0x%x\n", 397 + bo->vpu_addr, bo->base.base.size, flags); 200 398 201 399 return bo; 400 + } 202 401 203 - err_release: 204 - drm_gem_object_release(&bo->base); 205 - err_free: 206 - kfree(bo); 207 - return ERR_PTR(ret); 402 + static int ivpu_bo_open(struct drm_gem_object *obj, struct drm_file *file) 403 + { 404 + struct ivpu_file_priv *file_priv = file->driver_priv; 405 + struct ivpu_device *vdev = file_priv->vdev; 406 + struct ivpu_bo *bo = to_ivpu_bo(obj); 407 + struct ivpu_addr_range *range; 408 + 409 + if (bo->flags & DRM_IVPU_BO_SHAVE_MEM) 410 + range = &vdev->hw->ranges.shave; 411 + else if (bo->flags & DRM_IVPU_BO_DMA_MEM) 412 + range = &vdev->hw->ranges.dma; 413 + else 414 + range = &vdev->hw->ranges.user; 415 + 416 + return ivpu_bo_alloc_vpu_addr(bo, &file_priv->ctx, range); 208 417 } 209 418 210 419 static void ivpu_bo_free(struct drm_gem_object *obj) 211 420 { 421 + struct ivpu_device *vdev = to_ivpu_device(obj->dev); 212 422 struct ivpu_bo *bo = to_ivpu_bo(obj); 213 - struct ivpu_device *vdev = ivpu_bo_to_vdev(bo); 214 423 215 - if (bo->ctx) 216 - ivpu_dbg(vdev, BO, "free: ctx %d vpu_addr 0x%llx allocated %d mmu_mapped %d\n", 217 - bo->ctx->id, bo->vpu_addr, (bool)bo->sgt, bo->mmu_mapped); 218 - else 219 - ivpu_dbg(vdev, BO, "free: ctx (released) allocated %d mmu_mapped %d\n", 220 - (bool)bo->sgt, bo->mmu_mapped); 424 + mutex_lock(&vdev->bo_list_lock); 425 + list_del(&bo->bo_list_node); 426 + mutex_unlock(&vdev->bo_list_lock); 221 427 222 428 drm_WARN_ON(&vdev->drm, !dma_resv_test_signaled(obj->resv, DMA_RESV_USAGE_READ)); 223 429 224 - vunmap(bo->kvaddr); 430 + ivpu_dbg_bo(vdev, bo, "free"); 225 431 226 - if (bo->ctx) 227 - ivpu_bo_free_vpu_addr(bo); 228 - 229 - if (bo->sgt) 230 - ivpu_bo_unmap_and_free_pages(bo); 231 - 232 - if (bo->base.import_attach) 233 - drm_prime_gem_destroy(&bo->base, bo->sgt); 234 - 235 - drm_gem_object_release(&bo->base); 236 - 432 + ivpu_bo_unbind(bo); 237 433 mutex_destroy(&bo->lock); 238 - kfree(bo); 434 + 435 + drm_WARN_ON(obj->dev, bo->base.pages_use_count > 1); 436 + drm_gem_shmem_free(&bo->base); 239 437 } 240 438 241 - static int ivpu_bo_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) 242 - { 243 - struct ivpu_bo *bo = to_ivpu_bo(obj); 244 - struct ivpu_device *vdev = ivpu_bo_to_vdev(bo); 245 - 246 - ivpu_dbg(vdev, BO, "mmap: ctx %u handle %u vpu_addr 0x%llx size %zu type %s", 247 - bo->ctx->id, bo->handle, bo->vpu_addr, ivpu_bo_size(bo), bo->ops->name); 248 - 249 - if (obj->import_attach) { 250 - /* Drop the reference drm_gem_mmap_obj() acquired.*/ 251 - drm_gem_object_put(obj); 252 - vma->vm_private_data = NULL; 253 - return dma_buf_mmap(obj->dma_buf, vma, 0); 254 - } 255 - 256 - vm_flags_set(vma, VM_PFNMAP | VM_DONTEXPAND); 257 - vma->vm_page_prot = ivpu_bo_pgprot(bo, vm_get_page_prot(vma->vm_flags)); 258 - 259 - return 0; 260 - } 261 - 262 - static struct sg_table *ivpu_bo_get_sg_table(struct drm_gem_object *obj) 263 - { 264 - struct ivpu_bo *bo = to_ivpu_bo(obj); 265 - loff_t npages = obj->size >> PAGE_SHIFT; 266 - int ret = 0; 267 - 268 - mutex_lock(&bo->lock); 269 - 270 - if (!bo->sgt) 271 - ret = ivpu_bo_alloc_and_map_pages_locked(bo); 272 - 273 - mutex_unlock(&bo->lock); 274 - 275 - if (ret) 276 - return ERR_PTR(ret); 277 - 278 - return drm_prime_pages_to_sg(obj->dev, bo->pages, npages); 279 - } 280 - 281 - static vm_fault_t ivpu_vm_fault(struct vm_fault *vmf) 282 - { 283 - struct vm_area_struct *vma = vmf->vma; 284 - struct drm_gem_object *obj = vma->vm_private_data; 285 - struct ivpu_bo *bo = to_ivpu_bo(obj); 286 - loff_t npages = obj->size >> PAGE_SHIFT; 287 - pgoff_t page_offset; 288 - struct page *page; 289 - vm_fault_t ret; 290 - int err; 291 - 292 - mutex_lock(&bo->lock); 293 - 294 - if (!bo->sgt) { 295 - err = ivpu_bo_alloc_and_map_pages_locked(bo); 296 - if (err) { 297 - ret = vmf_error(err); 298 - goto unlock; 299 - } 300 - } 301 - 302 - /* We don't use vmf->pgoff since that has the fake offset */ 303 - page_offset = (vmf->address - vma->vm_start) >> PAGE_SHIFT; 304 - if (page_offset >= npages) { 305 - ret = VM_FAULT_SIGBUS; 306 - } else { 307 - page = bo->pages[page_offset]; 308 - ret = vmf_insert_pfn(vma, vmf->address, page_to_pfn(page)); 309 - } 310 - 311 - unlock: 312 - mutex_unlock(&bo->lock); 313 - 314 - return ret; 315 - } 316 - 317 - static const struct vm_operations_struct ivpu_vm_ops = { 318 - .fault = ivpu_vm_fault, 319 - .open = drm_gem_vm_open, 320 - .close = drm_gem_vm_close, 439 + static const struct dma_buf_ops ivpu_bo_dmabuf_ops = { 440 + .cache_sgt_mapping = true, 441 + .attach = drm_gem_map_attach, 442 + .detach = drm_gem_map_detach, 443 + .map_dma_buf = drm_gem_map_dma_buf, 444 + .unmap_dma_buf = drm_gem_unmap_dma_buf, 445 + .release = drm_gem_dmabuf_release, 446 + .mmap = drm_gem_dmabuf_mmap, 447 + .vmap = drm_gem_dmabuf_vmap, 448 + .vunmap = drm_gem_dmabuf_vunmap, 321 449 }; 450 + 451 + static struct dma_buf *ivpu_bo_export(struct drm_gem_object *obj, int flags) 452 + { 453 + struct drm_device *dev = obj->dev; 454 + struct dma_buf_export_info exp_info = { 455 + .exp_name = KBUILD_MODNAME, 456 + .owner = dev->driver->fops->owner, 457 + .ops = &ivpu_bo_dmabuf_ops, 458 + .size = obj->size, 459 + .flags = flags, 460 + .priv = obj, 461 + .resv = obj->resv, 462 + }; 463 + void *sgt; 464 + 465 + /* 466 + * Make sure that pages are allocated and dma-mapped before exporting the bo. 467 + * DMA-mapping is required if the bo will be imported to the same device. 468 + */ 469 + sgt = drm_gem_shmem_get_pages_sgt(to_drm_gem_shmem_obj(obj)); 470 + if (IS_ERR(sgt)) 471 + return sgt; 472 + 473 + return drm_gem_dmabuf_export(dev, &exp_info); 474 + } 322 475 323 476 static const struct drm_gem_object_funcs ivpu_gem_funcs = { 324 477 .free = ivpu_bo_free, 325 - .mmap = ivpu_bo_mmap, 326 - .vm_ops = &ivpu_vm_ops, 327 - .get_sg_table = ivpu_bo_get_sg_table, 478 + .open = ivpu_bo_open, 479 + .export = ivpu_bo_export, 480 + .print_info = drm_gem_shmem_object_print_info, 481 + .pin = drm_gem_shmem_object_pin, 482 + .unpin = drm_gem_shmem_object_unpin, 483 + .get_sg_table = drm_gem_shmem_object_get_sg_table, 484 + .vmap = drm_gem_shmem_object_vmap, 485 + .vunmap = drm_gem_shmem_object_vunmap, 486 + .mmap = drm_gem_shmem_object_mmap, 487 + .vm_ops = &drm_gem_shmem_vm_ops, 328 488 }; 329 489 330 - int 331 - ivpu_bo_create_ioctl(struct drm_device *dev, void *data, struct drm_file *file) 490 + int ivpu_bo_create_ioctl(struct drm_device *dev, void *data, struct drm_file *file) 332 491 { 333 492 struct ivpu_file_priv *file_priv = file->driver_priv; 334 493 struct ivpu_device *vdev = file_priv->vdev; ··· 308 537 if (size == 0) 309 538 return -EINVAL; 310 539 311 - bo = ivpu_bo_alloc(vdev, &file_priv->ctx, size, args->flags, &shmem_ops, NULL, 0); 540 + bo = ivpu_bo_create(vdev, size, args->flags); 312 541 if (IS_ERR(bo)) { 313 542 ivpu_err(vdev, "Failed to create BO: %pe (ctx %u size %llu flags 0x%x)", 314 543 bo, file_priv->ctx.id, args->size, args->flags); 315 544 return PTR_ERR(bo); 316 545 } 317 546 318 - ret = drm_gem_handle_create(file, &bo->base, &bo->handle); 547 + ret = drm_gem_handle_create(file, &bo->base.base, &bo->handle); 319 548 if (!ret) { 320 549 args->vpu_addr = bo->vpu_addr; 321 550 args->handle = bo->handle; 322 551 } 323 552 324 - drm_gem_object_put(&bo->base); 325 - 326 - ivpu_dbg(vdev, BO, "alloc shmem: ctx %u vpu_addr 0x%llx size %zu flags 0x%x\n", 327 - file_priv->ctx.id, bo->vpu_addr, ivpu_bo_size(bo), bo->flags); 553 + drm_gem_object_put(&bo->base.base); 328 554 329 555 return ret; 330 556 } ··· 331 563 { 332 564 const struct ivpu_addr_range *range; 333 565 struct ivpu_addr_range fixed_range; 566 + struct iosys_map map; 334 567 struct ivpu_bo *bo; 335 - pgprot_t prot; 336 568 int ret; 337 569 338 570 drm_WARN_ON(&vdev->drm, !PAGE_ALIGNED(vpu_addr)); ··· 346 578 range = &vdev->hw->ranges.global; 347 579 } 348 580 349 - bo = ivpu_bo_alloc(vdev, &vdev->gctx, size, flags, &internal_ops, range, 0); 581 + bo = ivpu_bo_create(vdev, size, flags); 350 582 if (IS_ERR(bo)) { 351 583 ivpu_err(vdev, "Failed to create BO: %pe (vpu_addr 0x%llx size %llu flags 0x%x)", 352 584 bo, vpu_addr, size, flags); 353 585 return NULL; 354 586 } 355 587 588 + ret = ivpu_bo_alloc_vpu_addr(bo, &vdev->gctx, range); 589 + if (ret) 590 + goto err_put; 591 + 356 592 ret = ivpu_bo_pin(bo); 357 593 if (ret) 358 594 goto err_put; 359 595 360 - if (ivpu_bo_cache_mode(bo) != DRM_IVPU_BO_CACHED) 361 - drm_clflush_pages(bo->pages, ivpu_bo_size(bo) >> PAGE_SHIFT); 362 - 363 - if (bo->flags & DRM_IVPU_BO_WC) 364 - set_pages_array_wc(bo->pages, ivpu_bo_size(bo) >> PAGE_SHIFT); 365 - else if (bo->flags & DRM_IVPU_BO_UNCACHED) 366 - set_pages_array_uc(bo->pages, ivpu_bo_size(bo) >> PAGE_SHIFT); 367 - 368 - prot = ivpu_bo_pgprot(bo, PAGE_KERNEL); 369 - bo->kvaddr = vmap(bo->pages, ivpu_bo_size(bo) >> PAGE_SHIFT, VM_MAP, prot); 370 - if (!bo->kvaddr) { 371 - ivpu_err(vdev, "Failed to map BO into kernel virtual memory\n"); 596 + ret = drm_gem_shmem_vmap(&bo->base, &map); 597 + if (ret) 372 598 goto err_put; 373 - } 374 - 375 - ivpu_dbg(vdev, BO, "alloc internal: ctx 0 vpu_addr 0x%llx size %zu flags 0x%x\n", 376 - bo->vpu_addr, ivpu_bo_size(bo), flags); 377 599 378 600 return bo; 379 601 380 602 err_put: 381 - drm_gem_object_put(&bo->base); 603 + drm_gem_object_put(&bo->base.base); 382 604 return NULL; 383 605 } 384 606 385 607 void ivpu_bo_free_internal(struct ivpu_bo *bo) 386 608 { 387 - drm_gem_object_put(&bo->base); 388 - } 609 + struct iosys_map map = IOSYS_MAP_INIT_VADDR(bo->base.vaddr); 389 610 390 - struct drm_gem_object *ivpu_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) 391 - { 392 - struct ivpu_device *vdev = to_ivpu_device(dev); 393 - struct dma_buf_attachment *attach; 394 - struct ivpu_bo *bo; 395 - 396 - attach = dma_buf_attach(buf, dev->dev); 397 - if (IS_ERR(attach)) 398 - return ERR_CAST(attach); 399 - 400 - get_dma_buf(buf); 401 - 402 - bo = ivpu_bo_alloc(vdev, NULL, buf->size, DRM_IVPU_BO_MAPPABLE, &prime_ops, NULL, 0); 403 - if (IS_ERR(bo)) { 404 - ivpu_err(vdev, "Failed to import BO: %pe (size %lu)", bo, buf->size); 405 - goto err_detach; 406 - } 407 - 408 - lockdep_set_class(&bo->lock, &prime_bo_lock_class_key); 409 - 410 - bo->base.import_attach = attach; 411 - 412 - return &bo->base; 413 - 414 - err_detach: 415 - dma_buf_detach(buf, attach); 416 - dma_buf_put(buf); 417 - return ERR_CAST(bo); 611 + drm_gem_shmem_vunmap(&bo->base, &map); 612 + drm_gem_object_put(&bo->base.base); 418 613 } 419 614 420 615 int ivpu_bo_info_ioctl(struct drm_device *dev, void *data, struct drm_file *file) 421 616 { 422 - struct ivpu_file_priv *file_priv = file->driver_priv; 423 - struct ivpu_device *vdev = to_ivpu_device(dev); 424 617 struct drm_ivpu_bo_info *args = data; 425 618 struct drm_gem_object *obj; 426 619 struct ivpu_bo *bo; ··· 394 665 bo = to_ivpu_bo(obj); 395 666 396 667 mutex_lock(&bo->lock); 397 - 398 - if (!bo->ctx) { 399 - ret = ivpu_bo_alloc_vpu_addr(bo, &file_priv->ctx, NULL); 400 - if (ret) { 401 - ivpu_err(vdev, "Failed to allocate vpu_addr: %d\n", ret); 402 - goto unlock; 403 - } 404 - } 405 - 406 668 args->flags = bo->flags; 407 669 args->mmap_offset = drm_vma_node_offset_addr(&obj->vma_node); 408 670 args->vpu_addr = bo->vpu_addr; 409 671 args->size = obj->size; 410 - unlock: 411 672 mutex_unlock(&bo->lock); 673 + 412 674 drm_gem_object_put(obj); 413 675 return ret; 414 676 } ··· 434 714 { 435 715 unsigned long dma_refcount = 0; 436 716 437 - if (bo->base.dma_buf && bo->base.dma_buf->file) 438 - dma_refcount = atomic_long_read(&bo->base.dma_buf->file->f_count); 717 + mutex_lock(&bo->lock); 439 718 440 - drm_printf(p, "%5u %6d %16llx %10lu %10u %12lu %14s\n", 441 - bo->ctx->id, bo->handle, bo->vpu_addr, ivpu_bo_size(bo), 442 - kref_read(&bo->base.refcount), dma_refcount, bo->ops->name); 719 + if (bo->base.base.dma_buf && bo->base.base.dma_buf->file) 720 + dma_refcount = atomic_long_read(&bo->base.base.dma_buf->file->f_count); 721 + 722 + drm_printf(p, "%-3u %-6d 0x%-12llx %-10lu 0x%-8x %-4u %-8lu", 723 + bo->ctx->id, bo->handle, bo->vpu_addr, bo->base.base.size, 724 + bo->flags, kref_read(&bo->base.base.refcount), dma_refcount); 725 + 726 + if (bo->base.base.import_attach) 727 + drm_printf(p, " imported"); 728 + 729 + if (bo->base.pages) 730 + drm_printf(p, " has_pages"); 731 + 732 + if (bo->mmu_mapped) 733 + drm_printf(p, " mmu_mapped"); 734 + 735 + drm_printf(p, "\n"); 736 + 737 + mutex_unlock(&bo->lock); 443 738 } 444 739 445 740 void ivpu_bo_list(struct drm_device *dev, struct drm_printer *p) 446 741 { 447 742 struct ivpu_device *vdev = to_ivpu_device(dev); 448 - struct ivpu_file_priv *file_priv; 449 - unsigned long ctx_id; 450 743 struct ivpu_bo *bo; 451 744 452 - drm_printf(p, "%5s %6s %16s %10s %10s %12s %14s\n", 453 - "ctx", "handle", "vpu_addr", "size", "refcount", "dma_refcount", "type"); 745 + drm_printf(p, "%-3s %-6s %-14s %-10s %-10s %-4s %-8s %s\n", 746 + "ctx", "handle", "vpu_addr", "size", "flags", "refs", "dma_refs", "attribs"); 454 747 455 - mutex_lock(&vdev->gctx.lock); 456 - list_for_each_entry(bo, &vdev->gctx.bo_list, ctx_node) 748 + mutex_lock(&vdev->bo_list_lock); 749 + list_for_each_entry(bo, &vdev->bo_list, bo_list_node) 457 750 ivpu_bo_print_info(bo, p); 458 - mutex_unlock(&vdev->gctx.lock); 459 - 460 - xa_for_each(&vdev->context_xa, ctx_id, file_priv) { 461 - file_priv = ivpu_file_priv_get_by_ctx_id(vdev, ctx_id); 462 - if (!file_priv) 463 - continue; 464 - 465 - mutex_lock(&file_priv->ctx.lock); 466 - list_for_each_entry(bo, &file_priv->ctx.bo_list, ctx_node) 467 - ivpu_bo_print_info(bo, p); 468 - mutex_unlock(&file_priv->ctx.lock); 469 - 470 - ivpu_file_priv_put(&file_priv); 471 - } 751 + mutex_unlock(&vdev->bo_list_lock); 472 752 } 473 753 474 754 void ivpu_bo_list_print(struct drm_device *dev)

+16 -59

drivers/accel/ivpu/ivpu_gem.h

··· 6 6 #define __IVPU_GEM_H__ 7 7 8 8 #include <drm/drm_gem.h> 9 + #include <drm/drm_gem_shmem_helper.h> 9 10 #include <drm/drm_mm.h> 10 11 11 - struct dma_buf; 12 - struct ivpu_bo_ops; 13 12 struct ivpu_file_priv; 14 13 15 14 struct ivpu_bo { 16 - struct drm_gem_object base; 17 - const struct ivpu_bo_ops *ops; 18 - 15 + struct drm_gem_shmem_object base; 19 16 struct ivpu_mmu_context *ctx; 20 - struct list_head ctx_node; 17 + struct list_head bo_list_node; 21 18 struct drm_mm_node mm_node; 22 19 23 - struct mutex lock; /* Protects: pages, sgt, mmu_mapped */ 24 - struct sg_table *sgt; 25 - struct page **pages; 26 - bool mmu_mapped; 27 - 28 - void *kvaddr; 20 + struct mutex lock; /* Protects: ctx, mmu_mapped, vpu_addr */ 29 21 u64 vpu_addr; 30 22 u32 handle; 31 23 u32 flags; 32 - uintptr_t user_ptr; 33 - u32 job_status; 34 - }; 35 - 36 - enum ivpu_bo_type { 37 - IVPU_BO_TYPE_SHMEM = 1, 38 - IVPU_BO_TYPE_INTERNAL, 39 - IVPU_BO_TYPE_PRIME, 40 - }; 41 - 42 - struct ivpu_bo_ops { 43 - enum ivpu_bo_type type; 44 - const char *name; 45 - int (*alloc_pages)(struct ivpu_bo *bo); 46 - void (*free_pages)(struct ivpu_bo *bo); 47 - int (*map_pages)(struct ivpu_bo *bo); 48 - void (*unmap_pages)(struct ivpu_bo *bo); 24 + u32 job_status; /* Valid only for command buffer */ 25 + bool mmu_mapped; 49 26 }; 50 27 51 28 int ivpu_bo_pin(struct ivpu_bo *bo); 52 - void ivpu_bo_remove_all_bos_from_context(struct ivpu_mmu_context *ctx); 53 - void ivpu_bo_list(struct drm_device *dev, struct drm_printer *p); 54 - void ivpu_bo_list_print(struct drm_device *dev); 29 + void ivpu_bo_remove_all_bos_from_context(struct ivpu_device *vdev, struct ivpu_mmu_context *ctx); 55 30 56 - struct ivpu_bo * 57 - ivpu_bo_alloc_internal(struct ivpu_device *vdev, u64 vpu_addr, u64 size, u32 flags); 31 + struct drm_gem_object *ivpu_gem_create_object(struct drm_device *dev, size_t size); 32 + struct ivpu_bo *ivpu_bo_alloc_internal(struct ivpu_device *vdev, u64 vpu_addr, u64 size, u32 flags); 58 33 void ivpu_bo_free_internal(struct ivpu_bo *bo); 59 - struct drm_gem_object *ivpu_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf); 60 - void ivpu_bo_unmap_sgt_and_remove_from_context(struct ivpu_bo *bo); 61 34 62 35 int ivpu_bo_create_ioctl(struct drm_device *dev, void *data, struct drm_file *file); 63 36 int ivpu_bo_info_ioctl(struct drm_device *dev, void *data, struct drm_file *file); 64 37 int ivpu_bo_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file); 65 38 39 + void ivpu_bo_list(struct drm_device *dev, struct drm_printer *p); 40 + void ivpu_bo_list_print(struct drm_device *dev); 41 + 66 42 static inline struct ivpu_bo *to_ivpu_bo(struct drm_gem_object *obj) 67 43 { 68 - return container_of(obj, struct ivpu_bo, base); 44 + return container_of(obj, struct ivpu_bo, base.base); 69 45 } 70 46 71 47 static inline void *ivpu_bo_vaddr(struct ivpu_bo *bo) 72 48 { 73 - return bo->kvaddr; 49 + return bo->base.vaddr; 74 50 } 75 51 76 52 static inline size_t ivpu_bo_size(struct ivpu_bo *bo) 77 53 { 78 - return bo->base.size; 79 - } 80 - 81 - static inline struct page *ivpu_bo_get_page(struct ivpu_bo *bo, u64 offset) 82 - { 83 - if (offset > ivpu_bo_size(bo) || !bo->pages) 84 - return NULL; 85 - 86 - return bo->pages[offset / PAGE_SIZE]; 54 + return bo->base.base.size; 87 55 } 88 56 89 57 static inline u32 ivpu_bo_cache_mode(struct ivpu_bo *bo) ··· 64 96 return ivpu_bo_cache_mode(bo) == DRM_IVPU_BO_CACHED; 65 97 } 66 98 67 - static inline pgprot_t ivpu_bo_pgprot(struct ivpu_bo *bo, pgprot_t prot) 68 - { 69 - if (bo->flags & DRM_IVPU_BO_WC) 70 - return pgprot_writecombine(prot); 71 - 72 - if (bo->flags & DRM_IVPU_BO_UNCACHED) 73 - return pgprot_noncached(prot); 74 - 75 - return prot; 76 - } 77 - 78 99 static inline struct ivpu_device *ivpu_bo_to_vdev(struct ivpu_bo *bo) 79 100 { 80 - return to_ivpu_device(bo->base.dev); 101 + return to_ivpu_device(bo->base.base.dev); 81 102 } 82 103 83 104 static inline void *ivpu_to_cpu_addr(struct ivpu_bo *bo, u32 vpu_addr)

+20

drivers/accel/ivpu/ivpu_hw.h

··· 15 15 int (*power_down)(struct ivpu_device *vdev); 16 16 int (*reset)(struct ivpu_device *vdev); 17 17 bool (*is_idle)(struct ivpu_device *vdev); 18 + int (*wait_for_idle)(struct ivpu_device *vdev); 18 19 void (*wdt_disable)(struct ivpu_device *vdev); 19 20 void (*diagnose_failure)(struct ivpu_device *vdev); 21 + u32 (*profiling_freq_get)(struct ivpu_device *vdev); 22 + void (*profiling_freq_drive)(struct ivpu_device *vdev, bool enable); 20 23 u32 (*reg_pll_freq_get)(struct ivpu_device *vdev); 21 24 u32 (*reg_telemetry_offset_get)(struct ivpu_device *vdev); 22 25 u32 (*reg_telemetry_size_get)(struct ivpu_device *vdev); ··· 61 58 u32 sku; 62 59 u16 config; 63 60 int dma_bits; 61 + ktime_t d0i3_entry_host_ts; 62 + u64 d0i3_entry_vpu_ts; 64 63 }; 65 64 66 65 extern const struct ivpu_hw_ops ivpu_hw_37xx_ops; ··· 90 85 return vdev->hw->ops->is_idle(vdev); 91 86 }; 92 87 88 + static inline int ivpu_hw_wait_for_idle(struct ivpu_device *vdev) 89 + { 90 + return vdev->hw->ops->wait_for_idle(vdev); 91 + }; 92 + 93 93 static inline int ivpu_hw_power_down(struct ivpu_device *vdev) 94 94 { 95 95 ivpu_dbg(vdev, PM, "HW power down\n"); ··· 112 102 static inline void ivpu_hw_wdt_disable(struct ivpu_device *vdev) 113 103 { 114 104 vdev->hw->ops->wdt_disable(vdev); 105 + }; 106 + 107 + static inline u32 ivpu_hw_profiling_freq_get(struct ivpu_device *vdev) 108 + { 109 + return vdev->hw->ops->profiling_freq_get(vdev); 110 + }; 111 + 112 + static inline void ivpu_hw_profiling_freq_drive(struct ivpu_device *vdev, bool enable) 113 + { 114 + return vdev->hw->ops->profiling_freq_drive(vdev, enable); 115 115 }; 116 116 117 117 /* Register indirect accesses */

+52 -18

drivers/accel/ivpu/ivpu_hw_37xx.c

··· 29 29 30 30 #define PLL_REF_CLK_FREQ (50 * 1000000) 31 31 #define PLL_SIMULATION_FREQ (10 * 1000000) 32 + #define PLL_PROF_CLK_FREQ (38400 * 1000) 32 33 #define PLL_DEFAULT_EPP_VALUE 0x80 33 34 34 35 #define TIM_SAFE_ENABLE 0xf1d0dead ··· 38 37 #define TIMEOUT_US (150 * USEC_PER_MSEC) 39 38 #define PWR_ISLAND_STATUS_TIMEOUT_US (5 * USEC_PER_MSEC) 40 39 #define PLL_TIMEOUT_US (1500 * USEC_PER_MSEC) 41 - #define IDLE_TIMEOUT_US (500 * USEC_PER_MSEC) 40 + #define IDLE_TIMEOUT_US (5 * USEC_PER_MSEC) 42 41 43 42 #define ICB_0_IRQ_MASK ((REG_FLD(VPU_37XX_HOST_SS_ICB_STATUS_0, HOST_IPC_FIFO_INT)) | \ 44 43 (REG_FLD(VPU_37XX_HOST_SS_ICB_STATUS_0, MMU_IRQ_0_INT)) | \ ··· 91 90 vdev->timeout.tdr = 2000; 92 91 vdev->timeout.reschedule_suspend = 10; 93 92 vdev->timeout.autosuspend = 10; 93 + vdev->timeout.d0i3_entry_msg = 5; 94 94 } 95 95 96 96 static int ivpu_pll_wait_for_cmd_send(struct ivpu_device *vdev) ··· 653 651 { 654 652 int ret; 655 653 656 - ret = ivpu_hw_37xx_reset(vdev); 657 - if (ret) 658 - ivpu_warn(vdev, "Failed to reset HW: %d\n", ret); 659 - 660 654 ret = ivpu_hw_37xx_d0i3_disable(vdev); 661 655 if (ret) 662 656 ivpu_warn(vdev, "Failed to disable D0I3: %d\n", ret); ··· 716 718 REG_TEST_FLD(VPU_37XX_BUTTRESS_VPU_STATUS, IDLE, val); 717 719 } 718 720 721 + static int ivpu_hw_37xx_wait_for_idle(struct ivpu_device *vdev) 722 + { 723 + return REGB_POLL_FLD(VPU_37XX_BUTTRESS_VPU_STATUS, IDLE, 0x1, IDLE_TIMEOUT_US); 724 + } 725 + 726 + static void ivpu_hw_37xx_save_d0i3_entry_timestamp(struct ivpu_device *vdev) 727 + { 728 + vdev->hw->d0i3_entry_host_ts = ktime_get_boottime(); 729 + vdev->hw->d0i3_entry_vpu_ts = REGV_RD64(VPU_37XX_CPU_SS_TIM_PERF_FREE_CNT); 730 + } 731 + 719 732 static int ivpu_hw_37xx_power_down(struct ivpu_device *vdev) 720 733 { 721 734 int ret = 0; 722 735 723 - if (!ivpu_hw_37xx_is_idle(vdev) && ivpu_hw_37xx_reset(vdev)) 724 - ivpu_err(vdev, "Failed to reset the VPU\n"); 736 + ivpu_hw_37xx_save_d0i3_entry_timestamp(vdev); 737 + 738 + if (!ivpu_hw_37xx_is_idle(vdev)) { 739 + ivpu_warn(vdev, "VPU not idle during power down\n"); 740 + if (ivpu_hw_37xx_reset(vdev)) 741 + ivpu_warn(vdev, "Failed to reset the VPU\n"); 742 + } 725 743 726 744 if (ivpu_pll_disable(vdev)) { 727 745 ivpu_err(vdev, "Failed to disable PLL\n"); ··· 768 754 val = REGV_RD32(VPU_37XX_CPU_SS_TIM_GEN_CONFIG); 769 755 val = REG_CLR_FLD(VPU_37XX_CPU_SS_TIM_GEN_CONFIG, WDOG_TO_INT_CLR, val); 770 756 REGV_WR32(VPU_37XX_CPU_SS_TIM_GEN_CONFIG, val); 757 + } 758 + 759 + static u32 ivpu_hw_37xx_profiling_freq_get(struct ivpu_device *vdev) 760 + { 761 + return PLL_PROF_CLK_FREQ; 762 + } 763 + 764 + static void ivpu_hw_37xx_profiling_freq_drive(struct ivpu_device *vdev, bool enable) 765 + { 766 + /* Profiling freq - is a debug feature. Unavailable on VPU 37XX. */ 771 767 } 772 768 773 769 static u32 ivpu_hw_37xx_pll_to_freq(u32 ratio, u32 config) ··· 891 867 } 892 868 893 869 /* Handler for IRQs from VPU core (irqV) */ 894 - static u32 ivpu_hw_37xx_irqv_handler(struct ivpu_device *vdev, int irq) 870 + static bool ivpu_hw_37xx_irqv_handler(struct ivpu_device *vdev, int irq, bool *wake_thread) 895 871 { 896 872 u32 status = REGV_RD32(VPU_37XX_HOST_SS_ICB_STATUS_0) & ICB_0_IRQ_MASK; 873 + 874 + if (!status) 875 + return false; 897 876 898 877 REGV_WR32(VPU_37XX_HOST_SS_ICB_CLEAR_0, status); 899 878 ··· 904 877 ivpu_mmu_irq_evtq_handler(vdev); 905 878 906 879 if (REG_TEST_FLD(VPU_37XX_HOST_SS_ICB_STATUS_0, HOST_IPC_FIFO_INT, status)) 907 - ivpu_ipc_irq_handler(vdev); 880 + ivpu_ipc_irq_handler(vdev, wake_thread); 908 881 909 882 if (REG_TEST_FLD(VPU_37XX_HOST_SS_ICB_STATUS_0, MMU_IRQ_1_INT, status)) 910 883 ivpu_dbg(vdev, IRQ, "MMU sync complete\n"); ··· 921 894 if (REG_TEST_FLD(VPU_37XX_HOST_SS_ICB_STATUS_0, NOC_FIREWALL_INT, status)) 922 895 ivpu_hw_37xx_irq_noc_firewall_handler(vdev); 923 896 924 - return status; 897 + return true; 925 898 } 926 899 927 900 /* Handler for IRQs from Buttress core (irqB) */ 928 - static u32 ivpu_hw_37xx_irqb_handler(struct ivpu_device *vdev, int irq) 901 + static bool ivpu_hw_37xx_irqb_handler(struct ivpu_device *vdev, int irq) 929 902 { 930 903 u32 status = REGB_RD32(VPU_37XX_BUTTRESS_INTERRUPT_STAT) & BUTTRESS_IRQ_MASK; 931 904 bool schedule_recovery = false; 932 905 933 - if (status == 0) 934 - return 0; 906 + if (!status) 907 + return false; 935 908 936 909 if (REG_TEST_FLD(VPU_37XX_BUTTRESS_INTERRUPT_STAT, FREQ_CHANGE, status)) 937 910 ivpu_dbg(vdev, IRQ, "FREQ_CHANGE irq: %08x", ··· 967 940 if (schedule_recovery) 968 941 ivpu_pm_schedule_recovery(vdev); 969 942 970 - return status; 943 + return true; 971 944 } 972 945 973 946 static irqreturn_t ivpu_hw_37xx_irq_handler(int irq, void *ptr) 974 947 { 975 948 struct ivpu_device *vdev = ptr; 976 - u32 ret_irqv, ret_irqb; 949 + bool irqv_handled, irqb_handled, wake_thread = false; 977 950 978 951 REGB_WR32(VPU_37XX_BUTTRESS_GLOBAL_INT_MASK, 0x1); 979 952 980 - ret_irqv = ivpu_hw_37xx_irqv_handler(vdev, irq); 981 - ret_irqb = ivpu_hw_37xx_irqb_handler(vdev, irq); 953 + irqv_handled = ivpu_hw_37xx_irqv_handler(vdev, irq, &wake_thread); 954 + irqb_handled = ivpu_hw_37xx_irqb_handler(vdev, irq); 982 955 983 956 /* Re-enable global interrupts to re-trigger MSI for pending interrupts */ 984 957 REGB_WR32(VPU_37XX_BUTTRESS_GLOBAL_INT_MASK, 0x0); 985 958 986 - return IRQ_RETVAL(ret_irqb | ret_irqv); 959 + if (wake_thread) 960 + return IRQ_WAKE_THREAD; 961 + if (irqv_handled || irqb_handled) 962 + return IRQ_HANDLED; 963 + return IRQ_NONE; 987 964 } 988 965 989 966 static void ivpu_hw_37xx_diagnose_failure(struct ivpu_device *vdev) ··· 1024 993 .info_init = ivpu_hw_37xx_info_init, 1025 994 .power_up = ivpu_hw_37xx_power_up, 1026 995 .is_idle = ivpu_hw_37xx_is_idle, 996 + .wait_for_idle = ivpu_hw_37xx_wait_for_idle, 1027 997 .power_down = ivpu_hw_37xx_power_down, 1028 998 .reset = ivpu_hw_37xx_reset, 1029 999 .boot_fw = ivpu_hw_37xx_boot_fw, 1030 1000 .wdt_disable = ivpu_hw_37xx_wdt_disable, 1031 1001 .diagnose_failure = ivpu_hw_37xx_diagnose_failure, 1002 + .profiling_freq_get = ivpu_hw_37xx_profiling_freq_get, 1003 + .profiling_freq_drive = ivpu_hw_37xx_profiling_freq_drive, 1032 1004 .reg_pll_freq_get = ivpu_hw_37xx_reg_pll_freq_get, 1033 1005 .reg_telemetry_offset_get = ivpu_hw_37xx_reg_telemetry_offset_get, 1034 1006 .reg_telemetry_size_get = ivpu_hw_37xx_reg_telemetry_size_get,

+2

drivers/accel/ivpu/ivpu_hw_37xx_reg.h

··· 240 240 #define VPU_37XX_CPU_SS_TIM_GEN_CONFIG 0x06021008u 241 241 #define VPU_37XX_CPU_SS_TIM_GEN_CONFIG_WDOG_TO_INT_CLR_MASK BIT_MASK(9) 242 242 243 + #define VPU_37XX_CPU_SS_TIM_PERF_FREE_CNT 0x06029000u 244 + 243 245 #define VPU_37XX_CPU_SS_DOORBELL_0 0x06300000u 244 246 #define VPU_37XX_CPU_SS_DOORBELL_0_SET_MASK BIT_MASK(0) 245 247

+48 -21

drivers/accel/ivpu/ivpu_hw_40xx.c

··· 39 39 #define TIMEOUT_US (150 * USEC_PER_MSEC) 40 40 #define PWR_ISLAND_STATUS_TIMEOUT_US (5 * USEC_PER_MSEC) 41 41 #define PLL_TIMEOUT_US (1500 * USEC_PER_MSEC) 42 + #define IDLE_TIMEOUT_US (5 * USEC_PER_MSEC) 42 43 43 44 #define WEIGHTS_DEFAULT 0xf711f711u 44 45 #define WEIGHTS_ATS_DEFAULT 0x0000f711u ··· 140 139 vdev->timeout.tdr = 2000000; 141 140 vdev->timeout.reschedule_suspend = 1000; 142 141 vdev->timeout.autosuspend = -1; 142 + vdev->timeout.d0i3_entry_msg = 500; 143 143 } else if (ivpu_is_simics(vdev)) { 144 144 vdev->timeout.boot = 50; 145 145 vdev->timeout.jsm = 500; 146 146 vdev->timeout.tdr = 10000; 147 147 vdev->timeout.reschedule_suspend = 10; 148 148 vdev->timeout.autosuspend = -1; 149 + vdev->timeout.d0i3_entry_msg = 100; 149 150 } else { 150 151 vdev->timeout.boot = 1000; 151 152 vdev->timeout.jsm = 500; 152 153 vdev->timeout.tdr = 2000; 153 154 vdev->timeout.reschedule_suspend = 10; 154 155 vdev->timeout.autosuspend = 10; 156 + vdev->timeout.d0i3_entry_msg = 5; 155 157 } 156 158 } 157 159 ··· 828 824 { 829 825 int ret; 830 826 831 - ret = ivpu_hw_40xx_reset(vdev); 832 - if (ret) { 833 - ivpu_err(vdev, "Failed to reset HW: %d\n", ret); 834 - return ret; 835 - } 836 - 837 827 ret = ivpu_hw_40xx_d0i3_disable(vdev); 838 828 if (ret) 839 829 ivpu_warn(vdev, "Failed to disable D0I3: %d\n", ret); ··· 896 898 REG_TEST_FLD(VPU_40XX_BUTTRESS_VPU_STATUS, IDLE, val); 897 899 } 898 900 901 + static int ivpu_hw_40xx_wait_for_idle(struct ivpu_device *vdev) 902 + { 903 + return REGB_POLL_FLD(VPU_40XX_BUTTRESS_VPU_STATUS, IDLE, 0x1, IDLE_TIMEOUT_US); 904 + } 905 + 906 + static void ivpu_hw_40xx_save_d0i3_entry_timestamp(struct ivpu_device *vdev) 907 + { 908 + vdev->hw->d0i3_entry_host_ts = ktime_get_boottime(); 909 + vdev->hw->d0i3_entry_vpu_ts = REGV_RD64(VPU_40XX_CPU_SS_TIM_PERF_EXT_FREE_CNT); 910 + } 911 + 899 912 static int ivpu_hw_40xx_power_down(struct ivpu_device *vdev) 900 913 { 901 914 int ret = 0; 915 + 916 + ivpu_hw_40xx_save_d0i3_entry_timestamp(vdev); 902 917 903 918 if (!ivpu_hw_40xx_is_idle(vdev) && ivpu_hw_40xx_reset(vdev)) 904 919 ivpu_warn(vdev, "Failed to reset the VPU\n"); ··· 942 931 val = REGV_RD32(VPU_40XX_CPU_SS_TIM_GEN_CONFIG); 943 932 val = REG_CLR_FLD(VPU_40XX_CPU_SS_TIM_GEN_CONFIG, WDOG_TO_INT_CLR, val); 944 933 REGV_WR32(VPU_40XX_CPU_SS_TIM_GEN_CONFIG, val); 934 + } 935 + 936 + static u32 ivpu_hw_40xx_profiling_freq_get(struct ivpu_device *vdev) 937 + { 938 + return vdev->hw->pll.profiling_freq; 939 + } 940 + 941 + static void ivpu_hw_40xx_profiling_freq_drive(struct ivpu_device *vdev, bool enable) 942 + { 943 + if (enable) 944 + vdev->hw->pll.profiling_freq = PLL_PROFILING_FREQ_HIGH; 945 + else 946 + vdev->hw->pll.profiling_freq = PLL_PROFILING_FREQ_DEFAULT; 945 947 } 946 948 947 949 /* Register indirect accesses */ ··· 1047 1023 } 1048 1024 1049 1025 /* Handler for IRQs from VPU core (irqV) */ 1050 - static irqreturn_t ivpu_hw_40xx_irqv_handler(struct ivpu_device *vdev, int irq) 1026 + static bool ivpu_hw_40xx_irqv_handler(struct ivpu_device *vdev, int irq, bool *wake_thread) 1051 1027 { 1052 1028 u32 status = REGV_RD32(VPU_40XX_HOST_SS_ICB_STATUS_0) & ICB_0_IRQ_MASK; 1053 - irqreturn_t ret = IRQ_NONE; 1054 1029 1055 1030 if (!status) 1056 - return IRQ_NONE; 1031 + return false; 1057 1032 1058 1033 REGV_WR32(VPU_40XX_HOST_SS_ICB_CLEAR_0, status); 1059 1034 ··· 1060 1037 ivpu_mmu_irq_evtq_handler(vdev); 1061 1038 1062 1039 if (REG_TEST_FLD(VPU_40XX_HOST_SS_ICB_STATUS_0, HOST_IPC_FIFO_INT, status)) 1063 - ret |= ivpu_ipc_irq_handler(vdev); 1040 + ivpu_ipc_irq_handler(vdev, wake_thread); 1064 1041 1065 1042 if (REG_TEST_FLD(VPU_40XX_HOST_SS_ICB_STATUS_0, MMU_IRQ_1_INT, status)) 1066 1043 ivpu_dbg(vdev, IRQ, "MMU sync complete\n"); ··· 1077 1054 if (REG_TEST_FLD(VPU_40XX_HOST_SS_ICB_STATUS_0, NOC_FIREWALL_INT, status)) 1078 1055 ivpu_hw_40xx_irq_noc_firewall_handler(vdev); 1079 1056 1080 - return ret; 1057 + return true; 1081 1058 } 1082 1059 1083 1060 /* Handler for IRQs from Buttress core (irqB) */ 1084 - static irqreturn_t ivpu_hw_40xx_irqb_handler(struct ivpu_device *vdev, int irq) 1061 + static bool ivpu_hw_40xx_irqb_handler(struct ivpu_device *vdev, int irq) 1085 1062 { 1086 1063 bool schedule_recovery = false; 1087 1064 u32 status = REGB_RD32(VPU_40XX_BUTTRESS_INTERRUPT_STAT) & BUTTRESS_IRQ_MASK; 1088 1065 1089 - if (status == 0) 1090 - return IRQ_NONE; 1066 + if (!status) 1067 + return false; 1091 1068 1092 1069 if (REG_TEST_FLD(VPU_40XX_BUTTRESS_INTERRUPT_STAT, FREQ_CHANGE, status)) 1093 1070 ivpu_dbg(vdev, IRQ, "FREQ_CHANGE"); ··· 1139 1116 if (schedule_recovery) 1140 1117 ivpu_pm_schedule_recovery(vdev); 1141 1118 1142 - return IRQ_HANDLED; 1119 + return true; 1143 1120 } 1144 1121 1145 1122 static irqreturn_t ivpu_hw_40xx_irq_handler(int irq, void *ptr) 1146 1123 { 1124 + bool irqv_handled, irqb_handled, wake_thread = false; 1147 1125 struct ivpu_device *vdev = ptr; 1148 - irqreturn_t ret = IRQ_NONE; 1149 1126 1150 1127 REGB_WR32(VPU_40XX_BUTTRESS_GLOBAL_INT_MASK, 0x1); 1151 1128 1152 - ret |= ivpu_hw_40xx_irqv_handler(vdev, irq); 1153 - ret |= ivpu_hw_40xx_irqb_handler(vdev, irq); 1129 + irqv_handled = ivpu_hw_40xx_irqv_handler(vdev, irq, &wake_thread); 1130 + irqb_handled = ivpu_hw_40xx_irqb_handler(vdev, irq); 1154 1131 1155 1132 /* Re-enable global interrupts to re-trigger MSI for pending interrupts */ 1156 1133 REGB_WR32(VPU_40XX_BUTTRESS_GLOBAL_INT_MASK, 0x0); 1157 1134 1158 - if (ret & IRQ_WAKE_THREAD) 1135 + if (wake_thread) 1159 1136 return IRQ_WAKE_THREAD; 1160 - 1161 - return ret; 1137 + if (irqv_handled || irqb_handled) 1138 + return IRQ_HANDLED; 1139 + return IRQ_NONE; 1162 1140 } 1163 1141 1164 1142 static void ivpu_hw_40xx_diagnose_failure(struct ivpu_device *vdev) ··· 1209 1185 .info_init = ivpu_hw_40xx_info_init, 1210 1186 .power_up = ivpu_hw_40xx_power_up, 1211 1187 .is_idle = ivpu_hw_40xx_is_idle, 1188 + .wait_for_idle = ivpu_hw_40xx_wait_for_idle, 1212 1189 .power_down = ivpu_hw_40xx_power_down, 1213 1190 .reset = ivpu_hw_40xx_reset, 1214 1191 .boot_fw = ivpu_hw_40xx_boot_fw, 1215 1192 .wdt_disable = ivpu_hw_40xx_wdt_disable, 1216 1193 .diagnose_failure = ivpu_hw_40xx_diagnose_failure, 1194 + .profiling_freq_get = ivpu_hw_40xx_profiling_freq_get, 1195 + .profiling_freq_drive = ivpu_hw_40xx_profiling_freq_drive, 1217 1196 .reg_pll_freq_get = ivpu_hw_40xx_reg_pll_freq_get, 1218 1197 .reg_telemetry_offset_get = ivpu_hw_40xx_reg_telemetry_offset_get, 1219 1198 .reg_telemetry_size_get = ivpu_hw_40xx_reg_telemetry_size_get,

+157 -94

drivers/accel/ivpu/ivpu_ipc.c

··· 5 5 6 6 #include <linux/genalloc.h> 7 7 #include <linux/highmem.h> 8 - #include <linux/kthread.h> 8 + #include <linux/pm_runtime.h> 9 9 #include <linux/wait.h> 10 10 11 11 #include "ivpu_drv.h" ··· 17 17 #include "ivpu_pm.h" 18 18 19 19 #define IPC_MAX_RX_MSG 128 20 - #define IS_KTHREAD() (get_current()->flags & PF_KTHREAD) 21 20 22 21 struct ivpu_ipc_tx_buf { 23 22 struct ivpu_ipc_hdr ipc; 24 23 struct vpu_jsm_msg jsm; 25 - }; 26 - 27 - struct ivpu_ipc_rx_msg { 28 - struct list_head link; 29 - struct ivpu_ipc_hdr *ipc_hdr; 30 - struct vpu_jsm_msg *jsm_msg; 31 24 }; 32 25 33 26 static void ivpu_ipc_msg_dump(struct ivpu_device *vdev, char *c, ··· 132 139 ivpu_hw_reg_ipc_tx_set(vdev, vpu_addr); 133 140 } 134 141 135 - void 136 - ivpu_ipc_consumer_add(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons, u32 channel) 142 + static void 143 + ivpu_ipc_rx_msg_add(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons, 144 + struct ivpu_ipc_hdr *ipc_hdr, struct vpu_jsm_msg *jsm_msg) 145 + { 146 + struct ivpu_ipc_info *ipc = vdev->ipc; 147 + struct ivpu_ipc_rx_msg *rx_msg; 148 + 149 + lockdep_assert_held(&ipc->cons_lock); 150 + lockdep_assert_irqs_disabled(); 151 + 152 + rx_msg = kzalloc(sizeof(*rx_msg), GFP_ATOMIC); 153 + if (!rx_msg) { 154 + ivpu_ipc_rx_mark_free(vdev, ipc_hdr, jsm_msg); 155 + return; 156 + } 157 + 158 + atomic_inc(&ipc->rx_msg_count); 159 + 160 + rx_msg->ipc_hdr = ipc_hdr; 161 + rx_msg->jsm_msg = jsm_msg; 162 + rx_msg->callback = cons->rx_callback; 163 + 164 + if (rx_msg->callback) { 165 + list_add_tail(&rx_msg->link, &ipc->cb_msg_list); 166 + } else { 167 + spin_lock(&cons->rx_lock); 168 + list_add_tail(&rx_msg->link, &cons->rx_msg_list); 169 + spin_unlock(&cons->rx_lock); 170 + wake_up(&cons->rx_msg_wq); 171 + } 172 + } 173 + 174 + static void 175 + ivpu_ipc_rx_msg_del(struct ivpu_device *vdev, struct ivpu_ipc_rx_msg *rx_msg) 176 + { 177 + list_del(&rx_msg->link); 178 + ivpu_ipc_rx_mark_free(vdev, rx_msg->ipc_hdr, rx_msg->jsm_msg); 179 + atomic_dec(&vdev->ipc->rx_msg_count); 180 + kfree(rx_msg); 181 + } 182 + 183 + void ivpu_ipc_consumer_add(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons, 184 + u32 channel, ivpu_ipc_rx_callback_t rx_callback) 137 185 { 138 186 struct ivpu_ipc_info *ipc = vdev->ipc; 139 187 ··· 182 148 cons->channel = channel; 183 149 cons->tx_vpu_addr = 0; 184 150 cons->request_id = 0; 185 - spin_lock_init(&cons->rx_msg_lock); 151 + cons->aborted = false; 152 + cons->rx_callback = rx_callback; 153 + spin_lock_init(&cons->rx_lock); 186 154 INIT_LIST_HEAD(&cons->rx_msg_list); 187 155 init_waitqueue_head(&cons->rx_msg_wq); 188 156 189 - spin_lock_irq(&ipc->cons_list_lock); 157 + spin_lock_irq(&ipc->cons_lock); 190 158 list_add_tail(&cons->link, &ipc->cons_list); 191 - spin_unlock_irq(&ipc->cons_list_lock); 159 + spin_unlock_irq(&ipc->cons_lock); 192 160 } 193 161 194 162 void ivpu_ipc_consumer_del(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons) ··· 198 162 struct ivpu_ipc_info *ipc = vdev->ipc; 199 163 struct ivpu_ipc_rx_msg *rx_msg, *r; 200 164 201 - spin_lock_irq(&ipc->cons_list_lock); 165 + spin_lock_irq(&ipc->cons_lock); 202 166 list_del(&cons->link); 203 - spin_unlock_irq(&ipc->cons_list_lock); 167 + spin_unlock_irq(&ipc->cons_lock); 204 168 205 - spin_lock_irq(&cons->rx_msg_lock); 206 - list_for_each_entry_safe(rx_msg, r, &cons->rx_msg_list, link) { 207 - list_del(&rx_msg->link); 208 - ivpu_ipc_rx_mark_free(vdev, rx_msg->ipc_hdr, rx_msg->jsm_msg); 209 - atomic_dec(&ipc->rx_msg_count); 210 - kfree(rx_msg); 211 - } 212 - spin_unlock_irq(&cons->rx_msg_lock); 169 + spin_lock_irq(&cons->rx_lock); 170 + list_for_each_entry_safe(rx_msg, r, &cons->rx_msg_list, link) 171 + ivpu_ipc_rx_msg_del(vdev, rx_msg); 172 + spin_unlock_irq(&cons->rx_lock); 213 173 214 174 ivpu_ipc_tx_release(vdev, cons->tx_vpu_addr); 215 175 } ··· 234 202 return ret; 235 203 } 236 204 205 + static bool ivpu_ipc_rx_need_wakeup(struct ivpu_ipc_consumer *cons) 206 + { 207 + bool ret; 208 + 209 + spin_lock_irq(&cons->rx_lock); 210 + ret = !list_empty(&cons->rx_msg_list) || cons->aborted; 211 + spin_unlock_irq(&cons->rx_lock); 212 + 213 + return ret; 214 + } 215 + 237 216 int ivpu_ipc_receive(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons, 238 217 struct ivpu_ipc_hdr *ipc_buf, 239 - struct vpu_jsm_msg *ipc_payload, unsigned long timeout_ms) 218 + struct vpu_jsm_msg *jsm_msg, unsigned long timeout_ms) 240 219 { 241 - struct ivpu_ipc_info *ipc = vdev->ipc; 242 220 struct ivpu_ipc_rx_msg *rx_msg; 243 221 int wait_ret, ret = 0; 244 222 245 - wait_ret = wait_event_timeout(cons->rx_msg_wq, 246 - (IS_KTHREAD() && kthread_should_stop()) || 247 - !list_empty(&cons->rx_msg_list), 248 - msecs_to_jiffies(timeout_ms)); 223 + if (drm_WARN_ONCE(&vdev->drm, cons->rx_callback, "Consumer works only in async mode\n")) 224 + return -EINVAL; 249 225 250 - if (IS_KTHREAD() && kthread_should_stop()) 251 - return -EINTR; 226 + wait_ret = wait_event_timeout(cons->rx_msg_wq, 227 + ivpu_ipc_rx_need_wakeup(cons), 228 + msecs_to_jiffies(timeout_ms)); 252 229 253 230 if (wait_ret == 0) 254 231 return -ETIMEDOUT; 255 232 256 - spin_lock_irq(&cons->rx_msg_lock); 233 + spin_lock_irq(&cons->rx_lock); 234 + if (cons->aborted) { 235 + spin_unlock_irq(&cons->rx_lock); 236 + return -ECANCELED; 237 + } 257 238 rx_msg = list_first_entry_or_null(&cons->rx_msg_list, struct ivpu_ipc_rx_msg, link); 258 239 if (!rx_msg) { 259 - spin_unlock_irq(&cons->rx_msg_lock); 240 + spin_unlock_irq(&cons->rx_lock); 260 241 return -EAGAIN; 261 242 } 262 - list_del(&rx_msg->link); 263 - spin_unlock_irq(&cons->rx_msg_lock); 264 243 265 244 if (ipc_buf) 266 245 memcpy(ipc_buf, rx_msg->ipc_hdr, sizeof(*ipc_buf)); 267 246 if (rx_msg->jsm_msg) { 268 - u32 size = min_t(int, rx_msg->ipc_hdr->data_size, sizeof(*ipc_payload)); 247 + u32 size = min_t(int, rx_msg->ipc_hdr->data_size, sizeof(*jsm_msg)); 269 248 270 249 if (rx_msg->jsm_msg->result != VPU_JSM_STATUS_SUCCESS) { 271 250 ivpu_dbg(vdev, IPC, "IPC resp result error: %d\n", rx_msg->jsm_msg->result); 272 251 ret = -EBADMSG; 273 252 } 274 253 275 - if (ipc_payload) 276 - memcpy(ipc_payload, rx_msg->jsm_msg, size); 254 + if (jsm_msg) 255 + memcpy(jsm_msg, rx_msg->jsm_msg, size); 277 256 } 278 257 279 - ivpu_ipc_rx_mark_free(vdev, rx_msg->ipc_hdr, rx_msg->jsm_msg); 280 - atomic_dec(&ipc->rx_msg_count); 281 - kfree(rx_msg); 282 - 258 + ivpu_ipc_rx_msg_del(vdev, rx_msg); 259 + spin_unlock_irq(&cons->rx_lock); 283 260 return ret; 284 261 } 285 262 ··· 301 260 struct ivpu_ipc_consumer cons; 302 261 int ret; 303 262 304 - ivpu_ipc_consumer_add(vdev, &cons, channel); 263 + ivpu_ipc_consumer_add(vdev, &cons, channel, NULL); 305 264 306 265 ret = ivpu_ipc_send(vdev, &cons, req); 307 266 if (ret) { ··· 326 285 return ret; 327 286 } 328 287 329 - int ivpu_ipc_send_receive(struct ivpu_device *vdev, struct vpu_jsm_msg *req, 330 - enum vpu_ipc_msg_type expected_resp_type, 331 - struct vpu_jsm_msg *resp, u32 channel, 332 - unsigned long timeout_ms) 288 + int ivpu_ipc_send_receive_active(struct ivpu_device *vdev, struct vpu_jsm_msg *req, 289 + enum vpu_ipc_msg_type expected_resp, struct vpu_jsm_msg *resp, 290 + u32 channel, unsigned long timeout_ms) 333 291 { 334 292 struct vpu_jsm_msg hb_req = { .type = VPU_JSM_MSG_QUERY_ENGINE_HB }; 335 293 struct vpu_jsm_msg hb_resp; 336 294 int ret, hb_ret; 337 295 338 - ret = ivpu_rpm_get(vdev); 339 - if (ret < 0) 340 - return ret; 296 + drm_WARN_ON(&vdev->drm, pm_runtime_status_suspended(vdev->drm.dev)); 341 297 342 - ret = ivpu_ipc_send_receive_internal(vdev, req, expected_resp_type, resp, 343 - channel, timeout_ms); 298 + ret = ivpu_ipc_send_receive_internal(vdev, req, expected_resp, resp, channel, timeout_ms); 344 299 if (ret != -ETIMEDOUT) 345 - goto rpm_put; 300 + return ret; 346 301 347 302 hb_ret = ivpu_ipc_send_receive_internal(vdev, &hb_req, VPU_JSM_MSG_QUERY_ENGINE_HB_DONE, 348 303 &hb_resp, VPU_IPC_CHAN_ASYNC_CMD, ··· 348 311 ivpu_pm_schedule_recovery(vdev); 349 312 } 350 313 351 - rpm_put: 314 + return ret; 315 + } 316 + 317 + int ivpu_ipc_send_receive(struct ivpu_device *vdev, struct vpu_jsm_msg *req, 318 + enum vpu_ipc_msg_type expected_resp, struct vpu_jsm_msg *resp, 319 + u32 channel, unsigned long timeout_ms) 320 + { 321 + int ret; 322 + 323 + ret = ivpu_rpm_get(vdev); 324 + if (ret < 0) 325 + return ret; 326 + 327 + ret = ivpu_ipc_send_receive_active(vdev, req, expected_resp, resp, channel, timeout_ms); 328 + 352 329 ivpu_rpm_put(vdev); 353 330 return ret; 354 331 } ··· 380 329 return false; 381 330 } 382 331 383 - static void 384 - ivpu_ipc_dispatch(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons, 385 - struct ivpu_ipc_hdr *ipc_hdr, struct vpu_jsm_msg *jsm_msg) 386 - { 387 - struct ivpu_ipc_info *ipc = vdev->ipc; 388 - struct ivpu_ipc_rx_msg *rx_msg; 389 - unsigned long flags; 390 - 391 - lockdep_assert_held(&ipc->cons_list_lock); 392 - 393 - rx_msg = kzalloc(sizeof(*rx_msg), GFP_ATOMIC); 394 - if (!rx_msg) { 395 - ivpu_ipc_rx_mark_free(vdev, ipc_hdr, jsm_msg); 396 - return; 397 - } 398 - 399 - atomic_inc(&ipc->rx_msg_count); 400 - 401 - rx_msg->ipc_hdr = ipc_hdr; 402 - rx_msg->jsm_msg = jsm_msg; 403 - 404 - spin_lock_irqsave(&cons->rx_msg_lock, flags); 405 - list_add_tail(&rx_msg->link, &cons->rx_msg_list); 406 - spin_unlock_irqrestore(&cons->rx_msg_lock, flags); 407 - 408 - wake_up(&cons->rx_msg_wq); 409 - } 410 - 411 - int ivpu_ipc_irq_handler(struct ivpu_device *vdev) 332 + void ivpu_ipc_irq_handler(struct ivpu_device *vdev, bool *wake_thread) 412 333 { 413 334 struct ivpu_ipc_info *ipc = vdev->ipc; 414 335 struct ivpu_ipc_consumer *cons; ··· 398 375 vpu_addr = ivpu_hw_reg_ipc_rx_addr_get(vdev); 399 376 if (vpu_addr == REG_IO_ERROR) { 400 377 ivpu_err_ratelimited(vdev, "Failed to read IPC rx addr register\n"); 401 - return -EIO; 378 + return; 402 379 } 403 380 404 381 ipc_hdr = ivpu_to_cpu_addr(ipc->mem_rx, vpu_addr); ··· 428 405 } 429 406 430 407 dispatched = false; 431 - spin_lock_irqsave(&ipc->cons_list_lock, flags); 408 + spin_lock_irqsave(&ipc->cons_lock, flags); 432 409 list_for_each_entry(cons, &ipc->cons_list, link) { 433 410 if (ivpu_ipc_match_consumer(vdev, cons, ipc_hdr, jsm_msg)) { 434 - ivpu_ipc_dispatch(vdev, cons, ipc_hdr, jsm_msg); 411 + ivpu_ipc_rx_msg_add(vdev, cons, ipc_hdr, jsm_msg); 435 412 dispatched = true; 436 413 break; 437 414 } 438 415 } 439 - spin_unlock_irqrestore(&ipc->cons_list_lock, flags); 416 + spin_unlock_irqrestore(&ipc->cons_lock, flags); 440 417 441 418 if (!dispatched) { 442 419 ivpu_dbg(vdev, IPC, "IPC RX msg 0x%x dropped (no consumer)\n", vpu_addr); ··· 444 421 } 445 422 } 446 423 447 - return 0; 424 + if (wake_thread) 425 + *wake_thread = !list_empty(&ipc->cb_msg_list); 426 + } 427 + 428 + irqreturn_t ivpu_ipc_irq_thread_handler(struct ivpu_device *vdev) 429 + { 430 + struct ivpu_ipc_info *ipc = vdev->ipc; 431 + struct ivpu_ipc_rx_msg *rx_msg, *r; 432 + struct list_head cb_msg_list; 433 + 434 + INIT_LIST_HEAD(&cb_msg_list); 435 + 436 + spin_lock_irq(&ipc->cons_lock); 437 + list_splice_tail_init(&ipc->cb_msg_list, &cb_msg_list); 438 + spin_unlock_irq(&ipc->cons_lock); 439 + 440 + list_for_each_entry_safe(rx_msg, r, &cb_msg_list, link) { 441 + rx_msg->callback(vdev, rx_msg->ipc_hdr, rx_msg->jsm_msg); 442 + ivpu_ipc_rx_msg_del(vdev, rx_msg); 443 + } 444 + 445 + return IRQ_HANDLED; 448 446 } 449 447 450 448 int ivpu_ipc_init(struct ivpu_device *vdev) ··· 500 456 goto err_free_rx; 501 457 } 502 458 459 + spin_lock_init(&ipc->cons_lock); 503 460 INIT_LIST_HEAD(&ipc->cons_list); 504 - spin_lock_init(&ipc->cons_list_lock); 461 + INIT_LIST_HEAD(&ipc->cb_msg_list); 505 462 drmm_mutex_init(&vdev->drm, &ipc->lock); 506 - 507 463 ivpu_ipc_reset(vdev); 508 464 return 0; 509 465 ··· 516 472 517 473 void ivpu_ipc_fini(struct ivpu_device *vdev) 518 474 { 475 + struct ivpu_ipc_info *ipc = vdev->ipc; 476 + 477 + drm_WARN_ON(&vdev->drm, ipc->on); 478 + drm_WARN_ON(&vdev->drm, !list_empty(&ipc->cons_list)); 479 + drm_WARN_ON(&vdev->drm, !list_empty(&ipc->cb_msg_list)); 480 + drm_WARN_ON(&vdev->drm, atomic_read(&ipc->rx_msg_count) > 0); 481 + 519 482 ivpu_ipc_mem_fini(vdev); 520 483 } 521 484 ··· 539 488 { 540 489 struct ivpu_ipc_info *ipc = vdev->ipc; 541 490 struct ivpu_ipc_consumer *cons, *c; 542 - unsigned long flags; 491 + struct ivpu_ipc_rx_msg *rx_msg, *r; 492 + 493 + drm_WARN_ON(&vdev->drm, !list_empty(&ipc->cb_msg_list)); 543 494 544 495 mutex_lock(&ipc->lock); 545 496 ipc->on = false; 546 497 mutex_unlock(&ipc->lock); 547 498 548 - spin_lock_irqsave(&ipc->cons_list_lock, flags); 549 - list_for_each_entry_safe(cons, c, &ipc->cons_list, link) 499 + spin_lock_irq(&ipc->cons_lock); 500 + list_for_each_entry_safe(cons, c, &ipc->cons_list, link) { 501 + spin_lock(&cons->rx_lock); 502 + if (!cons->rx_callback) 503 + cons->aborted = true; 504 + list_for_each_entry_safe(rx_msg, r, &cons->rx_msg_list, link) 505 + ivpu_ipc_rx_msg_del(vdev, rx_msg); 506 + spin_unlock(&cons->rx_lock); 550 507 wake_up(&cons->rx_msg_wq); 551 - spin_unlock_irqrestore(&ipc->cons_list_lock, flags); 508 + } 509 + spin_unlock_irq(&ipc->cons_lock); 510 + 511 + drm_WARN_ON(&vdev->drm, atomic_read(&ipc->rx_msg_count) > 0); 552 512 } 553 513 554 514 void ivpu_ipc_reset(struct ivpu_device *vdev) ··· 567 505 struct ivpu_ipc_info *ipc = vdev->ipc; 568 506 569 507 mutex_lock(&ipc->lock); 508 + drm_WARN_ON(&vdev->drm, ipc->on); 570 509 571 510 memset(ivpu_bo_vaddr(ipc->mem_tx), 0, ivpu_bo_size(ipc->mem_tx)); 572 511 memset(ivpu_bo_vaddr(ipc->mem_rx), 0, ivpu_bo_size(ipc->mem_rx));

+25 -8

drivers/accel/ivpu/ivpu_ipc.h

··· 42 42 u8 status; 43 43 } __packed __aligned(IVPU_IPC_ALIGNMENT); 44 44 45 + typedef void (*ivpu_ipc_rx_callback_t)(struct ivpu_device *vdev, 46 + struct ivpu_ipc_hdr *ipc_hdr, 47 + struct vpu_jsm_msg *jsm_msg); 48 + 49 + struct ivpu_ipc_rx_msg { 50 + struct list_head link; 51 + struct ivpu_ipc_hdr *ipc_hdr; 52 + struct vpu_jsm_msg *jsm_msg; 53 + ivpu_ipc_rx_callback_t callback; 54 + }; 55 + 45 56 struct ivpu_ipc_consumer { 46 57 struct list_head link; 47 58 u32 channel; 48 59 u32 tx_vpu_addr; 49 60 u32 request_id; 61 + bool aborted; 62 + ivpu_ipc_rx_callback_t rx_callback; 50 63 51 - spinlock_t rx_msg_lock; /* Protects rx_msg_list */ 64 + spinlock_t rx_lock; /* Protects rx_msg_list and aborted */ 52 65 struct list_head rx_msg_list; 53 66 wait_queue_head_t rx_msg_wq; 54 67 }; ··· 73 60 74 61 atomic_t rx_msg_count; 75 62 76 - spinlock_t cons_list_lock; /* Protects cons_list */ 63 + spinlock_t cons_lock; /* Protects cons_list and cb_msg_list */ 77 64 struct list_head cons_list; 65 + struct list_head cb_msg_list; 78 66 79 67 atomic_t request_id; 80 68 struct mutex lock; /* Lock on status */ ··· 89 75 void ivpu_ipc_disable(struct ivpu_device *vdev); 90 76 void ivpu_ipc_reset(struct ivpu_device *vdev); 91 77 92 - int ivpu_ipc_irq_handler(struct ivpu_device *vdev); 78 + void ivpu_ipc_irq_handler(struct ivpu_device *vdev, bool *wake_thread); 79 + irqreturn_t ivpu_ipc_irq_thread_handler(struct ivpu_device *vdev); 93 80 94 81 void ivpu_ipc_consumer_add(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons, 95 - u32 channel); 82 + u32 channel, ivpu_ipc_rx_callback_t callback); 96 83 void ivpu_ipc_consumer_del(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons); 97 84 98 85 int ivpu_ipc_receive(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons, 99 - struct ivpu_ipc_hdr *ipc_buf, struct vpu_jsm_msg *ipc_payload, 86 + struct ivpu_ipc_hdr *ipc_buf, struct vpu_jsm_msg *jsm_msg, 100 87 unsigned long timeout_ms); 101 88 89 + int ivpu_ipc_send_receive_active(struct ivpu_device *vdev, struct vpu_jsm_msg *req, 90 + enum vpu_ipc_msg_type expected_resp, struct vpu_jsm_msg *resp, 91 + u32 channel, unsigned long timeout_ms); 102 92 int ivpu_ipc_send_receive(struct ivpu_device *vdev, struct vpu_jsm_msg *req, 103 - enum vpu_ipc_msg_type expected_resp_type, 104 - struct vpu_jsm_msg *resp, u32 channel, 105 - unsigned long timeout_ms); 93 + enum vpu_ipc_msg_type expected_resp, struct vpu_jsm_msg *resp, 94 + u32 channel, unsigned long timeout_ms); 106 95 107 96 #endif /* __IVPU_IPC_H__ */

+33 -70

drivers/accel/ivpu/ivpu_job.c

··· 7 7 8 8 #include <linux/bitfield.h> 9 9 #include <linux/highmem.h> 10 - #include <linux/kthread.h> 11 10 #include <linux/pci.h> 12 11 #include <linux/module.h> 13 12 #include <uapi/drm/ivpu_accel.h> ··· 22 23 #define JOB_ID_JOB_MASK GENMASK(7, 0) 23 24 #define JOB_ID_CONTEXT_MASK GENMASK(31, 8) 24 25 #define JOB_MAX_BUFFER_COUNT 65535 25 - 26 - static unsigned int ivpu_tdr_timeout_ms; 27 - module_param_named(tdr_timeout_ms, ivpu_tdr_timeout_ms, uint, 0644); 28 - MODULE_PARM_DESC(tdr_timeout_ms, "Timeout for device hang detection, in milliseconds, 0 - default"); 29 26 30 27 static void ivpu_cmdq_ring_db(struct ivpu_device *vdev, struct ivpu_cmdq *cmdq) 31 28 { ··· 191 196 entry->batch_buf_addr = job->cmd_buf_vpu_addr; 192 197 entry->job_id = job->job_id; 193 198 entry->flags = 0; 199 + if (unlikely(ivpu_test_mode & IVPU_TEST_MODE_NULL_SUBMISSION)) 200 + entry->flags = VPU_JOB_FLAGS_NULL_SUBMISSION_MASK; 194 201 wmb(); /* Ensure that tail is updated after filling entry */ 195 202 header->tail = next_entry; 196 203 wmb(); /* Flush WC buffer for jobq header */ ··· 261 264 262 265 for (i = 0; i < job->bo_count; i++) 263 266 if (job->bos[i]) 264 - drm_gem_object_put(&job->bos[i]->base); 267 + drm_gem_object_put(&job->bos[i]->base.base); 265 268 266 269 dma_fence_put(job->done_fence); 267 270 ivpu_file_priv_put(&job->file_priv); ··· 337 340 ivpu_dbg(vdev, JOB, "Job complete: id %3u ctx %2d engine %d status 0x%x\n", 338 341 job->job_id, job->file_priv->ctx.id, job->engine_idx, job_status); 339 342 343 + ivpu_stop_job_timeout_detection(vdev); 344 + 340 345 job_put(job); 341 346 return 0; 342 - } 343 - 344 - static void ivpu_job_done_message(struct ivpu_device *vdev, void *msg) 345 - { 346 - struct vpu_ipc_msg_payload_job_done *payload; 347 - struct vpu_jsm_msg *job_ret_msg = msg; 348 - int ret; 349 - 350 - payload = (struct vpu_ipc_msg_payload_job_done *)&job_ret_msg->payload; 351 - 352 - ret = ivpu_job_done(vdev, payload->job_id, payload->job_status); 353 - if (ret) 354 - ivpu_err(vdev, "Failed to finish job %d: %d\n", payload->job_id, ret); 355 347 } 356 348 357 349 void ivpu_jobs_abort_all(struct ivpu_device *vdev) ··· 384 398 if (ret) 385 399 goto err_xa_erase; 386 400 401 + ivpu_start_job_timeout_detection(vdev); 402 + 387 403 ivpu_dbg(vdev, JOB, "Job submitted: id %3u addr 0x%llx ctx %2d engine %d next %d\n", 388 404 job->job_id, job->cmd_buf_vpu_addr, file_priv->ctx.id, 389 405 job->engine_idx, cmdq->jobq->header.tail); 390 406 391 - if (ivpu_test_mode == IVPU_TEST_MODE_NULL_HW) { 407 + if (ivpu_test_mode & IVPU_TEST_MODE_NULL_HW) { 392 408 ivpu_job_done(vdev, job->job_id, VPU_JSM_STATUS_SUCCESS); 393 409 cmdq->jobq->header.head = cmdq->jobq->header.tail; 394 410 wmb(); /* Flush WC buffer for jobq header */ ··· 436 448 } 437 449 438 450 bo = job->bos[CMD_BUF_IDX]; 439 - if (!dma_resv_test_signaled(bo->base.resv, DMA_RESV_USAGE_READ)) { 451 + if (!dma_resv_test_signaled(bo->base.base.resv, DMA_RESV_USAGE_READ)) { 440 452 ivpu_warn(vdev, "Buffer is already in use\n"); 441 453 return -EBUSY; 442 454 } ··· 456 468 } 457 469 458 470 for (i = 0; i < buf_count; i++) { 459 - ret = dma_resv_reserve_fences(job->bos[i]->base.resv, 1); 471 + ret = dma_resv_reserve_fences(job->bos[i]->base.base.resv, 1); 460 472 if (ret) { 461 473 ivpu_warn(vdev, "Failed to reserve fences: %d\n", ret); 462 474 goto unlock_reservations; ··· 465 477 466 478 for (i = 0; i < buf_count; i++) { 467 479 usage = (i == CMD_BUF_IDX) ? DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_BOOKKEEP; 468 - dma_resv_add_fence(job->bos[i]->base.resv, job->done_fence, usage); 480 + dma_resv_add_fence(job->bos[i]->base.base.resv, job->done_fence, usage); 469 481 } 470 482 471 483 unlock_reservations: ··· 550 562 return ret; 551 563 } 552 564 553 - static int ivpu_job_done_thread(void *arg) 565 + static void 566 + ivpu_job_done_callback(struct ivpu_device *vdev, struct ivpu_ipc_hdr *ipc_hdr, 567 + struct vpu_jsm_msg *jsm_msg) 554 568 { 555 - struct ivpu_device *vdev = (struct ivpu_device *)arg; 556 - struct ivpu_ipc_consumer cons; 557 - struct vpu_jsm_msg jsm_msg; 558 - bool jobs_submitted; 559 - unsigned int timeout; 569 + struct vpu_ipc_msg_payload_job_done *payload; 560 570 int ret; 561 571 562 - ivpu_dbg(vdev, JOB, "Started %s\n", __func__); 563 - 564 - ivpu_ipc_consumer_add(vdev, &cons, VPU_IPC_CHAN_JOB_RET); 565 - 566 - while (!kthread_should_stop()) { 567 - timeout = ivpu_tdr_timeout_ms ? ivpu_tdr_timeout_ms : vdev->timeout.tdr; 568 - jobs_submitted = !xa_empty(&vdev->submitted_jobs_xa); 569 - ret = ivpu_ipc_receive(vdev, &cons, NULL, &jsm_msg, timeout); 570 - if (!ret) { 571 - ivpu_job_done_message(vdev, &jsm_msg); 572 - } else if (ret == -ETIMEDOUT) { 573 - if (jobs_submitted && !xa_empty(&vdev->submitted_jobs_xa)) { 574 - ivpu_err(vdev, "TDR detected, timeout %d ms", timeout); 575 - ivpu_hw_diagnose_failure(vdev); 576 - ivpu_pm_schedule_recovery(vdev); 577 - } 578 - } 572 + if (!jsm_msg) { 573 + ivpu_err(vdev, "IPC message has no JSM payload\n"); 574 + return; 579 575 } 580 576 581 - ivpu_ipc_consumer_del(vdev, &cons); 582 - 583 - ivpu_jobs_abort_all(vdev); 584 - 585 - ivpu_dbg(vdev, JOB, "Stopped %s\n", __func__); 586 - return 0; 587 - } 588 - 589 - int ivpu_job_done_thread_init(struct ivpu_device *vdev) 590 - { 591 - struct task_struct *thread; 592 - 593 - thread = kthread_run(&ivpu_job_done_thread, (void *)vdev, "ivpu_job_done_thread"); 594 - if (IS_ERR(thread)) { 595 - ivpu_err(vdev, "Failed to start job completion thread\n"); 596 - return -EIO; 577 + if (jsm_msg->result != VPU_JSM_STATUS_SUCCESS) { 578 + ivpu_err(vdev, "Invalid JSM message result: %d\n", jsm_msg->result); 579 + return; 597 580 } 598 581 599 - get_task_struct(thread); 600 - wake_up_process(thread); 601 - 602 - vdev->job_done_thread = thread; 603 - 604 - return 0; 582 + payload = (struct vpu_ipc_msg_payload_job_done *)&jsm_msg->payload; 583 + ret = ivpu_job_done(vdev, payload->job_id, payload->job_status); 584 + if (!ret && !xa_empty(&vdev->submitted_jobs_xa)) 585 + ivpu_start_job_timeout_detection(vdev); 605 586 } 606 587 607 - void ivpu_job_done_thread_fini(struct ivpu_device *vdev) 588 + void ivpu_job_done_consumer_init(struct ivpu_device *vdev) 608 589 { 609 - kthread_stop_put(vdev->job_done_thread); 590 + ivpu_ipc_consumer_add(vdev, &vdev->job_done_consumer, 591 + VPU_IPC_CHAN_JOB_RET, ivpu_job_done_callback); 592 + } 593 + 594 + void ivpu_job_done_consumer_fini(struct ivpu_device *vdev) 595 + { 596 + ivpu_ipc_consumer_del(vdev, &vdev->job_done_consumer); 610 597 }

+2 -2

drivers/accel/ivpu/ivpu_job.h

··· 59 59 void ivpu_cmdq_release_all(struct ivpu_file_priv *file_priv); 60 60 void ivpu_cmdq_reset_all_contexts(struct ivpu_device *vdev); 61 61 62 - int ivpu_job_done_thread_init(struct ivpu_device *vdev); 63 - void ivpu_job_done_thread_fini(struct ivpu_device *vdev); 62 + void ivpu_job_done_consumer_init(struct ivpu_device *vdev); 63 + void ivpu_job_done_consumer_fini(struct ivpu_device *vdev); 64 64 65 65 void ivpu_jobs_abort_all(struct ivpu_device *vdev); 66 66

+38

drivers/accel/ivpu/ivpu_jsm_msg.c

··· 4 4 */ 5 5 6 6 #include "ivpu_drv.h" 7 + #include "ivpu_hw.h" 7 8 #include "ivpu_ipc.h" 8 9 #include "ivpu_jsm_msg.h" 9 10 ··· 37 36 IVPU_CASE_TO_STR(VPU_JSM_MSG_DESTROY_CMD_QUEUE); 38 37 IVPU_CASE_TO_STR(VPU_JSM_MSG_SET_CONTEXT_SCHED_PROPERTIES); 39 38 IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_REGISTER_DB); 39 + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_RESUME_CMDQ); 40 + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SUSPEND_CMDQ); 41 + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_RESUME_CMDQ_RSP); 42 + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SUSPEND_CMDQ_DONE); 43 + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SET_SCHEDULING_LOG); 44 + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SET_SCHEDULING_LOG_RSP); 45 + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SCHEDULING_LOG_NOTIFICATION); 46 + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_ENGINE_RESUME); 47 + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_RESUME_ENGINE_DONE); 48 + IVPU_CASE_TO_STR(VPU_JSM_MSG_STATE_DUMP); 49 + IVPU_CASE_TO_STR(VPU_JSM_MSG_STATE_DUMP_RSP); 40 50 IVPU_CASE_TO_STR(VPU_JSM_MSG_BLOB_DEINIT); 41 51 IVPU_CASE_TO_STR(VPU_JSM_MSG_DYNDBG_CONTROL); 42 52 IVPU_CASE_TO_STR(VPU_JSM_MSG_JOB_DONE); ··· 77 65 IVPU_CASE_TO_STR(VPU_JSM_MSG_SET_CONTEXT_SCHED_PROPERTIES_RSP); 78 66 IVPU_CASE_TO_STR(VPU_JSM_MSG_BLOB_DEINIT_DONE); 79 67 IVPU_CASE_TO_STR(VPU_JSM_MSG_DYNDBG_CONTROL_RSP); 68 + IVPU_CASE_TO_STR(VPU_JSM_MSG_PWR_D0I3_ENTER); 69 + IVPU_CASE_TO_STR(VPU_JSM_MSG_PWR_D0I3_ENTER_DONE); 70 + IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_ENABLE); 71 + IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_ENABLE_DONE); 72 + IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_DISABLE); 73 + IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_DISABLE_DONE); 80 74 } 81 75 #undef IVPU_CASE_TO_STR 82 76 ··· 260 242 261 243 return ivpu_ipc_send_receive(vdev, &req, VPU_JSM_MSG_SSID_RELEASE_DONE, &resp, 262 244 VPU_IPC_CHAN_ASYNC_CMD, vdev->timeout.jsm); 245 + } 246 + 247 + int ivpu_jsm_pwr_d0i3_enter(struct ivpu_device *vdev) 248 + { 249 + struct vpu_jsm_msg req = { .type = VPU_JSM_MSG_PWR_D0I3_ENTER }; 250 + struct vpu_jsm_msg resp; 251 + int ret; 252 + 253 + if (IVPU_WA(disable_d0i3_msg)) 254 + return 0; 255 + 256 + req.payload.pwr_d0i3_enter.send_response = 1; 257 + 258 + ret = ivpu_ipc_send_receive_active(vdev, &req, VPU_JSM_MSG_PWR_D0I3_ENTER_DONE, 259 + &resp, VPU_IPC_CHAN_GEN_CMD, 260 + vdev->timeout.d0i3_entry_msg); 261 + if (ret) 262 + return ret; 263 + 264 + return ivpu_hw_wait_for_idle(vdev); 263 265 }

+1

drivers/accel/ivpu/ivpu_jsm_msg.h

··· 22 22 int ivpu_jsm_trace_set_config(struct ivpu_device *vdev, u32 trace_level, u32 trace_destination_mask, 23 23 u64 trace_hw_component_mask); 24 24 int ivpu_jsm_context_release(struct ivpu_device *vdev, u32 host_ssid); 25 + int ivpu_jsm_pwr_d0i3_enter(struct ivpu_device *vdev); 25 26 #endif

+36 -8

drivers/accel/ivpu/ivpu_mmu.c

··· 230 230 (REG_FLD(IVPU_MMU_REG_GERROR, MSI_PRIQ_ABT)) | \ 231 231 (REG_FLD(IVPU_MMU_REG_GERROR, MSI_ABT))) 232 232 233 - static char *ivpu_mmu_event_to_str(u32 cmd) 233 + #define IVPU_MMU_CERROR_NONE 0x0 234 + #define IVPU_MMU_CERROR_ILL 0x1 235 + #define IVPU_MMU_CERROR_ABT 0x2 236 + #define IVPU_MMU_CERROR_ATC_INV_SYNC 0x3 237 + 238 + static const char *ivpu_mmu_event_to_str(u32 cmd) 234 239 { 235 240 switch (cmd) { 236 241 case IVPU_MMU_EVT_F_UUT: ··· 278 273 return "Fetch of VMS caused external abort"; 279 274 default: 280 275 return "Unknown CMDQ command"; 276 + } 277 + } 278 + 279 + static const char *ivpu_mmu_cmdq_err_to_str(u32 err) 280 + { 281 + switch (err) { 282 + case IVPU_MMU_CERROR_NONE: 283 + return "No CMDQ Error"; 284 + case IVPU_MMU_CERROR_ILL: 285 + return "Illegal command"; 286 + case IVPU_MMU_CERROR_ABT: 287 + return "External abort on CMDQ read"; 288 + case IVPU_MMU_CERROR_ATC_INV_SYNC: 289 + return "Sync failed to complete ATS invalidation"; 290 + default: 291 + return "Unknown CMDQ Error"; 281 292 } 282 293 } 283 294 ··· 500 479 u64 val; 501 480 int ret; 502 481 503 - val = FIELD_PREP(IVPU_MMU_CMD_OPCODE, CMD_SYNC) | 504 - FIELD_PREP(IVPU_MMU_CMD_SYNC_0_CS, 0x2) | 505 - FIELD_PREP(IVPU_MMU_CMD_SYNC_0_MSH, 0x3) | 506 - FIELD_PREP(IVPU_MMU_CMD_SYNC_0_MSI_ATTR, 0xf); 482 + val = FIELD_PREP(IVPU_MMU_CMD_OPCODE, CMD_SYNC); 507 483 508 484 ret = ivpu_mmu_cmdq_cmd_write(vdev, "SYNC", val, 0); 509 485 if (ret) ··· 510 492 REGV_WR32(IVPU_MMU_REG_CMDQ_PROD, q->prod); 511 493 512 494 ret = ivpu_mmu_cmdq_wait_for_cons(vdev); 513 - if (ret) 514 - ivpu_err(vdev, "Timed out waiting for consumer: %d\n", ret); 495 + if (ret) { 496 + u32 err; 497 + 498 + val = REGV_RD32(IVPU_MMU_REG_CMDQ_CONS); 499 + err = REG_GET_FLD(IVPU_MMU_REG_CMDQ_CONS, ERR, val); 500 + 501 + ivpu_err(vdev, "Timed out waiting for MMU consumer: %d, error: %s\n", ret, 502 + ivpu_mmu_cmdq_err_to_str(err)); 503 + } 515 504 516 505 return ret; 517 506 } ··· 775 750 776 751 ivpu_dbg(vdev, MMU, "Init..\n"); 777 752 778 - drmm_mutex_init(&vdev->drm, &mmu->lock); 779 753 ivpu_mmu_config_check(vdev); 754 + 755 + ret = drmm_mutex_init(&vdev->drm, &mmu->lock); 756 + if (ret) 757 + return ret; 780 758 781 759 ret = ivpu_mmu_structs_alloc(vdev); 782 760 if (ret)

+86 -67

drivers/accel/ivpu/ivpu_mmu_context.c

··· 5 5 6 6 #include <linux/bitfield.h> 7 7 #include <linux/highmem.h> 8 + #include <linux/set_memory.h> 9 + 10 + #include <drm/drm_cache.h> 8 11 9 12 #include "ivpu_drv.h" 10 13 #include "ivpu_hw.h" ··· 42 39 #define IVPU_MMU_ENTRY_MAPPED (IVPU_MMU_ENTRY_FLAG_AF | IVPU_MMU_ENTRY_FLAG_USER | \ 43 40 IVPU_MMU_ENTRY_FLAG_NG | IVPU_MMU_ENTRY_VALID) 44 41 42 + static void *ivpu_pgtable_alloc_page(struct ivpu_device *vdev, dma_addr_t *dma) 43 + { 44 + dma_addr_t dma_addr; 45 + struct page *page; 46 + void *cpu; 47 + 48 + page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO); 49 + if (!page) 50 + return NULL; 51 + 52 + set_pages_array_wc(&page, 1); 53 + 54 + dma_addr = dma_map_page(vdev->drm.dev, page, 0, PAGE_SIZE, DMA_BIDIRECTIONAL); 55 + if (dma_mapping_error(vdev->drm.dev, dma_addr)) 56 + goto err_free_page; 57 + 58 + cpu = vmap(&page, 1, VM_MAP, pgprot_writecombine(PAGE_KERNEL)); 59 + if (!cpu) 60 + goto err_dma_unmap_page; 61 + 62 + 63 + *dma = dma_addr; 64 + return cpu; 65 + 66 + err_dma_unmap_page: 67 + dma_unmap_page(vdev->drm.dev, dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL); 68 + 69 + err_free_page: 70 + put_page(page); 71 + return NULL; 72 + } 73 + 74 + static void ivpu_pgtable_free_page(struct ivpu_device *vdev, u64 *cpu_addr, dma_addr_t dma_addr) 75 + { 76 + struct page *page; 77 + 78 + if (cpu_addr) { 79 + page = vmalloc_to_page(cpu_addr); 80 + vunmap(cpu_addr); 81 + dma_unmap_page(vdev->drm.dev, dma_addr & ~IVPU_MMU_ENTRY_FLAGS_MASK, PAGE_SIZE, 82 + DMA_BIDIRECTIONAL); 83 + set_pages_array_wb(&page, 1); 84 + put_page(page); 85 + } 86 + } 87 + 45 88 static int ivpu_mmu_pgtable_init(struct ivpu_device *vdev, struct ivpu_mmu_pgtable *pgtable) 46 89 { 47 90 dma_addr_t pgd_dma; 48 91 49 - pgtable->pgd_dma_ptr = dma_alloc_coherent(vdev->drm.dev, IVPU_MMU_PGTABLE_SIZE, &pgd_dma, 50 - GFP_KERNEL); 92 + pgtable->pgd_dma_ptr = ivpu_pgtable_alloc_page(vdev, &pgd_dma); 51 93 if (!pgtable->pgd_dma_ptr) 52 94 return -ENOMEM; 53 95 54 96 pgtable->pgd_dma = pgd_dma; 55 97 56 98 return 0; 57 - } 58 - 59 - static void ivpu_mmu_pgtable_free(struct ivpu_device *vdev, u64 *cpu_addr, dma_addr_t dma_addr) 60 - { 61 - if (cpu_addr) 62 - dma_free_coherent(vdev->drm.dev, IVPU_MMU_PGTABLE_SIZE, cpu_addr, 63 - dma_addr & ~IVPU_MMU_ENTRY_FLAGS_MASK); 64 99 } 65 100 66 101 static void ivpu_mmu_pgtables_free(struct ivpu_device *vdev, struct ivpu_mmu_pgtable *pgtable) ··· 125 84 pte_dma_ptr = pgtable->pte_ptrs[pgd_idx][pud_idx][pmd_idx]; 126 85 pte_dma = pgtable->pmd_ptrs[pgd_idx][pud_idx][pmd_idx]; 127 86 128 - ivpu_mmu_pgtable_free(vdev, pte_dma_ptr, pte_dma); 87 + ivpu_pgtable_free_page(vdev, pte_dma_ptr, pte_dma); 129 88 } 130 89 131 90 kfree(pgtable->pte_ptrs[pgd_idx][pud_idx]); 132 - ivpu_mmu_pgtable_free(vdev, pmd_dma_ptr, pmd_dma); 91 + ivpu_pgtable_free_page(vdev, pmd_dma_ptr, pmd_dma); 133 92 } 134 93 135 94 kfree(pgtable->pmd_ptrs[pgd_idx]); 136 95 kfree(pgtable->pte_ptrs[pgd_idx]); 137 - ivpu_mmu_pgtable_free(vdev, pud_dma_ptr, pud_dma); 96 + ivpu_pgtable_free_page(vdev, pud_dma_ptr, pud_dma); 138 97 } 139 98 140 - ivpu_mmu_pgtable_free(vdev, pgtable->pgd_dma_ptr, pgtable->pgd_dma); 99 + ivpu_pgtable_free_page(vdev, pgtable->pgd_dma_ptr, pgtable->pgd_dma); 141 100 } 142 101 143 102 static u64* ··· 149 108 if (pud_dma_ptr) 150 109 return pud_dma_ptr; 151 110 152 - pud_dma_ptr = dma_alloc_wc(vdev->drm.dev, IVPU_MMU_PGTABLE_SIZE, &pud_dma, GFP_KERNEL); 111 + pud_dma_ptr = ivpu_pgtable_alloc_page(vdev, &pud_dma); 153 112 if (!pud_dma_ptr) 154 113 return NULL; 155 114 ··· 172 131 kfree(pgtable->pmd_ptrs[pgd_idx]); 173 132 174 133 err_free_pud_dma_ptr: 175 - ivpu_mmu_pgtable_free(vdev, pud_dma_ptr, pud_dma); 134 + ivpu_pgtable_free_page(vdev, pud_dma_ptr, pud_dma); 176 135 return NULL; 177 136 } 178 137 ··· 186 145 if (pmd_dma_ptr) 187 146 return pmd_dma_ptr; 188 147 189 - pmd_dma_ptr = dma_alloc_wc(vdev->drm.dev, IVPU_MMU_PGTABLE_SIZE, &pmd_dma, GFP_KERNEL); 148 + pmd_dma_ptr = ivpu_pgtable_alloc_page(vdev, &pmd_dma); 190 149 if (!pmd_dma_ptr) 191 150 return NULL; 192 151 ··· 201 160 return pmd_dma_ptr; 202 161 203 162 err_free_pmd_dma_ptr: 204 - ivpu_mmu_pgtable_free(vdev, pmd_dma_ptr, pmd_dma); 163 + ivpu_pgtable_free_page(vdev, pmd_dma_ptr, pmd_dma); 205 164 return NULL; 206 165 } 207 166 ··· 215 174 if (pte_dma_ptr) 216 175 return pte_dma_ptr; 217 176 218 - pte_dma_ptr = dma_alloc_wc(vdev->drm.dev, IVPU_MMU_PGTABLE_SIZE, &pte_dma, GFP_KERNEL); 177 + pte_dma_ptr = ivpu_pgtable_alloc_page(vdev, &pte_dma); 219 178 if (!pte_dma_ptr) 220 179 return NULL; 221 180 ··· 290 249 ctx->pgtable.pte_ptrs[pgd_idx][pud_idx][pmd_idx][pte_idx] = IVPU_MMU_ENTRY_INVALID; 291 250 } 292 251 293 - static void 294 - ivpu_mmu_context_flush_page_tables(struct ivpu_mmu_context *ctx, u64 vpu_addr, size_t size) 295 - { 296 - struct ivpu_mmu_pgtable *pgtable = &ctx->pgtable; 297 - u64 end_addr = vpu_addr + size; 298 - 299 - /* Align to PMD entry (2 MB) */ 300 - vpu_addr &= ~(IVPU_MMU_PTE_MAP_SIZE - 1); 301 - 302 - while (vpu_addr < end_addr) { 303 - int pgd_idx = FIELD_GET(IVPU_MMU_PGD_INDEX_MASK, vpu_addr); 304 - u64 pud_end = (pgd_idx + 1) * (u64)IVPU_MMU_PUD_MAP_SIZE; 305 - 306 - while (vpu_addr < end_addr && vpu_addr < pud_end) { 307 - int pud_idx = FIELD_GET(IVPU_MMU_PUD_INDEX_MASK, vpu_addr); 308 - u64 pmd_end = (pud_idx + 1) * (u64)IVPU_MMU_PMD_MAP_SIZE; 309 - 310 - while (vpu_addr < end_addr && vpu_addr < pmd_end) { 311 - int pmd_idx = FIELD_GET(IVPU_MMU_PMD_INDEX_MASK, vpu_addr); 312 - 313 - clflush_cache_range(pgtable->pte_ptrs[pgd_idx][pud_idx][pmd_idx], 314 - IVPU_MMU_PGTABLE_SIZE); 315 - vpu_addr += IVPU_MMU_PTE_MAP_SIZE; 316 - } 317 - clflush_cache_range(pgtable->pmd_ptrs[pgd_idx][pud_idx], 318 - IVPU_MMU_PGTABLE_SIZE); 319 - } 320 - clflush_cache_range(pgtable->pud_ptrs[pgd_idx], IVPU_MMU_PGTABLE_SIZE); 321 - } 322 - clflush_cache_range(pgtable->pgd_dma_ptr, IVPU_MMU_PGTABLE_SIZE); 323 - } 324 - 325 252 static int 326 253 ivpu_mmu_context_map_pages(struct ivpu_device *vdev, struct ivpu_mmu_context *ctx, 327 254 u64 vpu_addr, dma_addr_t dma_addr, size_t size, u64 prot) ··· 336 327 u64 prot; 337 328 u64 i; 338 329 330 + if (drm_WARN_ON(&vdev->drm, !ctx)) 331 + return -EINVAL; 332 + 339 333 if (!IS_ALIGNED(vpu_addr, IVPU_MMU_PAGE_SIZE)) 340 334 return -EINVAL; 341 335 ··· 361 349 mutex_unlock(&ctx->lock); 362 350 return ret; 363 351 } 364 - ivpu_mmu_context_flush_page_tables(ctx, vpu_addr, size); 365 352 vpu_addr += size; 366 353 } 367 354 355 + /* Ensure page table modifications are flushed from wc buffers to memory */ 356 + wmb(); 368 357 mutex_unlock(&ctx->lock); 369 358 370 359 ret = ivpu_mmu_invalidate_tlb(vdev, ctx->id); ··· 382 369 int ret; 383 370 u64 i; 384 371 385 - if (!IS_ALIGNED(vpu_addr, IVPU_MMU_PAGE_SIZE)) 386 - ivpu_warn(vdev, "Unaligned vpu_addr: 0x%llx\n", vpu_addr); 372 + if (drm_WARN_ON(&vdev->drm, !ctx)) 373 + return; 387 374 388 375 mutex_lock(&ctx->lock); 389 376 ··· 391 378 size_t size = sg_dma_len(sg) + sg->offset; 392 379 393 380 ivpu_mmu_context_unmap_pages(ctx, vpu_addr, size); 394 - ivpu_mmu_context_flush_page_tables(ctx, vpu_addr, size); 395 381 vpu_addr += size; 396 382 } 397 383 384 + /* Ensure page table modifications are flushed from wc buffers to memory */ 385 + wmb(); 398 386 mutex_unlock(&ctx->lock); 399 387 400 388 ret = ivpu_mmu_invalidate_tlb(vdev, ctx->id); ··· 404 390 } 405 391 406 392 int 407 - ivpu_mmu_context_insert_node_locked(struct ivpu_mmu_context *ctx, 408 - const struct ivpu_addr_range *range, 409 - u64 size, struct drm_mm_node *node) 393 + ivpu_mmu_context_insert_node(struct ivpu_mmu_context *ctx, const struct ivpu_addr_range *range, 394 + u64 size, struct drm_mm_node *node) 410 395 { 411 - lockdep_assert_held(&ctx->lock); 396 + int ret; 412 397 398 + WARN_ON(!range); 399 + 400 + mutex_lock(&ctx->lock); 413 401 if (!ivpu_disable_mmu_cont_pages && size >= IVPU_MMU_CONT_PAGES_SIZE) { 414 - if (!drm_mm_insert_node_in_range(&ctx->mm, node, size, IVPU_MMU_CONT_PAGES_SIZE, 0, 415 - range->start, range->end, DRM_MM_INSERT_BEST)) 416 - return 0; 402 + ret = drm_mm_insert_node_in_range(&ctx->mm, node, size, IVPU_MMU_CONT_PAGES_SIZE, 0, 403 + range->start, range->end, DRM_MM_INSERT_BEST); 404 + if (!ret) 405 + goto unlock; 417 406 } 418 407 419 - return drm_mm_insert_node_in_range(&ctx->mm, node, size, IVPU_MMU_PAGE_SIZE, 0, 420 - range->start, range->end, DRM_MM_INSERT_BEST); 408 + ret = drm_mm_insert_node_in_range(&ctx->mm, node, size, IVPU_MMU_PAGE_SIZE, 0, 409 + range->start, range->end, DRM_MM_INSERT_BEST); 410 + unlock: 411 + mutex_unlock(&ctx->lock); 412 + return ret; 421 413 } 422 414 423 415 void 424 - ivpu_mmu_context_remove_node_locked(struct ivpu_mmu_context *ctx, struct drm_mm_node *node) 416 + ivpu_mmu_context_remove_node(struct ivpu_mmu_context *ctx, struct drm_mm_node *node) 425 417 { 426 - lockdep_assert_held(&ctx->lock); 427 - 418 + mutex_lock(&ctx->lock); 428 419 drm_mm_remove_node(node); 420 + mutex_unlock(&ctx->lock); 429 421 } 430 422 431 423 static int ··· 441 421 int ret; 442 422 443 423 mutex_init(&ctx->lock); 444 - INIT_LIST_HEAD(&ctx->bo_list); 445 424 446 425 ret = ivpu_mmu_pgtable_init(vdev, &ctx->pgtable); 447 426 if (ret) {

+4 -7

drivers/accel/ivpu/ivpu_mmu_context.h

··· 23 23 }; 24 24 25 25 struct ivpu_mmu_context { 26 - struct mutex lock; /* protects: mm, pgtable, bo_list */ 26 + struct mutex lock; /* Protects: mm, pgtable */ 27 27 struct drm_mm mm; 28 28 struct ivpu_mmu_pgtable pgtable; 29 - struct list_head bo_list; 30 29 u32 id; 31 30 }; 32 31 ··· 38 39 void ivpu_mmu_user_context_fini(struct ivpu_device *vdev, struct ivpu_mmu_context *ctx); 39 40 void ivpu_mmu_user_context_mark_invalid(struct ivpu_device *vdev, u32 ssid); 40 41 41 - int ivpu_mmu_context_insert_node_locked(struct ivpu_mmu_context *ctx, 42 - const struct ivpu_addr_range *range, 43 - u64 size, struct drm_mm_node *node); 44 - void ivpu_mmu_context_remove_node_locked(struct ivpu_mmu_context *ctx, 45 - struct drm_mm_node *node); 42 + int ivpu_mmu_context_insert_node(struct ivpu_mmu_context *ctx, const struct ivpu_addr_range *range, 43 + u64 size, struct drm_mm_node *node); 44 + void ivpu_mmu_context_remove_node(struct ivpu_mmu_context *ctx, struct drm_mm_node *node); 46 45 47 46 int ivpu_mmu_context_map_sgt(struct ivpu_device *vdev, struct ivpu_mmu_context *ctx, 48 47 u64 vpu_addr, struct sg_table *sgt, bool llc_coherent);

+58 -14

drivers/accel/ivpu/ivpu_pm.c

··· 15 15 #include "ivpu_fw.h" 16 16 #include "ivpu_ipc.h" 17 17 #include "ivpu_job.h" 18 + #include "ivpu_jsm_msg.h" 18 19 #include "ivpu_mmu.h" 19 20 #include "ivpu_pm.h" 20 21 21 22 static bool ivpu_disable_recovery; 22 23 module_param_named_unsafe(disable_recovery, ivpu_disable_recovery, bool, 0644); 23 24 MODULE_PARM_DESC(disable_recovery, "Disables recovery when VPU hang is detected"); 25 + 26 + static unsigned long ivpu_tdr_timeout_ms; 27 + module_param_named(tdr_timeout_ms, ivpu_tdr_timeout_ms, ulong, 0644); 28 + MODULE_PARM_DESC(tdr_timeout_ms, "Timeout for device hang detection, in milliseconds, 0 - default"); 24 29 25 30 #define PM_RESCHEDULE_LIMIT 5 26 31 ··· 74 69 ret = ivpu_hw_power_up(vdev); 75 70 if (ret) { 76 71 ivpu_err(vdev, "Failed to power up HW: %d\n", ret); 77 - return ret; 72 + goto err_power_down; 78 73 } 79 74 80 75 ret = ivpu_mmu_enable(vdev); 81 76 if (ret) { 82 77 ivpu_err(vdev, "Failed to resume MMU: %d\n", ret); 83 - ivpu_hw_power_down(vdev); 84 - return ret; 78 + goto err_power_down; 85 79 } 86 80 87 81 ret = ivpu_boot(vdev); 88 - if (ret) { 89 - ivpu_mmu_disable(vdev); 90 - ivpu_hw_power_down(vdev); 91 - if (!ivpu_fw_is_cold_boot(vdev)) { 92 - ivpu_warn(vdev, "Failed to resume the FW: %d. Retrying cold boot..\n", ret); 93 - ivpu_pm_prepare_cold_boot(vdev); 94 - goto retry; 95 - } else { 96 - ivpu_err(vdev, "Failed to resume the FW: %d\n", ret); 97 - } 82 + if (ret) 83 + goto err_mmu_disable; 84 + 85 + return 0; 86 + 87 + err_mmu_disable: 88 + ivpu_mmu_disable(vdev); 89 + err_power_down: 90 + ivpu_hw_power_down(vdev); 91 + 92 + if (!ivpu_fw_is_cold_boot(vdev)) { 93 + ivpu_pm_prepare_cold_boot(vdev); 94 + goto retry; 95 + } else { 96 + ivpu_err(vdev, "Failed to resume the FW: %d\n", ret); 98 97 } 99 98 100 99 return ret; ··· 145 136 } 146 137 } 147 138 139 + static void ivpu_job_timeout_work(struct work_struct *work) 140 + { 141 + struct ivpu_pm_info *pm = container_of(work, struct ivpu_pm_info, job_timeout_work.work); 142 + struct ivpu_device *vdev = pm->vdev; 143 + unsigned long timeout_ms = ivpu_tdr_timeout_ms ? ivpu_tdr_timeout_ms : vdev->timeout.tdr; 144 + 145 + ivpu_err(vdev, "TDR detected, timeout %lu ms", timeout_ms); 146 + ivpu_hw_diagnose_failure(vdev); 147 + 148 + ivpu_pm_schedule_recovery(vdev); 149 + } 150 + 151 + void ivpu_start_job_timeout_detection(struct ivpu_device *vdev) 152 + { 153 + unsigned long timeout_ms = ivpu_tdr_timeout_ms ? ivpu_tdr_timeout_ms : vdev->timeout.tdr; 154 + 155 + /* No-op if already queued */ 156 + queue_delayed_work(system_wq, &vdev->pm->job_timeout_work, msecs_to_jiffies(timeout_ms)); 157 + } 158 + 159 + void ivpu_stop_job_timeout_detection(struct ivpu_device *vdev) 160 + { 161 + cancel_delayed_work_sync(&vdev->pm->job_timeout_work); 162 + } 163 + 148 164 int ivpu_pm_suspend_cb(struct device *dev) 149 165 { 150 166 struct drm_device *drm = dev_get_drvdata(dev); ··· 186 152 return -EBUSY; 187 153 } 188 154 } 155 + 156 + ivpu_jsm_pwr_d0i3_enter(vdev); 189 157 190 158 ivpu_suspend(vdev); 191 159 ivpu_pm_prepare_warm_boot(vdev); ··· 224 188 { 225 189 struct drm_device *drm = dev_get_drvdata(dev); 226 190 struct ivpu_device *vdev = to_ivpu_device(drm); 191 + bool hw_is_idle = true; 227 192 int ret; 228 193 229 194 ivpu_dbg(vdev, PM, "Runtime suspend..\n"); ··· 237 200 return -EAGAIN; 238 201 } 239 202 203 + if (!vdev->pm->suspend_reschedule_counter) 204 + hw_is_idle = false; 205 + else if (ivpu_jsm_pwr_d0i3_enter(vdev)) 206 + hw_is_idle = false; 207 + 240 208 ret = ivpu_suspend(vdev); 241 209 if (ret) 242 210 ivpu_err(vdev, "Failed to set suspend VPU: %d\n", ret); 243 211 244 - if (!vdev->pm->suspend_reschedule_counter) { 212 + if (!hw_is_idle) { 245 213 ivpu_warn(vdev, "VPU failed to enter idle, force suspended.\n"); 246 214 ivpu_pm_prepare_cold_boot(vdev); 247 215 } else { ··· 346 304 347 305 atomic_set(&pm->in_reset, 0); 348 306 INIT_WORK(&pm->recovery_work, ivpu_pm_recovery_work); 307 + INIT_DELAYED_WORK(&pm->job_timeout_work, ivpu_job_timeout_work); 349 308 350 309 if (ivpu_disable_recovery) 351 310 delay = -1; ··· 361 318 362 319 void ivpu_pm_cancel_recovery(struct ivpu_device *vdev) 363 320 { 321 + drm_WARN_ON(&vdev->drm, delayed_work_pending(&vdev->pm->job_timeout_work)); 364 322 cancel_work_sync(&vdev->pm->recovery_work); 365 323 } 366 324

+3

drivers/accel/ivpu/ivpu_pm.h

··· 12 12 13 13 struct ivpu_pm_info { 14 14 struct ivpu_device *vdev; 15 + struct delayed_work job_timeout_work; 15 16 struct work_struct recovery_work; 16 17 atomic_t in_reset; 17 18 atomic_t reset_counter; ··· 38 37 void ivpu_rpm_put(struct ivpu_device *vdev); 39 38 40 39 void ivpu_pm_schedule_recovery(struct ivpu_device *vdev); 40 + void ivpu_start_job_timeout_detection(struct ivpu_device *vdev); 41 + void ivpu_stop_job_timeout_detection(struct ivpu_device *vdev); 41 42 42 43 #endif /* __IVPU_PM_H__ */

+81 -9

drivers/accel/ivpu/vpu_boot_api.h

··· 11 11 * The bellow values will be used to construct the version info this way: 12 12 * fw_bin_header->api_version[VPU_BOOT_API_VER_ID] = (VPU_BOOT_API_VER_MAJOR << 16) | 13 13 * VPU_BOOT_API_VER_MINOR; 14 - * VPU_BOOT_API_VER_PATCH will be ignored. KMD and compatibility is not affected if this changes. 14 + * VPU_BOOT_API_VER_PATCH will be ignored. KMD and compatibility is not affected if this changes 15 + * This information is collected by using vpuip_2/application/vpuFirmware/make_std_fw_image.py 16 + * If a header is missing this info we ignore the header, if a header is missing or contains 17 + * partial info a build error will be generated. 15 18 */ 16 19 17 20 /* ··· 27 24 * Minor version changes when API backward compatibility is preserved. 28 25 * Resets to 0 if Major version is incremented. 29 26 */ 30 - #define VPU_BOOT_API_VER_MINOR 12 27 + #define VPU_BOOT_API_VER_MINOR 20 31 28 32 29 /* 33 30 * API header changed (field names, documentation, formatting) but API itself has not been changed 34 31 */ 35 - #define VPU_BOOT_API_VER_PATCH 2 32 + #define VPU_BOOT_API_VER_PATCH 4 36 33 37 34 /* 38 35 * Index in the API version table ··· 66 63 /* Size of memory require for firmware execution */ 67 64 u32 runtime_size; 68 65 u32 shave_nn_fw_size; 66 + /* Size of primary preemption buffer. */ 67 + u32 preemption_buffer_1_size; 68 + /* Size of secondary preemption buffer. */ 69 + u32 preemption_buffer_2_size; 70 + /* Space reserved for future preemption-related fields. */ 71 + u32 preemption_reserved[6]; 69 72 }; 70 73 71 74 /* ··· 96 87 VPU_BOOT_L2_CACHE_CFG_UPA = 0, 97 88 VPU_BOOT_L2_CACHE_CFG_NN = 1, 98 89 VPU_BOOT_L2_CACHE_CFG_NUM = 2 90 + }; 91 + 92 + /** VPU MCA ECC signalling mode. By default, no signalling is used */ 93 + enum VPU_BOOT_MCA_ECC_SIGNAL_TYPE { 94 + VPU_BOOT_MCA_ECC_NONE = 0, 95 + VPU_BOOT_MCA_ECC_CORR = 1, 96 + VPU_BOOT_MCA_ECC_FATAL = 2, 97 + VPU_BOOT_MCA_ECC_BOTH = 3 99 98 }; 100 99 101 100 /** ··· 148 131 #define VPU_TRACE_PROC_BIT_ACT_SHV_3 22 149 132 #define VPU_TRACE_PROC_NO_OF_HW_DEVS 23 150 133 151 - /* KMB HW component IDs are sequential, so define first and last IDs. */ 152 - #define VPU_TRACE_PROC_BIT_KMB_FIRST VPU_TRACE_PROC_BIT_LRT 153 - #define VPU_TRACE_PROC_BIT_KMB_LAST VPU_TRACE_PROC_BIT_SHV_15 134 + /* VPU 30xx HW component IDs are sequential, so define first and last IDs. */ 135 + #define VPU_TRACE_PROC_BIT_30XX_FIRST VPU_TRACE_PROC_BIT_LRT 136 + #define VPU_TRACE_PROC_BIT_30XX_LAST VPU_TRACE_PROC_BIT_SHV_15 137 + #define VPU_TRACE_PROC_BIT_KMB_FIRST VPU_TRACE_PROC_BIT_30XX_FIRST 138 + #define VPU_TRACE_PROC_BIT_KMB_LAST VPU_TRACE_PROC_BIT_30XX_LAST 154 139 155 140 struct vpu_boot_l2_cache_config { 156 141 u8 use; ··· 166 147 u32 core_id; 167 148 u32 is_clear_op; 168 149 }; 150 + 151 + /* 152 + * When HW scheduling mode is enabled, a present period is defined. 153 + * It will be used by VPU to swap between normal and focus priorities 154 + * to prevent starving of normal priority band (when implemented). 155 + * Host must provide a valid value at boot time in 156 + * `vpu_focus_present_timer_ms`. If the value provided by the host is not within the 157 + * defined range a default value will be used. Here we define the min. and max. 158 + * allowed values and the and default value of the present period. Units are milliseconds. 159 + */ 160 + #define VPU_PRESENT_CALL_PERIOD_MS_DEFAULT 50 161 + #define VPU_PRESENT_CALL_PERIOD_MS_MIN 16 162 + #define VPU_PRESENT_CALL_PERIOD_MS_MAX 10000 163 + 164 + /** 165 + * Macros to enable various operation modes within the VPU. 166 + * To be defined as part of 32 bit mask. 167 + */ 168 + #define VPU_OP_MODE_SURVIVABILITY 0x1 169 169 170 170 struct vpu_boot_params { 171 171 u32 magic; ··· 256 218 * the threshold will not be logged); applies to every enabled logging 257 219 * destination and loggable HW component. See 'mvLog_t' enum for acceptable 258 220 * values. 221 + * TODO: EISW-33556: Move log level definition (mvLog_t) to this file. 259 222 */ 260 223 u32 default_trace_level; 261 224 u32 boot_type; ··· 288 249 u32 temp_sensor_period_ms; 289 250 /** PLL ratio for efficient clock frequency */ 290 251 u32 pn_freq_pll_ratio; 291 - u32 pad4[28]; 252 + /** DVFS Mode: Default: 0, Max Performance: 1, On Demand: 2, Power Save: 3 */ 253 + u32 dvfs_mode; 254 + /** 255 + * Depending on DVFS Mode: 256 + * On-demand: Default if 0. 257 + * Bit 0-7 - uint8_t: Highest residency percent 258 + * Bit 8-15 - uint8_t: High residency percent 259 + * Bit 16-23 - uint8_t: Low residency percent 260 + * Bit 24-31 - uint8_t: Lowest residency percent 261 + * Bit 32-35 - unsigned 4b: PLL Ratio increase amount on highest residency 262 + * Bit 36-39 - unsigned 4b: PLL Ratio increase amount on high residency 263 + * Bit 40-43 - unsigned 4b: PLL Ratio decrease amount on low residency 264 + * Bit 44-47 - unsigned 4b: PLL Ratio decrease amount on lowest frequency 265 + * Bit 48-55 - uint8_t: Period (ms) for residency decisions 266 + * Bit 56-63 - uint8_t: Averaging windows (as multiples of period. Max: 30 decimal) 267 + * Power Save/Max Performance: Unused 268 + */ 269 + u64 dvfs_param; 270 + /** 271 + * D0i3 delayed entry 272 + * Bit0: Disable CPU state save on D0i2 entry flow. 273 + * 0: Every D0i2 entry saves state. Save state IPC message ignored. 274 + * 1: IPC message required to save state on D0i3 entry flow. 275 + */ 276 + u32 d0i3_delayed_entry; 277 + /* Time spent by VPU in D0i3 state */ 278 + u64 d0i3_residency_time_us; 279 + /* Value of VPU perf counter at the time of entering D0i3 state . */ 280 + u64 d0i3_entry_vpu_ts; 281 + u32 pad4[20]; 292 282 /* Warm boot information: 0x400 - 0x43F */ 293 283 u32 warm_boot_sections_count; 294 284 u32 warm_boot_start_address_reference; ··· 342 274 u32 vpu_scheduling_mode; 343 275 /* Present call period in milliseconds. */ 344 276 u32 vpu_focus_present_timer_ms; 345 - /* Unused/reserved: 0x478 - 0xFFF */ 346 - u32 pad6[738]; 277 + /* VPU ECC Signaling */ 278 + u32 vpu_uses_ecc_mca_signal; 279 + /* Values defined by VPU_OP_MODE* macros */ 280 + u32 vpu_operation_mode; 281 + /* Unused/reserved: 0x480 - 0xFFF */ 282 + u32 pad6[736]; 347 283 }; 348 284 349 285 /*

+294 -15

drivers/accel/ivpu/vpu_jsm_api.h

··· 22 22 /* 23 23 * Minor version changes when API backward compatibility is preserved. 24 24 */ 25 - #define VPU_JSM_API_VER_MINOR 0 25 + #define VPU_JSM_API_VER_MINOR 15 26 26 27 27 /* 28 28 * API header changed (field names, documentation, formatting) but API itself has not been changed 29 29 */ 30 - #define VPU_JSM_API_VER_PATCH 1 30 + #define VPU_JSM_API_VER_PATCH 0 31 31 32 32 /* 33 33 * Index in the API version table ··· 84 84 * Job flags bit masks. 85 85 */ 86 86 #define VPU_JOB_FLAGS_NULL_SUBMISSION_MASK 0x00000001 87 + #define VPU_JOB_FLAGS_PRIVATE_DATA_MASK 0xFF000000 87 88 88 89 /* 89 90 * Sizes of the reserved areas in jobs, in bytes. 90 91 */ 91 - #define VPU_JOB_RESERVED_BYTES 16 92 + #define VPU_JOB_RESERVED_BYTES 8 93 + 92 94 /* 93 95 * Sizes of the reserved areas in job queues, in bytes. 94 96 */ ··· 111 109 #define VPU_DYNDBG_CMD_MAX_LEN 96 112 110 113 111 /* 112 + * For HWS command queue scheduling, we can prioritise command queues inside the 113 + * same process with a relative in-process priority. Valid values for relative 114 + * priority are given below - max and min. 115 + */ 116 + #define VPU_HWS_COMMAND_QUEUE_MAX_IN_PROCESS_PRIORITY 7 117 + #define VPU_HWS_COMMAND_QUEUE_MIN_IN_PROCESS_PRIORITY -7 118 + 119 + /* 120 + * For HWS priority scheduling, we can have multiple realtime priority bands. 121 + * They are numbered 0 to a MAX. 122 + */ 123 + #define VPU_HWS_MAX_REALTIME_PRIORITY_LEVEL 31U 124 + 125 + /* 114 126 * Job format. 115 127 */ 116 128 struct vpu_job_queue_entry { ··· 133 117 u32 flags; /**< Flags bit field, see VPU_JOB_FLAGS_* above */ 134 118 u64 root_page_table_addr; /**< Address of root page table to use for this job */ 135 119 u64 root_page_table_update_counter; /**< Page tables update events counter */ 136 - u64 preemption_buffer_address; /**< Address of the preemption buffer to use for this job */ 137 - u64 preemption_buffer_size; /**< Size of the preemption buffer to use for this job */ 120 + u64 primary_preempt_buf_addr; 121 + /**< Address of the primary preemption buffer to use for this job */ 122 + u32 primary_preempt_buf_size; 123 + /**< Size of the primary preemption buffer to use for this job */ 124 + u32 secondary_preempt_buf_size; 125 + /**< Size of secondary preemption buffer to use for this job */ 126 + u64 secondary_preempt_buf_addr; 127 + /**< Address of secondary preemption buffer to use for this job */ 138 128 u8 reserved_0[VPU_JOB_RESERVED_BYTES]; 139 129 }; 140 130 ··· 172 150 VPU_TRACE_ENTITY_TYPE_DESTINATION = 1, 173 151 /** Loggable HW component (HW entity that can be logged). */ 174 152 VPU_TRACE_ENTITY_TYPE_HW_COMPONENT = 2, 153 + }; 154 + 155 + /* 156 + * HWS specific log buffer header details. 157 + * Total size is 32 bytes. 158 + */ 159 + struct vpu_hws_log_buffer_header { 160 + /* Written by VPU after adding a log entry. Initialised by host to 0. */ 161 + u32 first_free_entry_index; 162 + /* Incremented by VPU every time the VPU overwrites the 0th entry; 163 + * initialised by host to 0. 164 + */ 165 + u32 wraparound_count; 166 + /* 167 + * This is the number of buffers that can be stored in the log buffer provided by the host. 168 + * It is written by host before passing buffer to VPU. VPU should consider it read-only. 169 + */ 170 + u64 num_of_entries; 171 + u64 reserved[2]; 172 + }; 173 + 174 + /* 175 + * HWS specific log buffer entry details. 176 + * Total size is 32 bytes. 177 + */ 178 + struct vpu_hws_log_buffer_entry { 179 + /* VPU timestamp must be an invariant timer tick (not impacted by DVFS) */ 180 + u64 vpu_timestamp; 181 + /* 182 + * Operation type: 183 + * 0 - context state change 184 + * 1 - queue new work 185 + * 2 - queue unwait sync object 186 + * 3 - queue no more work 187 + * 4 - queue wait sync object 188 + */ 189 + u32 operation_type; 190 + u32 reserved; 191 + /* Operation data depends on operation type */ 192 + u64 operation_data[2]; 175 193 }; 176 194 177 195 /* ··· 290 228 * deallocated or reassigned to another context. 291 229 */ 292 230 VPU_JSM_MSG_HWS_REGISTER_DB = 0x1117, 231 + /** Control command: Log buffer setting */ 232 + VPU_JSM_MSG_HWS_SET_SCHEDULING_LOG = 0x1118, 233 + /* Control command: Suspend command queue. */ 234 + VPU_JSM_MSG_HWS_SUSPEND_CMDQ = 0x1119, 235 + /* Control command: Resume command queue */ 236 + VPU_JSM_MSG_HWS_RESUME_CMDQ = 0x111a, 237 + /* Control command: Resume engine after reset */ 238 + VPU_JSM_MSG_HWS_ENGINE_RESUME = 0x111b, 239 + /* Control command: Enable survivability/DCT mode */ 240 + VPU_JSM_MSG_DCT_ENABLE = 0x111c, 241 + /* Control command: Disable survivability/DCT mode */ 242 + VPU_JSM_MSG_DCT_DISABLE = 0x111d, 243 + /** 244 + * Dump VPU state. To be used for debug purposes only. 245 + * NOTE: Please introduce new ASYNC commands before this one. * 246 + */ 247 + VPU_JSM_MSG_STATE_DUMP = 0x11FF, 293 248 /* IPC Host -> Device, General commands */ 294 249 VPU_JSM_MSG_GENERAL_CMD = 0x1200, 295 250 VPU_JSM_MSG_BLOB_DEINIT = VPU_JSM_MSG_GENERAL_CMD, ··· 315 236 * Linux command: `echo '<dyndbg_cmd>' > <debugfs>/dynamic_debug/control`. 316 237 */ 317 238 VPU_JSM_MSG_DYNDBG_CONTROL = 0x1201, 239 + /** 240 + * Perform the save procedure for the D0i3 entry 241 + */ 242 + VPU_JSM_MSG_PWR_D0I3_ENTER = 0x1202, 318 243 /* IPC Device -> Host, Job completion */ 319 244 VPU_JSM_MSG_JOB_DONE = 0x2100, 320 245 /* IPC Device -> Host, Async command completion */ ··· 387 304 VPU_JSM_MSG_DESTROY_CMD_QUEUE_RSP = 0x2216, 388 305 /** Response to control command: Set context scheduling properties */ 389 306 VPU_JSM_MSG_SET_CONTEXT_SCHED_PROPERTIES_RSP = 0x2217, 307 + /** Response to control command: Log buffer setting */ 308 + VPU_JSM_MSG_HWS_SET_SCHEDULING_LOG_RSP = 0x2218, 309 + /* IPC Device -> Host, HWS notify index entry of log buffer written */ 310 + VPU_JSM_MSG_HWS_SCHEDULING_LOG_NOTIFICATION = 0x2219, 311 + /* IPC Device -> Host, HWS completion of a context suspend request */ 312 + VPU_JSM_MSG_HWS_SUSPEND_CMDQ_DONE = 0x221a, 313 + /* Response to control command: Resume command queue */ 314 + VPU_JSM_MSG_HWS_RESUME_CMDQ_RSP = 0x221b, 315 + /* Response to control command: Resume engine command response */ 316 + VPU_JSM_MSG_HWS_RESUME_ENGINE_DONE = 0x221c, 317 + /* Response to control command: Enable survivability/DCT mode */ 318 + VPU_JSM_MSG_DCT_ENABLE_DONE = 0x221d, 319 + /* Response to control command: Disable survivability/DCT mode */ 320 + VPU_JSM_MSG_DCT_DISABLE_DONE = 0x221e, 321 + /** 322 + * Response to state dump control command. 323 + * NOTE: Please introduce new ASYNC responses before this one. * 324 + */ 325 + VPU_JSM_MSG_STATE_DUMP_RSP = 0x22FF, 390 326 /* IPC Device -> Host, General command completion */ 391 327 VPU_JSM_MSG_GENERAL_CMD_DONE = 0x2300, 392 328 VPU_JSM_MSG_BLOB_DEINIT_DONE = VPU_JSM_MSG_GENERAL_CMD_DONE, 393 329 /** Response to VPU_JSM_MSG_DYNDBG_CONTROL. */ 394 330 VPU_JSM_MSG_DYNDBG_CONTROL_RSP = 0x2301, 331 + /** 332 + * Acknowledgment of completion of the save procedure initiated by 333 + * VPU_JSM_MSG_PWR_D0I3_ENTER 334 + */ 335 + VPU_JSM_MSG_PWR_D0I3_ENTER_DONE = 0x2302, 395 336 }; 396 337 397 338 enum vpu_ipc_msg_status { VPU_JSM_MSG_FREE, VPU_JSM_MSG_ALLOCATED }; ··· 700 593 * Default quantum in 100ns units for scheduling across processes 701 594 * within a priority band 702 595 */ 703 - u64 process_quantum[VPU_HWS_NUM_PRIORITY_BANDS]; 596 + u32 process_quantum[VPU_HWS_NUM_PRIORITY_BANDS]; 704 597 /* 705 598 * Default grace period in 100ns units for processes that preempt each 706 599 * other within a priority band 707 600 */ 708 - u64 process_grace_period[VPU_HWS_NUM_PRIORITY_BANDS]; 601 + u32 process_grace_period[VPU_HWS_NUM_PRIORITY_BANDS]; 709 602 /* 710 603 * For normal priority band, specifies the target VPU percentage 711 604 * in situations when it's starved by the focus band. ··· 715 608 u32 reserved_0; 716 609 }; 717 610 718 - /* HWS create command queue request */ 611 + /* 612 + * @brief HWS create command queue request. 613 + * Host will create a command queue via this command. 614 + * Note: Cmdq group is a handle of an object which 615 + * may contain one or more command queues. 616 + * @see VPU_JSM_MSG_CREATE_CMD_QUEUE 617 + * @see VPU_JSM_MSG_CREATE_CMD_QUEUE_RSP 618 + */ 719 619 struct vpu_ipc_msg_payload_hws_create_cmdq { 720 620 /* Process id */ 721 621 u64 process_id; 722 622 /* Host SSID */ 723 623 u32 host_ssid; 724 - /* Zero Padding */ 725 - u32 reserved; 624 + /* Engine for which queue is being created */ 625 + u32 engine_idx; 626 + /* 627 + * Cmdq group may be set to 0 or equal to 628 + * cmdq_id while each priority band contains 629 + * only single engine instances. 630 + */ 631 + u64 cmdq_group; 726 632 /* Command queue id */ 727 633 u64 cmdq_id; 728 634 /* Command queue base */ 729 635 u64 cmdq_base; 730 636 /* Command queue size */ 731 637 u32 cmdq_size; 732 - /* Reserved */ 638 + /* Zero padding */ 733 639 u32 reserved_0; 734 640 }; 735 641 736 - /* HWS create command queue response */ 642 + /* 643 + * @brief HWS create command queue response. 644 + * @see VPU_JSM_MSG_CREATE_CMD_QUEUE 645 + * @see VPU_JSM_MSG_CREATE_CMD_QUEUE_RSP 646 + */ 737 647 struct vpu_ipc_msg_payload_hws_create_cmdq_rsp { 738 648 /* Process id */ 739 649 u64 process_id; 740 650 /* Host SSID */ 741 651 u32 host_ssid; 742 - /* Zero Padding */ 743 - u32 reserved; 652 + /* Engine for which queue is being created */ 653 + u32 engine_idx; 654 + /* Command queue group */ 655 + u64 cmdq_group; 744 656 /* Command queue id */ 745 657 u64 cmdq_id; 746 658 }; ··· 787 661 /* Inside realtime band assigns a further priority */ 788 662 u32 realtime_priority_level; 789 663 /* Priority relative to other contexts in the same process */ 790 - u32 in_process_priority; 664 + s32 in_process_priority; 791 665 /* Zero padding / Reserved */ 792 666 u32 reserved_1; 793 667 /* Context quantum relative to other contexts of same priority in the same process */ ··· 818 692 u64 cmdq_base; 819 693 /* Size of the command queue in bytes. */ 820 694 u64 cmdq_size; 695 + }; 696 + 697 + /* 698 + * @brief Structure to set another buffer to be used for scheduling-related logging. 699 + * The size of the logging buffer and the number of entries is defined as part of the 700 + * buffer itself as described next. 701 + * The log buffer received from the host is made up of; 702 + * - header: 32 bytes in size, as shown in 'struct vpu_hws_log_buffer_header'. 703 + * The header contains the number of log entries in the buffer. 704 + * - log entry: 0 to n-1, each log entry is 32 bytes in size, as shown in 705 + * 'struct vpu_hws_log_buffer_entry'. 706 + * The entry contains the VPU timestamp, operation type and data. 707 + * The host should provide the notify index value of log buffer to VPU. This is a 708 + * value defined within the log buffer and when written to will generate the 709 + * scheduling log notification. 710 + * The host should set engine_idx and vpu_log_buffer_va to 0 to disable logging 711 + * for a particular engine. 712 + * VPU will handle one log buffer for each of supported engines. 713 + * VPU should allow the logging to consume one host_ssid. 714 + * @see VPU_JSM_MSG_HWS_SET_SCHEDULING_LOG 715 + * @see VPU_JSM_MSG_HWS_SET_SCHEDULING_LOG_RSP 716 + * @see VPU_JSM_MSG_HWS_SCHEDULING_LOG_NOTIFICATION 717 + */ 718 + struct vpu_ipc_msg_payload_hws_set_scheduling_log { 719 + /* Engine ordinal */ 720 + u32 engine_idx; 721 + /* Host SSID */ 722 + u32 host_ssid; 723 + /* 724 + * VPU log buffer virtual address. 725 + * Set to 0 to disable logging for this engine. 726 + */ 727 + u64 vpu_log_buffer_va; 728 + /* 729 + * Notify index of log buffer. VPU_JSM_MSG_HWS_SCHEDULING_LOG_NOTIFICATION 730 + * is generated when an event log is written to this index. 731 + */ 732 + u64 notify_index; 733 + }; 734 + 735 + /* 736 + * @brief The scheduling log notification is generated by VPU when it writes 737 + * an event into the log buffer at the notify_index. VPU notifies host with 738 + * VPU_JSM_MSG_HWS_SCHEDULING_LOG_NOTIFICATION. This is an asynchronous 739 + * message from VPU to host. 740 + * @see VPU_JSM_MSG_HWS_SCHEDULING_LOG_NOTIFICATION 741 + * @see VPU_JSM_MSG_HWS_SET_SCHEDULING_LOG 742 + */ 743 + struct vpu_ipc_msg_payload_hws_scheduling_log_notification { 744 + /* Engine ordinal */ 745 + u32 engine_idx; 746 + /* Zero Padding */ 747 + u32 reserved_0; 748 + }; 749 + 750 + /* 751 + * @brief HWS suspend command queue request and done structure. 752 + * Host will request the suspend of contexts and VPU will; 753 + * - Suspend all work on this context 754 + * - Preempt any running work 755 + * - Asynchronously perform the above and return success immediately once 756 + * all items above are started successfully 757 + * - Notify the host of completion of these operations via 758 + * VPU_JSM_MSG_HWS_SUSPEND_CMDQ_DONE 759 + * - Reject any other context operations on a context with an in-flight 760 + * suspend request running 761 + * Same structure used when VPU notifies host of completion of a context suspend 762 + * request. The ids and suspend fence value reported in this command will match 763 + * the one in the request from the host to suspend the context. Once suspend is 764 + * complete, VPU will not access any data relating to this command queue until 765 + * it is resumed. 766 + * @see VPU_JSM_MSG_HWS_SUSPEND_CMDQ 767 + * @see VPU_JSM_MSG_HWS_SUSPEND_CMDQ_DONE 768 + */ 769 + struct vpu_ipc_msg_payload_hws_suspend_cmdq { 770 + /* Host SSID */ 771 + u32 host_ssid; 772 + /* Zero Padding */ 773 + u32 reserved_0; 774 + /* Command queue id */ 775 + u64 cmdq_id; 776 + /* 777 + * Suspend fence value - reported by the VPU suspend context 778 + * completed once suspend is complete. 779 + */ 780 + u64 suspend_fence_value; 781 + }; 782 + 783 + /* 784 + * @brief HWS Resume command queue request / response structure. 785 + * Host will request the resume of a context; 786 + * - VPU will resume all work on this context 787 + * - Scheduler will allow this context to be scheduled 788 + * @see VPU_JSM_MSG_HWS_RESUME_CMDQ 789 + * @see VPU_JSM_MSG_HWS_RESUME_CMDQ_RSP 790 + */ 791 + struct vpu_ipc_msg_payload_hws_resume_cmdq { 792 + /* Host SSID */ 793 + u32 host_ssid; 794 + /* Zero Padding */ 795 + u32 reserved_0; 796 + /* Command queue id */ 797 + u64 cmdq_id; 798 + }; 799 + 800 + /* 801 + * @brief HWS Resume engine request / response structure. 802 + * After a HWS engine reset, all scheduling is stopped on VPU until a engine resume. 803 + * Host shall send this command to resume scheduling of any valid queue. 804 + * @see VPU_JSM_MSG_HWS_RESUME_ENGINE 805 + * @see VPU_JSM_MSG_HWS_RESUME_ENGINE_DONE 806 + */ 807 + struct vpu_ipc_msg_payload_hws_resume_engine { 808 + /* Engine to be resumed */ 809 + u32 engine_idx; 810 + /* Reserved */ 811 + u32 reserved_0; 821 812 }; 822 813 823 814 /** ··· 1181 938 char dyndbg_cmd[VPU_DYNDBG_CMD_MAX_LEN]; 1182 939 }; 1183 940 941 + /** 942 + * Payload for VPU_JSM_MSG_PWR_D0I3_ENTER 943 + * 944 + * This is a bi-directional payload. 945 + */ 946 + struct vpu_ipc_msg_payload_pwr_d0i3_enter { 947 + /** 948 + * 0: VPU_JSM_MSG_PWR_D0I3_ENTER_DONE is not sent to the host driver 949 + * The driver will poll for D0i2 Idle state transitions. 950 + * 1: VPU_JSM_MSG_PWR_D0I3_ENTER_DONE is sent after VPU state save is complete 951 + */ 952 + u32 send_response; 953 + u32 reserved_0; 954 + }; 955 + 956 + /** 957 + * Payload for VPU_JSM_MSG_DCT_ENABLE message. 958 + * 959 + * Default values for DCT active/inactive times are 5.3ms and 30ms respectively, 960 + * corresponding to a 85% duty cycle. This payload allows the host to tune these 961 + * values according to application requirements. 962 + */ 963 + struct vpu_ipc_msg_payload_pwr_dct_control { 964 + /** Duty cycle active time in microseconds */ 965 + u32 dct_active_us; 966 + /** Duty cycle inactive time in microseconds */ 967 + u32 dct_inactive_us; 968 + }; 969 + 1184 970 /* 1185 971 * Payloads union, used to define complete message format. 1186 972 */ ··· 1246 974 struct vpu_ipc_msg_payload_hws_destroy_cmdq hws_destroy_cmdq; 1247 975 struct vpu_ipc_msg_payload_hws_set_context_sched_properties 1248 976 hws_set_context_sched_properties; 977 + struct vpu_ipc_msg_payload_hws_set_scheduling_log hws_set_scheduling_log; 978 + struct vpu_ipc_msg_payload_hws_scheduling_log_notification hws_scheduling_log_notification; 979 + struct vpu_ipc_msg_payload_hws_suspend_cmdq hws_suspend_cmdq; 980 + struct vpu_ipc_msg_payload_hws_resume_cmdq hws_resume_cmdq; 981 + struct vpu_ipc_msg_payload_hws_resume_engine hws_resume_engine; 982 + struct vpu_ipc_msg_payload_pwr_d0i3_enter pwr_d0i3_enter; 983 + struct vpu_ipc_msg_payload_pwr_dct_control pwr_dct_control; 1249 984 }; 1250 985 1251 986 /*

+2 -1

drivers/accel/qaic/Makefile

··· 9 9 mhi_controller.o \ 10 10 qaic_control.o \ 11 11 qaic_data.o \ 12 - qaic_drv.o 12 + qaic_drv.o \ 13 + qaic_timesync.o

+39 -3

drivers/accel/qaic/mhi_controller.c

··· 348 348 .local_elements = 0, 349 349 .event_ring = 0, 350 350 .dir = DMA_TO_DEVICE, 351 - .ee_mask = MHI_CH_EE_SBL | MHI_CH_EE_AMSS, 351 + .ee_mask = MHI_CH_EE_SBL, 352 352 .pollcfg = 0, 353 353 .doorbell = MHI_DB_BRST_DISABLE, 354 354 .lpm_notify = false, ··· 364 364 .local_elements = 0, 365 365 .event_ring = 0, 366 366 .dir = DMA_FROM_DEVICE, 367 - .ee_mask = MHI_CH_EE_SBL | MHI_CH_EE_AMSS, 367 + .ee_mask = MHI_CH_EE_SBL, 368 + .pollcfg = 0, 369 + .doorbell = MHI_DB_BRST_DISABLE, 370 + .lpm_notify = false, 371 + .offload_channel = false, 372 + .doorbell_mode_switch = false, 373 + .auto_queue = false, 374 + .wake_capable = false, 375 + }, 376 + { 377 + .name = "QAIC_TIMESYNC_PERIODIC", 378 + .num = 22, 379 + .num_elements = 32, 380 + .local_elements = 0, 381 + .event_ring = 0, 382 + .dir = DMA_TO_DEVICE, 383 + .ee_mask = MHI_CH_EE_AMSS, 384 + .pollcfg = 0, 385 + .doorbell = MHI_DB_BRST_DISABLE, 386 + .lpm_notify = false, 387 + .offload_channel = false, 388 + .doorbell_mode_switch = false, 389 + .auto_queue = false, 390 + .wake_capable = false, 391 + }, 392 + { 393 + .num = 23, 394 + .name = "QAIC_TIMESYNC_PERIODIC", 395 + .num_elements = 32, 396 + .local_elements = 0, 397 + .event_ring = 0, 398 + .dir = DMA_FROM_DEVICE, 399 + .ee_mask = MHI_CH_EE_AMSS, 368 400 .pollcfg = 0, 369 401 .doorbell = MHI_DB_BRST_DISABLE, 370 402 .lpm_notify = false, ··· 500 468 } 501 469 502 470 struct mhi_controller *qaic_mhi_register_controller(struct pci_dev *pci_dev, void __iomem *mhi_bar, 503 - int mhi_irq) 471 + int mhi_irq, bool shared_msi) 504 472 { 505 473 struct mhi_controller *mhi_cntrl; 506 474 int ret; ··· 532 500 return ERR_PTR(-ENOMEM); 533 501 534 502 mhi_cntrl->irq[0] = mhi_irq; 503 + 504 + if (shared_msi) /* MSI shared with data path, no IRQF_NO_SUSPEND */ 505 + mhi_cntrl->irq_flags = IRQF_SHARED; 506 + 535 507 mhi_cntrl->fw_image = "qcom/aic100/sbl.bin"; 536 508 537 509 /* use latest configured timeout */

+1 -1

drivers/accel/qaic/mhi_controller.h

··· 8 8 #define MHICONTROLLERQAIC_H_ 9 9 10 10 struct mhi_controller *qaic_mhi_register_controller(struct pci_dev *pci_dev, void __iomem *mhi_bar, 11 - int mhi_irq); 11 + int mhi_irq, bool shared_msi); 12 12 void qaic_mhi_free_controller(struct mhi_controller *mhi_cntrl, bool link_up); 13 13 void qaic_mhi_start_reset(struct mhi_controller *mhi_cntrl); 14 14 void qaic_mhi_reset_done(struct mhi_controller *mhi_cntrl);

+6

drivers/accel/qaic/qaic.h

··· 123 123 struct srcu_struct dev_lock; 124 124 /* true: Device under reset; false: Device not under reset */ 125 125 bool in_reset; 126 + /* true: single MSI is used to operate device */ 127 + bool single_msi; 126 128 /* 127 129 * true: A tx MHI transaction has failed and a rx buffer is still queued 128 130 * in control device. Such a buffer is considered lost rx buffer ··· 139 137 u32 (*gen_crc)(void *msg); 140 138 /* Validate the CRC of a control message */ 141 139 bool (*valid_crc)(void *msg); 140 + /* MHI "QAIC_TIMESYNC" channel device */ 141 + struct mhi_device *qts_ch; 142 + /* Work queue for tasks related to MHI "QAIC_TIMESYNC" channel */ 143 + struct workqueue_struct *qts_wq; 142 144 }; 143 145 144 146 struct qaic_drm_device {

+1 -1

drivers/accel/qaic/qaic_control.c

··· 1138 1138 if (!list_is_first(&wrapper->list, &wrappers->list)) 1139 1139 kref_put(&wrapper->ref_count, free_wrapper); 1140 1140 1141 - wrapper = add_wrapper(wrappers, offsetof(struct wrapper_msg, trans) + sizeof(*out_trans)); 1141 + wrapper = add_wrapper(wrappers, sizeof(*wrapper)); 1142 1142 1143 1143 if (!wrapper) 1144 1144 return -ENOMEM;

+61 -68

drivers/accel/qaic/qaic_data.c

··· 51 51 }) 52 52 #define NUM_EVENTS 128 53 53 #define NUM_DELAYS 10 54 + #define fifo_at(base, offset) ((base) + (offset) * get_dbc_req_elem_size()) 54 55 55 56 static unsigned int wait_exec_default_timeout_ms = 5000; /* 5 sec default */ 56 57 module_param(wait_exec_default_timeout_ms, uint, 0600); ··· 1059 1058 return ret; 1060 1059 } 1061 1060 1061 + static inline u32 fifo_space_avail(u32 head, u32 tail, u32 q_size) 1062 + { 1063 + u32 avail = head - tail - 1; 1064 + 1065 + if (head <= tail) 1066 + avail += q_size; 1067 + 1068 + return avail; 1069 + } 1070 + 1062 1071 static inline int copy_exec_reqs(struct qaic_device *qdev, struct bo_slice *slice, u32 dbc_id, 1063 1072 u32 head, u32 *ptail) 1064 1073 { ··· 1077 1066 u32 tail = *ptail; 1078 1067 u32 avail; 1079 1068 1080 - avail = head - tail; 1081 - if (head <= tail) 1082 - avail += dbc->nelem; 1083 - 1084 - --avail; 1085 - 1069 + avail = fifo_space_avail(head, tail, dbc->nelem); 1086 1070 if (avail < slice->nents) 1087 1071 return -EAGAIN; 1088 1072 1089 1073 if (tail + slice->nents > dbc->nelem) { 1090 1074 avail = dbc->nelem - tail; 1091 1075 avail = min_t(u32, avail, slice->nents); 1092 - memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs, 1093 - sizeof(*reqs) * avail); 1076 + memcpy(fifo_at(dbc->req_q_base, tail), reqs, sizeof(*reqs) * avail); 1094 1077 reqs += avail; 1095 1078 avail = slice->nents - avail; 1096 1079 if (avail) 1097 1080 memcpy(dbc->req_q_base, reqs, sizeof(*reqs) * avail); 1098 1081 } else { 1099 - memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs, 1100 - sizeof(*reqs) * slice->nents); 1082 + memcpy(fifo_at(dbc->req_q_base, tail), reqs, sizeof(*reqs) * slice->nents); 1101 1083 } 1102 1084 1103 1085 *ptail = (tail + slice->nents) % dbc->nelem; ··· 1098 1094 return 0; 1099 1095 } 1100 1096 1101 - /* 1102 - * Based on the value of resize we may only need to transmit first_n 1103 - * entries and the last entry, with last_bytes to send from the last entry. 1104 - * Note that first_n could be 0. 1105 - */ 1106 1097 static inline int copy_partial_exec_reqs(struct qaic_device *qdev, struct bo_slice *slice, 1107 - u64 resize, u32 dbc_id, u32 head, u32 *ptail) 1098 + u64 resize, struct dma_bridge_chan *dbc, u32 head, 1099 + u32 *ptail) 1108 1100 { 1109 - struct dma_bridge_chan *dbc = &qdev->dbc[dbc_id]; 1110 1101 struct dbc_req *reqs = slice->reqs; 1111 1102 struct dbc_req *last_req; 1112 1103 u32 tail = *ptail; 1113 - u64 total_bytes; 1114 1104 u64 last_bytes; 1115 1105 u32 first_n; 1116 1106 u32 avail; 1117 - int ret; 1118 - int i; 1119 1107 1120 - avail = head - tail; 1121 - if (head <= tail) 1122 - avail += dbc->nelem; 1108 + avail = fifo_space_avail(head, tail, dbc->nelem); 1123 1109 1124 - --avail; 1125 - 1126 - total_bytes = 0; 1127 - for (i = 0; i < slice->nents; i++) { 1128 - total_bytes += le32_to_cpu(reqs[i].len); 1129 - if (total_bytes >= resize) 1110 + /* 1111 + * After this for loop is complete, first_n represents the index 1112 + * of the last DMA request of this slice that needs to be 1113 + * transferred after resizing and last_bytes represents DMA size 1114 + * of that request. 1115 + */ 1116 + last_bytes = resize; 1117 + for (first_n = 0; first_n < slice->nents; first_n++) 1118 + if (last_bytes > le32_to_cpu(reqs[first_n].len)) 1119 + last_bytes -= le32_to_cpu(reqs[first_n].len); 1120 + else 1130 1121 break; 1131 - } 1132 - 1133 - if (total_bytes < resize) { 1134 - /* User space should have used the full buffer path. */ 1135 - ret = -EINVAL; 1136 - return ret; 1137 - } 1138 - 1139 - first_n = i; 1140 - last_bytes = i ? resize + le32_to_cpu(reqs[i].len) - total_bytes : resize; 1141 1122 1142 1123 if (avail < (first_n + 1)) 1143 1124 return -EAGAIN; ··· 1131 1142 if (tail + first_n > dbc->nelem) { 1132 1143 avail = dbc->nelem - tail; 1133 1144 avail = min_t(u32, avail, first_n); 1134 - memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs, 1135 - sizeof(*reqs) * avail); 1145 + memcpy(fifo_at(dbc->req_q_base, tail), reqs, sizeof(*reqs) * avail); 1136 1146 last_req = reqs + avail; 1137 1147 avail = first_n - avail; 1138 1148 if (avail) 1139 1149 memcpy(dbc->req_q_base, last_req, sizeof(*reqs) * avail); 1140 1150 } else { 1141 - memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs, 1142 - sizeof(*reqs) * first_n); 1151 + memcpy(fifo_at(dbc->req_q_base, tail), reqs, sizeof(*reqs) * first_n); 1143 1152 } 1144 1153 } 1145 1154 1146 - /* Copy over the last entry. Here we need to adjust len to the left over 1155 + /* 1156 + * Copy over the last entry. Here we need to adjust len to the left over 1147 1157 * size, and set src and dst to the entry it is copied to. 1148 1158 */ 1149 - last_req = dbc->req_q_base + (tail + first_n) % dbc->nelem * get_dbc_req_elem_size(); 1159 + last_req = fifo_at(dbc->req_q_base, (tail + first_n) % dbc->nelem); 1150 1160 memcpy(last_req, reqs + slice->nents - 1, sizeof(*reqs)); 1151 1161 1152 1162 /* ··· 1156 1168 last_req->len = cpu_to_le32((u32)last_bytes); 1157 1169 last_req->src_addr = reqs[first_n].src_addr; 1158 1170 last_req->dest_addr = reqs[first_n].dest_addr; 1171 + if (!last_bytes) 1172 + /* Disable DMA transfer */ 1173 + last_req->cmd = GENMASK(7, 2) & reqs[first_n].cmd; 1159 1174 1160 1175 *ptail = (tail + first_n + 1) % dbc->nelem; 1161 1176 ··· 1218 1227 bo->req_id = dbc->next_req_id++; 1219 1228 1220 1229 list_for_each_entry(slice, &bo->slices, slice) { 1221 - /* 1222 - * If this slice does not fall under the given 1223 - * resize then skip this slice and continue the loop 1224 - */ 1225 - if (is_partial && pexec[i].resize && pexec[i].resize <= slice->offset) 1226 - continue; 1227 - 1228 1230 for (j = 0; j < slice->nents; j++) 1229 1231 slice->reqs[j].req_id = cpu_to_le16(bo->req_id); 1230 1232 1231 - /* 1232 - * If it is a partial execute ioctl call then check if 1233 - * resize has cut this slice short then do a partial copy 1234 - * else do complete copy 1235 - */ 1236 - if (is_partial && pexec[i].resize && 1237 - pexec[i].resize < slice->offset + slice->size) 1233 + if (is_partial && (!pexec[i].resize || pexec[i].resize <= slice->offset)) 1234 + /* Configure the slice for no DMA transfer */ 1235 + ret = copy_partial_exec_reqs(qdev, slice, 0, dbc, head, tail); 1236 + else if (is_partial && pexec[i].resize < slice->offset + slice->size) 1237 + /* Configure the slice to be partially DMA transferred */ 1238 1238 ret = copy_partial_exec_reqs(qdev, slice, 1239 - pexec[i].resize - slice->offset, 1240 - dbc->id, head, tail); 1239 + pexec[i].resize - slice->offset, dbc, 1240 + head, tail); 1241 1241 else 1242 1242 ret = copy_exec_reqs(qdev, slice, dbc->id, head, tail); 1243 1243 if (ret) { ··· 1448 1466 1449 1467 rcu_id = srcu_read_lock(&dbc->ch_lock); 1450 1468 1469 + if (datapath_polling) { 1470 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1471 + /* 1472 + * Normally datapath_polling will not have irqs enabled, but 1473 + * when running with only one MSI the interrupt is shared with 1474 + * MHI so it cannot be disabled. Return ASAP instead. 1475 + */ 1476 + return IRQ_HANDLED; 1477 + } 1478 + 1451 1479 if (!dbc->usr) { 1452 1480 srcu_read_unlock(&dbc->ch_lock, rcu_id); 1453 1481 return IRQ_HANDLED; ··· 1480 1488 return IRQ_NONE; 1481 1489 } 1482 1490 1483 - disable_irq_nosync(irq); 1491 + if (!dbc->qdev->single_msi) 1492 + disable_irq_nosync(irq); 1484 1493 srcu_read_unlock(&dbc->ch_lock, rcu_id); 1485 1494 return IRQ_WAKE_THREAD; 1486 1495 } ··· 1552 1559 u32 tail; 1553 1560 1554 1561 rcu_id = srcu_read_lock(&dbc->ch_lock); 1562 + qdev = dbc->qdev; 1555 1563 1556 1564 head = readl(dbc->dbc_base + RSPHP_OFF); 1557 1565 if (head == U32_MAX) /* PCI link error */ 1558 1566 goto error_out; 1559 1567 1560 - qdev = dbc->qdev; 1561 1568 read_fifo: 1562 1569 1563 1570 if (!event_count) { ··· 1638 1645 goto read_fifo; 1639 1646 1640 1647 normal_out: 1641 - if (likely(!datapath_polling)) 1648 + if (!qdev->single_msi && likely(!datapath_polling)) 1642 1649 enable_irq(irq); 1643 - else 1650 + else if (unlikely(datapath_polling)) 1644 1651 schedule_work(&dbc->poll_work); 1645 1652 /* checking the fifo and enabling irqs is a race, missed event check */ 1646 1653 tail = readl(dbc->dbc_base + RSPTP_OFF); 1647 1654 if (tail != U32_MAX && head != tail) { 1648 - if (likely(!datapath_polling)) 1655 + if (!qdev->single_msi && likely(!datapath_polling)) 1649 1656 disable_irq_nosync(irq); 1650 1657 goto read_fifo; 1651 1658 } ··· 1654 1661 1655 1662 error_out: 1656 1663 srcu_read_unlock(&dbc->ch_lock, rcu_id); 1657 - if (likely(!datapath_polling)) 1664 + if (!qdev->single_msi && likely(!datapath_polling)) 1658 1665 enable_irq(irq); 1659 - else 1666 + else if (unlikely(datapath_polling)) 1660 1667 schedule_work(&dbc->poll_work); 1661 1668 1662 1669 return IRQ_HANDLED;

+37 -11

drivers/accel/qaic/qaic_drv.c

··· 27 27 28 28 #include "mhi_controller.h" 29 29 #include "qaic.h" 30 + #include "qaic_timesync.h" 30 31 31 32 MODULE_IMPORT_NS(DMA_BUF); 32 33 ··· 325 324 cleanup_srcu_struct(&qdev->dev_lock); 326 325 pci_set_drvdata(qdev->pdev, NULL); 327 326 destroy_workqueue(qdev->cntl_wq); 327 + destroy_workqueue(qdev->qts_wq); 328 328 } 329 329 330 330 static struct qaic_device *create_qdev(struct pci_dev *pdev, const struct pci_device_id *id) ··· 348 346 qdev->cntl_wq = alloc_workqueue("qaic_cntl", WQ_UNBOUND, 0); 349 347 if (!qdev->cntl_wq) 350 348 return NULL; 349 + 350 + qdev->qts_wq = alloc_workqueue("qaic_ts", WQ_UNBOUND, 0); 351 + if (!qdev->qts_wq) { 352 + destroy_workqueue(qdev->cntl_wq); 353 + return NULL; 354 + } 351 355 352 356 pci_set_drvdata(pdev, qdev); 353 357 qdev->pdev = pdev; ··· 432 424 int i; 433 425 434 426 /* Managed release since we use pcim_enable_device */ 435 - ret = pci_alloc_irq_vectors(pdev, 1, 32, PCI_IRQ_MSI); 436 - if (ret < 0) 437 - return ret; 427 + ret = pci_alloc_irq_vectors(pdev, 32, 32, PCI_IRQ_MSI); 428 + if (ret == -ENOSPC) { 429 + ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI); 430 + if (ret < 0) 431 + return ret; 438 432 439 - if (ret < 32) { 440 - pci_err(pdev, "%s: Requested 32 MSIs. Obtained %d MSIs which is less than the 32 required.\n", 441 - __func__, ret); 442 - return -ENODEV; 433 + /* 434 + * Operate in one MSI mode. All interrupts will be directed to 435 + * MSI0; every interrupt will wake up all the interrupt handlers 436 + * (MHI and DBC[0-15]). Since the interrupt is now shared, it is 437 + * not disabled during DBC threaded handler, but only one thread 438 + * will be allowed to run per DBC, so while it can be 439 + * interrupted, it shouldn't race with itself. 440 + */ 441 + qdev->single_msi = true; 442 + pci_info(pdev, "Allocating 32 MSIs failed, operating in 1 MSI mode. Performance may be impacted.\n"); 443 + } else if (ret < 0) { 444 + return ret; 443 445 } 444 446 445 447 mhi_irq = pci_irq_vector(pdev, 0); ··· 457 439 return mhi_irq; 458 440 459 441 for (i = 0; i < qdev->num_dbc; ++i) { 460 - ret = devm_request_threaded_irq(&pdev->dev, pci_irq_vector(pdev, i + 1), 442 + ret = devm_request_threaded_irq(&pdev->dev, 443 + pci_irq_vector(pdev, qdev->single_msi ? 0 : i + 1), 461 444 dbc_irq_handler, dbc_irq_threaded_fn, IRQF_SHARED, 462 445 "qaic_dbc", &qdev->dbc[i]); 463 446 if (ret) 464 447 return ret; 465 448 466 449 if (datapath_polling) { 467 - qdev->dbc[i].irq = pci_irq_vector(pdev, i + 1); 468 - disable_irq_nosync(qdev->dbc[i].irq); 450 + qdev->dbc[i].irq = pci_irq_vector(pdev, qdev->single_msi ? 0 : i + 1); 451 + if (!qdev->single_msi) 452 + disable_irq_nosync(qdev->dbc[i].irq); 469 453 INIT_WORK(&qdev->dbc[i].poll_work, irq_polling_work); 470 454 } 471 455 } ··· 499 479 goto cleanup_qdev; 500 480 } 501 481 502 - qdev->mhi_cntrl = qaic_mhi_register_controller(pdev, qdev->bar_0, mhi_irq); 482 + qdev->mhi_cntrl = qaic_mhi_register_controller(pdev, qdev->bar_0, mhi_irq, 483 + qdev->single_msi); 503 484 if (IS_ERR(qdev->mhi_cntrl)) { 504 485 ret = PTR_ERR(qdev->mhi_cntrl); 505 486 goto cleanup_qdev; ··· 607 586 goto free_pci; 608 587 } 609 588 589 + ret = qaic_timesync_init(); 590 + if (ret) 591 + pr_debug("qaic: qaic_timesync_init failed %d\n", ret); 592 + 610 593 return 0; 611 594 612 595 free_pci: ··· 636 611 * reinitializing the link_up state after the cleanup is done. 637 612 */ 638 613 link_up = true; 614 + qaic_timesync_deinit(); 639 615 mhi_driver_unregister(&qaic_mhi_driver); 640 616 pci_unregister_driver(&qaic_pci_driver); 641 617 }

+395

drivers/accel/qaic/qaic_timesync.c

··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved. */ 3 + 4 + #include <linux/io.h> 5 + #include <linux/kernel.h> 6 + #include <linux/math64.h> 7 + #include <linux/mhi.h> 8 + #include <linux/mod_devicetable.h> 9 + #include <linux/module.h> 10 + #include <linux/time64.h> 11 + #include <linux/timer.h> 12 + 13 + #include "qaic.h" 14 + #include "qaic_timesync.h" 15 + 16 + #define QTIMER_REG_OFFSET 0xa28 17 + #define QAIC_TIMESYNC_SIGNATURE 0x55aa 18 + #define QAIC_CONV_QTIMER_TO_US(qtimer) (mul_u64_u32_div(qtimer, 10, 192)) 19 + 20 + static unsigned int timesync_delay_ms = 1000; /* 1 sec default */ 21 + module_param(timesync_delay_ms, uint, 0600); 22 + MODULE_PARM_DESC(timesync_delay_ms, "Delay in ms between two consecutive timesync operations"); 23 + 24 + enum qts_msg_type { 25 + QAIC_TS_CMD_TO_HOST, 26 + QAIC_TS_SYNC_REQ, 27 + QAIC_TS_ACK_TO_HOST, 28 + QAIC_TS_MSG_TYPE_MAX 29 + }; 30 + 31 + /** 32 + * struct qts_hdr - Timesync message header structure. 33 + * @signature: Unique signature to identify the timesync message. 34 + * @reserved_1: Reserved for future use. 35 + * @reserved_2: Reserved for future use. 36 + * @msg_type: sub-type of the timesync message. 37 + * @reserved_3: Reserved for future use. 38 + */ 39 + struct qts_hdr { 40 + __le16 signature; 41 + __le16 reserved_1; 42 + u8 reserved_2; 43 + u8 msg_type; 44 + __le16 reserved_3; 45 + } __packed; 46 + 47 + /** 48 + * struct qts_timeval - Structure to carry time information. 49 + * @tv_sec: Seconds part of the time. 50 + * @tv_usec: uS (microseconds) part of the time. 51 + */ 52 + struct qts_timeval { 53 + __le64 tv_sec; 54 + __le64 tv_usec; 55 + } __packed; 56 + 57 + /** 58 + * struct qts_host_time_sync_msg_data - Structure to denote the timesync message. 59 + * @header: Header of the timesync message. 60 + * @data: Time information. 61 + */ 62 + struct qts_host_time_sync_msg_data { 63 + struct qts_hdr header; 64 + struct qts_timeval data; 65 + } __packed; 66 + 67 + /** 68 + * struct mqts_dev - MHI QAIC Timesync Control device. 69 + * @qdev: Pointer to the root device struct driven by QAIC driver. 70 + * @mhi_dev: Pointer to associated MHI device. 71 + * @timer: Timer handle used for timesync. 72 + * @qtimer_addr: Device QTimer register pointer. 73 + * @buff_in_use: atomic variable to track if the sync_msg buffer is in use. 74 + * @dev: Device pointer to qdev->pdev->dev stored for easy access. 75 + * @sync_msg: Buffer used to send timesync message over MHI. 76 + */ 77 + struct mqts_dev { 78 + struct qaic_device *qdev; 79 + struct mhi_device *mhi_dev; 80 + struct timer_list timer; 81 + void __iomem *qtimer_addr; 82 + atomic_t buff_in_use; 83 + struct device *dev; 84 + struct qts_host_time_sync_msg_data *sync_msg; 85 + }; 86 + 87 + struct qts_resp_msg { 88 + struct qts_hdr hdr; 89 + } __packed; 90 + 91 + struct qts_resp { 92 + struct qts_resp_msg data; 93 + struct work_struct work; 94 + struct qaic_device *qdev; 95 + }; 96 + 97 + #ifdef readq 98 + static u64 read_qtimer(const volatile void __iomem *addr) 99 + { 100 + return readq(addr); 101 + } 102 + #else 103 + static u64 read_qtimer(const volatile void __iomem *addr) 104 + { 105 + u64 low, high; 106 + 107 + low = readl(addr); 108 + high = readl(addr + sizeof(u32)); 109 + return low | (high << 32); 110 + } 111 + #endif 112 + 113 + static void qaic_timesync_ul_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result) 114 + { 115 + struct mqts_dev *mqtsdev = dev_get_drvdata(&mhi_dev->dev); 116 + 117 + dev_dbg(mqtsdev->dev, "%s status: %d xfer_len: %zu\n", __func__, 118 + mhi_result->transaction_status, mhi_result->bytes_xferd); 119 + 120 + atomic_set(&mqtsdev->buff_in_use, 0); 121 + } 122 + 123 + static void qaic_timesync_dl_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result) 124 + { 125 + struct mqts_dev *mqtsdev = dev_get_drvdata(&mhi_dev->dev); 126 + 127 + dev_err(mqtsdev->dev, "%s no data expected on dl channel\n", __func__); 128 + } 129 + 130 + static void qaic_timesync_timer(struct timer_list *t) 131 + { 132 + struct mqts_dev *mqtsdev = from_timer(mqtsdev, t, timer); 133 + struct qts_host_time_sync_msg_data *sync_msg; 134 + u64 device_qtimer_us; 135 + u64 device_qtimer; 136 + u64 host_time_us; 137 + u64 offset_us; 138 + u64 host_sec; 139 + int ret; 140 + 141 + if (atomic_read(&mqtsdev->buff_in_use)) { 142 + dev_dbg(mqtsdev->dev, "%s buffer not free, schedule next cycle\n", __func__); 143 + goto mod_timer; 144 + } 145 + atomic_set(&mqtsdev->buff_in_use, 1); 146 + 147 + sync_msg = mqtsdev->sync_msg; 148 + sync_msg->header.signature = cpu_to_le16(QAIC_TIMESYNC_SIGNATURE); 149 + sync_msg->header.msg_type = QAIC_TS_SYNC_REQ; 150 + /* Read host UTC time and convert to uS*/ 151 + host_time_us = div_u64(ktime_get_real_ns(), NSEC_PER_USEC); 152 + device_qtimer = read_qtimer(mqtsdev->qtimer_addr); 153 + device_qtimer_us = QAIC_CONV_QTIMER_TO_US(device_qtimer); 154 + /* Offset between host UTC and device time */ 155 + offset_us = host_time_us - device_qtimer_us; 156 + 157 + host_sec = div_u64(offset_us, USEC_PER_SEC); 158 + sync_msg->data.tv_usec = cpu_to_le64(offset_us - host_sec * USEC_PER_SEC); 159 + sync_msg->data.tv_sec = cpu_to_le64(host_sec); 160 + ret = mhi_queue_buf(mqtsdev->mhi_dev, DMA_TO_DEVICE, sync_msg, sizeof(*sync_msg), MHI_EOT); 161 + if (ret && (ret != -EAGAIN)) { 162 + dev_err(mqtsdev->dev, "%s unable to queue to mhi:%d\n", __func__, ret); 163 + return; 164 + } else if (ret == -EAGAIN) { 165 + atomic_set(&mqtsdev->buff_in_use, 0); 166 + } 167 + 168 + mod_timer: 169 + ret = mod_timer(t, jiffies + msecs_to_jiffies(timesync_delay_ms)); 170 + if (ret) 171 + dev_err(mqtsdev->dev, "%s mod_timer error:%d\n", __func__, ret); 172 + } 173 + 174 + static int qaic_timesync_probe(struct mhi_device *mhi_dev, const struct mhi_device_id *id) 175 + { 176 + struct qaic_device *qdev = pci_get_drvdata(to_pci_dev(mhi_dev->mhi_cntrl->cntrl_dev)); 177 + struct mqts_dev *mqtsdev; 178 + struct timer_list *timer; 179 + int ret; 180 + 181 + mqtsdev = kzalloc(sizeof(*mqtsdev), GFP_KERNEL); 182 + if (!mqtsdev) { 183 + ret = -ENOMEM; 184 + goto out; 185 + } 186 + 187 + timer = &mqtsdev->timer; 188 + mqtsdev->mhi_dev = mhi_dev; 189 + mqtsdev->qdev = qdev; 190 + mqtsdev->dev = &qdev->pdev->dev; 191 + 192 + mqtsdev->sync_msg = kzalloc(sizeof(*mqtsdev->sync_msg), GFP_KERNEL); 193 + if (!mqtsdev->sync_msg) { 194 + ret = -ENOMEM; 195 + goto free_mqts_dev; 196 + } 197 + atomic_set(&mqtsdev->buff_in_use, 0); 198 + 199 + ret = mhi_prepare_for_transfer(mhi_dev); 200 + if (ret) 201 + goto free_sync_msg; 202 + 203 + /* Qtimer register pointer */ 204 + mqtsdev->qtimer_addr = qdev->bar_0 + QTIMER_REG_OFFSET; 205 + timer_setup(timer, qaic_timesync_timer, 0); 206 + timer->expires = jiffies + msecs_to_jiffies(timesync_delay_ms); 207 + add_timer(timer); 208 + dev_set_drvdata(&mhi_dev->dev, mqtsdev); 209 + 210 + return 0; 211 + 212 + free_sync_msg: 213 + kfree(mqtsdev->sync_msg); 214 + free_mqts_dev: 215 + kfree(mqtsdev); 216 + out: 217 + return ret; 218 + }; 219 + 220 + static void qaic_timesync_remove(struct mhi_device *mhi_dev) 221 + { 222 + struct mqts_dev *mqtsdev = dev_get_drvdata(&mhi_dev->dev); 223 + 224 + del_timer_sync(&mqtsdev->timer); 225 + mhi_unprepare_from_transfer(mqtsdev->mhi_dev); 226 + kfree(mqtsdev->sync_msg); 227 + kfree(mqtsdev); 228 + } 229 + 230 + static const struct mhi_device_id qaic_timesync_match_table[] = { 231 + { .chan = "QAIC_TIMESYNC_PERIODIC"}, 232 + {}, 233 + }; 234 + 235 + MODULE_DEVICE_TABLE(mhi, qaic_timesync_match_table); 236 + 237 + static struct mhi_driver qaic_timesync_driver = { 238 + .id_table = qaic_timesync_match_table, 239 + .remove = qaic_timesync_remove, 240 + .probe = qaic_timesync_probe, 241 + .ul_xfer_cb = qaic_timesync_ul_xfer_cb, 242 + .dl_xfer_cb = qaic_timesync_dl_xfer_cb, 243 + .driver = { 244 + .name = "qaic_timesync_periodic", 245 + }, 246 + }; 247 + 248 + static void qaic_boot_timesync_worker(struct work_struct *work) 249 + { 250 + struct qts_resp *resp = container_of(work, struct qts_resp, work); 251 + struct qts_host_time_sync_msg_data *req; 252 + struct qts_resp_msg data = resp->data; 253 + struct qaic_device *qdev = resp->qdev; 254 + struct mhi_device *mhi_dev; 255 + struct timespec64 ts; 256 + int ret; 257 + 258 + mhi_dev = qdev->qts_ch; 259 + /* Queue the response message beforehand to avoid race conditions */ 260 + ret = mhi_queue_buf(mhi_dev, DMA_FROM_DEVICE, &resp->data, sizeof(resp->data), MHI_EOT); 261 + if (ret) { 262 + kfree(resp); 263 + dev_warn(&mhi_dev->dev, "Failed to re-queue response buffer %d\n", ret); 264 + return; 265 + } 266 + 267 + switch (data.hdr.msg_type) { 268 + case QAIC_TS_CMD_TO_HOST: 269 + req = kzalloc(sizeof(*req), GFP_KERNEL); 270 + if (!req) 271 + break; 272 + 273 + req->header = data.hdr; 274 + req->header.msg_type = QAIC_TS_SYNC_REQ; 275 + ktime_get_real_ts64(&ts); 276 + req->data.tv_sec = cpu_to_le64(ts.tv_sec); 277 + req->data.tv_usec = cpu_to_le64(div_u64(ts.tv_nsec, NSEC_PER_USEC)); 278 + 279 + ret = mhi_queue_buf(mhi_dev, DMA_TO_DEVICE, req, sizeof(*req), MHI_EOT); 280 + if (ret) { 281 + kfree(req); 282 + dev_dbg(&mhi_dev->dev, "Failed to send request message. Error %d\n", ret); 283 + } 284 + break; 285 + case QAIC_TS_ACK_TO_HOST: 286 + dev_dbg(&mhi_dev->dev, "ACK received from device\n"); 287 + break; 288 + default: 289 + dev_err(&mhi_dev->dev, "Invalid message type %u.\n", data.hdr.msg_type); 290 + } 291 + } 292 + 293 + static int qaic_boot_timesync_queue_resp(struct mhi_device *mhi_dev, struct qaic_device *qdev) 294 + { 295 + struct qts_resp *resp; 296 + int ret; 297 + 298 + resp = kzalloc(sizeof(*resp), GFP_KERNEL); 299 + if (!resp) 300 + return -ENOMEM; 301 + 302 + resp->qdev = qdev; 303 + INIT_WORK(&resp->work, qaic_boot_timesync_worker); 304 + 305 + ret = mhi_queue_buf(mhi_dev, DMA_FROM_DEVICE, &resp->data, sizeof(resp->data), MHI_EOT); 306 + if (ret) { 307 + kfree(resp); 308 + dev_warn(&mhi_dev->dev, "Failed to queue response buffer %d\n", ret); 309 + return ret; 310 + } 311 + 312 + return 0; 313 + } 314 + 315 + static void qaic_boot_timesync_remove(struct mhi_device *mhi_dev) 316 + { 317 + struct qaic_device *qdev; 318 + 319 + qdev = dev_get_drvdata(&mhi_dev->dev); 320 + mhi_unprepare_from_transfer(qdev->qts_ch); 321 + qdev->qts_ch = NULL; 322 + } 323 + 324 + static int qaic_boot_timesync_probe(struct mhi_device *mhi_dev, const struct mhi_device_id *id) 325 + { 326 + struct qaic_device *qdev = pci_get_drvdata(to_pci_dev(mhi_dev->mhi_cntrl->cntrl_dev)); 327 + int ret; 328 + 329 + ret = mhi_prepare_for_transfer(mhi_dev); 330 + if (ret) 331 + return ret; 332 + 333 + qdev->qts_ch = mhi_dev; 334 + dev_set_drvdata(&mhi_dev->dev, qdev); 335 + 336 + ret = qaic_boot_timesync_queue_resp(mhi_dev, qdev); 337 + if (ret) { 338 + dev_set_drvdata(&mhi_dev->dev, NULL); 339 + qdev->qts_ch = NULL; 340 + mhi_unprepare_from_transfer(mhi_dev); 341 + } 342 + 343 + return ret; 344 + } 345 + 346 + static void qaic_boot_timesync_ul_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result) 347 + { 348 + kfree(mhi_result->buf_addr); 349 + } 350 + 351 + static void qaic_boot_timesync_dl_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result) 352 + { 353 + struct qts_resp *resp = container_of(mhi_result->buf_addr, struct qts_resp, data); 354 + 355 + if (mhi_result->transaction_status || mhi_result->bytes_xferd != sizeof(resp->data)) { 356 + kfree(resp); 357 + return; 358 + } 359 + 360 + queue_work(resp->qdev->qts_wq, &resp->work); 361 + } 362 + 363 + static const struct mhi_device_id qaic_boot_timesync_match_table[] = { 364 + { .chan = "QAIC_TIMESYNC"}, 365 + {}, 366 + }; 367 + 368 + static struct mhi_driver qaic_boot_timesync_driver = { 369 + .id_table = qaic_boot_timesync_match_table, 370 + .remove = qaic_boot_timesync_remove, 371 + .probe = qaic_boot_timesync_probe, 372 + .ul_xfer_cb = qaic_boot_timesync_ul_xfer_cb, 373 + .dl_xfer_cb = qaic_boot_timesync_dl_xfer_cb, 374 + .driver = { 375 + .name = "qaic_timesync", 376 + }, 377 + }; 378 + 379 + int qaic_timesync_init(void) 380 + { 381 + int ret; 382 + 383 + ret = mhi_driver_register(&qaic_timesync_driver); 384 + if (ret) 385 + return ret; 386 + ret = mhi_driver_register(&qaic_boot_timesync_driver); 387 + 388 + return ret; 389 + } 390 + 391 + void qaic_timesync_deinit(void) 392 + { 393 + mhi_driver_unregister(&qaic_boot_timesync_driver); 394 + mhi_driver_unregister(&qaic_timesync_driver); 395 + }

+11

drivers/accel/qaic/qaic_timesync.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0-only 2 + * 3 + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved. 4 + */ 5 + 6 + #ifndef __QAIC_TIMESYNC_H__ 7 + #define __QAIC_TIMESYNC_H__ 8 + 9 + int qaic_timesync_init(void); 10 + void qaic_timesync_deinit(void); 11 + #endif /* __QAIC_TIMESYNC_H__ */

+6 -6

drivers/gpu/drm/Kconfig

··· 75 75 config DRM_KUNIT_TEST 76 76 tristate "KUnit tests for DRM" if !KUNIT_ALL_TESTS 77 77 depends on DRM && KUNIT 78 - select PRIME_NUMBERS 78 + select DRM_BUDDY 79 79 select DRM_DISPLAY_DP_HELPER 80 80 select DRM_DISPLAY_HELPER 81 - select DRM_LIB_RANDOM 82 - select DRM_KMS_HELPER 83 - select DRM_BUDDY 84 - select DRM_EXPORT_FOR_TESTS if m 85 - select DRM_KUNIT_TEST_HELPERS 86 81 select DRM_EXEC 82 + select DRM_EXPORT_FOR_TESTS if m 83 + select DRM_KMS_HELPER 84 + select DRM_KUNIT_TEST_HELPERS 85 + select DRM_LIB_RANDOM 86 + select PRIME_NUMBERS 87 87 default KUNIT_ALL_TESTS 88 88 help 89 89 This builds unit tests for DRM. This option is not useful for

+1

drivers/gpu/drm/Makefile

··· 22 22 drm_drv.o \ 23 23 drm_dumb_buffers.o \ 24 24 drm_edid.o \ 25 + drm_eld.o \ 25 26 drm_encoder.o \ 26 27 drm_file.o \ 27 28 drm_fourcc.o \

+1 -1

drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c

··· 290 290 for (i = 0; i < adev->gfx.num_compute_rings; i++) { 291 291 struct amdgpu_ring *ring = &adev->gfx.compute_ring[i]; 292 292 293 - if (!(ring && ring->sched.thread)) 293 + if (!(ring && drm_sched_wqueue_ready(&ring->sched))) 294 294 continue; 295 295 296 296 /* stop secheduler and drain ring. */

+8 -7

drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c

··· 1665 1665 for (i = 0; i < AMDGPU_MAX_RINGS; i++) { 1666 1666 struct amdgpu_ring *ring = adev->rings[i]; 1667 1667 1668 - if (!ring || !ring->sched.thread) 1668 + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) 1669 1669 continue; 1670 - kthread_park(ring->sched.thread); 1670 + drm_sched_wqueue_stop(&ring->sched); 1671 1671 } 1672 1672 1673 1673 seq_puts(m, "run ib test:\n"); ··· 1681 1681 for (i = 0; i < AMDGPU_MAX_RINGS; i++) { 1682 1682 struct amdgpu_ring *ring = adev->rings[i]; 1683 1683 1684 - if (!ring || !ring->sched.thread) 1684 + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) 1685 1685 continue; 1686 - kthread_unpark(ring->sched.thread); 1686 + drm_sched_wqueue_start(&ring->sched); 1687 1687 } 1688 1688 1689 1689 up_write(&adev->reset_domain->sem); ··· 1903 1903 1904 1904 ring = adev->rings[val]; 1905 1905 1906 - if (!ring || !ring->funcs->preempt_ib || !ring->sched.thread) 1906 + if (!ring || !ring->funcs->preempt_ib || 1907 + !drm_sched_wqueue_ready(&ring->sched)) 1907 1908 return -EINVAL; 1908 1909 1909 1910 /* the last preemption failed */ ··· 1922 1921 goto pro_end; 1923 1922 1924 1923 /* stop the scheduler */ 1925 - kthread_park(ring->sched.thread); 1924 + drm_sched_wqueue_stop(&ring->sched); 1926 1925 1927 1926 /* preempt the IB */ 1928 1927 r = amdgpu_ring_preempt_ib(ring); ··· 1956 1955 1957 1956 failure: 1958 1957 /* restart the scheduler */ 1959 - kthread_unpark(ring->sched.thread); 1958 + drm_sched_wqueue_start(&ring->sched); 1960 1959 1961 1960 up_read(&adev->reset_domain->sem); 1962 1961

+7 -7

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

··· 2573 2573 break; 2574 2574 } 2575 2575 2576 - r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, 2576 + r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL, 2577 2577 DRM_SCHED_PRIORITY_COUNT, 2578 2578 ring->num_hw_submission, 0, 2579 2579 timeout, adev->reset_domain->wq, ··· 4964 4964 for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { 4965 4965 struct amdgpu_ring *ring = adev->rings[i]; 4966 4966 4967 - if (!ring || !ring->sched.thread) 4967 + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) 4968 4968 continue; 4969 4969 4970 4970 spin_lock(&ring->sched.job_list_lock); ··· 5103 5103 for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { 5104 5104 struct amdgpu_ring *ring = adev->rings[i]; 5105 5105 5106 - if (!ring || !ring->sched.thread) 5106 + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) 5107 5107 continue; 5108 5108 5109 5109 /* Clear job fence from fence drv to avoid force_completion ··· 5592 5592 for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { 5593 5593 struct amdgpu_ring *ring = tmp_adev->rings[i]; 5594 5594 5595 - if (!ring || !ring->sched.thread) 5595 + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) 5596 5596 continue; 5597 5597 5598 5598 drm_sched_stop(&ring->sched, job ? &job->base : NULL); ··· 5668 5668 for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { 5669 5669 struct amdgpu_ring *ring = tmp_adev->rings[i]; 5670 5670 5671 - if (!ring || !ring->sched.thread) 5671 + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) 5672 5672 continue; 5673 5673 5674 5674 drm_sched_start(&ring->sched, true); ··· 5991 5991 for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { 5992 5992 struct amdgpu_ring *ring = adev->rings[i]; 5993 5993 5994 - if (!ring || !ring->sched.thread) 5994 + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) 5995 5995 continue; 5996 5996 5997 5997 drm_sched_stop(&ring->sched, NULL); ··· 6119 6119 for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { 6120 6120 struct amdgpu_ring *ring = adev->rings[i]; 6121 6121 6122 - if (!ring || !ring->sched.thread) 6122 + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) 6123 6123 continue; 6124 6124 6125 6125 drm_sched_start(&ring->sched, true);

+1 -1

drivers/gpu/drm/amd/amdgpu/amdgpu_job.c

··· 115 115 if (!entity) 116 116 return 0; 117 117 118 - return drm_sched_job_init(&(*job)->base, entity, owner); 118 + return drm_sched_job_init(&(*job)->base, entity, 1, owner); 119 119 } 120 120 121 121 int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,

+1

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

··· 87 87 #include <drm/drm_blend.h> 88 88 #include <drm/drm_fourcc.h> 89 89 #include <drm/drm_edid.h> 90 + #include <drm/drm_eld.h> 90 91 #include <drm/drm_vblank.h> 91 92 #include <drm/drm_audio_component.h> 92 93 #include <drm/drm_gem_atomic_helper.h>

+2 -2

drivers/gpu/drm/drm_atomic_helper.c

··· 2382 2382 EXPORT_SYMBOL(drm_atomic_helper_setup_commit); 2383 2383 2384 2384 /** 2385 - * drm_atomic_helper_wait_for_dependencies - wait for required preceeding commits 2385 + * drm_atomic_helper_wait_for_dependencies - wait for required preceding commits 2386 2386 * @old_state: atomic state object with old state structures 2387 2387 * 2388 - * This function waits for all preceeding commits that touch the same CRTC as 2388 + * This function waits for all preceding commits that touch the same CRTC as 2389 2389 * @old_state to both be committed to the hardware (as signalled by 2390 2390 * drm_atomic_helper_commit_hw_done()) and executed by the hardware (as signalled 2391 2391 * by calling drm_crtc_send_vblank_event() on the &drm_crtc_state.event).

+1 -11

drivers/gpu/drm/drm_client.c

··· 5 5 6 6 #include <linux/iosys-map.h> 7 7 #include <linux/list.h> 8 - #include <linux/module.h> 9 8 #include <linux/mutex.h> 10 9 #include <linux/seq_file.h> 11 10 #include <linux/slab.h> ··· 83 84 if (!drm_core_check_feature(dev, DRIVER_MODESET) || !dev->driver->dumb_create) 84 85 return -EOPNOTSUPP; 85 86 86 - if (funcs && !try_module_get(funcs->owner)) 87 - return -ENODEV; 88 - 89 87 client->dev = dev; 90 88 client->name = name; 91 89 client->funcs = funcs; 92 90 93 91 ret = drm_client_modeset_create(client); 94 92 if (ret) 95 - goto err_put_module; 93 + return ret; 96 94 97 95 ret = drm_client_open(client); 98 96 if (ret) ··· 101 105 102 106 err_free: 103 107 drm_client_modeset_free(client); 104 - err_put_module: 105 - if (funcs) 106 - module_put(funcs->owner); 107 - 108 108 return ret; 109 109 } 110 110 EXPORT_SYMBOL(drm_client_init); ··· 169 177 drm_client_modeset_free(client); 170 178 drm_client_close(client); 171 179 drm_dev_put(dev); 172 - if (client->funcs) 173 - module_put(client->funcs->owner); 174 180 } 175 181 EXPORT_SYMBOL(drm_client_release); 176 182

+6

drivers/gpu/drm/drm_connector.c

··· 1198 1198 * drm_connector_set_path_property(), in the case of DP MST with the 1199 1199 * path property the MST manager created. Userspace cannot change this 1200 1200 * property. 1201 + * 1202 + * In the case of DP MST, the property has the format 1203 + * ``mst:<parent>-<ports>`` where ``<parent>`` is the KMS object ID of the 1204 + * parent connector and ``<ports>`` is a hyphen-separated list of DP MST 1205 + * port numbers. Note, KMS object IDs are not guaranteed to be stable 1206 + * across reboots. 1201 1207 * TILE: 1202 1208 * Connector tile group property to indicate how a set of DRM connector 1203 1209 * compose together into one logical screen. This is used by both high-res

+2

drivers/gpu/drm/drm_crtc_internal.h

··· 222 222 void *data, struct drm_file *file_priv); 223 223 int drm_mode_rmfb_ioctl(struct drm_device *dev, 224 224 void *data, struct drm_file *file_priv); 225 + int drm_mode_closefb_ioctl(struct drm_device *dev, 226 + void *data, struct drm_file *file_priv); 225 227 int drm_mode_getfb(struct drm_device *dev, 226 228 void *data, struct drm_file *file_priv); 227 229 int drm_mode_getfb2_ioctl(struct drm_device *dev,

+31 -12

drivers/gpu/drm/drm_edid.c

··· 41 41 #include <drm/drm_displayid.h> 42 42 #include <drm/drm_drv.h> 43 43 #include <drm/drm_edid.h> 44 + #include <drm/drm_eld.h> 44 45 #include <drm/drm_encoder.h> 45 46 #include <drm/drm_print.h> 46 47 47 48 #include "drm_crtc_internal.h" 49 + #include "drm_internal.h" 48 50 49 51 static int oui(u8 first, u8 second, u8 third) 50 52 { ··· 5512 5510 } 5513 5511 5514 5512 /* 5513 + * Get 3-byte SAD buffer from struct cea_sad. 5514 + */ 5515 + void drm_edid_cta_sad_get(const struct cea_sad *cta_sad, u8 *sad) 5516 + { 5517 + sad[0] = cta_sad->format << 3 | cta_sad->channels; 5518 + sad[1] = cta_sad->freq; 5519 + sad[2] = cta_sad->byte2; 5520 + } 5521 + 5522 + /* 5523 + * Set struct cea_sad from 3-byte SAD buffer. 5524 + */ 5525 + void drm_edid_cta_sad_set(struct cea_sad *cta_sad, const u8 *sad) 5526 + { 5527 + cta_sad->format = (sad[0] & 0x78) >> 3; 5528 + cta_sad->channels = sad[0] & 0x07; 5529 + cta_sad->freq = sad[1] & 0x7f; 5530 + cta_sad->byte2 = sad[2]; 5531 + } 5532 + 5533 + /* 5515 5534 * drm_edid_to_eld - build ELD from EDID 5516 5535 * @connector: connector corresponding to the HDMI/DP sink 5517 5536 * @drm_edid: EDID to parse ··· 5616 5593 } 5617 5594 5618 5595 static int _drm_edid_to_sad(const struct drm_edid *drm_edid, 5619 - struct cea_sad **sads) 5596 + struct cea_sad **psads) 5620 5597 { 5621 5598 const struct cea_db *db; 5622 5599 struct cea_db_iter iter; ··· 5625 5602 cea_db_iter_edid_begin(drm_edid, &iter); 5626 5603 cea_db_iter_for_each(db, &iter) { 5627 5604 if (cea_db_tag(db) == CTA_DB_AUDIO) { 5628 - int j; 5605 + struct cea_sad *sads; 5606 + int i; 5629 5607 5630 5608 count = cea_db_payload_len(db) / 3; /* SAD is 3B */ 5631 - *sads = kcalloc(count, sizeof(**sads), GFP_KERNEL); 5632 - if (!*sads) 5609 + sads = kcalloc(count, sizeof(*sads), GFP_KERNEL); 5610 + *psads = sads; 5611 + if (!sads) 5633 5612 return -ENOMEM; 5634 - for (j = 0; j < count; j++) { 5635 - const u8 *sad = &db->data[j * 3]; 5636 - 5637 - (*sads)[j].format = (sad[0] & 0x78) >> 3; 5638 - (*sads)[j].channels = sad[0] & 0x7; 5639 - (*sads)[j].freq = sad[1] & 0x7F; 5640 - (*sads)[j].byte2 = sad[2]; 5641 - } 5613 + for (i = 0; i < count; i++) 5614 + drm_edid_cta_sad_set(&sads[i], &db->data[i * 3]); 5642 5615 break; 5643 5616 } 5644 5617 }

+55

drivers/gpu/drm/drm_eld.c

··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2023 Intel Corporation 4 + */ 5 + 6 + #include <drm/drm_edid.h> 7 + #include <drm/drm_eld.h> 8 + 9 + #include "drm_internal.h" 10 + 11 + /** 12 + * drm_eld_sad_get - get SAD from ELD to struct cea_sad 13 + * @eld: ELD buffer 14 + * @sad_index: SAD index 15 + * @cta_sad: destination struct cea_sad 16 + * 17 + * @return: 0 on success, or negative on errors 18 + */ 19 + int drm_eld_sad_get(const u8 *eld, int sad_index, struct cea_sad *cta_sad) 20 + { 21 + const u8 *sad; 22 + 23 + if (sad_index >= drm_eld_sad_count(eld)) 24 + return -EINVAL; 25 + 26 + sad = eld + DRM_ELD_CEA_SAD(drm_eld_mnl(eld), sad_index); 27 + 28 + drm_edid_cta_sad_set(cta_sad, sad); 29 + 30 + return 0; 31 + } 32 + EXPORT_SYMBOL(drm_eld_sad_get); 33 + 34 + /** 35 + * drm_eld_sad_set - set SAD to ELD from struct cea_sad 36 + * @eld: ELD buffer 37 + * @sad_index: SAD index 38 + * @cta_sad: source struct cea_sad 39 + * 40 + * @return: 0 on success, or negative on errors 41 + */ 42 + int drm_eld_sad_set(u8 *eld, int sad_index, const struct cea_sad *cta_sad) 43 + { 44 + u8 *sad; 45 + 46 + if (sad_index >= drm_eld_sad_count(eld)) 47 + return -EINVAL; 48 + 49 + sad = eld + DRM_ELD_CEA_SAD(drm_eld_mnl(eld), sad_index); 50 + 51 + drm_edid_cta_sad_get(cta_sad, sad); 52 + 53 + return 0; 54 + } 55 + EXPORT_SYMBOL(drm_eld_sad_set);

+1 -1

drivers/gpu/drm/drm_file.c

··· 913 913 unsigned u; 914 914 915 915 for (u = 0; u < ARRAY_SIZE(units) - 1; u++) { 916 - if (sz < SZ_1K) 916 + if (sz == 0 || !IS_ALIGNED(sz, SZ_1K)) 917 917 break; 918 918 sz = div_u64(sz, SZ_1K); 919 919 }

+7 -20

drivers/gpu/drm/drm_flip_work.c

··· 27 27 #include <drm/drm_print.h> 28 28 #include <drm/drm_util.h> 29 29 30 - /** 31 - * drm_flip_work_allocate_task - allocate a flip-work task 32 - * @data: data associated to the task 33 - * @flags: allocator flags 34 - * 35 - * Allocate a drm_flip_task object and attach private data to it. 36 - */ 37 - struct drm_flip_task *drm_flip_work_allocate_task(void *data, gfp_t flags) 30 + struct drm_flip_task { 31 + struct list_head node; 32 + void *data; 33 + }; 34 + 35 + static struct drm_flip_task *drm_flip_work_allocate_task(void *data, gfp_t flags) 38 36 { 39 37 struct drm_flip_task *task; 40 38 ··· 42 44 43 45 return task; 44 46 } 45 - EXPORT_SYMBOL(drm_flip_work_allocate_task); 46 47 47 - /** 48 - * drm_flip_work_queue_task - queue a specific task 49 - * @work: the flip-work 50 - * @task: the task to handle 51 - * 52 - * Queues task, that will later be run (passed back to drm_flip_func_t 53 - * func) on a work queue after drm_flip_work_commit() is called. 54 - */ 55 - void drm_flip_work_queue_task(struct drm_flip_work *work, 56 - struct drm_flip_task *task) 48 + static void drm_flip_work_queue_task(struct drm_flip_work *work, struct drm_flip_task *task) 57 49 { 58 50 unsigned long flags; 59 51 ··· 51 63 list_add_tail(&task->node, &work->queued); 52 64 spin_unlock_irqrestore(&work->lock, flags); 53 65 } 54 - EXPORT_SYMBOL(drm_flip_work_queue_task); 55 66 56 67 /** 57 68 * drm_flip_work_queue - queue work

+162 -53

drivers/gpu/drm/drm_format_helper.c

··· 20 20 #include <drm/drm_print.h> 21 21 #include <drm/drm_rect.h> 22 22 23 + /** 24 + * drm_format_conv_state_init - Initialize format-conversion state 25 + * @state: The state to initialize 26 + * 27 + * Clears all fields in struct drm_format_conv_state. The state will 28 + * be empty with no preallocated resources. 29 + */ 30 + void drm_format_conv_state_init(struct drm_format_conv_state *state) 31 + { 32 + state->tmp.mem = NULL; 33 + state->tmp.size = 0; 34 + state->tmp.preallocated = false; 35 + } 36 + EXPORT_SYMBOL(drm_format_conv_state_init); 37 + 38 + /** 39 + * drm_format_conv_state_copy - Copy format-conversion state 40 + * @state: Destination state 41 + * @old_state: Source state 42 + * 43 + * Copies format-conversion state from @old_state to @state; except for 44 + * temporary storage. 45 + */ 46 + void drm_format_conv_state_copy(struct drm_format_conv_state *state, 47 + const struct drm_format_conv_state *old_state) 48 + { 49 + /* 50 + * So far, there's only temporary storage here, which we don't 51 + * duplicate. Just clear the fields. 52 + */ 53 + state->tmp.mem = NULL; 54 + state->tmp.size = 0; 55 + state->tmp.preallocated = false; 56 + } 57 + EXPORT_SYMBOL(drm_format_conv_state_copy); 58 + 59 + /** 60 + * drm_format_conv_state_reserve - Allocates storage for format conversion 61 + * @state: The format-conversion state 62 + * @new_size: The minimum allocation size 63 + * @flags: Flags for kmalloc() 64 + * 65 + * Allocates at least @new_size bytes and returns a pointer to the memory 66 + * range. After calling this function, previously returned memory blocks 67 + * are invalid. It's best to collect all memory requirements of a format 68 + * conversion and call this function once to allocate the range. 69 + * 70 + * Returns: 71 + * A pointer to the allocated memory range, or NULL otherwise. 72 + */ 73 + void *drm_format_conv_state_reserve(struct drm_format_conv_state *state, 74 + size_t new_size, gfp_t flags) 75 + { 76 + void *mem; 77 + 78 + if (new_size <= state->tmp.size) 79 + goto out; 80 + else if (state->tmp.preallocated) 81 + return NULL; 82 + 83 + mem = krealloc(state->tmp.mem, new_size, flags); 84 + if (!mem) 85 + return NULL; 86 + 87 + state->tmp.mem = mem; 88 + state->tmp.size = new_size; 89 + 90 + out: 91 + return state->tmp.mem; 92 + } 93 + EXPORT_SYMBOL(drm_format_conv_state_reserve); 94 + 95 + /** 96 + * drm_format_conv_state_release - Releases an format-conversion storage 97 + * @state: The format-conversion state 98 + * 99 + * Releases the memory range references by the format-conversion state. 100 + * After this call, all pointers to the memory are invalid. Prefer 101 + * drm_format_conv_state_init() for cleaning up and unloading a driver. 102 + */ 103 + void drm_format_conv_state_release(struct drm_format_conv_state *state) 104 + { 105 + if (state->tmp.preallocated) 106 + return; 107 + 108 + kfree(state->tmp.mem); 109 + state->tmp.mem = NULL; 110 + state->tmp.size = 0; 111 + } 112 + EXPORT_SYMBOL(drm_format_conv_state_release); 113 + 23 114 static unsigned int clip_offset(const struct drm_rect *clip, unsigned int pitch, unsigned int cpp) 24 115 { 25 116 return clip->y1 * pitch + clip->x1 * cpp; ··· 136 45 static int __drm_fb_xfrm(void *dst, unsigned long dst_pitch, unsigned long dst_pixsize, 137 46 const void *vaddr, const struct drm_framebuffer *fb, 138 47 const struct drm_rect *clip, bool vaddr_cached_hint, 48 + struct drm_format_conv_state *state, 139 49 void (*xfrm_line)(void *dbuf, const void *sbuf, unsigned int npixels)) 140 50 { 141 51 unsigned long linepixels = drm_rect_width(clip); ··· 152 60 * one line at a time. 153 61 */ 154 62 if (!vaddr_cached_hint) { 155 - stmp = kmalloc(sbuf_len, GFP_KERNEL); 63 + stmp = drm_format_conv_state_reserve(state, sbuf_len, GFP_KERNEL); 156 64 if (!stmp) 157 65 return -ENOMEM; 158 66 } ··· 171 79 dst += dst_pitch; 172 80 } 173 81 174 - kfree(stmp); 175 - 176 82 return 0; 177 83 } 178 84 ··· 178 88 static int __drm_fb_xfrm_toio(void __iomem *dst, unsigned long dst_pitch, unsigned long dst_pixsize, 179 89 const void *vaddr, const struct drm_framebuffer *fb, 180 90 const struct drm_rect *clip, bool vaddr_cached_hint, 91 + struct drm_format_conv_state *state, 181 92 void (*xfrm_line)(void *dbuf, const void *sbuf, unsigned int npixels)) 182 93 { 183 94 unsigned long linepixels = drm_rect_width(clip); ··· 192 101 void *dbuf; 193 102 194 103 if (vaddr_cached_hint) { 195 - dbuf = kmalloc(dbuf_len, GFP_KERNEL); 104 + dbuf = drm_format_conv_state_reserve(state, dbuf_len, GFP_KERNEL); 196 105 } else { 197 - dbuf = kmalloc(stmp_off + sbuf_len, GFP_KERNEL); 106 + dbuf = drm_format_conv_state_reserve(state, stmp_off + sbuf_len, GFP_KERNEL); 198 107 stmp = dbuf + stmp_off; 199 108 } 200 109 if (!dbuf) ··· 215 124 dst += dst_pitch; 216 125 } 217 126 218 - kfree(dbuf); 219 - 220 127 return 0; 221 128 } 222 129 ··· 223 134 const unsigned int *dst_pitch, const u8 *dst_pixsize, 224 135 const struct iosys_map *src, const struct drm_framebuffer *fb, 225 136 const struct drm_rect *clip, bool vaddr_cached_hint, 137 + struct drm_format_conv_state *state, 226 138 void (*xfrm_line)(void *dbuf, const void *sbuf, unsigned int npixels)) 227 139 { 228 140 static const unsigned int default_dst_pitch[DRM_FORMAT_MAX_PLANES] = { ··· 236 146 /* TODO: handle src in I/O memory here */ 237 147 if (dst[0].is_iomem) 238 148 return __drm_fb_xfrm_toio(dst[0].vaddr_iomem, dst_pitch[0], dst_pixsize[0], 239 - src[0].vaddr, fb, clip, vaddr_cached_hint, xfrm_line); 149 + src[0].vaddr, fb, clip, vaddr_cached_hint, state, 150 + xfrm_line); 240 151 else 241 152 return __drm_fb_xfrm(dst[0].vaddr, dst_pitch[0], dst_pixsize[0], 242 - src[0].vaddr, fb, clip, vaddr_cached_hint, xfrm_line); 153 + src[0].vaddr, fb, clip, vaddr_cached_hint, state, 154 + xfrm_line); 243 155 } 244 156 245 157 /** ··· 327 235 * @fb: DRM framebuffer 328 236 * @clip: Clip rectangle area to copy 329 237 * @cached: Source buffer is mapped cached (eg. not write-combined) 238 + * @state: Transform and conversion state 330 239 * 331 240 * This function copies parts of a framebuffer to display memory and swaps per-pixel 332 241 * bytes during the process. Destination and framebuffer formats must match. The ··· 342 249 */ 343 250 void drm_fb_swab(struct iosys_map *dst, const unsigned int *dst_pitch, 344 251 const struct iosys_map *src, const struct drm_framebuffer *fb, 345 - const struct drm_rect *clip, bool cached) 252 + const struct drm_rect *clip, bool cached, 253 + struct drm_format_conv_state *state) 346 254 { 347 255 const struct drm_format_info *format = fb->format; 348 256 u8 cpp = DIV_ROUND_UP(drm_format_info_bpp(format, 0), 8); ··· 362 268 return; 363 269 } 364 270 365 - drm_fb_xfrm(dst, dst_pitch, &cpp, src, fb, clip, cached, swab_line); 271 + drm_fb_xfrm(dst, dst_pitch, &cpp, src, fb, clip, cached, state, swab_line); 366 272 } 367 273 EXPORT_SYMBOL(drm_fb_swab); 368 274 ··· 389 295 * @src: Array of XRGB8888 source buffers 390 296 * @fb: DRM framebuffer 391 297 * @clip: Clip rectangle area to copy 298 + * @state: Transform and conversion state 392 299 * 393 300 * This function copies parts of a framebuffer to display memory and converts the 394 301 * color format during the process. Destination and framebuffer formats must match. The ··· 404 309 */ 405 310 void drm_fb_xrgb8888_to_rgb332(struct iosys_map *dst, const unsigned int *dst_pitch, 406 311 const struct iosys_map *src, const struct drm_framebuffer *fb, 407 - const struct drm_rect *clip) 312 + const struct drm_rect *clip, struct drm_format_conv_state *state) 408 313 { 409 314 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 410 315 1, 411 316 }; 412 317 413 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, 318 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, 414 319 drm_fb_xrgb8888_to_rgb332_line); 415 320 } 416 321 EXPORT_SYMBOL(drm_fb_xrgb8888_to_rgb332); ··· 459 364 * @src: Array of XRGB8888 source buffer 460 365 * @fb: DRM framebuffer 461 366 * @clip: Clip rectangle area to copy 367 + * @state: Transform and conversion state 462 368 * @swab: Swap bytes 463 369 * 464 370 * This function copies parts of a framebuffer to display memory and converts the ··· 475 379 */ 476 380 void drm_fb_xrgb8888_to_rgb565(struct iosys_map *dst, const unsigned int *dst_pitch, 477 381 const struct iosys_map *src, const struct drm_framebuffer *fb, 478 - const struct drm_rect *clip, bool swab) 382 + const struct drm_rect *clip, struct drm_format_conv_state *state, 383 + bool swab) 479 384 { 480 385 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 481 386 2, ··· 489 392 else 490 393 xfrm_line = drm_fb_xrgb8888_to_rgb565_line; 491 394 492 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, xfrm_line); 395 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, xfrm_line); 493 396 } 494 397 EXPORT_SYMBOL(drm_fb_xrgb8888_to_rgb565); 495 398 ··· 518 421 * @src: Array of XRGB8888 source buffer 519 422 * @fb: DRM framebuffer 520 423 * @clip: Clip rectangle area to copy 424 + * @state: Transform and conversion state 521 425 * 522 426 * This function copies parts of a framebuffer to display memory and converts 523 427 * the color format during the process. The parameters @dst, @dst_pitch and ··· 534 436 */ 535 437 void drm_fb_xrgb8888_to_xrgb1555(struct iosys_map *dst, const unsigned int *dst_pitch, 536 438 const struct iosys_map *src, const struct drm_framebuffer *fb, 537 - const struct drm_rect *clip) 439 + const struct drm_rect *clip, struct drm_format_conv_state *state) 538 440 { 539 441 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 540 442 2, 541 443 }; 542 444 543 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, 445 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, 544 446 drm_fb_xrgb8888_to_xrgb1555_line); 545 447 } 546 448 EXPORT_SYMBOL(drm_fb_xrgb8888_to_xrgb1555); ··· 571 473 * @src: Array of XRGB8888 source buffer 572 474 * @fb: DRM framebuffer 573 475 * @clip: Clip rectangle area to copy 476 + * @state: Transform and conversion state 574 477 * 575 478 * This function copies parts of a framebuffer to display memory and converts 576 479 * the color format during the process. The parameters @dst, @dst_pitch and ··· 587 488 */ 588 489 void drm_fb_xrgb8888_to_argb1555(struct iosys_map *dst, const unsigned int *dst_pitch, 589 490 const struct iosys_map *src, const struct drm_framebuffer *fb, 590 - const struct drm_rect *clip) 491 + const struct drm_rect *clip, struct drm_format_conv_state *state) 591 492 { 592 493 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 593 494 2, 594 495 }; 595 496 596 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, 497 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, 597 498 drm_fb_xrgb8888_to_argb1555_line); 598 499 } 599 500 EXPORT_SYMBOL(drm_fb_xrgb8888_to_argb1555); ··· 624 525 * @src: Array of XRGB8888 source buffer 625 526 * @fb: DRM framebuffer 626 527 * @clip: Clip rectangle area to copy 528 + * @state: Transform and conversion state 627 529 * 628 530 * This function copies parts of a framebuffer to display memory and converts 629 531 * the color format during the process. The parameters @dst, @dst_pitch and ··· 640 540 */ 641 541 void drm_fb_xrgb8888_to_rgba5551(struct iosys_map *dst, const unsigned int *dst_pitch, 642 542 const struct iosys_map *src, const struct drm_framebuffer *fb, 643 - const struct drm_rect *clip) 543 + const struct drm_rect *clip, struct drm_format_conv_state *state) 644 544 { 645 545 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 646 546 2, 647 547 }; 648 548 649 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, 549 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, 650 550 drm_fb_xrgb8888_to_rgba5551_line); 651 551 } 652 552 EXPORT_SYMBOL(drm_fb_xrgb8888_to_rgba5551); ··· 675 575 * @src: Array of XRGB8888 source buffers 676 576 * @fb: DRM framebuffer 677 577 * @clip: Clip rectangle area to copy 578 + * @state: Transform and conversion state 678 579 * 679 580 * This function copies parts of a framebuffer to display memory and converts the 680 581 * color format during the process. Destination and framebuffer formats must match. The ··· 691 590 */ 692 591 void drm_fb_xrgb8888_to_rgb888(struct iosys_map *dst, const unsigned int *dst_pitch, 693 592 const struct iosys_map *src, const struct drm_framebuffer *fb, 694 - const struct drm_rect *clip) 593 + const struct drm_rect *clip, struct drm_format_conv_state *state) 695 594 { 696 595 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 697 596 3, 698 597 }; 699 598 700 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, 599 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, 701 600 drm_fb_xrgb8888_to_rgb888_line); 702 601 } 703 602 EXPORT_SYMBOL(drm_fb_xrgb8888_to_rgb888); ··· 724 623 * @src: Array of XRGB8888 source buffer 725 624 * @fb: DRM framebuffer 726 625 * @clip: Clip rectangle area to copy 626 + * @state: Transform and conversion state 727 627 * 728 628 * This function copies parts of a framebuffer to display memory and converts the 729 629 * color format during the process. The parameters @dst, @dst_pitch and @src refer ··· 740 638 */ 741 639 void drm_fb_xrgb8888_to_argb8888(struct iosys_map *dst, const unsigned int *dst_pitch, 742 640 const struct iosys_map *src, const struct drm_framebuffer *fb, 743 - const struct drm_rect *clip) 641 + const struct drm_rect *clip, struct drm_format_conv_state *state) 744 642 { 745 643 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 746 644 4, 747 645 }; 748 646 749 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, 647 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, 750 648 drm_fb_xrgb8888_to_argb8888_line); 751 649 } 752 650 EXPORT_SYMBOL(drm_fb_xrgb8888_to_argb8888); ··· 771 669 static void drm_fb_xrgb8888_to_abgr8888(struct iosys_map *dst, const unsigned int *dst_pitch, 772 670 const struct iosys_map *src, 773 671 const struct drm_framebuffer *fb, 774 - const struct drm_rect *clip) 672 + const struct drm_rect *clip, 673 + struct drm_format_conv_state *state) 775 674 { 776 675 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 777 676 4, 778 677 }; 779 678 780 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, 679 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, 781 680 drm_fb_xrgb8888_to_abgr8888_line); 782 681 } 783 682 ··· 802 699 static void drm_fb_xrgb8888_to_xbgr8888(struct iosys_map *dst, const unsigned int *dst_pitch, 803 700 const struct iosys_map *src, 804 701 const struct drm_framebuffer *fb, 805 - const struct drm_rect *clip) 702 + const struct drm_rect *clip, 703 + struct drm_format_conv_state *state) 806 704 { 807 705 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 808 706 4, 809 707 }; 810 708 811 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, 709 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, 812 710 drm_fb_xrgb8888_to_xbgr8888_line); 813 711 } 814 712 ··· 839 735 * @src: Array of XRGB8888 source buffers 840 736 * @fb: DRM framebuffer 841 737 * @clip: Clip rectangle area to copy 738 + * @state: Transform and conversion state 842 739 * 843 740 * This function copies parts of a framebuffer to display memory and converts the 844 741 * color format during the process. Destination and framebuffer formats must match. The ··· 855 750 */ 856 751 void drm_fb_xrgb8888_to_xrgb2101010(struct iosys_map *dst, const unsigned int *dst_pitch, 857 752 const struct iosys_map *src, const struct drm_framebuffer *fb, 858 - const struct drm_rect *clip) 753 + const struct drm_rect *clip, 754 + struct drm_format_conv_state *state) 859 755 { 860 756 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 861 757 4, 862 758 }; 863 759 864 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, 760 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, 865 761 drm_fb_xrgb8888_to_xrgb2101010_line); 866 762 } 867 763 EXPORT_SYMBOL(drm_fb_xrgb8888_to_xrgb2101010); ··· 894 788 * @src: Array of XRGB8888 source buffers 895 789 * @fb: DRM framebuffer 896 790 * @clip: Clip rectangle area to copy 791 + * @state: Transform and conversion state 897 792 * 898 793 * This function copies parts of a framebuffer to display memory and converts 899 794 * the color format during the process. The parameters @dst, @dst_pitch and ··· 910 803 */ 911 804 void drm_fb_xrgb8888_to_argb2101010(struct iosys_map *dst, const unsigned int *dst_pitch, 912 805 const struct iosys_map *src, const struct drm_framebuffer *fb, 913 - const struct drm_rect *clip) 806 + const struct drm_rect *clip, 807 + struct drm_format_conv_state *state) 914 808 { 915 809 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 916 810 4, 917 811 }; 918 812 919 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, 813 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, 920 814 drm_fb_xrgb8888_to_argb2101010_line); 921 815 } 922 816 EXPORT_SYMBOL(drm_fb_xrgb8888_to_argb2101010); ··· 947 839 * @src: Array of XRGB8888 source buffers 948 840 * @fb: DRM framebuffer 949 841 * @clip: Clip rectangle area to copy 842 + * @state: Transform and conversion state 950 843 * 951 844 * This function copies parts of a framebuffer to display memory and converts the 952 845 * color format during the process. Destination and framebuffer formats must match. The ··· 967 858 */ 968 859 void drm_fb_xrgb8888_to_gray8(struct iosys_map *dst, const unsigned int *dst_pitch, 969 860 const struct iosys_map *src, const struct drm_framebuffer *fb, 970 - const struct drm_rect *clip) 861 + const struct drm_rect *clip, struct drm_format_conv_state *state) 971 862 { 972 863 static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 973 864 1, 974 865 }; 975 866 976 - drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, 867 + drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state, 977 868 drm_fb_xrgb8888_to_gray8_line); 978 869 } 979 870 EXPORT_SYMBOL(drm_fb_xrgb8888_to_gray8); ··· 987 878 * @src: The framebuffer memory to copy from 988 879 * @fb: The framebuffer to copy from 989 880 * @clip: Clip rectangle area to copy 881 + * @state: Transform and conversion state 990 882 * 991 883 * This function copies parts of a framebuffer to display memory. If the 992 884 * formats of the display and the framebuffer mismatch, the blit function ··· 1006 896 */ 1007 897 int drm_fb_blit(struct iosys_map *dst, const unsigned int *dst_pitch, uint32_t dst_format, 1008 898 const struct iosys_map *src, const struct drm_framebuffer *fb, 1009 - const struct drm_rect *clip) 899 + const struct drm_rect *clip, struct drm_format_conv_state *state) 1010 900 { 1011 901 uint32_t fb_format = fb->format->format; 1012 902 ··· 1014 904 drm_fb_memcpy(dst, dst_pitch, src, fb, clip); 1015 905 return 0; 1016 906 } else if (fb_format == (dst_format | DRM_FORMAT_BIG_ENDIAN)) { 1017 - drm_fb_swab(dst, dst_pitch, src, fb, clip, false); 907 + drm_fb_swab(dst, dst_pitch, src, fb, clip, false, state); 1018 908 return 0; 1019 909 } else if (fb_format == (dst_format & ~DRM_FORMAT_BIG_ENDIAN)) { 1020 - drm_fb_swab(dst, dst_pitch, src, fb, clip, false); 910 + drm_fb_swab(dst, dst_pitch, src, fb, clip, false, state); 1021 911 return 0; 1022 912 } else if (fb_format == DRM_FORMAT_XRGB8888) { 1023 913 if (dst_format == DRM_FORMAT_RGB565) { 1024 - drm_fb_xrgb8888_to_rgb565(dst, dst_pitch, src, fb, clip, false); 914 + drm_fb_xrgb8888_to_rgb565(dst, dst_pitch, src, fb, clip, state, false); 1025 915 return 0; 1026 916 } else if (dst_format == DRM_FORMAT_XRGB1555) { 1027 - drm_fb_xrgb8888_to_xrgb1555(dst, dst_pitch, src, fb, clip); 917 + drm_fb_xrgb8888_to_xrgb1555(dst, dst_pitch, src, fb, clip, state); 1028 918 return 0; 1029 919 } else if (dst_format == DRM_FORMAT_ARGB1555) { 1030 - drm_fb_xrgb8888_to_argb1555(dst, dst_pitch, src, fb, clip); 920 + drm_fb_xrgb8888_to_argb1555(dst, dst_pitch, src, fb, clip, state); 1031 921 return 0; 1032 922 } else if (dst_format == DRM_FORMAT_RGBA5551) { 1033 - drm_fb_xrgb8888_to_rgba5551(dst, dst_pitch, src, fb, clip); 923 + drm_fb_xrgb8888_to_rgba5551(dst, dst_pitch, src, fb, clip, state); 1034 924 return 0; 1035 925 } else if (dst_format == DRM_FORMAT_RGB888) { 1036 - drm_fb_xrgb8888_to_rgb888(dst, dst_pitch, src, fb, clip); 926 + drm_fb_xrgb8888_to_rgb888(dst, dst_pitch, src, fb, clip, state); 1037 927 return 0; 1038 928 } else if (dst_format == DRM_FORMAT_ARGB8888) { 1039 - drm_fb_xrgb8888_to_argb8888(dst, dst_pitch, src, fb, clip); 929 + drm_fb_xrgb8888_to_argb8888(dst, dst_pitch, src, fb, clip, state); 1040 930 return 0; 1041 931 } else if (dst_format == DRM_FORMAT_XBGR8888) { 1042 - drm_fb_xrgb8888_to_xbgr8888(dst, dst_pitch, src, fb, clip); 932 + drm_fb_xrgb8888_to_xbgr8888(dst, dst_pitch, src, fb, clip, state); 1043 933 return 0; 1044 934 } else if (dst_format == DRM_FORMAT_ABGR8888) { 1045 - drm_fb_xrgb8888_to_abgr8888(dst, dst_pitch, src, fb, clip); 935 + drm_fb_xrgb8888_to_abgr8888(dst, dst_pitch, src, fb, clip, state); 1046 936 return 0; 1047 937 } else if (dst_format == DRM_FORMAT_XRGB2101010) { 1048 - drm_fb_xrgb8888_to_xrgb2101010(dst, dst_pitch, src, fb, clip); 938 + drm_fb_xrgb8888_to_xrgb2101010(dst, dst_pitch, src, fb, clip, state); 1049 939 return 0; 1050 940 } else if (dst_format == DRM_FORMAT_ARGB2101010) { 1051 - drm_fb_xrgb8888_to_argb2101010(dst, dst_pitch, src, fb, clip); 941 + drm_fb_xrgb8888_to_argb2101010(dst, dst_pitch, src, fb, clip, state); 1052 942 return 0; 1053 943 } else if (dst_format == DRM_FORMAT_BGRX8888) { 1054 - drm_fb_swab(dst, dst_pitch, src, fb, clip, false); 944 + drm_fb_swab(dst, dst_pitch, src, fb, clip, false, state); 1055 945 return 0; 1056 946 } 1057 947 } ··· 1088 978 * @src: Array of XRGB8888 source buffers 1089 979 * @fb: DRM framebuffer 1090 980 * @clip: Clip rectangle area to copy 981 + * @state: Transform and conversion state 1091 982 * 1092 983 * This function copies parts of a framebuffer to display memory and converts the 1093 984 * color format during the process. Destination and framebuffer formats must match. The ··· 1113 1002 */ 1114 1003 void drm_fb_xrgb8888_to_mono(struct iosys_map *dst, const unsigned int *dst_pitch, 1115 1004 const struct iosys_map *src, const struct drm_framebuffer *fb, 1116 - const struct drm_rect *clip) 1005 + const struct drm_rect *clip, struct drm_format_conv_state *state) 1117 1006 { 1118 1007 static const unsigned int default_dst_pitch[DRM_FORMAT_MAX_PLANES] = { 1119 1008 0, 0, 0, 0 ··· 1153 1042 * Allocate a buffer to be used for both copying from the cma 1154 1043 * memory and to store the intermediate grayscale line pixels. 1155 1044 */ 1156 - src32 = kmalloc(len_src32 + linepixels, GFP_KERNEL); 1045 + src32 = drm_format_conv_state_reserve(state, len_src32 + linepixels, GFP_KERNEL); 1157 1046 if (!src32) 1158 1047 return; 1159 1048 ··· 1167 1056 vaddr += fb->pitches[0]; 1168 1057 mono += dst_pitch_0; 1169 1058 } 1170 - 1171 - kfree(src32); 1172 1059 } 1173 1060 EXPORT_SYMBOL(drm_fb_xrgb8888_to_mono); 1174 1061

+53 -22

drivers/gpu/drm/drm_framebuffer.c

··· 394 394 } 395 395 } 396 396 397 + static int drm_mode_closefb(struct drm_framebuffer *fb, 398 + struct drm_file *file_priv) 399 + { 400 + struct drm_framebuffer *fbl; 401 + bool found = false; 402 + 403 + mutex_lock(&file_priv->fbs_lock); 404 + list_for_each_entry(fbl, &file_priv->fbs, filp_head) 405 + if (fb == fbl) 406 + found = true; 407 + 408 + if (!found) { 409 + mutex_unlock(&file_priv->fbs_lock); 410 + return -ENOENT; 411 + } 412 + 413 + list_del_init(&fb->filp_head); 414 + mutex_unlock(&file_priv->fbs_lock); 415 + 416 + /* Drop the reference that was stored in the fbs list */ 417 + drm_framebuffer_put(fb); 418 + 419 + return 0; 420 + } 421 + 397 422 /** 398 423 * drm_mode_rmfb - remove an FB from the configuration 399 424 * @dev: drm device ··· 435 410 int drm_mode_rmfb(struct drm_device *dev, u32 fb_id, 436 411 struct drm_file *file_priv) 437 412 { 438 - struct drm_framebuffer *fb = NULL; 439 - struct drm_framebuffer *fbl = NULL; 440 - int found = 0; 413 + struct drm_framebuffer *fb; 414 + int ret; 441 415 442 416 if (!drm_core_check_feature(dev, DRIVER_MODESET)) 443 417 return -EOPNOTSUPP; ··· 445 421 if (!fb) 446 422 return -ENOENT; 447 423 448 - mutex_lock(&file_priv->fbs_lock); 449 - list_for_each_entry(fbl, &file_priv->fbs, filp_head) 450 - if (fb == fbl) 451 - found = 1; 452 - if (!found) { 453 - mutex_unlock(&file_priv->fbs_lock); 454 - goto fail_unref; 424 + ret = drm_mode_closefb(fb, file_priv); 425 + if (ret != 0) { 426 + drm_framebuffer_put(fb); 427 + return ret; 455 428 } 456 429 457 - list_del_init(&fb->filp_head); 458 - mutex_unlock(&file_priv->fbs_lock); 459 - 460 - /* drop the reference we picked up in framebuffer lookup */ 461 - drm_framebuffer_put(fb); 462 - 463 430 /* 464 - * we now own the reference that was stored in the fbs list 465 - * 466 431 * drm_framebuffer_remove may fail with -EINTR on pending signals, 467 432 * so run this in a separate stack as there's no way to correctly 468 433 * handle this after the fb is already removed from the lookup table. ··· 470 457 drm_framebuffer_put(fb); 471 458 472 459 return 0; 473 - 474 - fail_unref: 475 - drm_framebuffer_put(fb); 476 - return -ENOENT; 477 460 } 478 461 479 462 int drm_mode_rmfb_ioctl(struct drm_device *dev, ··· 478 469 uint32_t *fb_id = data; 479 470 480 471 return drm_mode_rmfb(dev, *fb_id, file_priv); 472 + } 473 + 474 + int drm_mode_closefb_ioctl(struct drm_device *dev, 475 + void *data, struct drm_file *file_priv) 476 + { 477 + struct drm_mode_closefb *r = data; 478 + struct drm_framebuffer *fb; 479 + int ret; 480 + 481 + if (!drm_core_check_feature(dev, DRIVER_MODESET)) 482 + return -EOPNOTSUPP; 483 + 484 + if (r->pad) 485 + return -EINVAL; 486 + 487 + fb = drm_framebuffer_lookup(dev, file_priv, r->fb_id); 488 + if (!fb) 489 + return -ENOENT; 490 + 491 + ret = drm_mode_closefb(fb, file_priv); 492 + drm_framebuffer_put(fb); 493 + return ret; 481 494 } 482 495 483 496 /**

+9

drivers/gpu/drm/drm_gem_atomic_helper.c

··· 218 218 __drm_gem_duplicate_shadow_plane_state(struct drm_plane *plane, 219 219 struct drm_shadow_plane_state *new_shadow_plane_state) 220 220 { 221 + struct drm_plane_state *plane_state = plane->state; 222 + struct drm_shadow_plane_state *shadow_plane_state = 223 + to_drm_shadow_plane_state(plane_state); 224 + 221 225 __drm_atomic_helper_plane_duplicate_state(plane, &new_shadow_plane_state->base); 226 + 227 + drm_format_conv_state_copy(&shadow_plane_state->fmtcnv_state, 228 + &new_shadow_plane_state->fmtcnv_state); 222 229 } 223 230 EXPORT_SYMBOL(__drm_gem_duplicate_shadow_plane_state); 224 231 ··· 273 266 */ 274 267 void __drm_gem_destroy_shadow_plane_state(struct drm_shadow_plane_state *shadow_plane_state) 275 268 { 269 + drm_format_conv_state_release(&shadow_plane_state->fmtcnv_state); 276 270 __drm_atomic_helper_plane_destroy_state(&shadow_plane_state->base); 277 271 } 278 272 EXPORT_SYMBOL(__drm_gem_destroy_shadow_plane_state); ··· 310 302 struct drm_shadow_plane_state *shadow_plane_state) 311 303 { 312 304 __drm_atomic_helper_plane_reset(plane, &shadow_plane_state->base); 305 + drm_format_conv_state_init(&shadow_plane_state->fmtcnv_state); 313 306 } 314 307 EXPORT_SYMBOL(__drm_gem_reset_shadow_plane); 315 308

+1060 -71

drivers/gpu/drm/drm_gpuvm.c

··· 61 61 * contained within struct drm_gpuva already. Hence, for inserting &drm_gpuva 62 62 * entries from within dma-fence signalling critical sections it is enough to 63 63 * pre-allocate the &drm_gpuva structures. 64 + * 65 + * &drm_gem_objects which are private to a single VM can share a common 66 + * &dma_resv in order to improve locking efficiency (e.g. with &drm_exec). 67 + * For this purpose drivers must pass a &drm_gem_object to drm_gpuvm_init(), in 68 + * the following called 'resv object', which serves as the container of the 69 + * GPUVM's shared &dma_resv. This resv object can be a driver specific 70 + * &drm_gem_object, such as the &drm_gem_object containing the root page table, 71 + * but it can also be a 'dummy' object, which can be allocated with 72 + * drm_gpuvm_resv_object_alloc(). 73 + * 74 + * In order to connect a struct drm_gpuva its backing &drm_gem_object each 75 + * &drm_gem_object maintains a list of &drm_gpuvm_bo structures, and each 76 + * &drm_gpuvm_bo contains a list of &drm_gpuva structures. 77 + * 78 + * A &drm_gpuvm_bo is an abstraction that represents a combination of a 79 + * &drm_gpuvm and a &drm_gem_object. Every such combination should be unique. 80 + * This is ensured by the API through drm_gpuvm_bo_obtain() and 81 + * drm_gpuvm_bo_obtain_prealloc() which first look into the corresponding 82 + * &drm_gem_object list of &drm_gpuvm_bos for an existing instance of this 83 + * particular combination. If not existent a new instance is created and linked 84 + * to the &drm_gem_object. 85 + * 86 + * &drm_gpuvm_bo structures, since unique for a given &drm_gpuvm, are also used 87 + * as entry for the &drm_gpuvm's lists of external and evicted objects. Those 88 + * lists are maintained in order to accelerate locking of dma-resv locks and 89 + * validation of evicted objects bound in a &drm_gpuvm. For instance, all 90 + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be locked by calling 91 + * drm_gpuvm_exec_lock(). Once locked drivers can call drm_gpuvm_validate() in 92 + * order to validate all evicted &drm_gem_objects. It is also possible to lock 93 + * additional &drm_gem_objects by providing the corresponding parameters to 94 + * drm_gpuvm_exec_lock() as well as open code the &drm_exec loop while making 95 + * use of helper functions such as drm_gpuvm_prepare_range() or 96 + * drm_gpuvm_prepare_objects(). 97 + * 98 + * Every bound &drm_gem_object is treated as external object when its &dma_resv 99 + * structure is different than the &drm_gpuvm's common &dma_resv structure. 64 100 */ 65 101 66 102 /** ··· 422 386 /** 423 387 * DOC: Locking 424 388 * 425 - * Generally, the GPU VA manager does not take care of locking itself, it is 426 - * the drivers responsibility to take care about locking. Drivers might want to 427 - * protect the following operations: inserting, removing and iterating 428 - * &drm_gpuva objects as well as generating all kinds of operations, such as 429 - * split / merge or prefetch. 389 + * In terms of managing &drm_gpuva entries DRM GPUVM does not take care of 390 + * locking itself, it is the drivers responsibility to take care about locking. 391 + * Drivers might want to protect the following operations: inserting, removing 392 + * and iterating &drm_gpuva objects as well as generating all kinds of 393 + * operations, such as split / merge or prefetch. 430 394 * 431 - * The GPU VA manager also does not take care of the locking of the backing 432 - * &drm_gem_object buffers GPU VA lists by itself; drivers are responsible to 433 - * enforce mutual exclusion using either the GEMs dma_resv lock or alternatively 434 - * a driver specific external lock. For the latter see also 435 - * drm_gem_gpuva_set_lock(). 395 + * DRM GPUVM also does not take care of the locking of the backing 396 + * &drm_gem_object buffers GPU VA lists and &drm_gpuvm_bo abstractions by 397 + * itself; drivers are responsible to enforce mutual exclusion using either the 398 + * GEMs dma_resv lock or alternatively a driver specific external lock. For the 399 + * latter see also drm_gem_gpuva_set_lock(). 436 400 * 437 - * However, the GPU VA manager contains lockdep checks to ensure callers of its 438 - * API hold the corresponding lock whenever the &drm_gem_objects GPU VA list is 439 - * accessed by functions such as drm_gpuva_link() or drm_gpuva_unlink(). 401 + * However, DRM GPUVM contains lockdep checks to ensure callers of its API hold 402 + * the corresponding lock whenever the &drm_gem_objects GPU VA list is accessed 403 + * by functions such as drm_gpuva_link() or drm_gpuva_unlink(), but also 404 + * drm_gpuvm_bo_obtain() and drm_gpuvm_bo_put(). 405 + * 406 + * The latter is required since on creation and destruction of a &drm_gpuvm_bo 407 + * the &drm_gpuvm_bo is attached / removed from the &drm_gem_objects gpuva list. 408 + * Subsequent calls to drm_gpuvm_bo_obtain() for the same &drm_gpuvm and 409 + * &drm_gem_object must be able to observe previous creations and destructions 410 + * of &drm_gpuvm_bos in order to keep instances unique. 411 + * 412 + * The &drm_gpuvm's lists for keeping track of external and evicted objects are 413 + * protected against concurrent insertion / removal and iteration internally. 414 + * 415 + * However, drivers still need ensure to protect concurrent calls to functions 416 + * iterating those lists, namely drm_gpuvm_prepare_objects() and 417 + * drm_gpuvm_validate(). 418 + * 419 + * Alternatively, drivers can set the &DRM_GPUVM_RESV_PROTECTED flag to indicate 420 + * that the corresponding &dma_resv locks are held in order to protect the 421 + * lists. If &DRM_GPUVM_RESV_PROTECTED is set, internal locking is disabled and 422 + * the corresponding lockdep checks are enabled. This is an optimization for 423 + * drivers which are capable of taking the corresponding &dma_resv locks and 424 + * hence do not require internal locking. 440 425 */ 441 426 442 427 /** ··· 487 430 * { 488 431 * struct drm_gpuva_ops *ops; 489 432 * struct drm_gpuva_op *op 433 + * struct drm_gpuvm_bo *vm_bo; 490 434 * 491 435 * driver_lock_va_space(); 492 436 * ops = drm_gpuvm_sm_map_ops_create(gpuvm, addr, range, 493 437 * obj, offset); 494 438 * if (IS_ERR(ops)) 495 439 * return PTR_ERR(ops); 440 + * 441 + * vm_bo = drm_gpuvm_bo_obtain(gpuvm, obj); 442 + * if (IS_ERR(vm_bo)) 443 + * return PTR_ERR(vm_bo); 496 444 * 497 445 * drm_gpuva_for_each_op(op, ops) { 498 446 * struct drm_gpuva *va; ··· 511 449 * 512 450 * driver_vm_map(); 513 451 * drm_gpuva_map(gpuvm, va, &op->map); 514 - * drm_gpuva_link(va); 452 + * drm_gpuva_link(va, vm_bo); 515 453 * 516 454 * break; 517 455 * case DRM_GPUVA_OP_REMAP: { ··· 538 476 * driver_vm_remap(); 539 477 * drm_gpuva_remap(prev, next, &op->remap); 540 478 * 541 - * drm_gpuva_unlink(va); 542 479 * if (prev) 543 - * drm_gpuva_link(prev); 480 + * drm_gpuva_link(prev, va->vm_bo); 544 481 * if (next) 545 - * drm_gpuva_link(next); 482 + * drm_gpuva_link(next, va->vm_bo); 483 + * drm_gpuva_unlink(va); 546 484 * 547 485 * break; 548 486 * } ··· 558 496 * break; 559 497 * } 560 498 * } 499 + * drm_gpuvm_bo_put(vm_bo); 561 500 * driver_unlock_va_space(); 562 501 * 563 502 * return 0; ··· 568 505 * 569 506 * struct driver_context { 570 507 * struct drm_gpuvm *gpuvm; 508 + * struct drm_gpuvm_bo *vm_bo; 571 509 * struct drm_gpuva *new_va; 572 510 * struct drm_gpuva *prev_va; 573 511 * struct drm_gpuva *next_va; ··· 589 525 * struct drm_gem_object *obj, u64 offset) 590 526 * { 591 527 * struct driver_context ctx; 528 + * struct drm_gpuvm_bo *vm_bo; 592 529 * struct drm_gpuva_ops *ops; 593 530 * struct drm_gpuva_op *op; 594 531 * int ret = 0; ··· 599 534 * ctx.new_va = kzalloc(sizeof(*ctx.new_va), GFP_KERNEL); 600 535 * ctx.prev_va = kzalloc(sizeof(*ctx.prev_va), GFP_KERNEL); 601 536 * ctx.next_va = kzalloc(sizeof(*ctx.next_va), GFP_KERNEL); 602 - * if (!ctx.new_va || !ctx.prev_va || !ctx.next_va) { 537 + * ctx.vm_bo = drm_gpuvm_bo_create(gpuvm, obj); 538 + * if (!ctx.new_va || !ctx.prev_va || !ctx.next_va || !vm_bo) { 603 539 * ret = -ENOMEM; 604 540 * goto out; 605 541 * } 542 + * 543 + * // Typically protected with a driver specific GEM gpuva lock 544 + * // used in the fence signaling path for drm_gpuva_link() and 545 + * // drm_gpuva_unlink(), hence pre-allocate. 546 + * ctx.vm_bo = drm_gpuvm_bo_obtain_prealloc(ctx.vm_bo); 606 547 * 607 548 * driver_lock_va_space(); 608 549 * ret = drm_gpuvm_sm_map(gpuvm, &ctx, addr, range, obj, offset); 609 550 * driver_unlock_va_space(); 610 551 * 611 552 * out: 553 + * drm_gpuvm_bo_put(ctx.vm_bo); 612 554 * kfree(ctx.new_va); 613 555 * kfree(ctx.prev_va); 614 556 * kfree(ctx.next_va); ··· 628 556 * 629 557 * drm_gpuva_map(ctx->vm, ctx->new_va, &op->map); 630 558 * 631 - * drm_gpuva_link(ctx->new_va); 559 + * drm_gpuva_link(ctx->new_va, ctx->vm_bo); 632 560 * 633 561 * // prevent the new GPUVA from being freed in 634 562 * // driver_mapping_create() ··· 640 568 * int driver_gpuva_remap(struct drm_gpuva_op *op, void *__ctx) 641 569 * { 642 570 * struct driver_context *ctx = __ctx; 571 + * struct drm_gpuva *va = op->remap.unmap->va; 643 572 * 644 573 * drm_gpuva_remap(ctx->prev_va, ctx->next_va, &op->remap); 645 574 * 646 - * drm_gpuva_unlink(op->remap.unmap->va); 647 - * kfree(op->remap.unmap->va); 648 - * 649 575 * if (op->remap.prev) { 650 - * drm_gpuva_link(ctx->prev_va); 576 + * drm_gpuva_link(ctx->prev_va, va->vm_bo); 651 577 * ctx->prev_va = NULL; 652 578 * } 653 579 * 654 580 * if (op->remap.next) { 655 - * drm_gpuva_link(ctx->next_va); 581 + * drm_gpuva_link(ctx->next_va, va->vm_bo); 656 582 * ctx->next_va = NULL; 657 583 * } 584 + * 585 + * drm_gpuva_unlink(va); 586 + * kfree(va); 658 587 * 659 588 * return 0; 660 589 * } ··· 669 596 * return 0; 670 597 * } 671 598 */ 599 + 600 + /** 601 + * get_next_vm_bo_from_list() - get the next vm_bo element 602 + * @__gpuvm: the &drm_gpuvm 603 + * @__list_name: the name of the list we're iterating on 604 + * @__local_list: a pointer to the local list used to store already iterated items 605 + * @__prev_vm_bo: the previous element we got from get_next_vm_bo_from_list() 606 + * 607 + * This helper is here to provide lockless list iteration. Lockless as in, the 608 + * iterator releases the lock immediately after picking the first element from 609 + * the list, so list insertion deletion can happen concurrently. 610 + * 611 + * Elements popped from the original list are kept in a local list, so removal 612 + * and is_empty checks can still happen while we're iterating the list. 613 + */ 614 + #define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, __prev_vm_bo) \ 615 + ({ \ 616 + struct drm_gpuvm_bo *__vm_bo = NULL; \ 617 + \ 618 + drm_gpuvm_bo_put(__prev_vm_bo); \ 619 + \ 620 + spin_lock(&(__gpuvm)->__list_name.lock); \ 621 + if (!(__gpuvm)->__list_name.local_list) \ 622 + (__gpuvm)->__list_name.local_list = __local_list; \ 623 + else \ 624 + drm_WARN_ON((__gpuvm)->drm, \ 625 + (__gpuvm)->__list_name.local_list != __local_list); \ 626 + \ 627 + while (!list_empty(&(__gpuvm)->__list_name.list)) { \ 628 + __vm_bo = list_first_entry(&(__gpuvm)->__list_name.list, \ 629 + struct drm_gpuvm_bo, \ 630 + list.entry.__list_name); \ 631 + if (kref_get_unless_zero(&__vm_bo->kref)) { \ 632 + list_move_tail(&(__vm_bo)->list.entry.__list_name, \ 633 + __local_list); \ 634 + break; \ 635 + } else { \ 636 + list_del_init(&(__vm_bo)->list.entry.__list_name); \ 637 + __vm_bo = NULL; \ 638 + } \ 639 + } \ 640 + spin_unlock(&(__gpuvm)->__list_name.lock); \ 641 + \ 642 + __vm_bo; \ 643 + }) 644 + 645 + /** 646 + * for_each_vm_bo_in_list() - internal vm_bo list iterator 647 + * @__gpuvm: the &drm_gpuvm 648 + * @__list_name: the name of the list we're iterating on 649 + * @__local_list: a pointer to the local list used to store already iterated items 650 + * @__vm_bo: the struct drm_gpuvm_bo to assign in each iteration step 651 + * 652 + * This helper is here to provide lockless list iteration. Lockless as in, the 653 + * iterator releases the lock immediately after picking the first element from the 654 + * list, hence list insertion and deletion can happen concurrently. 655 + * 656 + * It is not allowed to re-assign the vm_bo pointer from inside this loop. 657 + * 658 + * Typical use: 659 + * 660 + * struct drm_gpuvm_bo *vm_bo; 661 + * LIST_HEAD(my_local_list); 662 + * 663 + * ret = 0; 664 + * for_each_vm_bo_in_list(gpuvm, <list_name>, &my_local_list, vm_bo) { 665 + * ret = do_something_with_vm_bo(..., vm_bo); 666 + * if (ret) 667 + * break; 668 + * } 669 + * // Drop ref in case we break out of the loop. 670 + * drm_gpuvm_bo_put(vm_bo); 671 + * restore_vm_bo_list(gpuvm, <list_name>, &my_local_list); 672 + * 673 + * 674 + * Only used for internal list iterations, not meant to be exposed to the outside 675 + * world. 676 + */ 677 + #define for_each_vm_bo_in_list(__gpuvm, __list_name, __local_list, __vm_bo) \ 678 + for (__vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name, \ 679 + __local_list, NULL); \ 680 + __vm_bo; \ 681 + __vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name, \ 682 + __local_list, __vm_bo)) 683 + 684 + static void 685 + __restore_vm_bo_list(struct drm_gpuvm *gpuvm, spinlock_t *lock, 686 + struct list_head *list, struct list_head **local_list) 687 + { 688 + /* Merge back the two lists, moving local list elements to the 689 + * head to preserve previous ordering, in case it matters. 690 + */ 691 + spin_lock(lock); 692 + if (*local_list) { 693 + list_splice(*local_list, list); 694 + *local_list = NULL; 695 + } 696 + spin_unlock(lock); 697 + } 698 + 699 + /** 700 + * restore_vm_bo_list() - move vm_bo elements back to their original list 701 + * @__gpuvm: the &drm_gpuvm 702 + * @__list_name: the name of the list we're iterating on 703 + * 704 + * When we're done iterating a vm_bo list, we should call restore_vm_bo_list() 705 + * to restore the original state and let new iterations take place. 706 + */ 707 + #define restore_vm_bo_list(__gpuvm, __list_name) \ 708 + __restore_vm_bo_list((__gpuvm), &(__gpuvm)->__list_name.lock, \ 709 + &(__gpuvm)->__list_name.list, \ 710 + &(__gpuvm)->__list_name.local_list) 711 + 712 + static void 713 + cond_spin_lock(spinlock_t *lock, bool cond) 714 + { 715 + if (cond) 716 + spin_lock(lock); 717 + } 718 + 719 + static void 720 + cond_spin_unlock(spinlock_t *lock, bool cond) 721 + { 722 + if (cond) 723 + spin_unlock(lock); 724 + } 725 + 726 + static void 727 + __drm_gpuvm_bo_list_add(struct drm_gpuvm *gpuvm, spinlock_t *lock, 728 + struct list_head *entry, struct list_head *list) 729 + { 730 + cond_spin_lock(lock, !!lock); 731 + if (list_empty(entry)) 732 + list_add_tail(entry, list); 733 + cond_spin_unlock(lock, !!lock); 734 + } 735 + 736 + /** 737 + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given list 738 + * @__vm_bo: the &drm_gpuvm_bo 739 + * @__list_name: the name of the list to insert into 740 + * @__lock: whether to lock with the internal spinlock 741 + * 742 + * Inserts the given @__vm_bo into the list specified by @__list_name. 743 + */ 744 + #define drm_gpuvm_bo_list_add(__vm_bo, __list_name, __lock) \ 745 + __drm_gpuvm_bo_list_add((__vm_bo)->vm, \ 746 + __lock ? &(__vm_bo)->vm->__list_name.lock : \ 747 + NULL, \ 748 + &(__vm_bo)->list.entry.__list_name, \ 749 + &(__vm_bo)->vm->__list_name.list) 750 + 751 + static void 752 + __drm_gpuvm_bo_list_del(struct drm_gpuvm *gpuvm, spinlock_t *lock, 753 + struct list_head *entry, bool init) 754 + { 755 + cond_spin_lock(lock, !!lock); 756 + if (init) { 757 + if (!list_empty(entry)) 758 + list_del_init(entry); 759 + } else { 760 + list_del(entry); 761 + } 762 + cond_spin_unlock(lock, !!lock); 763 + } 764 + 765 + /** 766 + * drm_gpuvm_bo_list_del_init() - remove a vm_bo from the given list 767 + * @__vm_bo: the &drm_gpuvm_bo 768 + * @__list_name: the name of the list to insert into 769 + * @__lock: whether to lock with the internal spinlock 770 + * 771 + * Removes the given @__vm_bo from the list specified by @__list_name. 772 + */ 773 + #define drm_gpuvm_bo_list_del_init(__vm_bo, __list_name, __lock) \ 774 + __drm_gpuvm_bo_list_del((__vm_bo)->vm, \ 775 + __lock ? &(__vm_bo)->vm->__list_name.lock : \ 776 + NULL, \ 777 + &(__vm_bo)->list.entry.__list_name, \ 778 + true) 779 + 780 + /** 781 + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given list 782 + * @__vm_bo: the &drm_gpuvm_bo 783 + * @__list_name: the name of the list to insert into 784 + * @__lock: whether to lock with the internal spinlock 785 + * 786 + * Removes the given @__vm_bo from the list specified by @__list_name. 787 + */ 788 + #define drm_gpuvm_bo_list_del(__vm_bo, __list_name, __lock) \ 789 + __drm_gpuvm_bo_list_del((__vm_bo)->vm, \ 790 + __lock ? &(__vm_bo)->vm->__list_name.lock : \ 791 + NULL, \ 792 + &(__vm_bo)->list.entry.__list_name, \ 793 + false) 672 794 673 795 #define to_drm_gpuva(__node) container_of((__node), struct drm_gpuva, rb.node) 674 796 ··· 886 618 { 887 619 u64 end; 888 620 889 - return WARN(check_add_overflow(addr, range, &end), 890 - "GPUVA address limited to %zu bytes.\n", sizeof(end)); 621 + return check_add_overflow(addr, range, &end); 622 + } 623 + 624 + static bool 625 + drm_gpuvm_warn_check_overflow(struct drm_gpuvm *gpuvm, u64 addr, u64 range) 626 + { 627 + return drm_WARN(gpuvm->drm, drm_gpuvm_check_overflow(addr, range), 628 + "GPUVA address limited to %zu bytes.\n", sizeof(addr)); 891 629 } 892 630 893 631 static bool ··· 917 643 return krange && addr < kend && kstart < end; 918 644 } 919 645 920 - static bool 646 + /** 647 + * drm_gpuvm_range_valid() - checks whether the given range is valid for the 648 + * given &drm_gpuvm 649 + * @gpuvm: the GPUVM to check the range for 650 + * @addr: the base address 651 + * @range: the range starting from the base address 652 + * 653 + * Checks whether the range is within the GPUVM's managed boundaries. 654 + * 655 + * Returns: true for a valid range, false otherwise 656 + */ 657 + bool 921 658 drm_gpuvm_range_valid(struct drm_gpuvm *gpuvm, 922 659 u64 addr, u64 range) 923 660 { ··· 936 651 drm_gpuvm_in_mm_range(gpuvm, addr, range) && 937 652 !drm_gpuvm_in_kernel_node(gpuvm, addr, range); 938 653 } 654 + EXPORT_SYMBOL_GPL(drm_gpuvm_range_valid); 655 + 656 + static void 657 + drm_gpuvm_gem_object_free(struct drm_gem_object *obj) 658 + { 659 + drm_gem_object_release(obj); 660 + kfree(obj); 661 + } 662 + 663 + static const struct drm_gem_object_funcs drm_gpuvm_object_funcs = { 664 + .free = drm_gpuvm_gem_object_free, 665 + }; 666 + 667 + /** 668 + * drm_gpuvm_resv_object_alloc() - allocate a dummy &drm_gem_object 669 + * @drm: the drivers &drm_device 670 + * 671 + * Allocates a dummy &drm_gem_object which can be passed to drm_gpuvm_init() in 672 + * order to serve as root GEM object providing the &drm_resv shared across 673 + * &drm_gem_objects local to a single GPUVM. 674 + * 675 + * Returns: the &drm_gem_object on success, NULL on failure 676 + */ 677 + struct drm_gem_object * 678 + drm_gpuvm_resv_object_alloc(struct drm_device *drm) 679 + { 680 + struct drm_gem_object *obj; 681 + 682 + obj = kzalloc(sizeof(*obj), GFP_KERNEL); 683 + if (!obj) 684 + return NULL; 685 + 686 + obj->funcs = &drm_gpuvm_object_funcs; 687 + drm_gem_private_object_init(drm, obj, 0); 688 + 689 + return obj; 690 + } 691 + EXPORT_SYMBOL_GPL(drm_gpuvm_resv_object_alloc); 939 692 940 693 /** 941 694 * drm_gpuvm_init() - initialize a &drm_gpuvm 942 695 * @gpuvm: pointer to the &drm_gpuvm to initialize 943 696 * @name: the name of the GPU VA space 697 + * @flags: the &drm_gpuvm_flags for this GPUVM 698 + * @drm: the &drm_device this VM resides in 699 + * @r_obj: the resv &drm_gem_object providing the GPUVM's common &dma_resv 944 700 * @start_offset: the start offset of the GPU VA space 945 701 * @range: the size of the GPU VA space 946 702 * @reserve_offset: the start of the kernel reserved GPU VA area ··· 994 668 * &name is expected to be managed by the surrounding driver structures. 995 669 */ 996 670 void 997 - drm_gpuvm_init(struct drm_gpuvm *gpuvm, 998 - const char *name, 671 + drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name, 672 + enum drm_gpuvm_flags flags, 673 + struct drm_device *drm, 674 + struct drm_gem_object *r_obj, 999 675 u64 start_offset, u64 range, 1000 676 u64 reserve_offset, u64 reserve_range, 1001 677 const struct drm_gpuvm_ops *ops) ··· 1005 677 gpuvm->rb.tree = RB_ROOT_CACHED; 1006 678 INIT_LIST_HEAD(&gpuvm->rb.list); 1007 679 1008 - drm_gpuvm_check_overflow(start_offset, range); 680 + INIT_LIST_HEAD(&gpuvm->extobj.list); 681 + spin_lock_init(&gpuvm->extobj.lock); 682 + 683 + INIT_LIST_HEAD(&gpuvm->evict.list); 684 + spin_lock_init(&gpuvm->evict.lock); 685 + 686 + kref_init(&gpuvm->kref); 687 + 688 + gpuvm->name = name ? name : "unknown"; 689 + gpuvm->flags = flags; 690 + gpuvm->ops = ops; 691 + gpuvm->drm = drm; 692 + gpuvm->r_obj = r_obj; 693 + 694 + drm_gem_object_get(r_obj); 695 + 696 + drm_gpuvm_warn_check_overflow(gpuvm, start_offset, range); 1009 697 gpuvm->mm_start = start_offset; 1010 698 gpuvm->mm_range = range; 1011 699 1012 - gpuvm->name = name ? name : "unknown"; 1013 - gpuvm->ops = ops; 1014 - 1015 700 memset(&gpuvm->kernel_alloc_node, 0, sizeof(struct drm_gpuva)); 1016 - 1017 701 if (reserve_range) { 1018 702 gpuvm->kernel_alloc_node.va.addr = reserve_offset; 1019 703 gpuvm->kernel_alloc_node.va.range = reserve_range; 1020 704 1021 - if (likely(!drm_gpuvm_check_overflow(reserve_offset, 1022 - reserve_range))) 705 + if (likely(!drm_gpuvm_warn_check_overflow(gpuvm, reserve_offset, 706 + reserve_range))) 1023 707 __drm_gpuva_insert(gpuvm, &gpuvm->kernel_alloc_node); 1024 708 } 1025 709 } 1026 710 EXPORT_SYMBOL_GPL(drm_gpuvm_init); 1027 711 1028 - /** 1029 - * drm_gpuvm_destroy() - cleanup a &drm_gpuvm 1030 - * @gpuvm: pointer to the &drm_gpuvm to clean up 1031 - * 1032 - * Note that it is a bug to call this function on a manager that still 1033 - * holds GPU VA mappings. 1034 - */ 1035 - void 1036 - drm_gpuvm_destroy(struct drm_gpuvm *gpuvm) 712 + static void 713 + drm_gpuvm_fini(struct drm_gpuvm *gpuvm) 1037 714 { 1038 715 gpuvm->name = NULL; 1039 716 1040 717 if (gpuvm->kernel_alloc_node.va.range) 1041 718 __drm_gpuva_remove(&gpuvm->kernel_alloc_node); 1042 719 1043 - WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root), 1044 - "GPUVA tree is not empty, potentially leaking memory."); 720 + drm_WARN(gpuvm->drm, !RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root), 721 + "GPUVA tree is not empty, potentially leaking memory.\n"); 722 + 723 + drm_WARN(gpuvm->drm, !list_empty(&gpuvm->extobj.list), 724 + "Extobj list should be empty.\n"); 725 + drm_WARN(gpuvm->drm, !list_empty(&gpuvm->evict.list), 726 + "Evict list should be empty.\n"); 727 + 728 + drm_gem_object_put(gpuvm->r_obj); 1045 729 } 1046 - EXPORT_SYMBOL_GPL(drm_gpuvm_destroy); 730 + 731 + static void 732 + drm_gpuvm_free(struct kref *kref) 733 + { 734 + struct drm_gpuvm *gpuvm = container_of(kref, struct drm_gpuvm, kref); 735 + 736 + drm_gpuvm_fini(gpuvm); 737 + 738 + if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free)) 739 + return; 740 + 741 + gpuvm->ops->vm_free(gpuvm); 742 + } 743 + 744 + /** 745 + * drm_gpuvm_put() - drop a struct drm_gpuvm reference 746 + * @gpuvm: the &drm_gpuvm to release the reference of 747 + * 748 + * This releases a reference to @gpuvm. 749 + * 750 + * This function may be called from atomic context. 751 + */ 752 + void 753 + drm_gpuvm_put(struct drm_gpuvm *gpuvm) 754 + { 755 + if (gpuvm) 756 + kref_put(&gpuvm->kref, drm_gpuvm_free); 757 + } 758 + EXPORT_SYMBOL_GPL(drm_gpuvm_put); 759 + 760 + static int 761 + __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, 762 + struct drm_exec *exec, 763 + unsigned int num_fences) 764 + { 765 + struct drm_gpuvm_bo *vm_bo; 766 + LIST_HEAD(extobjs); 767 + int ret = 0; 768 + 769 + for_each_vm_bo_in_list(gpuvm, extobj, &extobjs, vm_bo) { 770 + ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences); 771 + if (ret) 772 + break; 773 + } 774 + /* Drop ref in case we break out of the loop. */ 775 + drm_gpuvm_bo_put(vm_bo); 776 + restore_vm_bo_list(gpuvm, extobj); 777 + 778 + return ret; 779 + } 780 + 781 + static int 782 + drm_gpuvm_prepare_objects_locked(struct drm_gpuvm *gpuvm, 783 + struct drm_exec *exec, 784 + unsigned int num_fences) 785 + { 786 + struct drm_gpuvm_bo *vm_bo; 787 + int ret = 0; 788 + 789 + drm_gpuvm_resv_assert_held(gpuvm); 790 + list_for_each_entry(vm_bo, &gpuvm->extobj.list, list.entry.extobj) { 791 + ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences); 792 + if (ret) 793 + break; 794 + 795 + if (vm_bo->evicted) 796 + drm_gpuvm_bo_list_add(vm_bo, evict, false); 797 + } 798 + 799 + return ret; 800 + } 801 + 802 + /** 803 + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs 804 + * @gpuvm: the &drm_gpuvm 805 + * @exec: the &drm_exec locking context 806 + * @num_fences: the amount of &dma_fences to reserve 807 + * 808 + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the given 809 + * &drm_gpuvm contains mappings of. 810 + * 811 + * Using this function directly, it is the drivers responsibility to call 812 + * drm_exec_init() and drm_exec_fini() accordingly. 813 + * 814 + * Note: This function is safe against concurrent insertion and removal of 815 + * external objects, however it is not safe against concurrent usage itself. 816 + * 817 + * Drivers need to make sure to protect this case with either an outer VM lock 818 + * or by calling drm_gpuvm_prepare_vm() before this function within the 819 + * drm_exec_until_all_locked() loop, such that the GPUVM's dma-resv lock ensures 820 + * mutual exclusion. 821 + * 822 + * Returns: 0 on success, negative error code on failure. 823 + */ 824 + int 825 + drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, 826 + struct drm_exec *exec, 827 + unsigned int num_fences) 828 + { 829 + if (drm_gpuvm_resv_protected(gpuvm)) 830 + return drm_gpuvm_prepare_objects_locked(gpuvm, exec, 831 + num_fences); 832 + else 833 + return __drm_gpuvm_prepare_objects(gpuvm, exec, num_fences); 834 + } 835 + EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects); 836 + 837 + /** 838 + * drm_gpuvm_prepare_range() - prepare all BOs mapped within a given range 839 + * @gpuvm: the &drm_gpuvm 840 + * @exec: the &drm_exec locking context 841 + * @addr: the start address within the VA space 842 + * @range: the range to iterate within the VA space 843 + * @num_fences: the amount of &dma_fences to reserve 844 + * 845 + * Calls drm_exec_prepare_obj() for all &drm_gem_objects mapped between @addr 846 + * and @addr + @range. 847 + * 848 + * Returns: 0 on success, negative error code on failure. 849 + */ 850 + int 851 + drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct drm_exec *exec, 852 + u64 addr, u64 range, unsigned int num_fences) 853 + { 854 + struct drm_gpuva *va; 855 + u64 end = addr + range; 856 + int ret; 857 + 858 + drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) { 859 + struct drm_gem_object *obj = va->gem.obj; 860 + 861 + ret = drm_exec_prepare_obj(exec, obj, num_fences); 862 + if (ret) 863 + return ret; 864 + } 865 + 866 + return 0; 867 + } 868 + EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range); 869 + 870 + /** 871 + * drm_gpuvm_exec_lock() - lock all dma-resv of all assoiciated BOs 872 + * @vm_exec: the &drm_gpuvm_exec wrapper 873 + * 874 + * Acquires all dma-resv locks of all &drm_gem_objects the given 875 + * &drm_gpuvm contains mappings of. 876 + * 877 + * Addionally, when calling this function with struct drm_gpuvm_exec::extra 878 + * being set the driver receives the given @fn callback to lock additional 879 + * dma-resv in the context of the &drm_gpuvm_exec instance. Typically, drivers 880 + * would call drm_exec_prepare_obj() from within this callback. 881 + * 882 + * Returns: 0 on success, negative error code on failure. 883 + */ 884 + int 885 + drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec) 886 + { 887 + struct drm_gpuvm *gpuvm = vm_exec->vm; 888 + struct drm_exec *exec = &vm_exec->exec; 889 + unsigned int num_fences = vm_exec->num_fences; 890 + int ret; 891 + 892 + drm_exec_init(exec, vm_exec->flags); 893 + 894 + drm_exec_until_all_locked(exec) { 895 + ret = drm_gpuvm_prepare_vm(gpuvm, exec, num_fences); 896 + drm_exec_retry_on_contention(exec); 897 + if (ret) 898 + goto err; 899 + 900 + ret = drm_gpuvm_prepare_objects(gpuvm, exec, num_fences); 901 + drm_exec_retry_on_contention(exec); 902 + if (ret) 903 + goto err; 904 + 905 + if (vm_exec->extra.fn) { 906 + ret = vm_exec->extra.fn(vm_exec); 907 + drm_exec_retry_on_contention(exec); 908 + if (ret) 909 + goto err; 910 + } 911 + } 912 + 913 + return 0; 914 + 915 + err: 916 + drm_exec_fini(exec); 917 + return ret; 918 + } 919 + EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock); 920 + 921 + static int 922 + fn_lock_array(struct drm_gpuvm_exec *vm_exec) 923 + { 924 + struct { 925 + struct drm_gem_object **objs; 926 + unsigned int num_objs; 927 + } *args = vm_exec->extra.priv; 928 + 929 + return drm_exec_prepare_array(&vm_exec->exec, args->objs, 930 + args->num_objs, vm_exec->num_fences); 931 + } 932 + 933 + /** 934 + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all assoiciated BOs 935 + * @vm_exec: the &drm_gpuvm_exec wrapper 936 + * @objs: additional &drm_gem_objects to lock 937 + * @num_objs: the number of additional &drm_gem_objects to lock 938 + * 939 + * Acquires all dma-resv locks of all &drm_gem_objects the given &drm_gpuvm 940 + * contains mappings of, plus the ones given through @objs. 941 + * 942 + * Returns: 0 on success, negative error code on failure. 943 + */ 944 + int 945 + drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec, 946 + struct drm_gem_object **objs, 947 + unsigned int num_objs) 948 + { 949 + struct { 950 + struct drm_gem_object **objs; 951 + unsigned int num_objs; 952 + } args; 953 + 954 + args.objs = objs; 955 + args.num_objs = num_objs; 956 + 957 + vm_exec->extra.fn = fn_lock_array; 958 + vm_exec->extra.priv = &args; 959 + 960 + return drm_gpuvm_exec_lock(vm_exec); 961 + } 962 + EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array); 963 + 964 + /** 965 + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped within a given range 966 + * @vm_exec: the &drm_gpuvm_exec wrapper 967 + * @addr: the start address within the VA space 968 + * @range: the range to iterate within the VA space 969 + * 970 + * Acquires all dma-resv locks of all &drm_gem_objects mapped between @addr and 971 + * @addr + @range. 972 + * 973 + * Returns: 0 on success, negative error code on failure. 974 + */ 975 + int 976 + drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec, 977 + u64 addr, u64 range) 978 + { 979 + struct drm_gpuvm *gpuvm = vm_exec->vm; 980 + struct drm_exec *exec = &vm_exec->exec; 981 + int ret; 982 + 983 + drm_exec_init(exec, vm_exec->flags); 984 + 985 + drm_exec_until_all_locked(exec) { 986 + ret = drm_gpuvm_prepare_range(gpuvm, exec, addr, range, 987 + vm_exec->num_fences); 988 + drm_exec_retry_on_contention(exec); 989 + if (ret) 990 + goto err; 991 + } 992 + 993 + return ret; 994 + 995 + err: 996 + drm_exec_fini(exec); 997 + return ret; 998 + } 999 + EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range); 1000 + 1001 + static int 1002 + __drm_gpuvm_validate(struct drm_gpuvm *gpuvm, struct drm_exec *exec) 1003 + { 1004 + const struct drm_gpuvm_ops *ops = gpuvm->ops; 1005 + struct drm_gpuvm_bo *vm_bo; 1006 + LIST_HEAD(evict); 1007 + int ret = 0; 1008 + 1009 + for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) { 1010 + ret = ops->vm_bo_validate(vm_bo, exec); 1011 + if (ret) 1012 + break; 1013 + } 1014 + /* Drop ref in case we break out of the loop. */ 1015 + drm_gpuvm_bo_put(vm_bo); 1016 + restore_vm_bo_list(gpuvm, evict); 1017 + 1018 + return ret; 1019 + } 1020 + 1021 + static int 1022 + drm_gpuvm_validate_locked(struct drm_gpuvm *gpuvm, struct drm_exec *exec) 1023 + { 1024 + const struct drm_gpuvm_ops *ops = gpuvm->ops; 1025 + struct drm_gpuvm_bo *vm_bo, *next; 1026 + int ret = 0; 1027 + 1028 + drm_gpuvm_resv_assert_held(gpuvm); 1029 + 1030 + list_for_each_entry_safe(vm_bo, next, &gpuvm->evict.list, 1031 + list.entry.evict) { 1032 + ret = ops->vm_bo_validate(vm_bo, exec); 1033 + if (ret) 1034 + break; 1035 + 1036 + dma_resv_assert_held(vm_bo->obj->resv); 1037 + if (!vm_bo->evicted) 1038 + drm_gpuvm_bo_list_del_init(vm_bo, evict, false); 1039 + } 1040 + 1041 + return ret; 1042 + } 1043 + 1044 + /** 1045 + * drm_gpuvm_validate() - validate all BOs marked as evicted 1046 + * @gpuvm: the &drm_gpuvm to validate evicted BOs 1047 + * @exec: the &drm_exec instance used for locking the GPUVM 1048 + * 1049 + * Calls the &drm_gpuvm_ops::vm_bo_validate callback for all evicted buffer 1050 + * objects being mapped in the given &drm_gpuvm. 1051 + * 1052 + * Returns: 0 on success, negative error code on failure. 1053 + */ 1054 + int 1055 + drm_gpuvm_validate(struct drm_gpuvm *gpuvm, struct drm_exec *exec) 1056 + { 1057 + const struct drm_gpuvm_ops *ops = gpuvm->ops; 1058 + 1059 + if (unlikely(!ops || !ops->vm_bo_validate)) 1060 + return -EOPNOTSUPP; 1061 + 1062 + if (drm_gpuvm_resv_protected(gpuvm)) 1063 + return drm_gpuvm_validate_locked(gpuvm, exec); 1064 + else 1065 + return __drm_gpuvm_validate(gpuvm, exec); 1066 + } 1067 + EXPORT_SYMBOL_GPL(drm_gpuvm_validate); 1068 + 1069 + /** 1070 + * drm_gpuvm_resv_add_fence - add fence to private and all extobj 1071 + * dma-resv 1072 + * @gpuvm: the &drm_gpuvm to add a fence to 1073 + * @exec: the &drm_exec locking context 1074 + * @fence: fence to add 1075 + * @private_usage: private dma-resv usage 1076 + * @extobj_usage: extobj dma-resv usage 1077 + */ 1078 + void 1079 + drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm, 1080 + struct drm_exec *exec, 1081 + struct dma_fence *fence, 1082 + enum dma_resv_usage private_usage, 1083 + enum dma_resv_usage extobj_usage) 1084 + { 1085 + struct drm_gem_object *obj; 1086 + unsigned long index; 1087 + 1088 + drm_exec_for_each_locked_object(exec, index, obj) { 1089 + dma_resv_assert_held(obj->resv); 1090 + dma_resv_add_fence(obj->resv, fence, 1091 + drm_gpuvm_is_extobj(gpuvm, obj) ? 1092 + extobj_usage : private_usage); 1093 + } 1094 + } 1095 + EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence); 1096 + 1097 + /** 1098 + * drm_gpuvm_bo_create() - create a new instance of struct drm_gpuvm_bo 1099 + * @gpuvm: The &drm_gpuvm the @obj is mapped in. 1100 + * @obj: The &drm_gem_object being mapped in the @gpuvm. 1101 + * 1102 + * If provided by the driver, this function uses the &drm_gpuvm_ops 1103 + * vm_bo_alloc() callback to allocate. 1104 + * 1105 + * Returns: a pointer to the &drm_gpuvm_bo on success, NULL on failure 1106 + */ 1107 + struct drm_gpuvm_bo * 1108 + drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm, 1109 + struct drm_gem_object *obj) 1110 + { 1111 + const struct drm_gpuvm_ops *ops = gpuvm->ops; 1112 + struct drm_gpuvm_bo *vm_bo; 1113 + 1114 + if (ops && ops->vm_bo_alloc) 1115 + vm_bo = ops->vm_bo_alloc(); 1116 + else 1117 + vm_bo = kzalloc(sizeof(*vm_bo), GFP_KERNEL); 1118 + 1119 + if (unlikely(!vm_bo)) 1120 + return NULL; 1121 + 1122 + vm_bo->vm = drm_gpuvm_get(gpuvm); 1123 + vm_bo->obj = obj; 1124 + drm_gem_object_get(obj); 1125 + 1126 + kref_init(&vm_bo->kref); 1127 + INIT_LIST_HEAD(&vm_bo->list.gpuva); 1128 + INIT_LIST_HEAD(&vm_bo->list.entry.gem); 1129 + 1130 + INIT_LIST_HEAD(&vm_bo->list.entry.extobj); 1131 + INIT_LIST_HEAD(&vm_bo->list.entry.evict); 1132 + 1133 + return vm_bo; 1134 + } 1135 + EXPORT_SYMBOL_GPL(drm_gpuvm_bo_create); 1136 + 1137 + static void 1138 + drm_gpuvm_bo_destroy(struct kref *kref) 1139 + { 1140 + struct drm_gpuvm_bo *vm_bo = container_of(kref, struct drm_gpuvm_bo, 1141 + kref); 1142 + struct drm_gpuvm *gpuvm = vm_bo->vm; 1143 + const struct drm_gpuvm_ops *ops = gpuvm->ops; 1144 + struct drm_gem_object *obj = vm_bo->obj; 1145 + bool lock = !drm_gpuvm_resv_protected(gpuvm); 1146 + 1147 + if (!lock) 1148 + drm_gpuvm_resv_assert_held(gpuvm); 1149 + 1150 + drm_gpuvm_bo_list_del(vm_bo, extobj, lock); 1151 + drm_gpuvm_bo_list_del(vm_bo, evict, lock); 1152 + 1153 + drm_gem_gpuva_assert_lock_held(obj); 1154 + list_del(&vm_bo->list.entry.gem); 1155 + 1156 + if (ops && ops->vm_bo_free) 1157 + ops->vm_bo_free(vm_bo); 1158 + else 1159 + kfree(vm_bo); 1160 + 1161 + drm_gpuvm_put(gpuvm); 1162 + drm_gem_object_put(obj); 1163 + } 1164 + 1165 + /** 1166 + * drm_gpuvm_bo_put() - drop a struct drm_gpuvm_bo reference 1167 + * @vm_bo: the &drm_gpuvm_bo to release the reference of 1168 + * 1169 + * This releases a reference to @vm_bo. 1170 + * 1171 + * If the reference count drops to zero, the &gpuvm_bo is destroyed, which 1172 + * includes removing it from the GEMs gpuva list. Hence, if a call to this 1173 + * function can potentially let the reference count drop to zero the caller must 1174 + * hold the dma-resv or driver specific GEM gpuva lock. 1175 + * 1176 + * This function may only be called from non-atomic context. 1177 + */ 1178 + void 1179 + drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo) 1180 + { 1181 + might_sleep(); 1182 + 1183 + if (vm_bo) 1184 + kref_put(&vm_bo->kref, drm_gpuvm_bo_destroy); 1185 + } 1186 + EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put); 1187 + 1188 + static struct drm_gpuvm_bo * 1189 + __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm, 1190 + struct drm_gem_object *obj) 1191 + { 1192 + struct drm_gpuvm_bo *vm_bo; 1193 + 1194 + drm_gem_gpuva_assert_lock_held(obj); 1195 + drm_gem_for_each_gpuvm_bo(vm_bo, obj) 1196 + if (vm_bo->vm == gpuvm) 1197 + return vm_bo; 1198 + 1199 + return NULL; 1200 + } 1201 + 1202 + /** 1203 + * drm_gpuvm_bo_find() - find the &drm_gpuvm_bo for the given 1204 + * &drm_gpuvm and &drm_gem_object 1205 + * @gpuvm: The &drm_gpuvm the @obj is mapped in. 1206 + * @obj: The &drm_gem_object being mapped in the @gpuvm. 1207 + * 1208 + * Find the &drm_gpuvm_bo representing the combination of the given 1209 + * &drm_gpuvm and &drm_gem_object. If found, increases the reference 1210 + * count of the &drm_gpuvm_bo accordingly. 1211 + * 1212 + * Returns: a pointer to the &drm_gpuvm_bo on success, NULL on failure 1213 + */ 1214 + struct drm_gpuvm_bo * 1215 + drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm, 1216 + struct drm_gem_object *obj) 1217 + { 1218 + struct drm_gpuvm_bo *vm_bo = __drm_gpuvm_bo_find(gpuvm, obj); 1219 + 1220 + return vm_bo ? drm_gpuvm_bo_get(vm_bo) : NULL; 1221 + } 1222 + EXPORT_SYMBOL_GPL(drm_gpuvm_bo_find); 1223 + 1224 + /** 1225 + * drm_gpuvm_bo_obtain() - obtains and instance of the &drm_gpuvm_bo for the 1226 + * given &drm_gpuvm and &drm_gem_object 1227 + * @gpuvm: The &drm_gpuvm the @obj is mapped in. 1228 + * @obj: The &drm_gem_object being mapped in the @gpuvm. 1229 + * 1230 + * Find the &drm_gpuvm_bo representing the combination of the given 1231 + * &drm_gpuvm and &drm_gem_object. If found, increases the reference 1232 + * count of the &drm_gpuvm_bo accordingly. If not found, allocates a new 1233 + * &drm_gpuvm_bo. 1234 + * 1235 + * A new &drm_gpuvm_bo is added to the GEMs gpuva list. 1236 + * 1237 + * Returns: a pointer to the &drm_gpuvm_bo on success, an ERR_PTR on failure 1238 + */ 1239 + struct drm_gpuvm_bo * 1240 + drm_gpuvm_bo_obtain(struct drm_gpuvm *gpuvm, 1241 + struct drm_gem_object *obj) 1242 + { 1243 + struct drm_gpuvm_bo *vm_bo; 1244 + 1245 + vm_bo = drm_gpuvm_bo_find(gpuvm, obj); 1246 + if (vm_bo) 1247 + return vm_bo; 1248 + 1249 + vm_bo = drm_gpuvm_bo_create(gpuvm, obj); 1250 + if (!vm_bo) 1251 + return ERR_PTR(-ENOMEM); 1252 + 1253 + drm_gem_gpuva_assert_lock_held(obj); 1254 + list_add_tail(&vm_bo->list.entry.gem, &obj->gpuva.list); 1255 + 1256 + return vm_bo; 1257 + } 1258 + EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain); 1259 + 1260 + /** 1261 + * drm_gpuvm_bo_obtain_prealloc() - obtains and instance of the &drm_gpuvm_bo 1262 + * for the given &drm_gpuvm and &drm_gem_object 1263 + * @__vm_bo: A pre-allocated struct drm_gpuvm_bo. 1264 + * 1265 + * Find the &drm_gpuvm_bo representing the combination of the given 1266 + * &drm_gpuvm and &drm_gem_object. If found, increases the reference 1267 + * count of the found &drm_gpuvm_bo accordingly, while the @__vm_bo reference 1268 + * count is decreased. If not found @__vm_bo is returned without further 1269 + * increase of the reference count. 1270 + * 1271 + * A new &drm_gpuvm_bo is added to the GEMs gpuva list. 1272 + * 1273 + * Returns: a pointer to the found &drm_gpuvm_bo or @__vm_bo if no existing 1274 + * &drm_gpuvm_bo was found 1275 + */ 1276 + struct drm_gpuvm_bo * 1277 + drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *__vm_bo) 1278 + { 1279 + struct drm_gpuvm *gpuvm = __vm_bo->vm; 1280 + struct drm_gem_object *obj = __vm_bo->obj; 1281 + struct drm_gpuvm_bo *vm_bo; 1282 + 1283 + vm_bo = drm_gpuvm_bo_find(gpuvm, obj); 1284 + if (vm_bo) { 1285 + drm_gpuvm_bo_put(__vm_bo); 1286 + return vm_bo; 1287 + } 1288 + 1289 + drm_gem_gpuva_assert_lock_held(obj); 1290 + list_add_tail(&__vm_bo->list.entry.gem, &obj->gpuva.list); 1291 + 1292 + return __vm_bo; 1293 + } 1294 + EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc); 1295 + 1296 + /** 1297 + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its &drm_gpuvm's 1298 + * extobj list 1299 + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the extobj list. 1300 + * 1301 + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if not on the list 1302 + * already and if the corresponding &drm_gem_object is an external object, 1303 + * actually. 1304 + */ 1305 + void 1306 + drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo) 1307 + { 1308 + struct drm_gpuvm *gpuvm = vm_bo->vm; 1309 + bool lock = !drm_gpuvm_resv_protected(gpuvm); 1310 + 1311 + if (!lock) 1312 + drm_gpuvm_resv_assert_held(gpuvm); 1313 + 1314 + if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj)) 1315 + drm_gpuvm_bo_list_add(vm_bo, extobj, lock); 1316 + } 1317 + EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add); 1318 + 1319 + /** 1320 + * drm_gpuvm_bo_evict() - add / remove a &drm_gpuvm_bo to / from the &drm_gpuvms 1321 + * evicted list 1322 + * @vm_bo: the &drm_gpuvm_bo to add or remove 1323 + * @evict: indicates whether the object is evicted 1324 + * 1325 + * Adds a &drm_gpuvm_bo to or removes it from the &drm_gpuvms evicted list. 1326 + */ 1327 + void 1328 + drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, bool evict) 1329 + { 1330 + struct drm_gpuvm *gpuvm = vm_bo->vm; 1331 + struct drm_gem_object *obj = vm_bo->obj; 1332 + bool lock = !drm_gpuvm_resv_protected(gpuvm); 1333 + 1334 + dma_resv_assert_held(obj->resv); 1335 + vm_bo->evicted = evict; 1336 + 1337 + /* Can't add external objects to the evicted list directly if not using 1338 + * internal spinlocks, since in this case the evicted list is protected 1339 + * with the VM's common dma-resv lock. 1340 + */ 1341 + if (drm_gpuvm_is_extobj(gpuvm, obj) && !lock) 1342 + return; 1343 + 1344 + if (evict) 1345 + drm_gpuvm_bo_list_add(vm_bo, evict, lock); 1346 + else 1347 + drm_gpuvm_bo_list_del_init(vm_bo, evict, lock); 1348 + } 1349 + EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict); 1047 1350 1048 1351 static int 1049 1352 __drm_gpuva_insert(struct drm_gpuvm *gpuvm, ··· 1723 764 { 1724 765 u64 addr = va->va.addr; 1725 766 u64 range = va->va.range; 767 + int ret; 1726 768 1727 769 if (unlikely(!drm_gpuvm_range_valid(gpuvm, addr, range))) 1728 770 return -EINVAL; 1729 771 1730 - return __drm_gpuva_insert(gpuvm, va); 772 + ret = __drm_gpuva_insert(gpuvm, va); 773 + if (likely(!ret)) 774 + /* Take a reference of the GPUVM for the successfully inserted 775 + * drm_gpuva. We can't take the reference in 776 + * __drm_gpuva_insert() itself, since we don't want to increse 777 + * the reference count for the GPUVM's kernel_alloc_node. 778 + */ 779 + drm_gpuvm_get(gpuvm); 780 + 781 + return ret; 1731 782 } 1732 783 EXPORT_SYMBOL_GPL(drm_gpuva_insert); 1733 784 ··· 1764 795 struct drm_gpuvm *gpuvm = va->vm; 1765 796 1766 797 if (unlikely(va == &gpuvm->kernel_alloc_node)) { 1767 - WARN(1, "Can't destroy kernel reserved node.\n"); 798 + drm_WARN(gpuvm->drm, 1, 799 + "Can't destroy kernel reserved node.\n"); 1768 800 return; 1769 801 } 1770 802 1771 803 __drm_gpuva_remove(va); 804 + drm_gpuvm_put(va->vm); 1772 805 } 1773 806 EXPORT_SYMBOL_GPL(drm_gpuva_remove); 1774 807 1775 808 /** 1776 809 * drm_gpuva_link() - link a &drm_gpuva 1777 810 * @va: the &drm_gpuva to link 811 + * @vm_bo: the &drm_gpuvm_bo to add the &drm_gpuva to 1778 812 * 1779 - * This adds the given &va to the GPU VA list of the &drm_gem_object it is 1780 - * associated with. 813 + * This adds the given &va to the GPU VA list of the &drm_gpuvm_bo and the 814 + * &drm_gpuvm_bo to the &drm_gem_object it is associated with. 815 + * 816 + * For every &drm_gpuva entry added to the &drm_gpuvm_bo an additional 817 + * reference of the latter is taken. 1781 818 * 1782 819 * This function expects the caller to protect the GEM's GPUVA list against 1783 - * concurrent access using the GEMs dma_resv lock. 820 + * concurrent access using either the GEMs dma_resv lock or a driver specific 821 + * lock set through drm_gem_gpuva_set_lock(). 1784 822 */ 1785 823 void 1786 - drm_gpuva_link(struct drm_gpuva *va) 824 + drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuvm_bo *vm_bo) 1787 825 { 1788 826 struct drm_gem_object *obj = va->gem.obj; 827 + struct drm_gpuvm *gpuvm = va->vm; 1789 828 1790 829 if (unlikely(!obj)) 1791 830 return; 1792 831 1793 - drm_gem_gpuva_assert_lock_held(obj); 832 + drm_WARN_ON(gpuvm->drm, obj != vm_bo->obj); 1794 833 1795 - list_add_tail(&va->gem.entry, &obj->gpuva.list); 834 + va->vm_bo = drm_gpuvm_bo_get(vm_bo); 835 + 836 + drm_gem_gpuva_assert_lock_held(obj); 837 + list_add_tail(&va->gem.entry, &vm_bo->list.gpuva); 1796 838 } 1797 839 EXPORT_SYMBOL_GPL(drm_gpuva_link); 1798 840 ··· 1814 834 * This removes the given &va from the GPU VA list of the &drm_gem_object it is 1815 835 * associated with. 1816 836 * 837 + * This removes the given &va from the GPU VA list of the &drm_gpuvm_bo and 838 + * the &drm_gpuvm_bo from the &drm_gem_object it is associated with in case 839 + * this call unlinks the last &drm_gpuva from the &drm_gpuvm_bo. 840 + * 841 + * For every &drm_gpuva entry removed from the &drm_gpuvm_bo a reference of 842 + * the latter is dropped. 843 + * 1817 844 * This function expects the caller to protect the GEM's GPUVA list against 1818 - * concurrent access using the GEMs dma_resv lock. 845 + * concurrent access using either the GEMs dma_resv lock or a driver specific 846 + * lock set through drm_gem_gpuva_set_lock(). 1819 847 */ 1820 848 void 1821 849 drm_gpuva_unlink(struct drm_gpuva *va) 1822 850 { 1823 851 struct drm_gem_object *obj = va->gem.obj; 852 + struct drm_gpuvm_bo *vm_bo = va->vm_bo; 1824 853 1825 854 if (unlikely(!obj)) 1826 855 return; 1827 856 1828 857 drm_gem_gpuva_assert_lock_held(obj); 1829 - 1830 858 list_del_init(&va->gem.entry); 859 + 860 + va->vm_bo = NULL; 861 + drm_gpuvm_bo_put(vm_bo); 1831 862 } 1832 863 EXPORT_SYMBOL_GPL(drm_gpuva_unlink); 1833 864 ··· 1983 992 struct drm_gpuva *next, 1984 993 struct drm_gpuva_op_remap *op) 1985 994 { 1986 - struct drm_gpuva *curr = op->unmap->va; 1987 - struct drm_gpuvm *gpuvm = curr->vm; 995 + struct drm_gpuva *va = op->unmap->va; 996 + struct drm_gpuvm *gpuvm = va->vm; 1988 997 1989 - drm_gpuva_remove(curr); 998 + drm_gpuva_remove(va); 1990 999 1991 1000 if (op->prev) { 1992 1001 drm_gpuva_init_from_op(prev, op->prev); ··· 2628 1637 EXPORT_SYMBOL_GPL(drm_gpuvm_prefetch_ops_create); 2629 1638 2630 1639 /** 2631 - * drm_gpuvm_gem_unmap_ops_create() - creates the &drm_gpuva_ops to unmap a GEM 2632 - * @gpuvm: the &drm_gpuvm representing the GPU VA space 2633 - * @obj: the &drm_gem_object to unmap 1640 + * drm_gpuvm_bo_unmap_ops_create() - creates the &drm_gpuva_ops to unmap a GEM 1641 + * @vm_bo: the &drm_gpuvm_bo abstraction 2634 1642 * 2635 1643 * This function creates a list of operations to perform unmapping for every 2636 1644 * GPUVA attached to a GEM. ··· 2646 1656 * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure 2647 1657 */ 2648 1658 struct drm_gpuva_ops * 2649 - drm_gpuvm_gem_unmap_ops_create(struct drm_gpuvm *gpuvm, 2650 - struct drm_gem_object *obj) 1659 + drm_gpuvm_bo_unmap_ops_create(struct drm_gpuvm_bo *vm_bo) 2651 1660 { 2652 1661 struct drm_gpuva_ops *ops; 2653 1662 struct drm_gpuva_op *op; 2654 1663 struct drm_gpuva *va; 2655 1664 int ret; 2656 1665 2657 - drm_gem_gpuva_assert_lock_held(obj); 1666 + drm_gem_gpuva_assert_lock_held(vm_bo->obj); 2658 1667 2659 1668 ops = kzalloc(sizeof(*ops), GFP_KERNEL); 2660 1669 if (!ops) ··· 2661 1672 2662 1673 INIT_LIST_HEAD(&ops->list); 2663 1674 2664 - drm_gem_for_each_gpuva(va, obj) { 2665 - op = gpuva_op_alloc(gpuvm); 1675 + drm_gpuvm_bo_for_each_va(va, vm_bo) { 1676 + op = gpuva_op_alloc(vm_bo->vm); 2666 1677 if (!op) { 2667 1678 ret = -ENOMEM; 2668 1679 goto err_free_ops; ··· 2676 1687 return ops; 2677 1688 2678 1689 err_free_ops: 2679 - drm_gpuva_ops_free(gpuvm, ops); 1690 + drm_gpuva_ops_free(vm_bo->vm, ops); 2680 1691 return ERR_PTR(ret); 2681 1692 } 2682 - EXPORT_SYMBOL_GPL(drm_gpuvm_gem_unmap_ops_create); 1693 + EXPORT_SYMBOL_GPL(drm_gpuvm_bo_unmap_ops_create); 2683 1694 2684 1695 /** 2685 1696 * drm_gpuva_ops_free() - free the given &drm_gpuva_ops

+6

drivers/gpu/drm/drm_internal.h

··· 22 22 */ 23 23 24 24 #include <linux/kthread.h> 25 + #include <linux/types.h> 25 26 26 27 #include <drm/drm_ioctl.h> 27 28 #include <drm/drm_vblank.h> ··· 32 31 33 32 #define DRM_IF_VERSION(maj, min) (maj << 16 | min) 34 33 34 + struct cea_sad; 35 35 struct dentry; 36 36 struct dma_buf; 37 37 struct iosys_map; ··· 269 267 void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent, 270 268 const struct drm_framebuffer *fb); 271 269 void drm_framebuffer_debugfs_init(struct drm_device *dev); 270 + 271 + /* drm_edid.c */ 272 + void drm_edid_cta_sad_get(const struct cea_sad *cta_sad, u8 *sad); 273 + void drm_edid_cta_sad_set(struct cea_sad *cta_sad, const u8 *sad);

+1

drivers/gpu/drm/drm_ioctl.c

··· 675 675 DRM_IOCTL_DEF(DRM_IOCTL_MODE_ADDFB, drm_mode_addfb_ioctl, 0), 676 676 DRM_IOCTL_DEF(DRM_IOCTL_MODE_ADDFB2, drm_mode_addfb2_ioctl, 0), 677 677 DRM_IOCTL_DEF(DRM_IOCTL_MODE_RMFB, drm_mode_rmfb_ioctl, 0), 678 + DRM_IOCTL_DEF(DRM_IOCTL_MODE_CLOSEFB, drm_mode_closefb_ioctl, 0), 678 679 DRM_IOCTL_DEF(DRM_IOCTL_MODE_PAGE_FLIP, drm_mode_page_flip_ioctl, DRM_MASTER), 679 680 DRM_IOCTL_DEF(DRM_IOCTL_MODE_DIRTYFB, drm_mode_dirtyfb_ioctl, DRM_MASTER), 680 681 DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_DUMB, drm_mode_create_dumb_ioctl, 0),

+12 -7

drivers/gpu/drm/drm_mipi_dbi.c

··· 197 197 * @fb: The source framebuffer 198 198 * @clip: Clipping rectangle of the area to be copied 199 199 * @swap: When true, swap MSB/LSB of 16-bit values 200 + * @fmtcnv_state: Format-conversion state 200 201 * 201 202 * Returns: 202 203 * Zero on success, negative error code on failure. 203 204 */ 204 205 int mipi_dbi_buf_copy(void *dst, struct iosys_map *src, struct drm_framebuffer *fb, 205 - struct drm_rect *clip, bool swap) 206 + struct drm_rect *clip, bool swap, 207 + struct drm_format_conv_state *fmtcnv_state) 206 208 { 207 209 struct drm_gem_object *gem = drm_gem_fb_get_obj(fb, 0); 208 210 struct iosys_map dst_map = IOSYS_MAP_INIT_VADDR(dst); ··· 217 215 switch (fb->format->format) { 218 216 case DRM_FORMAT_RGB565: 219 217 if (swap) 220 - drm_fb_swab(&dst_map, NULL, src, fb, clip, !gem->import_attach); 218 + drm_fb_swab(&dst_map, NULL, src, fb, clip, !gem->import_attach, 219 + fmtcnv_state); 221 220 else 222 221 drm_fb_memcpy(&dst_map, NULL, src, fb, clip); 223 222 break; 224 223 case DRM_FORMAT_XRGB8888: 225 - drm_fb_xrgb8888_to_rgb565(&dst_map, NULL, src, fb, clip, swap); 224 + drm_fb_xrgb8888_to_rgb565(&dst_map, NULL, src, fb, clip, fmtcnv_state, swap); 226 225 break; 227 226 default: 228 227 drm_err_once(fb->dev, "Format is not supported: %p4cc\n", ··· 255 252 } 256 253 257 254 static void mipi_dbi_fb_dirty(struct iosys_map *src, struct drm_framebuffer *fb, 258 - struct drm_rect *rect) 255 + struct drm_rect *rect, struct drm_format_conv_state *fmtcnv_state) 259 256 { 260 257 struct mipi_dbi_dev *dbidev = drm_to_mipi_dbi_dev(fb->dev); 261 258 unsigned int height = rect->y2 - rect->y1; ··· 273 270 if (!dbi->dc || !full || swap || 274 271 fb->format->format == DRM_FORMAT_XRGB8888) { 275 272 tr = dbidev->tx_buf; 276 - ret = mipi_dbi_buf_copy(tr, src, fb, rect, swap); 273 + ret = mipi_dbi_buf_copy(tr, src, fb, rect, swap, fmtcnv_state); 277 274 if (ret) 278 275 goto err_msg; 279 276 } else { ··· 335 332 return; 336 333 337 334 if (drm_atomic_helper_damage_merged(old_state, state, &rect)) 338 - mipi_dbi_fb_dirty(&shadow_plane_state->data[0], fb, &rect); 335 + mipi_dbi_fb_dirty(&shadow_plane_state->data[0], fb, &rect, 336 + &shadow_plane_state->fmtcnv_state); 339 337 340 338 drm_dev_exit(idx); 341 339 } ··· 372 368 if (!drm_dev_enter(&dbidev->drm, &idx)) 373 369 return; 374 370 375 - mipi_dbi_fb_dirty(&shadow_plane_state->data[0], fb, &rect); 371 + mipi_dbi_fb_dirty(&shadow_plane_state->data[0], fb, &rect, 372 + &shadow_plane_state->fmtcnv_state); 376 373 backlight_enable(dbidev->backlight); 377 374 378 375 drm_dev_exit(idx);

+1 -1

drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c

··· 535 535 536 536 ret = drm_sched_job_init(&submit->sched_job, 537 537 &ctx->sched_entity[args->pipe], 538 - submit->ctx); 538 + 1, submit->ctx); 539 539 if (ret) 540 540 goto err_submit_put; 541 541

+1 -1

drivers/gpu/drm/etnaviv/etnaviv_gpu.c

··· 1917 1917 u32 idle, mask; 1918 1918 1919 1919 /* If there are any jobs in the HW queue, we're not idle */ 1920 - if (atomic_read(&gpu->sched.hw_rq_count)) 1920 + if (atomic_read(&gpu->sched.credit_count)) 1921 1921 return -EBUSY; 1922 1922 1923 1923 /* Check whether the hardware (except FE and MC) is idle */

+1 -1

drivers/gpu/drm/etnaviv/etnaviv_sched.c

··· 134 134 { 135 135 int ret; 136 136 137 - ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, 137 + ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL, 138 138 DRM_SCHED_PRIORITY_COUNT, 139 139 etnaviv_hw_jobs_limit, etnaviv_job_hang_limit, 140 140 msecs_to_jiffies(500), NULL, NULL,

+18 -12

drivers/gpu/drm/gud/gud_pipe.c

··· 51 51 52 52 static size_t gud_xrgb8888_to_r124(u8 *dst, const struct drm_format_info *format, 53 53 void *src, struct drm_framebuffer *fb, 54 - struct drm_rect *rect) 54 + struct drm_rect *rect, 55 + struct drm_format_conv_state *fmtcnv_state) 55 56 { 56 57 unsigned int block_width = drm_format_info_block_width(format, 0); 57 58 unsigned int bits_per_pixel = 8 / block_width; ··· 76 75 77 76 iosys_map_set_vaddr(&dst_map, buf); 78 77 iosys_map_set_vaddr(&vmap, src); 79 - drm_fb_xrgb8888_to_gray8(&dst_map, NULL, &vmap, fb, rect); 78 + drm_fb_xrgb8888_to_gray8(&dst_map, NULL, &vmap, fb, rect, fmtcnv_state); 80 79 pix8 = buf; 81 80 82 81 for (y = 0; y < height; y++) { ··· 153 152 static int gud_prep_flush(struct gud_device *gdrm, struct drm_framebuffer *fb, 154 153 const struct iosys_map *src, bool cached_reads, 155 154 const struct drm_format_info *format, struct drm_rect *rect, 156 - struct gud_set_buffer_req *req) 155 + struct gud_set_buffer_req *req, 156 + struct drm_format_conv_state *fmtcnv_state) 157 157 { 158 158 u8 compression = gdrm->compression; 159 159 struct iosys_map dst; ··· 180 178 */ 181 179 if (format != fb->format) { 182 180 if (format->format == GUD_DRM_FORMAT_R1) { 183 - len = gud_xrgb8888_to_r124(buf, format, vaddr, fb, rect); 181 + len = gud_xrgb8888_to_r124(buf, format, vaddr, fb, rect, fmtcnv_state); 184 182 if (!len) 185 183 return -ENOMEM; 186 184 } else if (format->format == DRM_FORMAT_R8) { 187 - drm_fb_xrgb8888_to_gray8(&dst, NULL, src, fb, rect); 185 + drm_fb_xrgb8888_to_gray8(&dst, NULL, src, fb, rect, fmtcnv_state); 188 186 } else if (format->format == DRM_FORMAT_RGB332) { 189 - drm_fb_xrgb8888_to_rgb332(&dst, NULL, src, fb, rect); 187 + drm_fb_xrgb8888_to_rgb332(&dst, NULL, src, fb, rect, fmtcnv_state); 190 188 } else if (format->format == DRM_FORMAT_RGB565) { 191 - drm_fb_xrgb8888_to_rgb565(&dst, NULL, src, fb, rect, 189 + drm_fb_xrgb8888_to_rgb565(&dst, NULL, src, fb, rect, fmtcnv_state, 192 190 gud_is_big_endian()); 193 191 } else if (format->format == DRM_FORMAT_RGB888) { 194 - drm_fb_xrgb8888_to_rgb888(&dst, NULL, src, fb, rect); 192 + drm_fb_xrgb8888_to_rgb888(&dst, NULL, src, fb, rect, fmtcnv_state); 195 193 } else { 196 194 len = gud_xrgb8888_to_color(buf, format, vaddr, fb, rect); 197 195 } 198 196 } else if (gud_is_big_endian() && format->cpp[0] > 1) { 199 - drm_fb_swab(&dst, NULL, src, fb, rect, cached_reads); 197 + drm_fb_swab(&dst, NULL, src, fb, rect, cached_reads, fmtcnv_state); 200 198 } else if (compression && cached_reads && pitch == fb->pitches[0]) { 201 199 /* can compress directly from the framebuffer */ 202 200 buf = vaddr + rect->y1 * pitch; ··· 268 266 269 267 static int gud_flush_rect(struct gud_device *gdrm, struct drm_framebuffer *fb, 270 268 const struct iosys_map *src, bool cached_reads, 271 - const struct drm_format_info *format, struct drm_rect *rect) 269 + const struct drm_format_info *format, struct drm_rect *rect, 270 + struct drm_format_conv_state *fmtcnv_state) 272 271 { 273 272 struct gud_set_buffer_req req; 274 273 size_t len, trlen; ··· 277 274 278 275 drm_dbg(&gdrm->drm, "Flushing [FB:%d] " DRM_RECT_FMT "\n", fb->base.id, DRM_RECT_ARG(rect)); 279 276 280 - ret = gud_prep_flush(gdrm, fb, src, cached_reads, format, rect, &req); 277 + ret = gud_prep_flush(gdrm, fb, src, cached_reads, format, rect, &req, fmtcnv_state); 281 278 if (ret) 282 279 return ret; 283 280 ··· 321 318 const struct iosys_map *src, bool cached_reads, 322 319 struct drm_rect *damage) 323 320 { 321 + struct drm_format_conv_state fmtcnv_state = DRM_FORMAT_CONV_STATE_INIT; 324 322 const struct drm_format_info *format; 325 323 unsigned int i, lines; 326 324 size_t pitch; ··· 344 340 rect.y1 += i * lines; 345 341 rect.y2 = min_t(u32, rect.y1 + lines, damage->y2); 346 342 347 - ret = gud_flush_rect(gdrm, fb, src, cached_reads, format, &rect); 343 + ret = gud_flush_rect(gdrm, fb, src, cached_reads, format, &rect, &fmtcnv_state); 348 344 if (ret) { 349 345 if (ret != -ENODEV && ret != -ECONNRESET && 350 346 ret != -ESHUTDOWN && ret != -EPROTO) ··· 354 350 break; 355 351 } 356 352 } 353 + 354 + drm_format_conv_state_release(&fmtcnv_state); 357 355 } 358 356 359 357 void gud_flush_work(struct work_struct *work)

+1

drivers/gpu/drm/i915/display/intel_audio.c

··· 25 25 #include <linux/kernel.h> 26 26 27 27 #include <drm/drm_edid.h> 28 + #include <drm/drm_eld.h> 28 29 #include <drm/i915_component.h> 29 30 30 31 #include "i915_drv.h"

+1

drivers/gpu/drm/i915/display/intel_crtc_state_dump.c

··· 4 4 */ 5 5 6 6 #include <drm/drm_edid.h> 7 + #include <drm/drm_eld.h> 7 8 8 9 #include "i915_drv.h" 9 10 #include "intel_crtc_state_dump.h"

+1

drivers/gpu/drm/i915/display/intel_sdvo.c

··· 35 35 #include <drm/drm_atomic_helper.h> 36 36 #include <drm/drm_crtc.h> 37 37 #include <drm/drm_edid.h> 38 + #include <drm/drm_eld.h> 38 39 39 40 #include "i915_drv.h" 40 41 #include "i915_reg.h"

+1 -1

drivers/gpu/drm/lima/lima_device.c

··· 514 514 515 515 /* check any task running */ 516 516 for (i = 0; i < lima_pipe_num; i++) { 517 - if (atomic_read(&ldev->pipe[i].base.hw_rq_count)) 517 + if (atomic_read(&ldev->pipe[i].base.credit_count)) 518 518 return -EBUSY; 519 519 } 520 520

+2 -2

drivers/gpu/drm/lima/lima_sched.c

··· 123 123 for (i = 0; i < num_bos; i++) 124 124 drm_gem_object_get(&bos[i]->base.base); 125 125 126 - err = drm_sched_job_init(&task->base, &context->base, vm); 126 + err = drm_sched_job_init(&task->base, &context->base, 1, vm); 127 127 if (err) { 128 128 kfree(task->bos); 129 129 return err; ··· 488 488 489 489 INIT_WORK(&pipe->recover_work, lima_sched_recover_work); 490 490 491 - return drm_sched_init(&pipe->base, &lima_sched_ops, 491 + return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 492 492 DRM_SCHED_PRIORITY_COUNT, 493 493 1, 494 494 lima_job_hang_limit,

+4 -2

drivers/gpu/drm/msm/adreno/adreno_device.c

··· 841 841 */ 842 842 for (i = 0; i < gpu->nr_rings; i++) { 843 843 struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; 844 - kthread_park(sched->thread); 844 + 845 + drm_sched_wqueue_stop(sched); 845 846 } 846 847 } 847 848 ··· 852 851 853 852 for (i = 0; i < gpu->nr_rings; i++) { 854 853 struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; 855 - kthread_unpark(sched->thread); 854 + 855 + drm_sched_wqueue_start(sched); 856 856 } 857 857 } 858 858

+1 -1

drivers/gpu/drm/msm/msm_gem_submit.c

··· 48 48 return ERR_PTR(ret); 49 49 } 50 50 51 - ret = drm_sched_job_init(&submit->base, queue->entity, queue); 51 + ret = drm_sched_job_init(&submit->base, queue->entity, 1, queue); 52 52 if (ret) { 53 53 kfree(submit->hw_fence); 54 54 kfree(submit);

+1 -1

drivers/gpu/drm/msm/msm_ringbuffer.c

··· 94 94 /* currently managing hangcheck ourselves: */ 95 95 sched_timeout = MAX_SCHEDULE_TIMEOUT; 96 96 97 - ret = drm_sched_init(&ring->sched, &msm_sched_ops, 97 + ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL, 98 98 DRM_SCHED_PRIORITY_COUNT, 99 99 num_hw_submissions, 0, sched_timeout, 100 100 NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);

+1

drivers/gpu/drm/nouveau/dispnv50/disp.c

··· 38 38 #include <drm/drm_atomic.h> 39 39 #include <drm/drm_atomic_helper.h> 40 40 #include <drm/drm_edid.h> 41 + #include <drm/drm_eld.h> 41 42 #include <drm/drm_fb_helper.h> 42 43 #include <drm/drm_probe_helper.h> 43 44 #include <drm/drm_vblank.h>

+9 -2

drivers/gpu/drm/nouveau/nouveau_bo.c

··· 148 148 * If nouveau_bo_new() allocated this buffer, the GEM object was never 149 149 * initialized, so don't attempt to release it. 150 150 */ 151 - if (bo->base.dev) 151 + if (bo->base.dev) { 152 + /* Gem objects not being shared with other VMs get their 153 + * dma_resv from a root GEM object. 154 + */ 155 + if (nvbo->no_share) 156 + drm_gem_object_put(nvbo->r_obj); 157 + 152 158 drm_gem_object_release(&bo->base); 153 - else 159 + } else { 154 160 dma_resv_fini(&bo->base._resv); 161 + } 155 162 156 163 kfree(nvbo); 157 164 }

+5

drivers/gpu/drm/nouveau/nouveau_bo.h

··· 26 26 struct list_head entry; 27 27 int pbbo_index; 28 28 bool validate_mapped; 29 + 30 + /* Root GEM object we derive the dma_resv of in case this BO is not 31 + * shared between VMs. 32 + */ 33 + struct drm_gem_object *r_obj; 29 34 bool no_share; 30 35 31 36 /* GPU address space is independent of CPU word size */

+4 -1

drivers/gpu/drm/nouveau/nouveau_drm.c

··· 190 190 static void 191 191 nouveau_cli_fini(struct nouveau_cli *cli) 192 192 { 193 + struct nouveau_uvmm *uvmm = nouveau_cli_uvmm_locked(cli); 194 + 193 195 /* All our channels are dead now, which means all the fences they 194 196 * own are signalled, and all callback functions have been called. 195 197 * ··· 201 199 WARN_ON(!list_empty(&cli->worker)); 202 200 203 201 usif_client_fini(cli); 204 - nouveau_uvmm_fini(&cli->uvmm); 202 + if (uvmm) 203 + nouveau_uvmm_fini(uvmm); 205 204 nouveau_sched_entity_fini(&cli->sched_entity); 206 205 nouveau_vmm_fini(&cli->svm); 207 206 nouveau_vmm_fini(&cli->vmm);

+5 -5

drivers/gpu/drm/nouveau/nouveau_drv.h

··· 93 93 struct nvif_mmu mmu; 94 94 struct nouveau_vmm vmm; 95 95 struct nouveau_vmm svm; 96 - struct nouveau_uvmm uvmm; 96 + struct { 97 + struct nouveau_uvmm *ptr; 98 + bool disabled; 99 + } uvmm; 97 100 98 101 struct nouveau_sched_entity sched_entity; 99 102 ··· 124 121 static inline struct nouveau_uvmm * 125 122 nouveau_cli_uvmm(struct nouveau_cli *cli) 126 123 { 127 - if (!cli || !cli->uvmm.vmm.cli) 128 - return NULL; 129 - 130 - return &cli->uvmm; 124 + return cli ? cli->uvmm.ptr : NULL; 131 125 } 132 126 133 127 static inline struct nouveau_uvmm *

+8 -2

drivers/gpu/drm/nouveau/nouveau_gem.c

··· 111 111 if (vmm->vmm.object.oclass < NVIF_CLASS_VMM_NV50) 112 112 return 0; 113 113 114 - if (nvbo->no_share && uvmm && &uvmm->resv != nvbo->bo.base.resv) 114 + if (nvbo->no_share && uvmm && 115 + drm_gpuvm_resv(&uvmm->base) != nvbo->bo.base.resv) 115 116 return -EPERM; 116 117 117 118 ret = ttm_bo_reserve(&nvbo->bo, false, false, NULL); ··· 246 245 if (unlikely(!uvmm)) 247 246 return -EINVAL; 248 247 249 - resv = &uvmm->resv; 248 + resv = drm_gpuvm_resv(&uvmm->base); 250 249 } 251 250 252 251 if (!(domain & (NOUVEAU_GEM_DOMAIN_VRAM | NOUVEAU_GEM_DOMAIN_GART))) ··· 288 287 NOUVEAU_GEM_DOMAIN_GART; 289 288 if (drm->client.device.info.family >= NV_DEVICE_INFO_V0_TESLA) 290 289 nvbo->valid_domains &= domain; 290 + 291 + if (nvbo->no_share) { 292 + nvbo->r_obj = drm_gpuvm_resv_obj(&uvmm->base); 293 + drm_gem_object_get(nvbo->r_obj); 294 + } 291 295 292 296 *pnvbo = nvbo; 293 297 return 0;

+2 -2

drivers/gpu/drm/nouveau/nouveau_sched.c

··· 89 89 90 90 } 91 91 92 - ret = drm_sched_job_init(&job->base, &entity->base, NULL); 92 + ret = drm_sched_job_init(&job->base, &entity->base, 1, NULL); 93 93 if (ret) 94 94 goto err_free_chains; 95 95 ··· 435 435 if (!drm->sched_wq) 436 436 return -ENOMEM; 437 437 438 - return drm_sched_init(sched, &nouveau_sched_ops, 438 + return drm_sched_init(sched, &nouveau_sched_ops, NULL, 439 439 DRM_SCHED_PRIORITY_COUNT, 440 440 NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit, 441 441 NULL, NULL, "nouveau_sched", drm->dev->dev);

+101 -73

drivers/gpu/drm/nouveau/nouveau_uvmm.c

··· 62 62 enum vm_bind_op op; 63 63 u32 flags; 64 64 65 + struct drm_gpuvm_bo *vm_bo; 66 + 65 67 struct { 66 68 u64 addr; 67 69 u64 range; ··· 931 929 static int 932 930 nouveau_uvmm_validate_range(struct nouveau_uvmm *uvmm, u64 addr, u64 range) 933 931 { 934 - u64 end = addr + range; 935 - u64 kernel_managed_end = uvmm->kernel_managed_addr + 936 - uvmm->kernel_managed_size; 937 - 938 932 if (addr & ~PAGE_MASK) 939 933 return -EINVAL; 940 934 941 935 if (range & ~PAGE_MASK) 942 936 return -EINVAL; 943 937 944 - if (end <= addr) 945 - return -EINVAL; 946 - 947 - if (addr < NOUVEAU_VA_SPACE_START || 948 - end > NOUVEAU_VA_SPACE_END) 949 - return -EINVAL; 950 - 951 - if (addr < kernel_managed_end && 952 - end > uvmm->kernel_managed_addr) 938 + if (!drm_gpuvm_range_valid(&uvmm->base, addr, range)) 953 939 return -EINVAL; 954 940 955 941 return 0; ··· 1103 1113 } 1104 1114 1105 1115 static void 1106 - bind_link_gpuvas(struct drm_gpuva_ops *ops, struct nouveau_uvma_prealloc *new) 1116 + bind_link_gpuvas(struct bind_job_op *bop) 1107 1117 { 1118 + struct nouveau_uvma_prealloc *new = &bop->new; 1119 + struct drm_gpuvm_bo *vm_bo = bop->vm_bo; 1120 + struct drm_gpuva_ops *ops = bop->ops; 1108 1121 struct drm_gpuva_op *op; 1109 1122 1110 1123 drm_gpuva_for_each_op(op, ops) { 1111 1124 switch (op->op) { 1112 1125 case DRM_GPUVA_OP_MAP: 1113 - drm_gpuva_link(&new->map->va); 1126 + drm_gpuva_link(&new->map->va, vm_bo); 1114 1127 break; 1115 - case DRM_GPUVA_OP_REMAP: 1128 + case DRM_GPUVA_OP_REMAP: { 1129 + struct drm_gpuva *va = op->remap.unmap->va; 1130 + 1116 1131 if (op->remap.prev) 1117 - drm_gpuva_link(&new->prev->va); 1132 + drm_gpuva_link(&new->prev->va, va->vm_bo); 1118 1133 if (op->remap.next) 1119 - drm_gpuva_link(&new->next->va); 1120 - drm_gpuva_unlink(op->remap.unmap->va); 1134 + drm_gpuva_link(&new->next->va, va->vm_bo); 1135 + drm_gpuva_unlink(va); 1121 1136 break; 1137 + } 1122 1138 case DRM_GPUVA_OP_UNMAP: 1123 1139 drm_gpuva_unlink(op->unmap.va); 1124 1140 break; ··· 1146 1150 1147 1151 list_for_each_op(op, &bind_job->ops) { 1148 1152 if (op->op == OP_MAP) { 1149 - op->gem.obj = drm_gem_object_lookup(job->file_priv, 1150 - op->gem.handle); 1151 - if (!op->gem.obj) 1153 + struct drm_gem_object *obj = op->gem.obj = 1154 + drm_gem_object_lookup(job->file_priv, 1155 + op->gem.handle); 1156 + if (!obj) 1152 1157 return -ENOENT; 1158 + 1159 + dma_resv_lock(obj->resv, NULL); 1160 + op->vm_bo = drm_gpuvm_bo_obtain(&uvmm->base, obj); 1161 + dma_resv_unlock(obj->resv); 1162 + if (IS_ERR(op->vm_bo)) 1163 + return PTR_ERR(op->vm_bo); 1153 1164 } 1154 1165 1155 1166 ret = bind_validate_op(job, op); ··· 1367 1364 case OP_UNMAP_SPARSE: 1368 1365 case OP_MAP: 1369 1366 case OP_UNMAP: 1370 - bind_link_gpuvas(op->ops, &op->new); 1367 + bind_link_gpuvas(op); 1371 1368 break; 1372 1369 default: 1373 1370 break; ··· 1514 1511 if (!IS_ERR_OR_NULL(op->ops)) 1515 1512 drm_gpuva_ops_free(&uvmm->base, op->ops); 1516 1513 1514 + if (!IS_ERR_OR_NULL(op->vm_bo)) { 1515 + dma_resv_lock(obj->resv, NULL); 1516 + drm_gpuvm_bo_put(op->vm_bo); 1517 + dma_resv_unlock(obj->resv); 1518 + } 1519 + 1517 1520 if (obj) 1518 1521 drm_gem_object_put(obj); 1519 1522 } ··· 1657 1648 return ret; 1658 1649 } 1659 1650 1660 - int 1661 - nouveau_uvmm_ioctl_vm_init(struct drm_device *dev, 1662 - void *data, 1663 - struct drm_file *file_priv) 1664 - { 1665 - struct nouveau_cli *cli = nouveau_cli(file_priv); 1666 - struct drm_nouveau_vm_init *init = data; 1667 - 1668 - return nouveau_uvmm_init(&cli->uvmm, cli, init->kernel_managed_addr, 1669 - init->kernel_managed_size); 1670 - } 1671 - 1672 1651 static int 1673 1652 nouveau_uvmm_vm_bind(struct nouveau_uvmm_bind_job_args *args) 1674 1653 { ··· 1773 1776 nouveau_uvmm_bo_map_all(struct nouveau_bo *nvbo, struct nouveau_mem *mem) 1774 1777 { 1775 1778 struct drm_gem_object *obj = &nvbo->bo.base; 1779 + struct drm_gpuvm_bo *vm_bo; 1776 1780 struct drm_gpuva *va; 1777 1781 1778 1782 dma_resv_assert_held(obj->resv); 1779 1783 1780 - drm_gem_for_each_gpuva(va, obj) { 1781 - struct nouveau_uvma *uvma = uvma_from_va(va); 1784 + drm_gem_for_each_gpuvm_bo(vm_bo, obj) { 1785 + drm_gpuvm_bo_for_each_va(va, vm_bo) { 1786 + struct nouveau_uvma *uvma = uvma_from_va(va); 1782 1787 1783 - nouveau_uvma_map(uvma, mem); 1784 - drm_gpuva_invalidate(va, false); 1788 + nouveau_uvma_map(uvma, mem); 1789 + drm_gpuva_invalidate(va, false); 1790 + } 1785 1791 } 1786 1792 } 1787 1793 ··· 1792 1792 nouveau_uvmm_bo_unmap_all(struct nouveau_bo *nvbo) 1793 1793 { 1794 1794 struct drm_gem_object *obj = &nvbo->bo.base; 1795 + struct drm_gpuvm_bo *vm_bo; 1795 1796 struct drm_gpuva *va; 1796 1797 1797 1798 dma_resv_assert_held(obj->resv); 1798 1799 1799 - drm_gem_for_each_gpuva(va, obj) { 1800 - struct nouveau_uvma *uvma = uvma_from_va(va); 1800 + drm_gem_for_each_gpuvm_bo(vm_bo, obj) { 1801 + drm_gpuvm_bo_for_each_va(va, vm_bo) { 1802 + struct nouveau_uvma *uvma = uvma_from_va(va); 1801 1803 1802 - nouveau_uvma_unmap(uvma); 1803 - drm_gpuva_invalidate(va, true); 1804 + nouveau_uvma_unmap(uvma); 1805 + drm_gpuva_invalidate(va, true); 1806 + } 1804 1807 } 1805 1808 } 1806 1809 1807 - int 1808 - nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli, 1809 - u64 kernel_managed_addr, u64 kernel_managed_size) 1810 + static void 1811 + nouveau_uvmm_free(struct drm_gpuvm *gpuvm) 1810 1812 { 1811 - int ret; 1812 - u64 kernel_managed_end = kernel_managed_addr + kernel_managed_size; 1813 + struct nouveau_uvmm *uvmm = uvmm_from_gpuvm(gpuvm); 1813 1814 1814 - mutex_init(&uvmm->mutex); 1815 - dma_resv_init(&uvmm->resv); 1816 - mt_init_flags(&uvmm->region_mt, MT_FLAGS_LOCK_EXTERN); 1817 - mt_set_external_lock(&uvmm->region_mt, &uvmm->mutex); 1815 + kfree(uvmm); 1816 + } 1817 + 1818 + static const struct drm_gpuvm_ops gpuvm_ops = { 1819 + .vm_free = nouveau_uvmm_free, 1820 + }; 1821 + 1822 + int 1823 + nouveau_uvmm_ioctl_vm_init(struct drm_device *dev, 1824 + void *data, 1825 + struct drm_file *file_priv) 1826 + { 1827 + struct nouveau_uvmm *uvmm; 1828 + struct nouveau_cli *cli = nouveau_cli(file_priv); 1829 + struct drm_device *drm = cli->drm->dev; 1830 + struct drm_gem_object *r_obj; 1831 + struct drm_nouveau_vm_init *init = data; 1832 + u64 kernel_managed_end; 1833 + int ret; 1834 + 1835 + if (check_add_overflow(init->kernel_managed_addr, 1836 + init->kernel_managed_size, 1837 + &kernel_managed_end)) 1838 + return -EINVAL; 1839 + 1840 + if (kernel_managed_end > NOUVEAU_VA_SPACE_END) 1841 + return -EINVAL; 1818 1842 1819 1843 mutex_lock(&cli->mutex); 1820 1844 ··· 1847 1823 goto out_unlock; 1848 1824 } 1849 1825 1850 - if (kernel_managed_end <= kernel_managed_addr) { 1851 - ret = -EINVAL; 1826 + uvmm = kzalloc(sizeof(*uvmm), GFP_KERNEL); 1827 + if (!uvmm) { 1828 + ret = -ENOMEM; 1852 1829 goto out_unlock; 1853 1830 } 1854 1831 1855 - if (kernel_managed_end > NOUVEAU_VA_SPACE_END) { 1856 - ret = -EINVAL; 1832 + r_obj = drm_gpuvm_resv_object_alloc(drm); 1833 + if (!r_obj) { 1834 + kfree(uvmm); 1835 + ret = -ENOMEM; 1857 1836 goto out_unlock; 1858 1837 } 1859 1838 1860 - uvmm->kernel_managed_addr = kernel_managed_addr; 1861 - uvmm->kernel_managed_size = kernel_managed_size; 1839 + mutex_init(&uvmm->mutex); 1840 + mt_init_flags(&uvmm->region_mt, MT_FLAGS_LOCK_EXTERN); 1841 + mt_set_external_lock(&uvmm->region_mt, &uvmm->mutex); 1862 1842 1863 - drm_gpuvm_init(&uvmm->base, cli->name, 1843 + drm_gpuvm_init(&uvmm->base, cli->name, 0, drm, r_obj, 1864 1844 NOUVEAU_VA_SPACE_START, 1865 1845 NOUVEAU_VA_SPACE_END, 1866 - kernel_managed_addr, kernel_managed_size, 1867 - NULL); 1846 + init->kernel_managed_addr, 1847 + init->kernel_managed_size, 1848 + &gpuvm_ops); 1849 + /* GPUVM takes care from here on. */ 1850 + drm_gem_object_put(r_obj); 1868 1851 1869 1852 ret = nvif_vmm_ctor(&cli->mmu, "uvmm", 1870 1853 cli->vmm.vmm.object.oclass, RAW, 1871 - kernel_managed_addr, kernel_managed_size, 1872 - NULL, 0, &cli->uvmm.vmm.vmm); 1854 + init->kernel_managed_addr, 1855 + init->kernel_managed_size, 1856 + NULL, 0, &uvmm->vmm.vmm); 1873 1857 if (ret) 1874 - goto out_free_gpuva_mgr; 1858 + goto out_gpuvm_fini; 1875 1859 1876 - cli->uvmm.vmm.cli = cli; 1860 + uvmm->vmm.cli = cli; 1861 + cli->uvmm.ptr = uvmm; 1877 1862 mutex_unlock(&cli->mutex); 1878 1863 1879 1864 return 0; 1880 1865 1881 - out_free_gpuva_mgr: 1882 - drm_gpuvm_destroy(&uvmm->base); 1866 + out_gpuvm_fini: 1867 + drm_gpuvm_put(&uvmm->base); 1883 1868 out_unlock: 1884 1869 mutex_unlock(&cli->mutex); 1885 1870 return ret; ··· 1902 1869 struct nouveau_cli *cli = uvmm->vmm.cli; 1903 1870 struct nouveau_sched_entity *entity = &cli->sched_entity; 1904 1871 struct drm_gpuva *va, *next; 1905 - 1906 - if (!cli) 1907 - return; 1908 1872 1909 1873 rmb(); /* for list_empty to work without lock */ 1910 1874 wait_event(entity->job.wq, list_empty(&entity->job.list.head)); ··· 1940 1910 1941 1911 mutex_lock(&cli->mutex); 1942 1912 nouveau_vmm_fini(&uvmm->vmm); 1943 - drm_gpuvm_destroy(&uvmm->base); 1913 + drm_gpuvm_put(&uvmm->base); 1944 1914 mutex_unlock(&cli->mutex); 1945 - 1946 - dma_resv_fini(&uvmm->resv); 1947 1915 }

-8

drivers/gpu/drm/nouveau/nouveau_uvmm.h

··· 12 12 struct nouveau_vmm vmm; 13 13 struct maple_tree region_mt; 14 14 struct mutex mutex; 15 - struct dma_resv resv; 16 - 17 - u64 kernel_managed_addr; 18 - u64 kernel_managed_size; 19 - 20 - bool disabled; 21 15 }; 22 16 23 17 struct nouveau_uvma_region { ··· 76 82 77 83 #define to_uvmm_bind_job(job) container_of((job), struct nouveau_uvmm_bind_job, base) 78 84 79 - int nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli, 80 - u64 kernel_managed_addr, u64 kernel_managed_size); 81 85 void nouveau_uvmm_fini(struct nouveau_uvmm *uvmm); 82 86 83 87 void nouveau_uvmm_bo_map_all(struct nouveau_bo *nvbov, struct nouveau_mem *mem);

+4 -5

drivers/gpu/drm/omapdrm/omap_drv.c

··· 69 69 { 70 70 struct drm_device *dev = old_state->dev; 71 71 struct omap_drm_private *priv = dev->dev_private; 72 - bool fence_cookie = dma_fence_begin_signalling(); 73 72 74 73 dispc_runtime_get(priv->dispc); 75 74 ··· 91 92 omap_atomic_wait_for_completion(dev, old_state); 92 93 93 94 drm_atomic_helper_commit_planes(dev, old_state, 0); 95 + 96 + drm_atomic_helper_commit_hw_done(old_state); 94 97 } else { 95 98 /* 96 99 * OMAP3 DSS seems to have issues with the work-around above, ··· 102 101 drm_atomic_helper_commit_planes(dev, old_state, 0); 103 102 104 103 drm_atomic_helper_commit_modeset_enables(dev, old_state); 104 + 105 + drm_atomic_helper_commit_hw_done(old_state); 105 106 } 106 - 107 - drm_atomic_helper_commit_hw_done(old_state); 108 - 109 - dma_fence_end_signalling(fence_cookie); 110 107 111 108 /* 112 109 * Wait for completion of the page flips to ensure that old buffers

+57 -1

drivers/gpu/drm/panel/panel-edp.c

··· 973 973 }, 974 974 .delay = { 975 975 .hpd_absent = 200, 976 + .unprepare = 500, 977 + .enable = 50, 976 978 }, 977 979 }; 978 980 ··· 1803 1801 .enable = 50, 1804 1802 }; 1805 1803 1804 + static const struct panel_delay delay_200_500_e80 = { 1805 + .hpd_absent = 200, 1806 + .unprepare = 500, 1807 + .enable = 80, 1808 + }; 1809 + 1806 1810 static const struct panel_delay delay_200_500_e80_d50 = { 1807 1811 .hpd_absent = 200, 1808 1812 .unprepare = 500, ··· 1825 1817 static const struct panel_delay delay_200_500_e200 = { 1826 1818 .hpd_absent = 200, 1827 1819 .unprepare = 500, 1820 + .enable = 200, 1821 + }; 1822 + 1823 + static const struct panel_delay delay_200_500_e200_d10 = { 1824 + .hpd_absent = 200, 1825 + .unprepare = 500, 1826 + .enable = 200, 1827 + .disable = 10, 1828 + }; 1829 + 1830 + static const struct panel_delay delay_200_150_e200 = { 1831 + .hpd_absent = 200, 1832 + .unprepare = 150, 1828 1833 .enable = 200, 1829 1834 }; 1830 1835 ··· 1861 1840 EDP_PANEL_ENTRY('A', 'U', 'O', 0x145c, &delay_200_500_e50, "B116XAB01.4"), 1862 1841 EDP_PANEL_ENTRY('A', 'U', 'O', 0x1e9b, &delay_200_500_e50, "B133UAN02.1"), 1863 1842 EDP_PANEL_ENTRY('A', 'U', 'O', 0x1ea5, &delay_200_500_e50, "B116XAK01.6"), 1864 - EDP_PANEL_ENTRY('A', 'U', 'O', 0x405c, &auo_b116xak01.delay, "B116XAK01"), 1843 + EDP_PANEL_ENTRY('A', 'U', 'O', 0x208d, &delay_200_500_e50, "B140HTN02.1"), 1844 + EDP_PANEL_ENTRY('A', 'U', 'O', 0x235c, &delay_200_500_e50, "B116XTN02.3"), 1845 + EDP_PANEL_ENTRY('A', 'U', 'O', 0x239b, &delay_200_500_e50, "B116XAN06.1"), 1846 + EDP_PANEL_ENTRY('A', 'U', 'O', 0x255c, &delay_200_500_e50, "B116XTN02.5"), 1847 + EDP_PANEL_ENTRY('A', 'U', 'O', 0x403d, &delay_200_500_e50, "B140HAN04.0"), 1848 + EDP_PANEL_ENTRY('A', 'U', 'O', 0x405c, &auo_b116xak01.delay, "B116XAK01.0"), 1865 1849 EDP_PANEL_ENTRY('A', 'U', 'O', 0x582d, &delay_200_500_e50, "B133UAN01.0"), 1866 1850 EDP_PANEL_ENTRY('A', 'U', 'O', 0x615c, &delay_200_500_e50, "B116XAN06.1"), 1851 + EDP_PANEL_ENTRY('A', 'U', 'O', 0x635c, &delay_200_500_e50, "B116XAN06.3"), 1852 + EDP_PANEL_ENTRY('A', 'U', 'O', 0x639c, &delay_200_500_e50, "B140HAK02.7"), 1867 1853 EDP_PANEL_ENTRY('A', 'U', 'O', 0x8594, &delay_200_500_e50, "B133UAN01.0"), 1854 + EDP_PANEL_ENTRY('A', 'U', 'O', 0xf390, &delay_200_500_e50, "B140XTN07.7"), 1868 1855 1856 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x0715, &delay_200_150_e200, "NT116WHM-N21"), 1857 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x0731, &delay_200_500_e80, "NT116WHM-N42"), 1858 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x0741, &delay_200_500_e200, "NT116WHM-N44"), 1869 1859 EDP_PANEL_ENTRY('B', 'O', 'E', 0x0786, &delay_200_500_p2e80, "NV116WHM-T01"), 1870 1860 EDP_PANEL_ENTRY('B', 'O', 'E', 0x07d1, &boe_nv133fhm_n61.delay, "NV133FHM-N61"), 1861 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x07f6, &delay_200_500_e200, "NT140FHM-N44"), 1871 1862 EDP_PANEL_ENTRY('B', 'O', 'E', 0x082d, &boe_nv133fhm_n61.delay, "NV133FHM-N62"), 1863 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x08b2, &delay_200_500_e200, "NT140WHM-N49"), 1864 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x09c3, &delay_200_500_e50, "NT116WHM-N21,836X2"), 1872 1865 EDP_PANEL_ENTRY('B', 'O', 'E', 0x094b, &delay_200_500_e50, "NT116WHM-N21"), 1866 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x0951, &delay_200_500_e80, "NV116WHM-N47"), 1873 1867 EDP_PANEL_ENTRY('B', 'O', 'E', 0x095f, &delay_200_500_e50, "NE135FBM-N41 v8.1"), 1868 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x0979, &delay_200_500_e50, "NV116WHM-N49 V8.0"), 1874 1869 EDP_PANEL_ENTRY('B', 'O', 'E', 0x098d, &boe_nv110wtm_n61.delay, "NV110WTM-N61"), 1870 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x09ae, &delay_200_500_e200, "NT140FHM-N45"), 1875 1871 EDP_PANEL_ENTRY('B', 'O', 'E', 0x09dd, &delay_200_500_e50, "NT116WHM-N21"), 1876 1872 EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a5d, &delay_200_500_e50, "NV116WHM-N45"), 1877 1873 EDP_PANEL_ENTRY('B', 'O', 'E', 0x0ac5, &delay_200_500_e50, "NV116WHM-N4C"), 1874 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x0b43, &delay_200_500_e200, "NV140FHM-T09"), 1875 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x0b56, &delay_200_500_e80, "NT140FHM-N47"), 1876 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x0c20, &delay_200_500_e80, "NT140FHM-N47"), 1878 1877 1878 + EDP_PANEL_ENTRY('C', 'M', 'N', 0x1132, &delay_200_500_e80_d50, "N116BGE-EA2"), 1879 + EDP_PANEL_ENTRY('C', 'M', 'N', 0x1138, &innolux_n116bca_ea1.delay, "N116BCA-EA1-RC4"), 1879 1880 EDP_PANEL_ENTRY('C', 'M', 'N', 0x1139, &delay_200_500_e80_d50, "N116BGE-EA2"), 1881 + EDP_PANEL_ENTRY('C', 'M', 'N', 0x1145, &delay_200_500_e80_d50, "N116BCN-EB1"), 1880 1882 EDP_PANEL_ENTRY('C', 'M', 'N', 0x114c, &innolux_n116bca_ea1.delay, "N116BCA-EA1"), 1881 1883 EDP_PANEL_ENTRY('C', 'M', 'N', 0x1152, &delay_200_500_e80_d50, "N116BCN-EA1"), 1882 1884 EDP_PANEL_ENTRY('C', 'M', 'N', 0x1153, &delay_200_500_e80_d50, "N116BGE-EA2"), 1883 1885 EDP_PANEL_ENTRY('C', 'M', 'N', 0x1154, &delay_200_500_e80_d50, "N116BCA-EA2"), 1886 + EDP_PANEL_ENTRY('C', 'M', 'N', 0x1157, &delay_200_500_e80_d50, "N116BGE-EA2"), 1887 + EDP_PANEL_ENTRY('C', 'M', 'N', 0x115b, &delay_200_500_e80_d50, "N116BCN-EB1"), 1884 1888 EDP_PANEL_ENTRY('C', 'M', 'N', 0x1247, &delay_200_500_e80_d50, "N120ACA-EA1"), 1889 + EDP_PANEL_ENTRY('C', 'M', 'N', 0x142b, &delay_200_500_e80_d50, "N140HCA-EAC"), 1890 + EDP_PANEL_ENTRY('C', 'M', 'N', 0x144f, &delay_200_500_e80_d50, "N140HGA-EA1"), 1891 + EDP_PANEL_ENTRY('C', 'M', 'N', 0x1468, &delay_200_500_e80, "N140HGA-EA1"), 1892 + EDP_PANEL_ENTRY('C', 'M', 'N', 0x14e5, &delay_200_500_e80_d50, "N140HGA-EA1"), 1885 1893 EDP_PANEL_ENTRY('C', 'M', 'N', 0x14d4, &delay_200_500_e80_d50, "N140HCA-EAC"), 1894 + EDP_PANEL_ENTRY('C', 'M', 'N', 0x14d6, &delay_200_500_e80_d50, "N140BGA-EA4"), 1886 1895 1896 + EDP_PANEL_ENTRY('H', 'K', 'C', 0x2d5c, &delay_200_500_e200, "MB116AN01-2"), 1897 + 1898 + EDP_PANEL_ENTRY('I', 'V', 'O', 0x048e, &delay_200_500_e200_d10, "M116NWR6 R5"), 1887 1899 EDP_PANEL_ENTRY('I', 'V', 'O', 0x057d, &delay_200_500_e200, "R140NWF5 RH"), 1888 1900 EDP_PANEL_ENTRY('I', 'V', 'O', 0x854a, &delay_200_500_p2e100, "M133NW4J"), 1889 1901 EDP_PANEL_ENTRY('I', 'V', 'O', 0x854b, &delay_200_500_p2e100, "R133NW4K-R0"), 1902 + EDP_PANEL_ENTRY('I', 'V', 'O', 0x8c4d, &delay_200_150_e200, "R140NWFM R1"), 1890 1903 1891 1904 EDP_PANEL_ENTRY('K', 'D', 'B', 0x0624, &kingdisplay_kd116n21_30nv_a010.delay, "116N21-30NV-A010"), 1905 + EDP_PANEL_ENTRY('K', 'D', 'C', 0x0809, &delay_200_500_e50, "KD116N2930A15"), 1892 1906 EDP_PANEL_ENTRY('K', 'D', 'B', 0x1120, &delay_200_500_e80_d50, "116N29-30NK-C007"), 1893 1907 1894 1908 EDP_PANEL_ENTRY('S', 'H', 'P', 0x1511, &delay_200_500_e50, "LQ140M1JW48"),

+1 -1

drivers/gpu/drm/panel/panel-novatek-nt35510.c

··· 1023 1023 .hdisplay = 480, 1024 1024 .hsync_start = 480 + 2, /* HFP = 2 */ 1025 1025 .hsync_end = 480 + 2 + 0, /* HSync = 0 */ 1026 - .htotal = 480 + 2 + 0 + 5, /* HFP = 5 */ 1026 + .htotal = 480 + 2 + 0 + 5, /* HBP = 5 */ 1027 1027 .vdisplay = 800, 1028 1028 .vsync_start = 800 + 2, /* VFP = 2 */ 1029 1029 .vsync_end = 800 + 2 + 0, /* VSync = 0 */

+74 -4

drivers/gpu/drm/panfrost/panfrost_device.c

··· 403 403 panfrost_job_enable_interrupts(pfdev); 404 404 } 405 405 406 - static int panfrost_device_resume(struct device *dev) 406 + static int panfrost_device_runtime_resume(struct device *dev) 407 407 { 408 408 struct panfrost_device *pfdev = dev_get_drvdata(dev); 409 409 ··· 413 413 return 0; 414 414 } 415 415 416 - static int panfrost_device_suspend(struct device *dev) 416 + static int panfrost_device_runtime_suspend(struct device *dev) 417 417 { 418 418 struct panfrost_device *pfdev = dev_get_drvdata(dev); 419 419 ··· 426 426 return 0; 427 427 } 428 428 429 - EXPORT_GPL_RUNTIME_DEV_PM_OPS(panfrost_pm_ops, panfrost_device_suspend, 430 - panfrost_device_resume, NULL); 429 + static int panfrost_device_resume(struct device *dev) 430 + { 431 + struct panfrost_device *pfdev = dev_get_drvdata(dev); 432 + int ret; 433 + 434 + if (pfdev->comp->pm_features & BIT(GPU_PM_VREG_OFF)) { 435 + unsigned long freq = pfdev->pfdevfreq.fast_rate; 436 + struct dev_pm_opp *opp; 437 + 438 + opp = dev_pm_opp_find_freq_ceil(dev, &freq); 439 + if (IS_ERR(opp)) 440 + return PTR_ERR(opp); 441 + dev_pm_opp_set_opp(dev, opp); 442 + dev_pm_opp_put(opp); 443 + } 444 + 445 + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { 446 + ret = clk_enable(pfdev->clock); 447 + if (ret) 448 + goto err_clk; 449 + 450 + if (pfdev->bus_clock) { 451 + ret = clk_enable(pfdev->bus_clock); 452 + if (ret) 453 + goto err_bus_clk; 454 + } 455 + } 456 + 457 + ret = pm_runtime_force_resume(dev); 458 + if (ret) 459 + goto err_resume; 460 + 461 + return 0; 462 + 463 + err_resume: 464 + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS) && pfdev->bus_clock) 465 + clk_disable(pfdev->bus_clock); 466 + err_bus_clk: 467 + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) 468 + clk_disable(pfdev->clock); 469 + err_clk: 470 + if (pfdev->comp->pm_features & BIT(GPU_PM_VREG_OFF)) 471 + dev_pm_opp_set_opp(dev, NULL); 472 + return ret; 473 + } 474 + 475 + static int panfrost_device_suspend(struct device *dev) 476 + { 477 + struct panfrost_device *pfdev = dev_get_drvdata(dev); 478 + int ret; 479 + 480 + ret = pm_runtime_force_suspend(dev); 481 + if (ret) 482 + return ret; 483 + 484 + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { 485 + if (pfdev->bus_clock) 486 + clk_disable(pfdev->bus_clock); 487 + 488 + clk_disable(pfdev->clock); 489 + } 490 + 491 + if (pfdev->comp->pm_features & BIT(GPU_PM_VREG_OFF)) 492 + dev_pm_opp_set_opp(dev, NULL); 493 + 494 + return 0; 495 + } 496 + 497 + EXPORT_GPL_DEV_PM_OPS(panfrost_pm_ops) = { 498 + RUNTIME_PM_OPS(panfrost_device_runtime_suspend, panfrost_device_runtime_resume, NULL) 499 + SYSTEM_SLEEP_PM_OPS(panfrost_device_suspend, panfrost_device_resume) 500 + };

+13

drivers/gpu/drm/panfrost/panfrost_device.h

··· 25 25 #define NUM_JOB_SLOTS 3 26 26 #define MAX_PM_DOMAINS 5 27 27 28 + /** 29 + * enum panfrost_gpu_pm - Supported kernel power management features 30 + * @GPU_PM_CLK_DIS: Allow disabling clocks during system suspend 31 + * @GPU_PM_VREG_OFF: Allow turning off regulators during system suspend 32 + */ 33 + enum panfrost_gpu_pm { 34 + GPU_PM_CLK_DIS, 35 + GPU_PM_VREG_OFF, 36 + }; 37 + 28 38 struct panfrost_features { 29 39 u16 id; 30 40 u16 revision; ··· 85 75 86 76 /* Vendor implementation quirks callback */ 87 77 void (*vendor_quirk)(struct panfrost_device *pfdev); 78 + 79 + /* Allowed PM features */ 80 + u8 pm_features; 88 81 }; 89 82 90 83 struct panfrost_device {

+4 -1

drivers/gpu/drm/panfrost/panfrost_drv.c

··· 274 274 275 275 ret = drm_sched_job_init(&job->base, 276 276 &file_priv->sched_entity[slot], 277 - NULL); 277 + 1, NULL); 278 278 if (ret) 279 279 goto out_put_job; 280 280 ··· 734 734 .supply_names = mediatek_mt8183_b_supplies, 735 735 .num_pm_domains = ARRAY_SIZE(mediatek_mt8183_pm_domains), 736 736 .pm_domain_names = mediatek_mt8183_pm_domains, 737 + .pm_features = BIT(GPU_PM_CLK_DIS) | BIT(GPU_PM_VREG_OFF), 737 738 }; 738 739 739 740 static const char * const mediatek_mt8186_pm_domains[] = { "core0", "core1" }; ··· 743 742 .supply_names = mediatek_mt8183_b_supplies, 744 743 .num_pm_domains = ARRAY_SIZE(mediatek_mt8186_pm_domains), 745 744 .pm_domain_names = mediatek_mt8186_pm_domains, 745 + .pm_features = BIT(GPU_PM_CLK_DIS) | BIT(GPU_PM_VREG_OFF), 746 746 }; 747 747 748 748 static const char * const mediatek_mt8192_supplies[] = { "mali", NULL }; ··· 754 752 .supply_names = mediatek_mt8192_supplies, 755 753 .num_pm_domains = ARRAY_SIZE(mediatek_mt8192_pm_domains), 756 754 .pm_domain_names = mediatek_mt8192_pm_domains, 755 + .pm_features = BIT(GPU_PM_CLK_DIS) | BIT(GPU_PM_VREG_OFF), 757 756 }; 758 757 759 758 static const struct of_device_id dt_match[] = {

+2 -10

drivers/gpu/drm/panfrost/panfrost_dump.c

··· 220 220 221 221 iter.hdr->bomap.data[0] = bomap - bomap_start; 222 222 223 - for_each_sgtable_page(bo->base.sgt, &page_iter, 0) { 224 - struct page *page = sg_page_iter_page(&page_iter); 225 - 226 - if (!IS_ERR(page)) { 227 - *bomap++ = page_to_phys(page); 228 - } else { 229 - dev_err(pfdev->dev, "Panfrost Dump: wrong page\n"); 230 - *bomap++ = 0; 231 - } 232 - } 223 + for_each_sgtable_page(bo->base.sgt, &page_iter, 0) 224 + *bomap++ = page_to_phys(sg_page_iter_page(&page_iter)); 233 225 234 226 iter.hdr->bomap.iova = mapping->mmnode.start << PAGE_SHIFT; 235 227

+60 -25

drivers/gpu/drm/panfrost/panfrost_gpu.c

··· 60 60 61 61 gpu_write(pfdev, GPU_INT_MASK, 0); 62 62 gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_RESET_COMPLETED); 63 - gpu_write(pfdev, GPU_CMD, GPU_CMD_SOFT_RESET); 64 63 64 + gpu_write(pfdev, GPU_CMD, GPU_CMD_SOFT_RESET); 65 65 ret = readl_relaxed_poll_timeout(pfdev->iomem + GPU_INT_RAWSTAT, 66 - val, val & GPU_IRQ_RESET_COMPLETED, 100, 10000); 66 + val, val & GPU_IRQ_RESET_COMPLETED, 10, 10000); 67 67 68 68 if (ret) { 69 - dev_err(pfdev->dev, "gpu soft reset timed out\n"); 70 - return ret; 69 + dev_err(pfdev->dev, "gpu soft reset timed out, attempting hard reset\n"); 70 + 71 + gpu_write(pfdev, GPU_CMD, GPU_CMD_HARD_RESET); 72 + ret = readl_relaxed_poll_timeout(pfdev->iomem + GPU_INT_RAWSTAT, val, 73 + val & GPU_IRQ_RESET_COMPLETED, 100, 10000); 74 + if (ret) { 75 + dev_err(pfdev->dev, "gpu hard reset timed out\n"); 76 + return ret; 77 + } 71 78 } 72 79 73 80 gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL); ··· 369 362 return ((u64)hi << 32) | lo; 370 363 } 371 364 365 + static u64 panfrost_get_core_mask(struct panfrost_device *pfdev) 366 + { 367 + u64 core_mask; 368 + 369 + if (pfdev->features.l2_present == 1) 370 + return U64_MAX; 371 + 372 + /* 373 + * Only support one core group now. 374 + * ~(l2_present - 1) unsets all bits in l2_present except 375 + * the bottom bit. (l2_present - 2) has all the bits in 376 + * the first core group set. AND them together to generate 377 + * a mask of cores in the first core group. 378 + */ 379 + core_mask = ~(pfdev->features.l2_present - 1) & 380 + (pfdev->features.l2_present - 2); 381 + dev_info_once(pfdev->dev, "using only 1st core group (%lu cores from %lu)\n", 382 + hweight64(core_mask), 383 + hweight64(pfdev->features.shader_present)); 384 + 385 + return core_mask; 386 + } 387 + 372 388 void panfrost_gpu_power_on(struct panfrost_device *pfdev) 373 389 { 374 390 int ret; 375 391 u32 val; 376 - u64 core_mask = U64_MAX; 392 + u64 core_mask; 377 393 378 394 panfrost_gpu_init_quirks(pfdev); 395 + core_mask = panfrost_get_core_mask(pfdev); 379 396 380 - if (pfdev->features.l2_present != 1) { 381 - /* 382 - * Only support one core group now. 383 - * ~(l2_present - 1) unsets all bits in l2_present except 384 - * the bottom bit. (l2_present - 2) has all the bits in 385 - * the first core group set. AND them together to generate 386 - * a mask of cores in the first core group. 387 - */ 388 - core_mask = ~(pfdev->features.l2_present - 1) & 389 - (pfdev->features.l2_present - 2); 390 - dev_info_once(pfdev->dev, "using only 1st core group (%lu cores from %lu)\n", 391 - hweight64(core_mask), 392 - hweight64(pfdev->features.shader_present)); 393 - } 394 397 gpu_write(pfdev, L2_PWRON_LO, pfdev->features.l2_present & core_mask); 395 398 ret = readl_relaxed_poll_timeout(pfdev->iomem + L2_READY_LO, 396 399 val, val == (pfdev->features.l2_present & core_mask), 397 - 100, 20000); 400 + 10, 20000); 398 401 if (ret) 399 402 dev_err(pfdev->dev, "error powering up gpu L2"); 400 403 ··· 412 395 pfdev->features.shader_present & core_mask); 413 396 ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_READY_LO, 414 397 val, val == (pfdev->features.shader_present & core_mask), 415 - 100, 20000); 398 + 10, 20000); 416 399 if (ret) 417 400 dev_err(pfdev->dev, "error powering up gpu shader"); 418 401 419 402 gpu_write(pfdev, TILER_PWRON_LO, pfdev->features.tiler_present); 420 403 ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_READY_LO, 421 - val, val == pfdev->features.tiler_present, 100, 1000); 404 + val, val == pfdev->features.tiler_present, 10, 1000); 422 405 if (ret) 423 406 dev_err(pfdev->dev, "error powering up gpu tiler"); 424 407 } 425 408 426 409 void panfrost_gpu_power_off(struct panfrost_device *pfdev) 427 410 { 428 - gpu_write(pfdev, TILER_PWROFF_LO, 0); 429 - gpu_write(pfdev, SHADER_PWROFF_LO, 0); 430 - gpu_write(pfdev, L2_PWROFF_LO, 0); 411 + u64 core_mask = panfrost_get_core_mask(pfdev); 412 + int ret; 413 + u32 val; 414 + 415 + gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & core_mask); 416 + ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO, 417 + val, !val, 1, 1000); 418 + if (ret) 419 + dev_err(pfdev->dev, "shader power transition timeout"); 420 + 421 + gpu_write(pfdev, TILER_PWROFF_LO, pfdev->features.tiler_present); 422 + ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_PWRTRANS_LO, 423 + val, !val, 1, 1000); 424 + if (ret) 425 + dev_err(pfdev->dev, "tiler power transition timeout"); 426 + 427 + gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask); 428 + ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO, 429 + val, !val, 0, 1000); 430 + if (ret) 431 + dev_err(pfdev->dev, "l2 power transition timeout"); 431 432 } 432 433 433 434 int panfrost_gpu_init(struct panfrost_device *pfdev)

+2 -2

drivers/gpu/drm/panfrost/panfrost_job.c

··· 852 852 js->queue[j].fence_context = dma_fence_context_alloc(1); 853 853 854 854 ret = drm_sched_init(&js->queue[j].sched, 855 - &panfrost_sched_ops, 855 + &panfrost_sched_ops, NULL, 856 856 DRM_SCHED_PRIORITY_COUNT, 857 857 nentries, 0, 858 858 msecs_to_jiffies(JOB_TIMEOUT_MS), ··· 963 963 964 964 for (i = 0; i < NUM_JOB_SLOTS; i++) { 965 965 /* If there are any jobs in the HW queue, we're not idle */ 966 - if (atomic_read(&js->queue[i].sched.hw_rq_count)) 966 + if (atomic_read(&js->queue[i].sched.credit_count)) 967 967 return false; 968 968 } 969 969

+1

drivers/gpu/drm/panfrost/panfrost_regs.h

··· 44 44 GPU_IRQ_MULTIPLE_FAULT) 45 45 #define GPU_CMD 0x30 46 46 #define GPU_CMD_SOFT_RESET 0x01 47 + #define GPU_CMD_HARD_RESET 0x02 47 48 #define GPU_CMD_PERFCNT_CLEAR 0x03 48 49 #define GPU_CMD_PERFCNT_SAMPLE 0x04 49 50 #define GPU_CMD_CYCLE_COUNT_START 0x05

+1

drivers/gpu/drm/radeon/radeon_audio.c

··· 26 26 #include <linux/component.h> 27 27 28 28 #include <drm/drm_crtc.h> 29 + #include <drm/drm_eld.h> 29 30 #include "dce6_afmt.h" 30 31 #include "evergreen_hdmi.h" 31 32 #include "radeon.h"

+1 -1

drivers/gpu/drm/scheduler/gpu_scheduler_trace.h

··· 51 51 __assign_str(name, sched_job->sched->name); 52 52 __entry->job_count = spsc_queue_count(&entity->job_queue); 53 53 __entry->hw_job_count = atomic_read( 54 - &sched_job->sched->hw_rq_count); 54 + &sched_job->sched->credit_count); 55 55 ), 56 56 TP_printk("entity=%p, id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d", 57 57 __entry->entity, __entry->id,

+2 -2

drivers/gpu/drm/scheduler/sched_entity.c

··· 370 370 container_of(cb, struct drm_sched_entity, cb); 371 371 372 372 drm_sched_entity_clear_dep(f, cb); 373 - drm_sched_wakeup_if_can_queue(entity->rq->sched); 373 + drm_sched_wakeup(entity->rq->sched, entity); 374 374 } 375 375 376 376 /** ··· 602 602 if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) 603 603 drm_sched_rq_update_fifo(entity, submit_ts); 604 604 605 - drm_sched_wakeup_if_can_queue(entity->rq->sched); 605 + drm_sched_wakeup(entity->rq->sched, entity); 606 606 } 607 607 } 608 608 EXPORT_SYMBOL(drm_sched_entity_push_job);

+342 -144

drivers/gpu/drm/scheduler/sched_main.c

··· 48 48 * through the jobs entity pointer. 49 49 */ 50 50 51 - #include <linux/kthread.h> 51 + /** 52 + * DOC: Flow Control 53 + * 54 + * The DRM GPU scheduler provides a flow control mechanism to regulate the rate 55 + * in which the jobs fetched from scheduler entities are executed. 56 + * 57 + * In this context the &drm_gpu_scheduler keeps track of a driver specified 58 + * credit limit representing the capacity of this scheduler and a credit count; 59 + * every &drm_sched_job carries a driver specified number of credits. 60 + * 61 + * Once a job is executed (but not yet finished), the job's credits contribute 62 + * to the scheduler's credit count until the job is finished. If by executing 63 + * one more job the scheduler's credit count would exceed the scheduler's 64 + * credit limit, the job won't be executed. Instead, the scheduler will wait 65 + * until the credit count has decreased enough to not overflow its credit limit. 66 + * This implies waiting for previously executed jobs. 67 + * 68 + * Optionally, drivers may register a callback (update_job_credits) provided by 69 + * struct drm_sched_backend_ops to update the job's credits dynamically. The 70 + * scheduler executes this callback every time the scheduler considers a job for 71 + * execution and subsequently checks whether the job fits the scheduler's credit 72 + * limit. 73 + */ 74 + 52 75 #include <linux/wait.h> 53 76 #include <linux/sched.h> 54 77 #include <linux/completion.h> ··· 98 75 */ 99 76 MODULE_PARM_DESC(sched_policy, "Specify the scheduling policy for entities on a run-queue, " __stringify(DRM_SCHED_POLICY_RR) " = Round Robin, " __stringify(DRM_SCHED_POLICY_FIFO) " = FIFO (default)."); 100 77 module_param_named(sched_policy, drm_sched_policy, int, 0444); 78 + 79 + static u32 drm_sched_available_credits(struct drm_gpu_scheduler *sched) 80 + { 81 + u32 credits; 82 + 83 + drm_WARN_ON(sched, check_sub_overflow(sched->credit_limit, 84 + atomic_read(&sched->credit_count), 85 + &credits)); 86 + 87 + return credits; 88 + } 89 + 90 + /** 91 + * drm_sched_can_queue -- Can we queue more to the hardware? 92 + * @sched: scheduler instance 93 + * @entity: the scheduler entity 94 + * 95 + * Return true if we can push at least one more job from @entity, false 96 + * otherwise. 97 + */ 98 + static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched, 99 + struct drm_sched_entity *entity) 100 + { 101 + struct drm_sched_job *s_job; 102 + 103 + s_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); 104 + if (!s_job) 105 + return false; 106 + 107 + if (sched->ops->update_job_credits) { 108 + s_job->credits = sched->ops->update_job_credits(s_job); 109 + 110 + drm_WARN(sched, !s_job->credits, 111 + "Jobs with zero credits bypass job-flow control.\n"); 112 + } 113 + 114 + /* If a job exceeds the credit limit, truncate it to the credit limit 115 + * itself to guarantee forward progress. 116 + */ 117 + if (drm_WARN(sched, s_job->credits > sched->credit_limit, 118 + "Jobs may not exceed the credit limit, truncate.\n")) 119 + s_job->credits = sched->credit_limit; 120 + 121 + return drm_sched_available_credits(sched) >= s_job->credits; 122 + } 101 123 102 124 static __always_inline bool drm_sched_entity_compare_before(struct rb_node *a, 103 125 const struct rb_node *b) ··· 255 187 /** 256 188 * drm_sched_rq_select_entity_rr - Select an entity which could provide a job to run 257 189 * 190 + * @sched: the gpu scheduler 258 191 * @rq: scheduler run queue to check. 259 192 * 260 - * Try to find a ready entity, returns NULL if none found. 193 + * Try to find the next ready entity. 194 + * 195 + * Return an entity if one is found; return an error-pointer (!NULL) if an 196 + * entity was ready, but the scheduler had insufficient credits to accommodate 197 + * its job; return NULL, if no ready entity was found. 261 198 */ 262 199 static struct drm_sched_entity * 263 - drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq) 200 + drm_sched_rq_select_entity_rr(struct drm_gpu_scheduler *sched, 201 + struct drm_sched_rq *rq) 264 202 { 265 203 struct drm_sched_entity *entity; 266 204 ··· 276 202 if (entity) { 277 203 list_for_each_entry_continue(entity, &rq->entities, list) { 278 204 if (drm_sched_entity_is_ready(entity)) { 205 + /* If we can't queue yet, preserve the current 206 + * entity in terms of fairness. 207 + */ 208 + if (!drm_sched_can_queue(sched, entity)) { 209 + spin_unlock(&rq->lock); 210 + return ERR_PTR(-ENOSPC); 211 + } 212 + 279 213 rq->current_entity = entity; 280 214 reinit_completion(&entity->entity_idle); 281 215 spin_unlock(&rq->lock); ··· 293 211 } 294 212 295 213 list_for_each_entry(entity, &rq->entities, list) { 296 - 297 214 if (drm_sched_entity_is_ready(entity)) { 215 + /* If we can't queue yet, preserve the current entity in 216 + * terms of fairness. 217 + */ 218 + if (!drm_sched_can_queue(sched, entity)) { 219 + spin_unlock(&rq->lock); 220 + return ERR_PTR(-ENOSPC); 221 + } 222 + 298 223 rq->current_entity = entity; 299 224 reinit_completion(&entity->entity_idle); 300 225 spin_unlock(&rq->lock); ··· 320 231 /** 321 232 * drm_sched_rq_select_entity_fifo - Select an entity which provides a job to run 322 233 * 234 + * @sched: the gpu scheduler 323 235 * @rq: scheduler run queue to check. 324 236 * 325 - * Find oldest waiting ready entity, returns NULL if none found. 237 + * Find oldest waiting ready entity. 238 + * 239 + * Return an entity if one is found; return an error-pointer (!NULL) if an 240 + * entity was ready, but the scheduler had insufficient credits to accommodate 241 + * its job; return NULL, if no ready entity was found. 326 242 */ 327 243 static struct drm_sched_entity * 328 - drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq) 244 + drm_sched_rq_select_entity_fifo(struct drm_gpu_scheduler *sched, 245 + struct drm_sched_rq *rq) 329 246 { 330 247 struct rb_node *rb; 331 248 ··· 341 246 342 247 entity = rb_entry(rb, struct drm_sched_entity, rb_tree_node); 343 248 if (drm_sched_entity_is_ready(entity)) { 249 + /* If we can't queue yet, preserve the current entity in 250 + * terms of fairness. 251 + */ 252 + if (!drm_sched_can_queue(sched, entity)) { 253 + spin_unlock(&rq->lock); 254 + return ERR_PTR(-ENOSPC); 255 + } 256 + 344 257 rq->current_entity = entity; 345 258 reinit_completion(&entity->entity_idle); 346 259 break; ··· 357 254 spin_unlock(&rq->lock); 358 255 359 256 return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL; 257 + } 258 + 259 + /** 260 + * drm_sched_run_job_queue - enqueue run-job work 261 + * @sched: scheduler instance 262 + */ 263 + static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched) 264 + { 265 + if (!READ_ONCE(sched->pause_submit)) 266 + queue_work(sched->submit_wq, &sched->work_run_job); 267 + } 268 + 269 + /** 270 + * __drm_sched_run_free_queue - enqueue free-job work 271 + * @sched: scheduler instance 272 + */ 273 + static void __drm_sched_run_free_queue(struct drm_gpu_scheduler *sched) 274 + { 275 + if (!READ_ONCE(sched->pause_submit)) 276 + queue_work(sched->submit_wq, &sched->work_free_job); 277 + } 278 + 279 + /** 280 + * drm_sched_run_free_queue - enqueue free-job work if ready 281 + * @sched: scheduler instance 282 + */ 283 + static void drm_sched_run_free_queue(struct drm_gpu_scheduler *sched) 284 + { 285 + struct drm_sched_job *job; 286 + 287 + spin_lock(&sched->job_list_lock); 288 + job = list_first_entry_or_null(&sched->pending_list, 289 + struct drm_sched_job, list); 290 + if (job && dma_fence_is_signaled(&job->s_fence->finished)) 291 + __drm_sched_run_free_queue(sched); 292 + spin_unlock(&sched->job_list_lock); 360 293 } 361 294 362 295 /** ··· 406 267 struct drm_sched_fence *s_fence = s_job->s_fence; 407 268 struct drm_gpu_scheduler *sched = s_fence->sched; 408 269 409 - atomic_dec(&sched->hw_rq_count); 270 + atomic_sub(s_job->credits, &sched->credit_count); 410 271 atomic_dec(sched->score); 411 272 412 273 trace_drm_sched_process_job(s_fence); ··· 414 275 dma_fence_get(&s_fence->finished); 415 276 drm_sched_fence_finished(s_fence, result); 416 277 dma_fence_put(&s_fence->finished); 417 - wake_up_interruptible(&sched->wake_up_worker); 278 + __drm_sched_run_free_queue(sched); 418 279 } 419 280 420 281 /** ··· 438 299 */ 439 300 static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched) 440 301 { 302 + lockdep_assert_held(&sched->job_list_lock); 303 + 441 304 if (sched->timeout != MAX_SCHEDULE_TIMEOUT && 442 305 !list_empty(&sched->pending_list)) 443 - queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout); 306 + mod_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout); 444 307 } 308 + 309 + static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched) 310 + { 311 + spin_lock(&sched->job_list_lock); 312 + drm_sched_start_timeout(sched); 313 + spin_unlock(&sched->job_list_lock); 314 + } 315 + 316 + /** 317 + * drm_sched_tdr_queue_imm: - immediately start job timeout handler 318 + * 319 + * @sched: scheduler for which the timeout handling should be started. 320 + * 321 + * Start timeout handling immediately for the named scheduler. 322 + */ 323 + void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched) 324 + { 325 + spin_lock(&sched->job_list_lock); 326 + sched->timeout = 0; 327 + drm_sched_start_timeout(sched); 328 + spin_unlock(&sched->job_list_lock); 329 + } 330 + EXPORT_SYMBOL(drm_sched_tdr_queue_imm); 445 331 446 332 /** 447 333 * drm_sched_fault - immediately start timeout handler ··· 552 388 553 389 sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); 554 390 555 - /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ 391 + /* Protects against concurrent deletion in drm_sched_get_finished_job */ 556 392 spin_lock(&sched->job_list_lock); 557 393 job = list_first_entry_or_null(&sched->pending_list, 558 394 struct drm_sched_job, list); ··· 580 416 spin_unlock(&sched->job_list_lock); 581 417 } 582 418 583 - if (status != DRM_GPU_SCHED_STAT_ENODEV) { 584 - spin_lock(&sched->job_list_lock); 585 - drm_sched_start_timeout(sched); 586 - spin_unlock(&sched->job_list_lock); 587 - } 419 + if (status != DRM_GPU_SCHED_STAT_ENODEV) 420 + drm_sched_start_timeout_unlocked(sched); 588 421 } 589 422 590 423 /** ··· 600 439 { 601 440 struct drm_sched_job *s_job, *tmp; 602 441 603 - kthread_park(sched->thread); 442 + drm_sched_wqueue_stop(sched); 604 443 605 444 /* 606 445 * Reinsert back the bad job here - now it's safe as 607 - * drm_sched_get_cleanup_job cannot race against us and release the 446 + * drm_sched_get_finished_job cannot race against us and release the 608 447 * bad job at this point - we parked (waited for) any in progress 609 - * (earlier) cleanups and drm_sched_get_cleanup_job will not be called 448 + * (earlier) cleanups and drm_sched_get_finished_job will not be called 610 449 * now until the scheduler thread is unparked. 611 450 */ 612 451 if (bad && bad->sched == sched) ··· 629 468 &s_job->cb)) { 630 469 dma_fence_put(s_job->s_fence->parent); 631 470 s_job->s_fence->parent = NULL; 632 - atomic_dec(&sched->hw_rq_count); 471 + atomic_sub(s_job->credits, &sched->credit_count); 633 472 } else { 634 473 /* 635 474 * remove job from pending_list. ··· 690 529 list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) { 691 530 struct dma_fence *fence = s_job->s_fence->parent; 692 531 693 - atomic_inc(&sched->hw_rq_count); 532 + atomic_add(s_job->credits, &sched->credit_count); 694 533 695 534 if (!full_recovery) 696 535 continue; ··· 707 546 drm_sched_job_done(s_job, -ECANCELED); 708 547 } 709 548 710 - if (full_recovery) { 711 - spin_lock(&sched->job_list_lock); 712 - drm_sched_start_timeout(sched); 713 - spin_unlock(&sched->job_list_lock); 714 - } 549 + if (full_recovery) 550 + drm_sched_start_timeout_unlocked(sched); 715 551 716 - kthread_unpark(sched->thread); 552 + drm_sched_wqueue_start(sched); 717 553 } 718 554 EXPORT_SYMBOL(drm_sched_start); 719 555 ··· 771 613 * drm_sched_job_init - init a scheduler job 772 614 * @job: scheduler job to init 773 615 * @entity: scheduler entity to use 616 + * @credits: the number of credits this job contributes to the schedulers 617 + * credit limit 774 618 * @owner: job owner for debugging 775 619 * 776 620 * Refer to drm_sched_entity_push_job() documentation ··· 790 630 */ 791 631 int drm_sched_job_init(struct drm_sched_job *job, 792 632 struct drm_sched_entity *entity, 793 - void *owner) 633 + u32 credits, void *owner) 794 634 { 795 635 if (!entity->rq) { 796 636 /* This will most likely be followed by missing frames ··· 801 641 return -ENOENT; 802 642 } 803 643 644 + if (unlikely(!credits)) { 645 + pr_err("*ERROR* %s: credits cannot be 0!\n", __func__); 646 + return -EINVAL; 647 + } 648 + 804 649 job->entity = entity; 650 + job->credits = credits; 805 651 job->s_fence = drm_sched_fence_alloc(entity, owner); 806 652 if (!job->s_fence) 807 653 return -ENOMEM; ··· 1020 854 EXPORT_SYMBOL(drm_sched_job_cleanup); 1021 855 1022 856 /** 1023 - * drm_sched_can_queue -- Can we queue more to the hardware? 857 + * drm_sched_wakeup - Wake up the scheduler if it is ready to queue 1024 858 * @sched: scheduler instance 1025 - * 1026 - * Return true if we can push more jobs to the hw, otherwise false. 1027 - */ 1028 - static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched) 1029 - { 1030 - return atomic_read(&sched->hw_rq_count) < 1031 - sched->hw_submission_limit; 1032 - } 1033 - 1034 - /** 1035 - * drm_sched_wakeup_if_can_queue - Wake up the scheduler 1036 - * @sched: scheduler instance 859 + * @entity: the scheduler entity 1037 860 * 1038 861 * Wake up the scheduler if we can queue jobs. 1039 862 */ 1040 - void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched) 863 + void drm_sched_wakeup(struct drm_gpu_scheduler *sched, 864 + struct drm_sched_entity *entity) 1041 865 { 1042 - if (drm_sched_can_queue(sched)) 1043 - wake_up_interruptible(&sched->wake_up_worker); 866 + if (drm_sched_entity_is_ready(entity)) 867 + if (drm_sched_can_queue(sched, entity)) 868 + drm_sched_run_job_queue(sched); 1044 869 } 1045 870 1046 871 /** ··· 1039 882 * 1040 883 * @sched: scheduler instance 1041 884 * 1042 - * Returns the entity to process or NULL if none are found. 885 + * Return an entity to process or NULL if none are found. 886 + * 887 + * Note, that we break out of the for-loop when "entity" is non-null, which can 888 + * also be an error-pointer--this assures we don't process lower priority 889 + * run-queues. See comments in the respectively called functions. 1043 890 */ 1044 891 static struct drm_sched_entity * 1045 892 drm_sched_select_entity(struct drm_gpu_scheduler *sched) ··· 1051 890 struct drm_sched_entity *entity; 1052 891 int i; 1053 892 1054 - if (!drm_sched_can_queue(sched)) 1055 - return NULL; 1056 - 1057 893 /* Kernel run queue has higher priority than normal run queue*/ 1058 894 for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { 1059 895 entity = drm_sched_policy == DRM_SCHED_POLICY_FIFO ? 1060 - drm_sched_rq_select_entity_fifo(sched->sched_rq[i]) : 1061 - drm_sched_rq_select_entity_rr(sched->sched_rq[i]); 896 + drm_sched_rq_select_entity_fifo(sched, sched->sched_rq[i]) : 897 + drm_sched_rq_select_entity_rr(sched, sched->sched_rq[i]); 1062 898 if (entity) 1063 899 break; 1064 900 } 1065 901 1066 - return entity; 902 + return IS_ERR(entity) ? NULL : entity; 1067 903 } 1068 904 1069 905 /** 1070 - * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed 906 + * drm_sched_get_finished_job - fetch the next finished job to be destroyed 1071 907 * 1072 908 * @sched: scheduler instance 1073 909 * ··· 1072 914 * ready for it to be destroyed. 1073 915 */ 1074 916 static struct drm_sched_job * 1075 - drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) 917 + drm_sched_get_finished_job(struct drm_gpu_scheduler *sched) 1076 918 { 1077 919 struct drm_sched_job *job, *next; 1078 920 ··· 1092 934 typeof(*next), list); 1093 935 1094 936 if (next) { 1095 - next->s_fence->scheduled.timestamp = 1096 - dma_fence_timestamp(&job->s_fence->finished); 937 + if (test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, 938 + &next->s_fence->scheduled.flags)) 939 + next->s_fence->scheduled.timestamp = 940 + dma_fence_timestamp(&job->s_fence->finished); 1097 941 /* start TO timer for next job */ 1098 942 drm_sched_start_timeout(sched); 1099 943 } ··· 1145 985 EXPORT_SYMBOL(drm_sched_pick_best); 1146 986 1147 987 /** 1148 - * drm_sched_blocked - check if the scheduler is blocked 988 + * drm_sched_free_job_work - worker to call free_job 1149 989 * 1150 - * @sched: scheduler instance 1151 - * 1152 - * Returns true if blocked, otherwise false. 990 + * @w: free job work 1153 991 */ 1154 - static bool drm_sched_blocked(struct drm_gpu_scheduler *sched) 992 + static void drm_sched_free_job_work(struct work_struct *w) 1155 993 { 1156 - if (kthread_should_park()) { 1157 - kthread_parkme(); 1158 - return true; 1159 - } 994 + struct drm_gpu_scheduler *sched = 995 + container_of(w, struct drm_gpu_scheduler, work_free_job); 996 + struct drm_sched_job *job; 1160 997 1161 - return false; 998 + if (READ_ONCE(sched->pause_submit)) 999 + return; 1000 + 1001 + job = drm_sched_get_finished_job(sched); 1002 + if (job) 1003 + sched->ops->free_job(job); 1004 + 1005 + drm_sched_run_free_queue(sched); 1006 + drm_sched_run_job_queue(sched); 1162 1007 } 1163 1008 1164 1009 /** 1165 - * drm_sched_main - main scheduler thread 1010 + * drm_sched_run_job_work - worker to call run_job 1166 1011 * 1167 - * @param: scheduler instance 1168 - * 1169 - * Returns 0. 1012 + * @w: run job work 1170 1013 */ 1171 - static int drm_sched_main(void *param) 1014 + static void drm_sched_run_job_work(struct work_struct *w) 1172 1015 { 1173 - struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param; 1016 + struct drm_gpu_scheduler *sched = 1017 + container_of(w, struct drm_gpu_scheduler, work_run_job); 1018 + struct drm_sched_entity *entity; 1019 + struct dma_fence *fence; 1020 + struct drm_sched_fence *s_fence; 1021 + struct drm_sched_job *sched_job; 1174 1022 int r; 1175 1023 1176 - sched_set_fifo_low(current); 1024 + if (READ_ONCE(sched->pause_submit)) 1025 + return; 1177 1026 1178 - while (!kthread_should_stop()) { 1179 - struct drm_sched_entity *entity = NULL; 1180 - struct drm_sched_fence *s_fence; 1181 - struct drm_sched_job *sched_job; 1182 - struct dma_fence *fence; 1183 - struct drm_sched_job *cleanup_job = NULL; 1027 + entity = drm_sched_select_entity(sched); 1028 + if (!entity) 1029 + return; 1184 1030 1185 - wait_event_interruptible(sched->wake_up_worker, 1186 - (cleanup_job = drm_sched_get_cleanup_job(sched)) || 1187 - (!drm_sched_blocked(sched) && 1188 - (entity = drm_sched_select_entity(sched))) || 1189 - kthread_should_stop()); 1190 - 1191 - if (cleanup_job) 1192 - sched->ops->free_job(cleanup_job); 1193 - 1194 - if (!entity) 1195 - continue; 1196 - 1197 - sched_job = drm_sched_entity_pop_job(entity); 1198 - 1199 - if (!sched_job) { 1200 - complete_all(&entity->entity_idle); 1201 - continue; 1202 - } 1203 - 1204 - s_fence = sched_job->s_fence; 1205 - 1206 - atomic_inc(&sched->hw_rq_count); 1207 - drm_sched_job_begin(sched_job); 1208 - 1209 - trace_drm_run_job(sched_job, entity); 1210 - fence = sched->ops->run_job(sched_job); 1031 + sched_job = drm_sched_entity_pop_job(entity); 1032 + if (!sched_job) { 1211 1033 complete_all(&entity->entity_idle); 1212 - drm_sched_fence_scheduled(s_fence, fence); 1213 - 1214 - if (!IS_ERR_OR_NULL(fence)) { 1215 - /* Drop for original kref_init of the fence */ 1216 - dma_fence_put(fence); 1217 - 1218 - r = dma_fence_add_callback(fence, &sched_job->cb, 1219 - drm_sched_job_done_cb); 1220 - if (r == -ENOENT) 1221 - drm_sched_job_done(sched_job, fence->error); 1222 - else if (r) 1223 - DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", 1224 - r); 1225 - } else { 1226 - drm_sched_job_done(sched_job, IS_ERR(fence) ? 1227 - PTR_ERR(fence) : 0); 1228 - } 1229 - 1230 - wake_up(&sched->job_scheduled); 1034 + return; /* No more work */ 1231 1035 } 1232 - return 0; 1036 + 1037 + s_fence = sched_job->s_fence; 1038 + 1039 + atomic_add(sched_job->credits, &sched->credit_count); 1040 + drm_sched_job_begin(sched_job); 1041 + 1042 + trace_drm_run_job(sched_job, entity); 1043 + fence = sched->ops->run_job(sched_job); 1044 + complete_all(&entity->entity_idle); 1045 + drm_sched_fence_scheduled(s_fence, fence); 1046 + 1047 + if (!IS_ERR_OR_NULL(fence)) { 1048 + /* Drop for original kref_init of the fence */ 1049 + dma_fence_put(fence); 1050 + 1051 + r = dma_fence_add_callback(fence, &sched_job->cb, 1052 + drm_sched_job_done_cb); 1053 + if (r == -ENOENT) 1054 + drm_sched_job_done(sched_job, fence->error); 1055 + else if (r) 1056 + DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r); 1057 + } else { 1058 + drm_sched_job_done(sched_job, IS_ERR(fence) ? 1059 + PTR_ERR(fence) : 0); 1060 + } 1061 + 1062 + wake_up(&sched->job_scheduled); 1063 + drm_sched_run_job_queue(sched); 1233 1064 } 1234 1065 1235 1066 /** ··· 1228 1077 * 1229 1078 * @sched: scheduler instance 1230 1079 * @ops: backend operations for this scheduler 1080 + * @submit_wq: workqueue to use for submission. If NULL, an ordered wq is 1081 + * allocated and used 1231 1082 * @num_rqs: number of runqueues, one for each priority, up to DRM_SCHED_PRIORITY_COUNT 1232 - * @hw_submission: number of hw submissions that can be in flight 1083 + * @credit_limit: the number of credits this scheduler can hold from all jobs 1233 1084 * @hang_limit: number of times to allow a job to hang before dropping it 1234 1085 * @timeout: timeout value in jiffies for the scheduler 1235 1086 * @timeout_wq: workqueue to use for timeout work. If NULL, the system_wq is ··· 1244 1091 */ 1245 1092 int drm_sched_init(struct drm_gpu_scheduler *sched, 1246 1093 const struct drm_sched_backend_ops *ops, 1247 - u32 num_rqs, uint32_t hw_submission, unsigned int hang_limit, 1094 + struct workqueue_struct *submit_wq, 1095 + u32 num_rqs, u32 credit_limit, unsigned int hang_limit, 1248 1096 long timeout, struct workqueue_struct *timeout_wq, 1249 1097 atomic_t *score, const char *name, struct device *dev) 1250 1098 { 1251 1099 int i, ret; 1252 1100 1253 1101 sched->ops = ops; 1254 - sched->hw_submission_limit = hw_submission; 1102 + sched->credit_limit = credit_limit; 1255 1103 sched->name = name; 1256 1104 sched->timeout = timeout; 1257 1105 sched->timeout_wq = timeout_wq ? : system_wq; ··· 1275 1121 return 0; 1276 1122 } 1277 1123 1124 + if (submit_wq) { 1125 + sched->submit_wq = submit_wq; 1126 + sched->own_submit_wq = false; 1127 + } else { 1128 + sched->submit_wq = alloc_ordered_workqueue(name, 0); 1129 + if (!sched->submit_wq) 1130 + return -ENOMEM; 1131 + 1132 + sched->own_submit_wq = true; 1133 + } 1134 + ret = -ENOMEM; 1278 1135 sched->sched_rq = kmalloc_array(num_rqs, sizeof(*sched->sched_rq), 1279 1136 GFP_KERNEL | __GFP_ZERO); 1280 - if (!sched->sched_rq) { 1281 - drm_err(sched, "%s: out of memory for sched_rq\n", __func__); 1282 - return -ENOMEM; 1283 - } 1137 + if (!sched->sched_rq) 1138 + goto Out_free; 1284 1139 sched->num_rqs = num_rqs; 1285 - ret = -ENOMEM; 1286 1140 for (i = DRM_SCHED_PRIORITY_MIN; i < sched->num_rqs; i++) { 1287 1141 sched->sched_rq[i] = kzalloc(sizeof(*sched->sched_rq[i]), GFP_KERNEL); 1288 1142 if (!sched->sched_rq[i]) ··· 1298 1136 drm_sched_rq_init(sched, sched->sched_rq[i]); 1299 1137 } 1300 1138 1301 - init_waitqueue_head(&sched->wake_up_worker); 1302 1139 init_waitqueue_head(&sched->job_scheduled); 1303 1140 INIT_LIST_HEAD(&sched->pending_list); 1304 1141 spin_lock_init(&sched->job_list_lock); 1305 - atomic_set(&sched->hw_rq_count, 0); 1142 + atomic_set(&sched->credit_count, 0); 1306 1143 INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout); 1144 + INIT_WORK(&sched->work_run_job, drm_sched_run_job_work); 1145 + INIT_WORK(&sched->work_free_job, drm_sched_free_job_work); 1307 1146 atomic_set(&sched->_score, 0); 1308 1147 atomic64_set(&sched->job_id_count, 0); 1309 - 1310 - /* Each scheduler will run on a seperate kernel thread */ 1311 - sched->thread = kthread_run(drm_sched_main, sched, sched->name); 1312 - if (IS_ERR(sched->thread)) { 1313 - ret = PTR_ERR(sched->thread); 1314 - sched->thread = NULL; 1315 - DRM_DEV_ERROR(sched->dev, "Failed to create scheduler for %s.\n", name); 1316 - goto Out_unroll; 1317 - } 1148 + sched->pause_submit = false; 1318 1149 1319 1150 sched->ready = true; 1320 1151 return 0; 1321 1152 Out_unroll: 1322 1153 for (--i ; i >= DRM_SCHED_PRIORITY_MIN; i--) 1323 1154 kfree(sched->sched_rq[i]); 1155 + Out_free: 1324 1156 kfree(sched->sched_rq); 1325 1157 sched->sched_rq = NULL; 1158 + if (sched->own_submit_wq) 1159 + destroy_workqueue(sched->submit_wq); 1326 1160 drm_err(sched, "%s: Failed to setup GPU scheduler--out of memory\n", __func__); 1327 1161 return ret; 1328 1162 } ··· 1336 1178 struct drm_sched_entity *s_entity; 1337 1179 int i; 1338 1180 1339 - if (sched->thread) 1340 - kthread_stop(sched->thread); 1181 + drm_sched_wqueue_stop(sched); 1341 1182 1342 1183 for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { 1343 1184 struct drm_sched_rq *rq = sched->sched_rq[i]; ··· 1359 1202 /* Confirm no work left behind accessing device structures */ 1360 1203 cancel_delayed_work_sync(&sched->work_tdr); 1361 1204 1205 + if (sched->own_submit_wq) 1206 + destroy_workqueue(sched->submit_wq); 1362 1207 sched->ready = false; 1363 1208 kfree(sched->sched_rq); 1364 1209 sched->sched_rq = NULL; ··· 1411 1252 } 1412 1253 } 1413 1254 EXPORT_SYMBOL(drm_sched_increase_karma); 1255 + 1256 + /** 1257 + * drm_sched_wqueue_ready - Is the scheduler ready for submission 1258 + * 1259 + * @sched: scheduler instance 1260 + * 1261 + * Returns true if submission is ready 1262 + */ 1263 + bool drm_sched_wqueue_ready(struct drm_gpu_scheduler *sched) 1264 + { 1265 + return sched->ready; 1266 + } 1267 + EXPORT_SYMBOL(drm_sched_wqueue_ready); 1268 + 1269 + /** 1270 + * drm_sched_wqueue_stop - stop scheduler submission 1271 + * 1272 + * @sched: scheduler instance 1273 + */ 1274 + void drm_sched_wqueue_stop(struct drm_gpu_scheduler *sched) 1275 + { 1276 + WRITE_ONCE(sched->pause_submit, true); 1277 + cancel_work_sync(&sched->work_run_job); 1278 + cancel_work_sync(&sched->work_free_job); 1279 + } 1280 + EXPORT_SYMBOL(drm_sched_wqueue_stop); 1281 + 1282 + /** 1283 + * drm_sched_wqueue_start - start scheduler submission 1284 + * 1285 + * @sched: scheduler instance 1286 + */ 1287 + void drm_sched_wqueue_start(struct drm_gpu_scheduler *sched) 1288 + { 1289 + WRITE_ONCE(sched->pause_submit, false); 1290 + queue_work(sched->submit_wq, &sched->work_run_job); 1291 + queue_work(sched->submit_wq, &sched->work_free_job); 1292 + } 1293 + EXPORT_SYMBOL(drm_sched_wqueue_start);

+32 -6

drivers/gpu/drm/solomon/ssd130x.c

··· 808 808 static int ssd130x_fb_blit_rect(struct drm_framebuffer *fb, 809 809 const struct iosys_map *vmap, 810 810 struct drm_rect *rect, 811 - u8 *buf, u8 *data_array) 811 + u8 *buf, u8 *data_array, 812 + struct drm_format_conv_state *fmtcnv_state) 812 813 { 813 814 struct ssd130x_device *ssd130x = drm_to_ssd130x(fb->dev); 814 815 struct iosys_map dst; ··· 827 826 return ret; 828 827 829 828 iosys_map_set_vaddr(&dst, buf); 830 - drm_fb_xrgb8888_to_mono(&dst, &dst_pitch, vmap, fb, rect); 829 + drm_fb_xrgb8888_to_mono(&dst, &dst_pitch, vmap, fb, rect, fmtcnv_state); 831 830 832 831 drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE); 833 832 ··· 839 838 static int ssd132x_fb_blit_rect(struct drm_framebuffer *fb, 840 839 const struct iosys_map *vmap, 841 840 struct drm_rect *rect, u8 *buf, 842 - u8 *data_array) 841 + u8 *data_array, 842 + struct drm_format_conv_state *fmtcnv_state) 843 843 { 844 844 struct ssd130x_device *ssd130x = drm_to_ssd130x(fb->dev); 845 845 unsigned int dst_pitch = drm_rect_width(rect); ··· 857 855 return ret; 858 856 859 857 iosys_map_set_vaddr(&dst, buf); 860 - drm_fb_xrgb8888_to_gray8(&dst, &dst_pitch, vmap, fb, rect); 858 + drm_fb_xrgb8888_to_gray8(&dst, &dst_pitch, vmap, fb, rect, fmtcnv_state); 861 859 862 860 drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE); 863 861 ··· 873 871 struct ssd130x_device *ssd130x = drm_to_ssd130x(drm); 874 872 struct drm_plane_state *plane_state = drm_atomic_get_new_plane_state(state, plane); 875 873 struct ssd130x_plane_state *ssd130x_state = to_ssd130x_plane_state(plane_state); 874 + struct drm_shadow_plane_state *shadow_plane_state = &ssd130x_state->base; 876 875 struct drm_crtc *crtc = plane_state->crtc; 877 876 struct drm_crtc_state *crtc_state = NULL; 878 877 const struct drm_format_info *fi; ··· 898 895 899 896 pitch = drm_format_info_min_pitch(fi, 0, ssd130x->width); 900 897 898 + if (plane_state->fb->format != fi) { 899 + void *buf; 900 + 901 + /* format conversion necessary; reserve buffer */ 902 + buf = drm_format_conv_state_reserve(&shadow_plane_state->fmtcnv_state, 903 + pitch, GFP_KERNEL); 904 + if (!buf) 905 + return -ENOMEM; 906 + } 907 + 901 908 ssd130x_state->buffer = kcalloc(pitch, ssd130x->height, GFP_KERNEL); 902 909 if (!ssd130x_state->buffer) 903 910 return -ENOMEM; ··· 922 909 struct ssd130x_device *ssd130x = drm_to_ssd130x(drm); 923 910 struct drm_plane_state *plane_state = drm_atomic_get_new_plane_state(state, plane); 924 911 struct ssd130x_plane_state *ssd130x_state = to_ssd130x_plane_state(plane_state); 912 + struct drm_shadow_plane_state *shadow_plane_state = &ssd130x_state->base; 925 913 struct drm_crtc *crtc = plane_state->crtc; 926 914 struct drm_crtc_state *crtc_state = NULL; 927 915 const struct drm_format_info *fi; ··· 946 932 return -EINVAL; 947 933 948 934 pitch = drm_format_info_min_pitch(fi, 0, ssd130x->width); 935 + 936 + if (plane_state->fb->format != fi) { 937 + void *buf; 938 + 939 + /* format conversion necessary; reserve buffer */ 940 + buf = drm_format_conv_state_reserve(&shadow_plane_state->fmtcnv_state, 941 + pitch, GFP_KERNEL); 942 + if (!buf) 943 + return -ENOMEM; 944 + } 949 945 950 946 ssd130x_state->buffer = kcalloc(pitch, ssd130x->height, GFP_KERNEL); 951 947 if (!ssd130x_state->buffer) ··· 992 968 993 969 ssd130x_fb_blit_rect(fb, &shadow_plane_state->data[0], &dst_clip, 994 970 ssd130x_plane_state->buffer, 995 - ssd130x_crtc_state->data_array); 971 + ssd130x_crtc_state->data_array, 972 + &shadow_plane_state->fmtcnv_state); 996 973 } 997 974 998 975 drm_dev_exit(idx); ··· 1027 1002 1028 1003 ssd132x_fb_blit_rect(fb, &shadow_plane_state->data[0], &dst_clip, 1029 1004 ssd130x_plane_state->buffer, 1030 - ssd130x_crtc_state->data_array); 1005 + ssd130x_crtc_state->data_array, 1006 + &shadow_plane_state->fmtcnv_state); 1031 1007 } 1032 1008 1033 1009 drm_dev_exit(idx);

+1

drivers/gpu/drm/tegra/hdmi.c

··· 24 24 #include <drm/drm_atomic_helper.h> 25 25 #include <drm/drm_crtc.h> 26 26 #include <drm/drm_debugfs.h> 27 + #include <drm/drm_eld.h> 27 28 #include <drm/drm_file.h> 28 29 #include <drm/drm_fourcc.h> 29 30 #include <drm/drm_probe_helper.h>

+1

drivers/gpu/drm/tegra/sor.c

··· 20 20 #include <drm/display/drm_scdc_helper.h> 21 21 #include <drm/drm_atomic_helper.h> 22 22 #include <drm/drm_debugfs.h> 23 + #include <drm/drm_eld.h> 23 24 #include <drm/drm_file.h> 24 25 #include <drm/drm_panel.h> 25 26 #include <drm/drm_simple_kms_helper.h>

+2 -2

drivers/gpu/drm/tests/Makefile

··· 9 9 drm_connector_test.o \ 10 10 drm_damage_helper_test.o \ 11 11 drm_dp_mst_helper_test.o \ 12 + drm_exec_test.o \ 12 13 drm_format_helper_test.o \ 13 14 drm_format_test.o \ 14 15 drm_framebuffer_test.o \ ··· 18 17 drm_modes_test.o \ 19 18 drm_plane_helper_test.o \ 20 19 drm_probe_helper_test.o \ 21 - drm_rect_test.o \ 22 - drm_exec_test.o 20 + drm_rect_test.o 23 21 24 22 CFLAGS_drm_mm_test.o := $(DISABLE_STRUCTLEAK_PLUGIN)

-465

drivers/gpu/drm/tests/drm_buddy_test.c

··· 13 13 14 14 #include "../lib/drm_random.h" 15 15 16 - #define TIMEOUT(name__) \ 17 - unsigned long name__ = jiffies + MAX_SCHEDULE_TIMEOUT 18 - 19 - static unsigned int random_seed; 20 - 21 16 static inline u64 get_size(int order, u64 chunk_size) 22 17 { 23 18 return (1 << order) * chunk_size; 24 - } 25 - 26 - __printf(2, 3) 27 - static bool __timeout(unsigned long timeout, const char *fmt, ...) 28 - { 29 - va_list va; 30 - 31 - if (!signal_pending(current)) { 32 - cond_resched(); 33 - if (time_before(jiffies, timeout)) 34 - return false; 35 - } 36 - 37 - if (fmt) { 38 - va_start(va, fmt); 39 - vprintk(fmt, va); 40 - va_end(va); 41 - } 42 - 43 - return true; 44 - } 45 - 46 - static void __dump_block(struct kunit *test, struct drm_buddy *mm, 47 - struct drm_buddy_block *block, bool buddy) 48 - { 49 - kunit_err(test, "block info: header=%llx, state=%u, order=%d, offset=%llx size=%llx root=%d buddy=%d\n", 50 - block->header, drm_buddy_block_state(block), 51 - drm_buddy_block_order(block), drm_buddy_block_offset(block), 52 - drm_buddy_block_size(mm, block), !block->parent, buddy); 53 - } 54 - 55 - static void dump_block(struct kunit *test, struct drm_buddy *mm, 56 - struct drm_buddy_block *block) 57 - { 58 - struct drm_buddy_block *buddy; 59 - 60 - __dump_block(test, mm, block, false); 61 - 62 - buddy = drm_get_buddy(block); 63 - if (buddy) 64 - __dump_block(test, mm, buddy, true); 65 - } 66 - 67 - static int check_block(struct kunit *test, struct drm_buddy *mm, 68 - struct drm_buddy_block *block) 69 - { 70 - struct drm_buddy_block *buddy; 71 - unsigned int block_state; 72 - u64 block_size; 73 - u64 offset; 74 - int err = 0; 75 - 76 - block_state = drm_buddy_block_state(block); 77 - 78 - if (block_state != DRM_BUDDY_ALLOCATED && 79 - block_state != DRM_BUDDY_FREE && block_state != DRM_BUDDY_SPLIT) { 80 - kunit_err(test, "block state mismatch\n"); 81 - err = -EINVAL; 82 - } 83 - 84 - block_size = drm_buddy_block_size(mm, block); 85 - offset = drm_buddy_block_offset(block); 86 - 87 - if (block_size < mm->chunk_size) { 88 - kunit_err(test, "block size smaller than min size\n"); 89 - err = -EINVAL; 90 - } 91 - 92 - /* We can't use is_power_of_2() for a u64 on 32-bit systems. */ 93 - if (block_size & (block_size - 1)) { 94 - kunit_err(test, "block size not power of two\n"); 95 - err = -EINVAL; 96 - } 97 - 98 - if (!IS_ALIGNED(block_size, mm->chunk_size)) { 99 - kunit_err(test, "block size not aligned to min size\n"); 100 - err = -EINVAL; 101 - } 102 - 103 - if (!IS_ALIGNED(offset, mm->chunk_size)) { 104 - kunit_err(test, "block offset not aligned to min size\n"); 105 - err = -EINVAL; 106 - } 107 - 108 - if (!IS_ALIGNED(offset, block_size)) { 109 - kunit_err(test, "block offset not aligned to block size\n"); 110 - err = -EINVAL; 111 - } 112 - 113 - buddy = drm_get_buddy(block); 114 - 115 - if (!buddy && block->parent) { 116 - kunit_err(test, "buddy has gone fishing\n"); 117 - err = -EINVAL; 118 - } 119 - 120 - if (buddy) { 121 - if (drm_buddy_block_offset(buddy) != (offset ^ block_size)) { 122 - kunit_err(test, "buddy has wrong offset\n"); 123 - err = -EINVAL; 124 - } 125 - 126 - if (drm_buddy_block_size(mm, buddy) != block_size) { 127 - kunit_err(test, "buddy size mismatch\n"); 128 - err = -EINVAL; 129 - } 130 - 131 - if (drm_buddy_block_state(buddy) == block_state && 132 - block_state == DRM_BUDDY_FREE) { 133 - kunit_err(test, "block and its buddy are free\n"); 134 - err = -EINVAL; 135 - } 136 - } 137 - 138 - return err; 139 - } 140 - 141 - static int check_blocks(struct kunit *test, struct drm_buddy *mm, 142 - struct list_head *blocks, u64 expected_size, bool is_contiguous) 143 - { 144 - struct drm_buddy_block *block; 145 - struct drm_buddy_block *prev; 146 - u64 total; 147 - int err = 0; 148 - 149 - block = NULL; 150 - prev = NULL; 151 - total = 0; 152 - 153 - list_for_each_entry(block, blocks, link) { 154 - err = check_block(test, mm, block); 155 - 156 - if (!drm_buddy_block_is_allocated(block)) { 157 - kunit_err(test, "block not allocated\n"); 158 - err = -EINVAL; 159 - } 160 - 161 - if (is_contiguous && prev) { 162 - u64 prev_block_size; 163 - u64 prev_offset; 164 - u64 offset; 165 - 166 - prev_offset = drm_buddy_block_offset(prev); 167 - prev_block_size = drm_buddy_block_size(mm, prev); 168 - offset = drm_buddy_block_offset(block); 169 - 170 - if (offset != (prev_offset + prev_block_size)) { 171 - kunit_err(test, "block offset mismatch\n"); 172 - err = -EINVAL; 173 - } 174 - } 175 - 176 - if (err) 177 - break; 178 - 179 - total += drm_buddy_block_size(mm, block); 180 - prev = block; 181 - } 182 - 183 - if (!err) { 184 - if (total != expected_size) { 185 - kunit_err(test, "size mismatch, expected=%llx, found=%llx\n", 186 - expected_size, total); 187 - err = -EINVAL; 188 - } 189 - return err; 190 - } 191 - 192 - if (prev) { 193 - kunit_err(test, "prev block, dump:\n"); 194 - dump_block(test, mm, prev); 195 - } 196 - 197 - kunit_err(test, "bad block, dump:\n"); 198 - dump_block(test, mm, block); 199 - 200 - return err; 201 - } 202 - 203 - static int check_mm(struct kunit *test, struct drm_buddy *mm) 204 - { 205 - struct drm_buddy_block *root; 206 - struct drm_buddy_block *prev; 207 - unsigned int i; 208 - u64 total; 209 - int err = 0; 210 - 211 - if (!mm->n_roots) { 212 - kunit_err(test, "n_roots is zero\n"); 213 - return -EINVAL; 214 - } 215 - 216 - if (mm->n_roots != hweight64(mm->size)) { 217 - kunit_err(test, "n_roots mismatch, n_roots=%u, expected=%lu\n", 218 - mm->n_roots, hweight64(mm->size)); 219 - return -EINVAL; 220 - } 221 - 222 - root = NULL; 223 - prev = NULL; 224 - total = 0; 225 - 226 - for (i = 0; i < mm->n_roots; ++i) { 227 - struct drm_buddy_block *block; 228 - unsigned int order; 229 - 230 - root = mm->roots[i]; 231 - if (!root) { 232 - kunit_err(test, "root(%u) is NULL\n", i); 233 - err = -EINVAL; 234 - break; 235 - } 236 - 237 - err = check_block(test, mm, root); 238 - 239 - if (!drm_buddy_block_is_free(root)) { 240 - kunit_err(test, "root not free\n"); 241 - err = -EINVAL; 242 - } 243 - 244 - order = drm_buddy_block_order(root); 245 - 246 - if (!i) { 247 - if (order != mm->max_order) { 248 - kunit_err(test, "max order root missing\n"); 249 - err = -EINVAL; 250 - } 251 - } 252 - 253 - if (prev) { 254 - u64 prev_block_size; 255 - u64 prev_offset; 256 - u64 offset; 257 - 258 - prev_offset = drm_buddy_block_offset(prev); 259 - prev_block_size = drm_buddy_block_size(mm, prev); 260 - offset = drm_buddy_block_offset(root); 261 - 262 - if (offset != (prev_offset + prev_block_size)) { 263 - kunit_err(test, "root offset mismatch\n"); 264 - err = -EINVAL; 265 - } 266 - } 267 - 268 - block = list_first_entry_or_null(&mm->free_list[order], 269 - struct drm_buddy_block, link); 270 - if (block != root) { 271 - kunit_err(test, "root mismatch at order=%u\n", order); 272 - err = -EINVAL; 273 - } 274 - 275 - if (err) 276 - break; 277 - 278 - prev = root; 279 - total += drm_buddy_block_size(mm, root); 280 - } 281 - 282 - if (!err) { 283 - if (total != mm->size) { 284 - kunit_err(test, "expected mm size=%llx, found=%llx\n", 285 - mm->size, total); 286 - err = -EINVAL; 287 - } 288 - return err; 289 - } 290 - 291 - if (prev) { 292 - kunit_err(test, "prev root(%u), dump:\n", i - 1); 293 - dump_block(test, mm, prev); 294 - } 295 - 296 - if (root) { 297 - kunit_err(test, "bad root(%u), dump:\n", i); 298 - dump_block(test, mm, root); 299 - } 300 - 301 - return err; 302 - } 303 - 304 - static void mm_config(u64 *size, u64 *chunk_size) 305 - { 306 - DRM_RND_STATE(prng, random_seed); 307 - u32 s, ms; 308 - 309 - /* Nothing fancy, just try to get an interesting bit pattern */ 310 - 311 - prandom_seed_state(&prng, random_seed); 312 - 313 - /* Let size be a random number of pages up to 8 GB (2M pages) */ 314 - s = 1 + drm_prandom_u32_max_state((BIT(33 - 12)) - 1, &prng); 315 - /* Let the chunk size be a random power of 2 less than size */ 316 - ms = BIT(drm_prandom_u32_max_state(ilog2(s), &prng)); 317 - /* Round size down to the chunk size */ 318 - s &= -ms; 319 - 320 - /* Convert from pages to bytes */ 321 - *chunk_size = (u64)ms << 12; 322 - *size = (u64)s << 12; 323 19 } 324 20 325 21 static void drm_test_buddy_alloc_pathological(struct kunit *test) ··· 96 400 97 401 list_splice_tail(&holes, &blocks); 98 402 drm_buddy_free_list(&mm, &blocks); 99 - drm_buddy_fini(&mm); 100 - } 101 - 102 - static void drm_test_buddy_alloc_smoke(struct kunit *test) 103 - { 104 - u64 mm_size, chunk_size, start = 0; 105 - unsigned long flags = 0; 106 - struct drm_buddy mm; 107 - int *order; 108 - int i; 109 - 110 - DRM_RND_STATE(prng, random_seed); 111 - TIMEOUT(end_time); 112 - 113 - mm_config(&mm_size, &chunk_size); 114 - 115 - KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_init(&mm, mm_size, chunk_size), 116 - "buddy_init failed\n"); 117 - 118 - order = drm_random_order(mm.max_order + 1, &prng); 119 - KUNIT_ASSERT_TRUE(test, order); 120 - 121 - for (i = 0; i <= mm.max_order; ++i) { 122 - struct drm_buddy_block *block; 123 - int max_order = order[i]; 124 - bool timeout = false; 125 - LIST_HEAD(blocks); 126 - u64 total, size; 127 - LIST_HEAD(tmp); 128 - int order, err; 129 - 130 - KUNIT_ASSERT_FALSE_MSG(test, check_mm(test, &mm), 131 - "pre-mm check failed, abort\n"); 132 - 133 - order = max_order; 134 - total = 0; 135 - 136 - do { 137 - retry: 138 - size = get_size(order, chunk_size); 139 - err = drm_buddy_alloc_blocks(&mm, start, mm_size, size, size, &tmp, flags); 140 - if (err) { 141 - if (err == -ENOMEM) { 142 - KUNIT_FAIL(test, "buddy_alloc hit -ENOMEM with order=%d\n", 143 - order); 144 - } else { 145 - if (order--) { 146 - err = 0; 147 - goto retry; 148 - } 149 - 150 - KUNIT_FAIL(test, "buddy_alloc with order=%d failed\n", 151 - order); 152 - } 153 - 154 - break; 155 - } 156 - 157 - block = list_first_entry_or_null(&tmp, struct drm_buddy_block, link); 158 - KUNIT_ASSERT_TRUE_MSG(test, block, "alloc_blocks has no blocks\n"); 159 - 160 - list_move_tail(&block->link, &blocks); 161 - KUNIT_EXPECT_EQ_MSG(test, drm_buddy_block_order(block), order, 162 - "buddy_alloc order mismatch\n"); 163 - 164 - total += drm_buddy_block_size(&mm, block); 165 - 166 - if (__timeout(end_time, NULL)) { 167 - timeout = true; 168 - break; 169 - } 170 - } while (total < mm.size); 171 - 172 - if (!err) 173 - err = check_blocks(test, &mm, &blocks, total, false); 174 - 175 - drm_buddy_free_list(&mm, &blocks); 176 - 177 - if (!err) { 178 - KUNIT_EXPECT_FALSE_MSG(test, check_mm(test, &mm), 179 - "post-mm check failed\n"); 180 - } 181 - 182 - if (err || timeout) 183 - break; 184 - 185 - cond_resched(); 186 - } 187 - 188 - kfree(order); 189 403 drm_buddy_fini(&mm); 190 404 } 191 405 ··· 240 634 drm_buddy_fini(&mm); 241 635 } 242 636 243 - static void drm_test_buddy_alloc_range(struct kunit *test) 244 - { 245 - unsigned long flags = DRM_BUDDY_RANGE_ALLOCATION; 246 - u64 offset, size, rem, chunk_size, end; 247 - unsigned long page_num; 248 - struct drm_buddy mm; 249 - LIST_HEAD(blocks); 250 - 251 - mm_config(&size, &chunk_size); 252 - 253 - KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_init(&mm, size, chunk_size), 254 - "buddy_init failed"); 255 - 256 - KUNIT_ASSERT_FALSE_MSG(test, check_mm(test, &mm), 257 - "pre-mm check failed, abort!"); 258 - 259 - rem = mm.size; 260 - offset = 0; 261 - 262 - for_each_prime_number_from(page_num, 1, ULONG_MAX - 1) { 263 - struct drm_buddy_block *block; 264 - LIST_HEAD(tmp); 265 - 266 - size = min(page_num * mm.chunk_size, rem); 267 - end = offset + size; 268 - 269 - KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, offset, end, 270 - size, mm.chunk_size, 271 - &tmp, flags), 272 - "alloc_range with offset=%llx, size=%llx failed\n", offset, size); 273 - 274 - block = list_first_entry_or_null(&tmp, struct drm_buddy_block, link); 275 - KUNIT_ASSERT_TRUE_MSG(test, block, "alloc_range has no blocks\n"); 276 - 277 - KUNIT_ASSERT_EQ_MSG(test, drm_buddy_block_offset(block), offset, 278 - "alloc_range start offset mismatch, found=%llx, expected=%llx\n", 279 - drm_buddy_block_offset(block), offset); 280 - 281 - KUNIT_ASSERT_FALSE(test, check_blocks(test, &mm, &tmp, size, true)); 282 - 283 - list_splice_tail(&tmp, &blocks); 284 - 285 - offset += size; 286 - 287 - rem -= size; 288 - if (!rem) 289 - break; 290 - 291 - cond_resched(); 292 - } 293 - 294 - drm_buddy_free_list(&mm, &blocks); 295 - 296 - KUNIT_EXPECT_FALSE_MSG(test, check_mm(test, &mm), "post-mm check failed\n"); 297 - 298 - drm_buddy_fini(&mm); 299 - } 300 - 301 637 static void drm_test_buddy_alloc_limit(struct kunit *test) 302 638 { 303 639 u64 size = U64_MAX, start = 0; ··· 275 727 drm_buddy_fini(&mm); 276 728 } 277 729 278 - static int drm_buddy_suite_init(struct kunit_suite *suite) 279 - { 280 - while (!random_seed) 281 - random_seed = get_random_u32(); 282 - 283 - kunit_info(suite, "Testing DRM buddy manager, with random_seed=0x%x\n", random_seed); 284 - 285 - return 0; 286 - } 287 - 288 730 static struct kunit_case drm_buddy_tests[] = { 289 731 KUNIT_CASE(drm_test_buddy_alloc_limit), 290 - KUNIT_CASE(drm_test_buddy_alloc_range), 291 732 KUNIT_CASE(drm_test_buddy_alloc_optimistic), 292 733 KUNIT_CASE(drm_test_buddy_alloc_pessimistic), 293 - KUNIT_CASE(drm_test_buddy_alloc_smoke), 294 734 KUNIT_CASE(drm_test_buddy_alloc_pathological), 295 735 {} 296 736 }; 297 737 298 738 static struct kunit_suite drm_buddy_test_suite = { 299 739 .name = "drm_buddy", 300 - .suite_init = drm_buddy_suite_init, 301 740 .test_cases = drm_buddy_tests, 302 741 }; 303 742

+44 -28

drivers/gpu/drm/tests/drm_format_helper_test.c

··· 20 20 21 21 #define TEST_USE_DEFAULT_PITCH 0 22 22 23 + static unsigned char fmtcnv_state_mem[PAGE_SIZE]; 24 + static struct drm_format_conv_state fmtcnv_state = 25 + DRM_FORMAT_CONV_STATE_INIT_PREALLOCATED(fmtcnv_state_mem, sizeof(fmtcnv_state_mem)); 26 + 23 27 struct convert_to_gray8_result { 24 28 unsigned int dst_pitch; 25 29 const u8 expected[TEST_BUF_SIZE]; ··· 634 630 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 635 631 NULL : &result->dst_pitch; 636 632 637 - drm_fb_xrgb8888_to_gray8(&dst, dst_pitch, &src, &fb, &params->clip); 638 - 633 + drm_fb_xrgb8888_to_gray8(&dst, dst_pitch, &src, &fb, &params->clip, &fmtcnv_state); 639 634 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 640 635 } 641 636 ··· 667 664 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 668 665 NULL : &result->dst_pitch; 669 666 670 - drm_fb_xrgb8888_to_rgb332(&dst, dst_pitch, &src, &fb, &params->clip); 667 + drm_fb_xrgb8888_to_rgb332(&dst, dst_pitch, &src, &fb, &params->clip, &fmtcnv_state); 671 668 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 672 669 } 673 670 ··· 700 697 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 701 698 NULL : &result->dst_pitch; 702 699 703 - drm_fb_xrgb8888_to_rgb565(&dst, dst_pitch, &src, &fb, &params->clip, false); 700 + drm_fb_xrgb8888_to_rgb565(&dst, dst_pitch, &src, &fb, &params->clip, 701 + &fmtcnv_state, false); 704 702 buf = le16buf_to_cpu(test, (__force const __le16 *)buf, dst_size / sizeof(__le16)); 705 703 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 706 704 707 705 buf = dst.vaddr; /* restore original value of buf */ 708 - drm_fb_xrgb8888_to_rgb565(&dst, &result->dst_pitch, &src, &fb, &params->clip, true); 706 + drm_fb_xrgb8888_to_rgb565(&dst, &result->dst_pitch, &src, &fb, &params->clip, 707 + &fmtcnv_state, true); 709 708 buf = le16buf_to_cpu(test, (__force const __le16 *)buf, dst_size / sizeof(__le16)); 710 709 KUNIT_EXPECT_MEMEQ(test, buf, result->expected_swab, dst_size); 711 710 ··· 716 711 717 712 int blit_result = 0; 718 713 719 - blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_RGB565, &src, &fb, &params->clip); 714 + blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_RGB565, &src, &fb, &params->clip, 715 + &fmtcnv_state); 720 716 721 717 buf = le16buf_to_cpu(test, (__force const __le16 *)buf, dst_size / sizeof(__le16)); 722 718 ··· 754 748 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 755 749 NULL : &result->dst_pitch; 756 750 757 - drm_fb_xrgb8888_to_xrgb1555(&dst, dst_pitch, &src, &fb, &params->clip); 751 + drm_fb_xrgb8888_to_xrgb1555(&dst, dst_pitch, &src, &fb, &params->clip, &fmtcnv_state); 758 752 buf = le16buf_to_cpu(test, (__force const __le16 *)buf, dst_size / sizeof(__le16)); 759 753 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 760 754 ··· 763 757 764 758 int blit_result = 0; 765 759 766 - blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_XRGB1555, &src, &fb, &params->clip); 760 + blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_XRGB1555, &src, &fb, &params->clip, 761 + &fmtcnv_state); 767 762 768 763 buf = le16buf_to_cpu(test, (__force const __le16 *)buf, dst_size / sizeof(__le16)); 769 764 ··· 801 794 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 802 795 NULL : &result->dst_pitch; 803 796 804 - drm_fb_xrgb8888_to_argb1555(&dst, dst_pitch, &src, &fb, &params->clip); 797 + drm_fb_xrgb8888_to_argb1555(&dst, dst_pitch, &src, &fb, &params->clip, &fmtcnv_state); 805 798 buf = le16buf_to_cpu(test, (__force const __le16 *)buf, dst_size / sizeof(__le16)); 806 799 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 807 800 ··· 810 803 811 804 int blit_result = 0; 812 805 813 - blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_ARGB1555, &src, &fb, &params->clip); 806 + blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_ARGB1555, &src, &fb, &params->clip, 807 + &fmtcnv_state); 814 808 815 809 buf = le16buf_to_cpu(test, (__force const __le16 *)buf, dst_size / sizeof(__le16)); 816 810 ··· 848 840 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 849 841 NULL : &result->dst_pitch; 850 842 851 - drm_fb_xrgb8888_to_rgba5551(&dst, dst_pitch, &src, &fb, &params->clip); 843 + drm_fb_xrgb8888_to_rgba5551(&dst, dst_pitch, &src, &fb, &params->clip, &fmtcnv_state); 852 844 buf = le16buf_to_cpu(test, (__force const __le16 *)buf, dst_size / sizeof(__le16)); 853 845 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 854 846 ··· 857 849 858 850 int blit_result = 0; 859 851 860 - blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_RGBA5551, &src, &fb, &params->clip); 852 + blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_RGBA5551, &src, &fb, &params->clip, 853 + &fmtcnv_state); 861 854 862 855 buf = le16buf_to_cpu(test, (__force const __le16 *)buf, dst_size / sizeof(__le16)); 863 856 ··· 899 890 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 900 891 NULL : &result->dst_pitch; 901 892 902 - drm_fb_xrgb8888_to_rgb888(&dst, dst_pitch, &src, &fb, &params->clip); 893 + drm_fb_xrgb8888_to_rgb888(&dst, dst_pitch, &src, &fb, &params->clip, &fmtcnv_state); 903 894 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 904 895 905 896 buf = dst.vaddr; /* restore original value of buf */ ··· 907 898 908 899 int blit_result = 0; 909 900 910 - blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_RGB888, &src, &fb, &params->clip); 901 + blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_RGB888, &src, &fb, &params->clip, 902 + &fmtcnv_state); 911 903 912 904 KUNIT_EXPECT_FALSE(test, blit_result); 913 905 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); ··· 943 933 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 944 934 NULL : &result->dst_pitch; 945 935 946 - drm_fb_xrgb8888_to_argb8888(&dst, dst_pitch, &src, &fb, &params->clip); 936 + drm_fb_xrgb8888_to_argb8888(&dst, dst_pitch, &src, &fb, &params->clip, &fmtcnv_state); 947 937 buf = le32buf_to_cpu(test, (__force const __le32 *)buf, dst_size / sizeof(u32)); 948 938 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 949 939 ··· 952 942 953 943 int blit_result = 0; 954 944 955 - blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_ARGB8888, &src, &fb, &params->clip); 945 + blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_ARGB8888, &src, &fb, &params->clip, 946 + &fmtcnv_state); 956 947 957 948 buf = le32buf_to_cpu(test, (__force const __le32 *)buf, dst_size / sizeof(u32)); 958 949 ··· 990 979 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 991 980 NULL : &result->dst_pitch; 992 981 993 - drm_fb_xrgb8888_to_xrgb2101010(&dst, dst_pitch, &src, &fb, &params->clip); 982 + drm_fb_xrgb8888_to_xrgb2101010(&dst, dst_pitch, &src, &fb, &params->clip, &fmtcnv_state); 994 983 buf = le32buf_to_cpu(test, buf, dst_size / sizeof(u32)); 995 984 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 996 985 ··· 1000 989 int blit_result = 0; 1001 990 1002 991 blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_XRGB2101010, &src, &fb, 1003 - &params->clip); 992 + &params->clip, &fmtcnv_state); 1004 993 1005 994 KUNIT_EXPECT_FALSE(test, blit_result); 1006 995 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); ··· 1035 1024 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 1036 1025 NULL : &result->dst_pitch; 1037 1026 1038 - drm_fb_xrgb8888_to_argb2101010(&dst, dst_pitch, &src, &fb, &params->clip); 1027 + drm_fb_xrgb8888_to_argb2101010(&dst, dst_pitch, &src, &fb, &params->clip, &fmtcnv_state); 1039 1028 buf = le32buf_to_cpu(test, (__force const __le32 *)buf, dst_size / sizeof(u32)); 1040 1029 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 1041 1030 ··· 1045 1034 int blit_result = 0; 1046 1035 1047 1036 blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_ARGB2101010, &src, &fb, 1048 - &params->clip); 1037 + &params->clip, &fmtcnv_state); 1049 1038 1050 1039 buf = le32buf_to_cpu(test, (__force const __le32 *)buf, dst_size / sizeof(u32)); 1051 1040 ··· 1082 1071 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 1083 1072 NULL : &result->dst_pitch; 1084 1073 1085 - drm_fb_xrgb8888_to_mono(&dst, dst_pitch, &src, &fb, &params->clip); 1074 + drm_fb_xrgb8888_to_mono(&dst, dst_pitch, &src, &fb, &params->clip, &fmtcnv_state); 1086 1075 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 1087 1076 } 1088 1077 ··· 1115 1104 const unsigned int *dst_pitch = (result->dst_pitch == TEST_USE_DEFAULT_PITCH) ? 1116 1105 NULL : &result->dst_pitch; 1117 1106 1118 - drm_fb_swab(&dst, dst_pitch, &src, &fb, &params->clip, false); 1107 + drm_fb_swab(&dst, dst_pitch, &src, &fb, &params->clip, false, &fmtcnv_state); 1119 1108 buf = le32buf_to_cpu(test, (__force const __le32 *)buf, dst_size / sizeof(u32)); 1120 1109 KUNIT_EXPECT_MEMEQ(test, buf, result->expected, dst_size); 1121 1110 ··· 1125 1114 int blit_result; 1126 1115 1127 1116 blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_XRGB8888 | DRM_FORMAT_BIG_ENDIAN, 1128 - &src, &fb, &params->clip); 1117 + &src, &fb, &params->clip, &fmtcnv_state); 1129 1118 buf = le32buf_to_cpu(test, (__force const __le32 *)buf, dst_size / sizeof(u32)); 1130 1119 1131 1120 KUNIT_EXPECT_FALSE(test, blit_result); ··· 1134 1123 buf = dst.vaddr; 1135 1124 memset(buf, 0, dst_size); 1136 1125 1137 - blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_BGRX8888, &src, &fb, &params->clip); 1126 + blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_BGRX8888, &src, &fb, &params->clip, 1127 + &fmtcnv_state); 1138 1128 buf = le32buf_to_cpu(test, (__force const __le32 *)buf, dst_size / sizeof(u32)); 1139 1129 1140 1130 KUNIT_EXPECT_FALSE(test, blit_result); ··· 1149 1137 mock_format.format |= DRM_FORMAT_BIG_ENDIAN; 1150 1138 fb.format = &mock_format; 1151 1139 1152 - blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_XRGB8888, &src, &fb, &params->clip); 1140 + blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_XRGB8888, &src, &fb, &params->clip, 1141 + &fmtcnv_state); 1153 1142 buf = le32buf_to_cpu(test, (__force const __le32 *)buf, dst_size / sizeof(u32)); 1154 1143 1155 1144 KUNIT_EXPECT_FALSE(test, blit_result); ··· 1188 1175 1189 1176 int blit_result = 0; 1190 1177 1191 - blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_ABGR8888, &src, &fb, &params->clip); 1178 + blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_ABGR8888, &src, &fb, &params->clip, 1179 + &fmtcnv_state); 1192 1180 1193 1181 buf = le32buf_to_cpu(test, (__force const __le32 *)buf, dst_size / sizeof(u32)); 1194 1182 ··· 1228 1214 1229 1215 int blit_result = 0; 1230 1216 1231 - blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_XBGR8888, &src, &fb, &params->clip); 1217 + blit_result = drm_fb_blit(&dst, dst_pitch, DRM_FORMAT_XBGR8888, &src, &fb, &params->clip, 1218 + &fmtcnv_state); 1232 1219 1233 1220 buf = le32buf_to_cpu(test, (__force const __le32 *)buf, dst_size / sizeof(u32)); 1234 1221 ··· 1832 1817 1833 1818 int blit_result; 1834 1819 1835 - blit_result = drm_fb_blit(dst, dst_pitches, params->format, src, &fb, &params->clip); 1820 + blit_result = drm_fb_blit(dst, dst_pitches, params->format, src, &fb, &params->clip, 1821 + &fmtcnv_state); 1836 1822 1837 1823 KUNIT_EXPECT_FALSE(test, blit_result); 1838 1824 for (size_t i = 0; i < fb.format->num_planes; i++) {

-1904

drivers/gpu/drm/tests/drm_mm_test.c

··· 17 17 18 18 #include "../lib/drm_random.h" 19 19 20 - static unsigned int random_seed; 21 - static unsigned int max_iterations = 8192; 22 - static unsigned int max_prime = 128; 23 - 24 20 enum { 25 21 BEST, 26 22 BOTTOMUP, ··· 32 36 [BOTTOMUP] = { "bottom-up", DRM_MM_INSERT_LOW }, 33 37 [TOPDOWN] = { "top-down", DRM_MM_INSERT_HIGH }, 34 38 [EVICT] = { "evict", DRM_MM_INSERT_EVICT }, 35 - {} 36 - }, evict_modes[] = { 37 - { "bottom-up", DRM_MM_INSERT_LOW }, 38 - { "top-down", DRM_MM_INSERT_HIGH }, 39 39 {} 40 40 }; 41 41 ··· 87 95 } 88 96 89 97 return ok; 90 - } 91 - 92 - static bool assert_continuous(struct kunit *test, const struct drm_mm *mm, u64 size) 93 - { 94 - struct drm_mm_node *node, *check, *found; 95 - unsigned long n; 96 - u64 addr; 97 - 98 - if (!assert_no_holes(test, mm)) 99 - return false; 100 - 101 - n = 0; 102 - addr = 0; 103 - drm_mm_for_each_node(node, mm) { 104 - if (node->start != addr) { 105 - KUNIT_FAIL(test, "node[%ld] list out of order, expected %llx found %llx\n", 106 - n, addr, node->start); 107 - return false; 108 - } 109 - 110 - if (node->size != size) { 111 - KUNIT_FAIL(test, "node[%ld].size incorrect, expected %llx, found %llx\n", 112 - n, size, node->size); 113 - return false; 114 - } 115 - 116 - if (drm_mm_hole_follows(node)) { 117 - KUNIT_FAIL(test, "node[%ld] is followed by a hole!\n", n); 118 - return false; 119 - } 120 - 121 - found = NULL; 122 - drm_mm_for_each_node_in_range(check, mm, addr, addr + size) { 123 - if (node != check) { 124 - KUNIT_FAIL(test, 125 - "lookup return wrong node, expected start %llx, found %llx\n", 126 - node->start, check->start); 127 - return false; 128 - } 129 - found = check; 130 - } 131 - if (!found) { 132 - KUNIT_FAIL(test, "lookup failed for node %llx + %llx\n", addr, size); 133 - return false; 134 - } 135 - 136 - addr += size; 137 - n++; 138 - } 139 - 140 - return true; 141 98 } 142 99 143 100 static u64 misalignment(struct drm_mm_node *node, u64 alignment) ··· 211 270 nodes[0].start, nodes[0].size); 212 271 } 213 272 214 - static struct drm_mm_node *set_node(struct drm_mm_node *node, 215 - u64 start, u64 size) 216 - { 217 - node->start = start; 218 - node->size = size; 219 - return node; 220 - } 221 - 222 - static bool expect_reserve_fail(struct kunit *test, struct drm_mm *mm, struct drm_mm_node *node) 223 - { 224 - int err; 225 - 226 - err = drm_mm_reserve_node(mm, node); 227 - if (likely(err == -ENOSPC)) 228 - return true; 229 - 230 - if (!err) { 231 - KUNIT_FAIL(test, "impossible reserve succeeded, node %llu + %llu\n", 232 - node->start, node->size); 233 - drm_mm_remove_node(node); 234 - } else { 235 - KUNIT_FAIL(test, 236 - "impossible reserve failed with wrong error %d [expected %d], node %llu + %llu\n", 237 - err, -ENOSPC, node->start, node->size); 238 - } 239 - return false; 240 - } 241 - 242 - static bool noinline_for_stack check_reserve_boundaries(struct kunit *test, struct drm_mm *mm, 243 - unsigned int count, 244 - u64 size) 245 - { 246 - const struct boundary { 247 - u64 start, size; 248 - const char *name; 249 - } boundaries[] = { 250 - #define B(st, sz) { (st), (sz), "{ " #st ", " #sz "}" } 251 - B(0, 0), 252 - B(-size, 0), 253 - B(size, 0), 254 - B(size * count, 0), 255 - B(-size, size), 256 - B(-size, -size), 257 - B(-size, 2 * size), 258 - B(0, -size), 259 - B(size, -size), 260 - B(count * size, size), 261 - B(count * size, -size), 262 - B(count * size, count * size), 263 - B(count * size, -count * size), 264 - B(count * size, -(count + 1) * size), 265 - B((count + 1) * size, size), 266 - B((count + 1) * size, -size), 267 - B((count + 1) * size, -2 * size), 268 - #undef B 269 - }; 270 - struct drm_mm_node tmp = {}; 271 - int n; 272 - 273 - for (n = 0; n < ARRAY_SIZE(boundaries); n++) { 274 - if (!expect_reserve_fail(test, mm, set_node(&tmp, boundaries[n].start, 275 - boundaries[n].size))) { 276 - KUNIT_FAIL(test, "boundary[%d:%s] failed, count=%u, size=%lld\n", 277 - n, boundaries[n].name, count, size); 278 - return false; 279 - } 280 - } 281 - 282 - return true; 283 - } 284 - 285 - static int __drm_test_mm_reserve(struct kunit *test, unsigned int count, u64 size) 286 - { 287 - DRM_RND_STATE(prng, random_seed); 288 - struct drm_mm mm; 289 - struct drm_mm_node tmp, *nodes, *node, *next; 290 - unsigned int *order, n, m, o = 0; 291 - int ret, err; 292 - 293 - /* For exercising drm_mm_reserve_node(), we want to check that 294 - * reservations outside of the drm_mm range are rejected, and to 295 - * overlapping and otherwise already occupied ranges. Afterwards, 296 - * the tree and nodes should be intact. 297 - */ 298 - 299 - DRM_MM_BUG_ON(!count); 300 - DRM_MM_BUG_ON(!size); 301 - 302 - ret = -ENOMEM; 303 - order = drm_random_order(count, &prng); 304 - if (!order) 305 - goto err; 306 - 307 - nodes = vzalloc(array_size(count, sizeof(*nodes))); 308 - KUNIT_ASSERT_TRUE(test, nodes); 309 - 310 - ret = -EINVAL; 311 - drm_mm_init(&mm, 0, count * size); 312 - 313 - if (!check_reserve_boundaries(test, &mm, count, size)) 314 - goto out; 315 - 316 - for (n = 0; n < count; n++) { 317 - nodes[n].start = order[n] * size; 318 - nodes[n].size = size; 319 - 320 - err = drm_mm_reserve_node(&mm, &nodes[n]); 321 - if (err) { 322 - KUNIT_FAIL(test, "reserve failed, step %d, start %llu\n", 323 - n, nodes[n].start); 324 - ret = err; 325 - goto out; 326 - } 327 - 328 - if (!drm_mm_node_allocated(&nodes[n])) { 329 - KUNIT_FAIL(test, "reserved node not allocated! step %d, start %llu\n", 330 - n, nodes[n].start); 331 - goto out; 332 - } 333 - 334 - if (!expect_reserve_fail(test, &mm, &nodes[n])) 335 - goto out; 336 - } 337 - 338 - /* After random insertion the nodes should be in order */ 339 - if (!assert_continuous(test, &mm, size)) 340 - goto out; 341 - 342 - /* Repeated use should then fail */ 343 - drm_random_reorder(order, count, &prng); 344 - for (n = 0; n < count; n++) { 345 - if (!expect_reserve_fail(test, &mm, set_node(&tmp, order[n] * size, 1))) 346 - goto out; 347 - 348 - /* Remove and reinsert should work */ 349 - drm_mm_remove_node(&nodes[order[n]]); 350 - err = drm_mm_reserve_node(&mm, &nodes[order[n]]); 351 - if (err) { 352 - KUNIT_FAIL(test, "reserve failed, step %d, start %llu\n", 353 - n, nodes[n].start); 354 - ret = err; 355 - goto out; 356 - } 357 - } 358 - 359 - if (!assert_continuous(test, &mm, size)) 360 - goto out; 361 - 362 - /* Overlapping use should then fail */ 363 - for (n = 0; n < count; n++) { 364 - if (!expect_reserve_fail(test, &mm, set_node(&tmp, 0, size * count))) 365 - goto out; 366 - } 367 - for (n = 0; n < count; n++) { 368 - if (!expect_reserve_fail(test, &mm, set_node(&tmp, size * n, size * (count - n)))) 369 - goto out; 370 - } 371 - 372 - /* Remove several, reinsert, check full */ 373 - for_each_prime_number(n, min(max_prime, count)) { 374 - for (m = 0; m < n; m++) { 375 - node = &nodes[order[(o + m) % count]]; 376 - drm_mm_remove_node(node); 377 - } 378 - 379 - for (m = 0; m < n; m++) { 380 - node = &nodes[order[(o + m) % count]]; 381 - err = drm_mm_reserve_node(&mm, node); 382 - if (err) { 383 - KUNIT_FAIL(test, "reserve failed, step %d/%d, start %llu\n", 384 - m, n, node->start); 385 - ret = err; 386 - goto out; 387 - } 388 - } 389 - 390 - o += n; 391 - 392 - if (!assert_continuous(test, &mm, size)) 393 - goto out; 394 - } 395 - 396 - ret = 0; 397 - out: 398 - drm_mm_for_each_node_safe(node, next, &mm) 399 - drm_mm_remove_node(node); 400 - drm_mm_takedown(&mm); 401 - vfree(nodes); 402 - kfree(order); 403 - err: 404 - return ret; 405 - } 406 - 407 - static void drm_test_mm_reserve(struct kunit *test) 408 - { 409 - const unsigned int count = min_t(unsigned int, BIT(10), max_iterations); 410 - int n; 411 - 412 - for_each_prime_number_from(n, 1, 54) { 413 - u64 size = BIT_ULL(n); 414 - 415 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_reserve(test, count, size - 1)); 416 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_reserve(test, count, size)); 417 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_reserve(test, count, size + 1)); 418 - 419 - cond_resched(); 420 - } 421 - } 422 - 423 273 static bool expect_insert(struct kunit *test, struct drm_mm *mm, 424 274 struct drm_mm_node *node, u64 size, u64 alignment, unsigned long color, 425 275 const struct insert_mode *mode) ··· 233 501 } 234 502 235 503 return true; 236 - } 237 - 238 - static bool expect_insert_fail(struct kunit *test, struct drm_mm *mm, u64 size) 239 - { 240 - struct drm_mm_node tmp = {}; 241 - int err; 242 - 243 - err = drm_mm_insert_node(mm, &tmp, size); 244 - if (likely(err == -ENOSPC)) 245 - return true; 246 - 247 - if (!err) { 248 - KUNIT_FAIL(test, "impossible insert succeeded, node %llu + %llu\n", 249 - tmp.start, tmp.size); 250 - drm_mm_remove_node(&tmp); 251 - } else { 252 - KUNIT_FAIL(test, 253 - "impossible insert failed with wrong error %d [expected %d], size %llu\n", 254 - err, -ENOSPC, size); 255 - } 256 - return false; 257 - } 258 - 259 - static int __drm_test_mm_insert(struct kunit *test, unsigned int count, u64 size, bool replace) 260 - { 261 - DRM_RND_STATE(prng, random_seed); 262 - const struct insert_mode *mode; 263 - struct drm_mm mm; 264 - struct drm_mm_node *nodes, *node, *next; 265 - unsigned int *order, n, m, o = 0; 266 - int ret; 267 - 268 - /* Fill a range with lots of nodes, check it doesn't fail too early */ 269 - 270 - DRM_MM_BUG_ON(!count); 271 - DRM_MM_BUG_ON(!size); 272 - 273 - ret = -ENOMEM; 274 - nodes = vmalloc(array_size(count, sizeof(*nodes))); 275 - KUNIT_ASSERT_TRUE(test, nodes); 276 - 277 - order = drm_random_order(count, &prng); 278 - if (!order) 279 - goto err_nodes; 280 - 281 - ret = -EINVAL; 282 - drm_mm_init(&mm, 0, count * size); 283 - 284 - for (mode = insert_modes; mode->name; mode++) { 285 - for (n = 0; n < count; n++) { 286 - struct drm_mm_node tmp; 287 - 288 - node = replace ? &tmp : &nodes[n]; 289 - memset(node, 0, sizeof(*node)); 290 - if (!expect_insert(test, &mm, node, size, 0, n, mode)) { 291 - KUNIT_FAIL(test, "%s insert failed, size %llu step %d\n", 292 - mode->name, size, n); 293 - goto out; 294 - } 295 - 296 - if (replace) { 297 - drm_mm_replace_node(&tmp, &nodes[n]); 298 - if (drm_mm_node_allocated(&tmp)) { 299 - KUNIT_FAIL(test, 300 - "replaced old-node still allocated! step %d\n", 301 - n); 302 - goto out; 303 - } 304 - 305 - if (!assert_node(test, &nodes[n], &mm, size, 0, n)) { 306 - KUNIT_FAIL(test, 307 - "replaced node did not inherit parameters, size %llu step %d\n", 308 - size, n); 309 - goto out; 310 - } 311 - 312 - if (tmp.start != nodes[n].start) { 313 - KUNIT_FAIL(test, 314 - "replaced node mismatch location expected [%llx + %llx], found [%llx + %llx]\n", 315 - tmp.start, size, nodes[n].start, nodes[n].size); 316 - goto out; 317 - } 318 - } 319 - } 320 - 321 - /* After random insertion the nodes should be in order */ 322 - if (!assert_continuous(test, &mm, size)) 323 - goto out; 324 - 325 - /* Repeated use should then fail */ 326 - if (!expect_insert_fail(test, &mm, size)) 327 - goto out; 328 - 329 - /* Remove one and reinsert, as the only hole it should refill itself */ 330 - for (n = 0; n < count; n++) { 331 - u64 addr = nodes[n].start; 332 - 333 - drm_mm_remove_node(&nodes[n]); 334 - if (!expect_insert(test, &mm, &nodes[n], size, 0, n, mode)) { 335 - KUNIT_FAIL(test, "%s reinsert failed, size %llu step %d\n", 336 - mode->name, size, n); 337 - goto out; 338 - } 339 - 340 - if (nodes[n].start != addr) { 341 - KUNIT_FAIL(test, 342 - "%s reinsert node moved, step %d, expected %llx, found %llx\n", 343 - mode->name, n, addr, nodes[n].start); 344 - goto out; 345 - } 346 - 347 - if (!assert_continuous(test, &mm, size)) 348 - goto out; 349 - } 350 - 351 - /* Remove several, reinsert, check full */ 352 - for_each_prime_number(n, min(max_prime, count)) { 353 - for (m = 0; m < n; m++) { 354 - node = &nodes[order[(o + m) % count]]; 355 - drm_mm_remove_node(node); 356 - } 357 - 358 - for (m = 0; m < n; m++) { 359 - node = &nodes[order[(o + m) % count]]; 360 - if (!expect_insert(test, &mm, node, size, 0, n, mode)) { 361 - KUNIT_FAIL(test, 362 - "%s multiple reinsert failed, size %llu step %d\n", 363 - mode->name, size, n); 364 - goto out; 365 - } 366 - } 367 - 368 - o += n; 369 - 370 - if (!assert_continuous(test, &mm, size)) 371 - goto out; 372 - 373 - if (!expect_insert_fail(test, &mm, size)) 374 - goto out; 375 - } 376 - 377 - drm_mm_for_each_node_safe(node, next, &mm) 378 - drm_mm_remove_node(node); 379 - DRM_MM_BUG_ON(!drm_mm_clean(&mm)); 380 - 381 - cond_resched(); 382 - } 383 - 384 - ret = 0; 385 - out: 386 - drm_mm_for_each_node_safe(node, next, &mm) 387 - drm_mm_remove_node(node); 388 - drm_mm_takedown(&mm); 389 - kfree(order); 390 - err_nodes: 391 - vfree(nodes); 392 - return ret; 393 - } 394 - 395 - static void drm_test_mm_insert(struct kunit *test) 396 - { 397 - const unsigned int count = min_t(unsigned int, BIT(10), max_iterations); 398 - unsigned int n; 399 - 400 - for_each_prime_number_from(n, 1, 54) { 401 - u64 size = BIT_ULL(n); 402 - 403 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert(test, count, size - 1, false)); 404 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert(test, count, size, false)); 405 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert(test, count, size + 1, false)); 406 - 407 - cond_resched(); 408 - } 409 - } 410 - 411 - static void drm_test_mm_replace(struct kunit *test) 412 - { 413 - const unsigned int count = min_t(unsigned int, BIT(10), max_iterations); 414 - unsigned int n; 415 - 416 - /* Reuse __drm_test_mm_insert to exercise replacement by inserting a dummy node, 417 - * then replacing it with the intended node. We want to check that 418 - * the tree is intact and all the information we need is carried 419 - * across to the target node. 420 - */ 421 - 422 - for_each_prime_number_from(n, 1, 54) { 423 - u64 size = BIT_ULL(n); 424 - 425 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert(test, count, size - 1, true)); 426 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert(test, count, size, true)); 427 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert(test, count, size + 1, true)); 428 - 429 - cond_resched(); 430 - } 431 - } 432 - 433 - static bool expect_insert_in_range(struct kunit *test, struct drm_mm *mm, struct drm_mm_node *node, 434 - u64 size, u64 alignment, unsigned long color, 435 - u64 range_start, u64 range_end, const struct insert_mode *mode) 436 - { 437 - int err; 438 - 439 - err = drm_mm_insert_node_in_range(mm, node, 440 - size, alignment, color, 441 - range_start, range_end, 442 - mode->mode); 443 - if (err) { 444 - KUNIT_FAIL(test, 445 - "insert (size=%llu, alignment=%llu, color=%lu, mode=%s) nto range [%llx, %llx] failed with err=%d\n", 446 - size, alignment, color, mode->name, 447 - range_start, range_end, err); 448 - return false; 449 - } 450 - 451 - if (!assert_node(test, node, mm, size, alignment, color)) { 452 - drm_mm_remove_node(node); 453 - return false; 454 - } 455 - 456 - return true; 457 - } 458 - 459 - static bool expect_insert_in_range_fail(struct kunit *test, struct drm_mm *mm, 460 - u64 size, u64 range_start, u64 range_end) 461 - { 462 - struct drm_mm_node tmp = {}; 463 - int err; 464 - 465 - err = drm_mm_insert_node_in_range(mm, &tmp, size, 0, 0, range_start, range_end, 466 - 0); 467 - if (likely(err == -ENOSPC)) 468 - return true; 469 - 470 - if (!err) { 471 - KUNIT_FAIL(test, 472 - "impossible insert succeeded, node %llx + %llu, range [%llx, %llx]\n", 473 - tmp.start, tmp.size, range_start, range_end); 474 - drm_mm_remove_node(&tmp); 475 - } else { 476 - KUNIT_FAIL(test, 477 - "impossible insert failed with wrong error %d [expected %d], size %llu, range [%llx, %llx]\n", 478 - err, -ENOSPC, size, range_start, range_end); 479 - } 480 - 481 - return false; 482 - } 483 - 484 - static bool assert_contiguous_in_range(struct kunit *test, struct drm_mm *mm, 485 - u64 size, u64 start, u64 end) 486 - { 487 - struct drm_mm_node *node; 488 - unsigned int n; 489 - 490 - if (!expect_insert_in_range_fail(test, mm, size, start, end)) 491 - return false; 492 - 493 - n = div64_u64(start + size - 1, size); 494 - drm_mm_for_each_node(node, mm) { 495 - if (node->start < start || node->start + node->size > end) { 496 - KUNIT_FAIL(test, 497 - "node %d out of range, address [%llx + %llu], range [%llx, %llx]\n", 498 - n, node->start, node->start + node->size, start, end); 499 - return false; 500 - } 501 - 502 - if (node->start != n * size) { 503 - KUNIT_FAIL(test, "node %d out of order, expected start %llx, found %llx\n", 504 - n, n * size, node->start); 505 - return false; 506 - } 507 - 508 - if (node->size != size) { 509 - KUNIT_FAIL(test, "node %d has wrong size, expected size %llx, found %llx\n", 510 - n, size, node->size); 511 - return false; 512 - } 513 - 514 - if (drm_mm_hole_follows(node) && drm_mm_hole_node_end(node) < end) { 515 - KUNIT_FAIL(test, "node %d is followed by a hole!\n", n); 516 - return false; 517 - } 518 - 519 - n++; 520 - } 521 - 522 - if (start > 0) { 523 - node = __drm_mm_interval_first(mm, 0, start - 1); 524 - if (drm_mm_node_allocated(node)) { 525 - KUNIT_FAIL(test, "node before start: node=%llx+%llu, start=%llx\n", 526 - node->start, node->size, start); 527 - return false; 528 - } 529 - } 530 - 531 - if (end < U64_MAX) { 532 - node = __drm_mm_interval_first(mm, end, U64_MAX); 533 - if (drm_mm_node_allocated(node)) { 534 - KUNIT_FAIL(test, "node after end: node=%llx+%llu, end=%llx\n", 535 - node->start, node->size, end); 536 - return false; 537 - } 538 - } 539 - 540 - return true; 541 - } 542 - 543 - static int __drm_test_mm_insert_range(struct kunit *test, unsigned int count, u64 size, 544 - u64 start, u64 end) 545 - { 546 - const struct insert_mode *mode; 547 - struct drm_mm mm; 548 - struct drm_mm_node *nodes, *node, *next; 549 - unsigned int n, start_n, end_n; 550 - int ret; 551 - 552 - DRM_MM_BUG_ON(!count); 553 - DRM_MM_BUG_ON(!size); 554 - DRM_MM_BUG_ON(end <= start); 555 - 556 - /* Very similar to __drm_test_mm_insert(), but now instead of populating the 557 - * full range of the drm_mm, we try to fill a small portion of it. 558 - */ 559 - 560 - ret = -ENOMEM; 561 - nodes = vzalloc(array_size(count, sizeof(*nodes))); 562 - KUNIT_ASSERT_TRUE(test, nodes); 563 - 564 - ret = -EINVAL; 565 - drm_mm_init(&mm, 0, count * size); 566 - 567 - start_n = div64_u64(start + size - 1, size); 568 - end_n = div64_u64(end - size, size); 569 - 570 - for (mode = insert_modes; mode->name; mode++) { 571 - for (n = start_n; n <= end_n; n++) { 572 - if (!expect_insert_in_range(test, &mm, &nodes[n], size, size, n, 573 - start, end, mode)) { 574 - KUNIT_FAIL(test, 575 - "%s insert failed, size %llu, step %d [%d, %d], range [%llx, %llx]\n", 576 - mode->name, size, n, start_n, end_n, start, end); 577 - goto out; 578 - } 579 - } 580 - 581 - if (!assert_contiguous_in_range(test, &mm, size, start, end)) { 582 - KUNIT_FAIL(test, 583 - "%s: range [%llx, %llx] not full after initialisation, size=%llu\n", 584 - mode->name, start, end, size); 585 - goto out; 586 - } 587 - 588 - /* Remove one and reinsert, it should refill itself */ 589 - for (n = start_n; n <= end_n; n++) { 590 - u64 addr = nodes[n].start; 591 - 592 - drm_mm_remove_node(&nodes[n]); 593 - if (!expect_insert_in_range(test, &mm, &nodes[n], size, size, n, 594 - start, end, mode)) { 595 - KUNIT_FAIL(test, "%s reinsert failed, step %d\n", mode->name, n); 596 - goto out; 597 - } 598 - 599 - if (nodes[n].start != addr) { 600 - KUNIT_FAIL(test, 601 - "%s reinsert node moved, step %d, expected %llx, found %llx\n", 602 - mode->name, n, addr, nodes[n].start); 603 - goto out; 604 - } 605 - } 606 - 607 - if (!assert_contiguous_in_range(test, &mm, size, start, end)) { 608 - KUNIT_FAIL(test, 609 - "%s: range [%llx, %llx] not full after reinsertion, size=%llu\n", 610 - mode->name, start, end, size); 611 - goto out; 612 - } 613 - 614 - drm_mm_for_each_node_safe(node, next, &mm) 615 - drm_mm_remove_node(node); 616 - DRM_MM_BUG_ON(!drm_mm_clean(&mm)); 617 - 618 - cond_resched(); 619 - } 620 - 621 - ret = 0; 622 - out: 623 - drm_mm_for_each_node_safe(node, next, &mm) 624 - drm_mm_remove_node(node); 625 - drm_mm_takedown(&mm); 626 - vfree(nodes); 627 - return ret; 628 - } 629 - 630 - static int insert_outside_range(struct kunit *test) 631 - { 632 - struct drm_mm mm; 633 - const unsigned int start = 1024; 634 - const unsigned int end = 2048; 635 - const unsigned int size = end - start; 636 - 637 - drm_mm_init(&mm, start, size); 638 - 639 - if (!expect_insert_in_range_fail(test, &mm, 1, 0, start)) 640 - return -EINVAL; 641 - 642 - if (!expect_insert_in_range_fail(test, &mm, size, 643 - start - size / 2, start + (size + 1) / 2)) 644 - return -EINVAL; 645 - 646 - if (!expect_insert_in_range_fail(test, &mm, size, 647 - end - (size + 1) / 2, end + size / 2)) 648 - return -EINVAL; 649 - 650 - if (!expect_insert_in_range_fail(test, &mm, 1, end, end + size)) 651 - return -EINVAL; 652 - 653 - drm_mm_takedown(&mm); 654 - return 0; 655 - } 656 - 657 - static void drm_test_mm_insert_range(struct kunit *test) 658 - { 659 - const unsigned int count = min_t(unsigned int, BIT(13), max_iterations); 660 - unsigned int n; 661 - 662 - /* Check that requests outside the bounds of drm_mm are rejected. */ 663 - KUNIT_ASSERT_FALSE(test, insert_outside_range(test)); 664 - 665 - for_each_prime_number_from(n, 1, 50) { 666 - const u64 size = BIT_ULL(n); 667 - const u64 max = count * size; 668 - 669 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert_range(test, count, size, 0, max)); 670 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert_range(test, count, size, 1, max)); 671 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert_range(test, count, size, 0, max - 1)); 672 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert_range(test, count, size, 0, max / 2)); 673 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert_range(test, count, size, 674 - max / 2, max)); 675 - KUNIT_ASSERT_FALSE(test, __drm_test_mm_insert_range(test, count, size, 676 - max / 4 + 1, 3 * max / 4 - 1)); 677 - 678 - cond_resched(); 679 - } 680 - } 681 - 682 - static int prepare_frag(struct kunit *test, struct drm_mm *mm, struct drm_mm_node *nodes, 683 - unsigned int num_insert, const struct insert_mode *mode) 684 - { 685 - unsigned int size = 4096; 686 - unsigned int i; 687 - 688 - for (i = 0; i < num_insert; i++) { 689 - if (!expect_insert(test, mm, &nodes[i], size, 0, i, mode) != 0) { 690 - KUNIT_FAIL(test, "%s insert failed\n", mode->name); 691 - return -EINVAL; 692 - } 693 - } 694 - 695 - /* introduce fragmentation by freeing every other node */ 696 - for (i = 0; i < num_insert; i++) { 697 - if (i % 2 == 0) 698 - drm_mm_remove_node(&nodes[i]); 699 - } 700 - 701 - return 0; 702 - } 703 - 704 - static u64 get_insert_time(struct kunit *test, struct drm_mm *mm, 705 - unsigned int num_insert, struct drm_mm_node *nodes, 706 - const struct insert_mode *mode) 707 - { 708 - unsigned int size = 8192; 709 - ktime_t start; 710 - unsigned int i; 711 - 712 - start = ktime_get(); 713 - for (i = 0; i < num_insert; i++) { 714 - if (!expect_insert(test, mm, &nodes[i], size, 0, i, mode) != 0) { 715 - KUNIT_FAIL(test, "%s insert failed\n", mode->name); 716 - return 0; 717 - } 718 - } 719 - 720 - return ktime_to_ns(ktime_sub(ktime_get(), start)); 721 - } 722 - 723 - static void drm_test_mm_frag(struct kunit *test) 724 - { 725 - struct drm_mm mm; 726 - const struct insert_mode *mode; 727 - struct drm_mm_node *nodes, *node, *next; 728 - unsigned int insert_size = 10000; 729 - unsigned int scale_factor = 4; 730 - 731 - /* We need 4 * insert_size nodes to hold intermediate allocated 732 - * drm_mm nodes. 733 - * 1 times for prepare_frag() 734 - * 1 times for get_insert_time() 735 - * 2 times for get_insert_time() 736 - */ 737 - nodes = vzalloc(array_size(insert_size * 4, sizeof(*nodes))); 738 - KUNIT_ASSERT_TRUE(test, nodes); 739 - 740 - /* For BOTTOMUP and TOPDOWN, we first fragment the 741 - * address space using prepare_frag() and then try to verify 742 - * that insertions scale quadratically from 10k to 20k insertions 743 - */ 744 - drm_mm_init(&mm, 1, U64_MAX - 2); 745 - for (mode = insert_modes; mode->name; mode++) { 746 - u64 insert_time1, insert_time2; 747 - 748 - if (mode->mode != DRM_MM_INSERT_LOW && 749 - mode->mode != DRM_MM_INSERT_HIGH) 750 - continue; 751 - 752 - if (prepare_frag(test, &mm, nodes, insert_size, mode)) 753 - goto err; 754 - 755 - insert_time1 = get_insert_time(test, &mm, insert_size, 756 - nodes + insert_size, mode); 757 - if (insert_time1 == 0) 758 - goto err; 759 - 760 - insert_time2 = get_insert_time(test, &mm, (insert_size * 2), 761 - nodes + insert_size * 2, mode); 762 - if (insert_time2 == 0) 763 - goto err; 764 - 765 - kunit_info(test, "%s fragmented insert of %u and %u insertions took %llu and %llu nsecs\n", 766 - mode->name, insert_size, insert_size * 2, insert_time1, insert_time2); 767 - 768 - if (insert_time2 > (scale_factor * insert_time1)) { 769 - KUNIT_FAIL(test, "%s fragmented insert took %llu nsecs more\n", 770 - mode->name, insert_time2 - (scale_factor * insert_time1)); 771 - goto err; 772 - } 773 - 774 - drm_mm_for_each_node_safe(node, next, &mm) 775 - drm_mm_remove_node(node); 776 - } 777 - 778 - err: 779 - drm_mm_for_each_node_safe(node, next, &mm) 780 - drm_mm_remove_node(node); 781 - drm_mm_takedown(&mm); 782 - vfree(nodes); 783 - } 784 - 785 - static void drm_test_mm_align(struct kunit *test) 786 - { 787 - const struct insert_mode *mode; 788 - const unsigned int max_count = min(8192u, max_prime); 789 - struct drm_mm mm; 790 - struct drm_mm_node *nodes, *node, *next; 791 - unsigned int prime; 792 - 793 - /* For each of the possible insertion modes, we pick a few 794 - * arbitrary alignments and check that the inserted node 795 - * meets our requirements. 796 - */ 797 - 798 - nodes = vzalloc(array_size(max_count, sizeof(*nodes))); 799 - KUNIT_ASSERT_TRUE(test, nodes); 800 - 801 - drm_mm_init(&mm, 1, U64_MAX - 2); 802 - 803 - for (mode = insert_modes; mode->name; mode++) { 804 - unsigned int i = 0; 805 - 806 - for_each_prime_number_from(prime, 1, max_count) { 807 - u64 size = next_prime_number(prime); 808 - 809 - if (!expect_insert(test, &mm, &nodes[i], size, prime, i, mode)) { 810 - KUNIT_FAIL(test, "%s insert failed with alignment=%d", 811 - mode->name, prime); 812 - goto out; 813 - } 814 - 815 - i++; 816 - } 817 - 818 - drm_mm_for_each_node_safe(node, next, &mm) 819 - drm_mm_remove_node(node); 820 - DRM_MM_BUG_ON(!drm_mm_clean(&mm)); 821 - 822 - cond_resched(); 823 - } 824 - 825 - out: 826 - drm_mm_for_each_node_safe(node, next, &mm) 827 - drm_mm_remove_node(node); 828 - drm_mm_takedown(&mm); 829 - vfree(nodes); 830 504 } 831 505 832 506 static void drm_test_mm_align_pot(struct kunit *test, int max) ··· 280 1142 static void drm_test_mm_align64(struct kunit *test) 281 1143 { 282 1144 drm_test_mm_align_pot(test, 64); 283 - } 284 - 285 - static void show_scan(struct kunit *test, const struct drm_mm_scan *scan) 286 - { 287 - kunit_info(test, "scan: hit [%llx, %llx], size=%lld, align=%lld, color=%ld\n", 288 - scan->hit_start, scan->hit_end, scan->size, scan->alignment, scan->color); 289 - } 290 - 291 - static void show_holes(struct kunit *test, const struct drm_mm *mm, int count) 292 - { 293 - u64 hole_start, hole_end; 294 - struct drm_mm_node *hole; 295 - 296 - drm_mm_for_each_hole(hole, mm, hole_start, hole_end) { 297 - struct drm_mm_node *next = list_next_entry(hole, node_list); 298 - const char *node1 = NULL, *node2 = NULL; 299 - 300 - if (drm_mm_node_allocated(hole)) 301 - node1 = kasprintf(GFP_KERNEL, "[%llx + %lld, color=%ld], ", 302 - hole->start, hole->size, hole->color); 303 - 304 - if (drm_mm_node_allocated(next)) 305 - node2 = kasprintf(GFP_KERNEL, ", [%llx + %lld, color=%ld]", 306 - next->start, next->size, next->color); 307 - 308 - kunit_info(test, "%sHole [%llx - %llx, size %lld]%s\n", node1, 309 - hole_start, hole_end, hole_end - hole_start, node2); 310 - 311 - kfree(node2); 312 - kfree(node1); 313 - 314 - if (!--count) 315 - break; 316 - } 317 - } 318 - 319 - struct evict_node { 320 - struct drm_mm_node node; 321 - struct list_head link; 322 - }; 323 - 324 - static bool evict_nodes(struct kunit *test, struct drm_mm_scan *scan, 325 - struct evict_node *nodes, unsigned int *order, unsigned int count, 326 - bool use_color, struct list_head *evict_list) 327 - { 328 - struct evict_node *e, *en; 329 - unsigned int i; 330 - 331 - for (i = 0; i < count; i++) { 332 - e = &nodes[order ? order[i] : i]; 333 - list_add(&e->link, evict_list); 334 - if (drm_mm_scan_add_block(scan, &e->node)) 335 - break; 336 - } 337 - list_for_each_entry_safe(e, en, evict_list, link) { 338 - if (!drm_mm_scan_remove_block(scan, &e->node)) 339 - list_del(&e->link); 340 - } 341 - if (list_empty(evict_list)) { 342 - KUNIT_FAIL(test, 343 - "Failed to find eviction: size=%lld [avail=%d], align=%lld (color=%lu)\n", 344 - scan->size, count, scan->alignment, scan->color); 345 - return false; 346 - } 347 - 348 - list_for_each_entry(e, evict_list, link) 349 - drm_mm_remove_node(&e->node); 350 - 351 - if (use_color) { 352 - struct drm_mm_node *node; 353 - 354 - while ((node = drm_mm_scan_color_evict(scan))) { 355 - e = container_of(node, typeof(*e), node); 356 - drm_mm_remove_node(&e->node); 357 - list_add(&e->link, evict_list); 358 - } 359 - } else { 360 - if (drm_mm_scan_color_evict(scan)) { 361 - KUNIT_FAIL(test, 362 - "drm_mm_scan_color_evict unexpectedly reported overlapping nodes!\n"); 363 - return false; 364 - } 365 - } 366 - 367 - return true; 368 - } 369 - 370 - static bool evict_nothing(struct kunit *test, struct drm_mm *mm, 371 - unsigned int total_size, struct evict_node *nodes) 372 - { 373 - struct drm_mm_scan scan; 374 - LIST_HEAD(evict_list); 375 - struct evict_node *e; 376 - struct drm_mm_node *node; 377 - unsigned int n; 378 - 379 - drm_mm_scan_init(&scan, mm, 1, 0, 0, 0); 380 - for (n = 0; n < total_size; n++) { 381 - e = &nodes[n]; 382 - list_add(&e->link, &evict_list); 383 - drm_mm_scan_add_block(&scan, &e->node); 384 - } 385 - list_for_each_entry(e, &evict_list, link) 386 - drm_mm_scan_remove_block(&scan, &e->node); 387 - 388 - for (n = 0; n < total_size; n++) { 389 - e = &nodes[n]; 390 - 391 - if (!drm_mm_node_allocated(&e->node)) { 392 - KUNIT_FAIL(test, "node[%d] no longer allocated!\n", n); 393 - return false; 394 - } 395 - 396 - e->link.next = NULL; 397 - } 398 - 399 - drm_mm_for_each_node(node, mm) { 400 - e = container_of(node, typeof(*e), node); 401 - e->link.next = &e->link; 402 - } 403 - 404 - for (n = 0; n < total_size; n++) { 405 - e = &nodes[n]; 406 - 407 - if (!e->link.next) { 408 - KUNIT_FAIL(test, "node[%d] no longer connected!\n", n); 409 - return false; 410 - } 411 - } 412 - 413 - return assert_continuous(test, mm, nodes[0].node.size); 414 - } 415 - 416 - static bool evict_everything(struct kunit *test, struct drm_mm *mm, 417 - unsigned int total_size, struct evict_node *nodes) 418 - { 419 - struct drm_mm_scan scan; 420 - LIST_HEAD(evict_list); 421 - struct evict_node *e; 422 - unsigned int n; 423 - int err; 424 - 425 - drm_mm_scan_init(&scan, mm, total_size, 0, 0, 0); 426 - for (n = 0; n < total_size; n++) { 427 - e = &nodes[n]; 428 - list_add(&e->link, &evict_list); 429 - if (drm_mm_scan_add_block(&scan, &e->node)) 430 - break; 431 - } 432 - 433 - err = 0; 434 - list_for_each_entry(e, &evict_list, link) { 435 - if (!drm_mm_scan_remove_block(&scan, &e->node)) { 436 - if (!err) { 437 - KUNIT_FAIL(test, "Node %lld not marked for eviction!\n", 438 - e->node.start); 439 - err = -EINVAL; 440 - } 441 - } 442 - } 443 - if (err) 444 - return false; 445 - 446 - list_for_each_entry(e, &evict_list, link) 447 - drm_mm_remove_node(&e->node); 448 - 449 - if (!assert_one_hole(test, mm, 0, total_size)) 450 - return false; 451 - 452 - list_for_each_entry(e, &evict_list, link) { 453 - err = drm_mm_reserve_node(mm, &e->node); 454 - if (err) { 455 - KUNIT_FAIL(test, "Failed to reinsert node after eviction: start=%llx\n", 456 - e->node.start); 457 - return false; 458 - } 459 - } 460 - 461 - return assert_continuous(test, mm, nodes[0].node.size); 462 - } 463 - 464 - static int evict_something(struct kunit *test, struct drm_mm *mm, 465 - u64 range_start, u64 range_end, struct evict_node *nodes, 466 - unsigned int *order, unsigned int count, unsigned int size, 467 - unsigned int alignment, const struct insert_mode *mode) 468 - { 469 - struct drm_mm_scan scan; 470 - LIST_HEAD(evict_list); 471 - struct evict_node *e; 472 - struct drm_mm_node tmp; 473 - int err; 474 - 475 - drm_mm_scan_init_with_range(&scan, mm, size, alignment, 0, range_start, 476 - range_end, mode->mode); 477 - if (!evict_nodes(test, &scan, nodes, order, count, false, &evict_list)) 478 - return -EINVAL; 479 - 480 - memset(&tmp, 0, sizeof(tmp)); 481 - err = drm_mm_insert_node_generic(mm, &tmp, size, alignment, 0, 482 - DRM_MM_INSERT_EVICT); 483 - if (err) { 484 - KUNIT_FAIL(test, "Failed to insert into eviction hole: size=%d, align=%d\n", 485 - size, alignment); 486 - show_scan(test, &scan); 487 - show_holes(test, mm, 3); 488 - return err; 489 - } 490 - 491 - if (tmp.start < range_start || tmp.start + tmp.size > range_end) { 492 - KUNIT_FAIL(test, 493 - "Inserted [address=%llu + %llu] did not fit into the request range [%llu, %llu]\n", 494 - tmp.start, tmp.size, range_start, range_end); 495 - err = -EINVAL; 496 - } 497 - 498 - if (!assert_node(test, &tmp, mm, size, alignment, 0) || 499 - drm_mm_hole_follows(&tmp)) { 500 - KUNIT_FAIL(test, 501 - "Inserted did not fill the eviction hole: size=%lld [%d], align=%d [rem=%lld], start=%llx, hole-follows?=%d\n", 502 - tmp.size, size, alignment, misalignment(&tmp, alignment), 503 - tmp.start, drm_mm_hole_follows(&tmp)); 504 - err = -EINVAL; 505 - } 506 - 507 - drm_mm_remove_node(&tmp); 508 - if (err) 509 - return err; 510 - 511 - list_for_each_entry(e, &evict_list, link) { 512 - err = drm_mm_reserve_node(mm, &e->node); 513 - if (err) { 514 - KUNIT_FAIL(test, "Failed to reinsert node after eviction: start=%llx\n", 515 - e->node.start); 516 - return err; 517 - } 518 - } 519 - 520 - if (!assert_continuous(test, mm, nodes[0].node.size)) { 521 - KUNIT_FAIL(test, "range is no longer continuous\n"); 522 - return -EINVAL; 523 - } 524 - 525 - return 0; 526 - } 527 - 528 - static void drm_test_mm_evict(struct kunit *test) 529 - { 530 - DRM_RND_STATE(prng, random_seed); 531 - const unsigned int size = 8192; 532 - const struct insert_mode *mode; 533 - struct drm_mm mm; 534 - struct evict_node *nodes; 535 - struct drm_mm_node *node, *next; 536 - unsigned int *order, n; 537 - 538 - /* Here we populate a full drm_mm and then try and insert a new node 539 - * by evicting other nodes in a random order. The drm_mm_scan should 540 - * pick the first matching hole it finds from the random list. We 541 - * repeat that for different allocation strategies, alignments and 542 - * sizes to try and stress the hole finder. 543 - */ 544 - 545 - nodes = vzalloc(array_size(size, sizeof(*nodes))); 546 - KUNIT_ASSERT_TRUE(test, nodes); 547 - 548 - order = drm_random_order(size, &prng); 549 - if (!order) 550 - goto err_nodes; 551 - 552 - drm_mm_init(&mm, 0, size); 553 - for (n = 0; n < size; n++) { 554 - if (drm_mm_insert_node(&mm, &nodes[n].node, 1)) { 555 - KUNIT_FAIL(test, "insert failed, step %d\n", n); 556 - goto out; 557 - } 558 - } 559 - 560 - /* First check that using the scanner doesn't break the mm */ 561 - if (!evict_nothing(test, &mm, size, nodes)) { 562 - KUNIT_FAIL(test, "evict_nothing() failed\n"); 563 - goto out; 564 - } 565 - if (!evict_everything(test, &mm, size, nodes)) { 566 - KUNIT_FAIL(test, "evict_everything() failed\n"); 567 - goto out; 568 - } 569 - 570 - for (mode = evict_modes; mode->name; mode++) { 571 - for (n = 1; n <= size; n <<= 1) { 572 - drm_random_reorder(order, size, &prng); 573 - if (evict_something(test, &mm, 0, U64_MAX, nodes, order, size, n, 1, 574 - mode)) { 575 - KUNIT_FAIL(test, "%s evict_something(size=%u) failed\n", 576 - mode->name, n); 577 - goto out; 578 - } 579 - } 580 - 581 - for (n = 1; n < size; n <<= 1) { 582 - drm_random_reorder(order, size, &prng); 583 - if (evict_something(test, &mm, 0, U64_MAX, nodes, order, size, 584 - size / 2, n, mode)) { 585 - KUNIT_FAIL(test, 586 - "%s evict_something(size=%u, alignment=%u) failed\n", 587 - mode->name, size / 2, n); 588 - goto out; 589 - } 590 - } 591 - 592 - for_each_prime_number_from(n, 1, min(size, max_prime)) { 593 - unsigned int nsize = (size - n + 1) / 2; 594 - 595 - DRM_MM_BUG_ON(!nsize); 596 - 597 - drm_random_reorder(order, size, &prng); 598 - if (evict_something(test, &mm, 0, U64_MAX, nodes, order, size, 599 - nsize, n, mode)) { 600 - KUNIT_FAIL(test, 601 - "%s evict_something(size=%u, alignment=%u) failed\n", 602 - mode->name, nsize, n); 603 - goto out; 604 - } 605 - } 606 - 607 - cond_resched(); 608 - } 609 - 610 - out: 611 - drm_mm_for_each_node_safe(node, next, &mm) 612 - drm_mm_remove_node(node); 613 - drm_mm_takedown(&mm); 614 - kfree(order); 615 - err_nodes: 616 - vfree(nodes); 617 - } 618 - 619 - static void drm_test_mm_evict_range(struct kunit *test) 620 - { 621 - DRM_RND_STATE(prng, random_seed); 622 - const unsigned int size = 8192; 623 - const unsigned int range_size = size / 2; 624 - const unsigned int range_start = size / 4; 625 - const unsigned int range_end = range_start + range_size; 626 - const struct insert_mode *mode; 627 - struct drm_mm mm; 628 - struct evict_node *nodes; 629 - struct drm_mm_node *node, *next; 630 - unsigned int *order, n; 631 - 632 - /* Like drm_test_mm_evict() but now we are limiting the search to a 633 - * small portion of the full drm_mm. 634 - */ 635 - 636 - nodes = vzalloc(array_size(size, sizeof(*nodes))); 637 - KUNIT_ASSERT_TRUE(test, nodes); 638 - 639 - order = drm_random_order(size, &prng); 640 - if (!order) 641 - goto err_nodes; 642 - 643 - drm_mm_init(&mm, 0, size); 644 - for (n = 0; n < size; n++) { 645 - if (drm_mm_insert_node(&mm, &nodes[n].node, 1)) { 646 - KUNIT_FAIL(test, "insert failed, step %d\n", n); 647 - goto out; 648 - } 649 - } 650 - 651 - for (mode = evict_modes; mode->name; mode++) { 652 - for (n = 1; n <= range_size; n <<= 1) { 653 - drm_random_reorder(order, size, &prng); 654 - if (evict_something(test, &mm, range_start, range_end, nodes, 655 - order, size, n, 1, mode)) { 656 - KUNIT_FAIL(test, 657 - "%s evict_something(size=%u) failed with range [%u, %u]\n", 658 - mode->name, n, range_start, range_end); 659 - goto out; 660 - } 661 - } 662 - 663 - for (n = 1; n <= range_size; n <<= 1) { 664 - drm_random_reorder(order, size, &prng); 665 - if (evict_something(test, &mm, range_start, range_end, nodes, 666 - order, size, range_size / 2, n, mode)) { 667 - KUNIT_FAIL(test, 668 - "%s evict_something(size=%u, alignment=%u) failed with range [%u, %u]\n", 669 - mode->name, range_size / 2, n, range_start, range_end); 670 - goto out; 671 - } 672 - } 673 - 674 - for_each_prime_number_from(n, 1, min(range_size, max_prime)) { 675 - unsigned int nsize = (range_size - n + 1) / 2; 676 - 677 - DRM_MM_BUG_ON(!nsize); 678 - 679 - drm_random_reorder(order, size, &prng); 680 - if (evict_something(test, &mm, range_start, range_end, nodes, 681 - order, size, nsize, n, mode)) { 682 - KUNIT_FAIL(test, 683 - "%s evict_something(size=%u, alignment=%u) failed with range [%u, %u]\n", 684 - mode->name, nsize, n, range_start, range_end); 685 - goto out; 686 - } 687 - } 688 - 689 - cond_resched(); 690 - } 691 - 692 - out: 693 - drm_mm_for_each_node_safe(node, next, &mm) 694 - drm_mm_remove_node(node); 695 - drm_mm_takedown(&mm); 696 - kfree(order); 697 - err_nodes: 698 - vfree(nodes); 699 - } 700 - 701 - static unsigned int node_index(const struct drm_mm_node *node) 702 - { 703 - return div64_u64(node->start, node->size); 704 - } 705 - 706 - static void drm_test_mm_topdown(struct kunit *test) 707 - { 708 - const struct insert_mode *topdown = &insert_modes[TOPDOWN]; 709 - 710 - DRM_RND_STATE(prng, random_seed); 711 - const unsigned int count = 8192; 712 - unsigned int size; 713 - unsigned long *bitmap; 714 - struct drm_mm mm; 715 - struct drm_mm_node *nodes, *node, *next; 716 - unsigned int *order, n, m, o = 0; 717 - 718 - /* When allocating top-down, we expect to be returned a node 719 - * from a suitable hole at the top of the drm_mm. We check that 720 - * the returned node does match the highest available slot. 721 - */ 722 - 723 - nodes = vzalloc(array_size(count, sizeof(*nodes))); 724 - KUNIT_ASSERT_TRUE(test, nodes); 725 - 726 - bitmap = bitmap_zalloc(count, GFP_KERNEL); 727 - if (!bitmap) 728 - goto err_nodes; 729 - 730 - order = drm_random_order(count, &prng); 731 - if (!order) 732 - goto err_bitmap; 733 - 734 - for (size = 1; size <= 64; size <<= 1) { 735 - drm_mm_init(&mm, 0, size * count); 736 - for (n = 0; n < count; n++) { 737 - if (!expect_insert(test, &mm, &nodes[n], size, 0, n, topdown)) { 738 - KUNIT_FAIL(test, "insert failed, size %u step %d\n", size, n); 739 - goto out; 740 - } 741 - 742 - if (drm_mm_hole_follows(&nodes[n])) { 743 - KUNIT_FAIL(test, 744 - "hole after topdown insert %d, start=%llx\n, size=%u", 745 - n, nodes[n].start, size); 746 - goto out; 747 - } 748 - 749 - if (!assert_one_hole(test, &mm, 0, size * (count - n - 1))) 750 - goto out; 751 - } 752 - 753 - if (!assert_continuous(test, &mm, size)) 754 - goto out; 755 - 756 - drm_random_reorder(order, count, &prng); 757 - for_each_prime_number_from(n, 1, min(count, max_prime)) { 758 - for (m = 0; m < n; m++) { 759 - node = &nodes[order[(o + m) % count]]; 760 - drm_mm_remove_node(node); 761 - __set_bit(node_index(node), bitmap); 762 - } 763 - 764 - for (m = 0; m < n; m++) { 765 - unsigned int last; 766 - 767 - node = &nodes[order[(o + m) % count]]; 768 - if (!expect_insert(test, &mm, node, size, 0, 0, topdown)) { 769 - KUNIT_FAIL(test, "insert failed, step %d/%d\n", m, n); 770 - goto out; 771 - } 772 - 773 - if (drm_mm_hole_follows(node)) { 774 - KUNIT_FAIL(test, 775 - "hole after topdown insert %d/%d, start=%llx\n", 776 - m, n, node->start); 777 - goto out; 778 - } 779 - 780 - last = find_last_bit(bitmap, count); 781 - if (node_index(node) != last) { 782 - KUNIT_FAIL(test, 783 - "node %d/%d, size %d, not inserted into upmost hole, expected %d, found %d\n", 784 - m, n, size, last, node_index(node)); 785 - goto out; 786 - } 787 - 788 - __clear_bit(last, bitmap); 789 - } 790 - 791 - DRM_MM_BUG_ON(find_first_bit(bitmap, count) != count); 792 - 793 - o += n; 794 - } 795 - 796 - drm_mm_for_each_node_safe(node, next, &mm) 797 - drm_mm_remove_node(node); 798 - DRM_MM_BUG_ON(!drm_mm_clean(&mm)); 799 - cond_resched(); 800 - } 801 - 802 - out: 803 - drm_mm_for_each_node_safe(node, next, &mm) 804 - drm_mm_remove_node(node); 805 - drm_mm_takedown(&mm); 806 - kfree(order); 807 - err_bitmap: 808 - bitmap_free(bitmap); 809 - err_nodes: 810 - vfree(nodes); 811 - } 812 - 813 - static void drm_test_mm_bottomup(struct kunit *test) 814 - { 815 - const struct insert_mode *bottomup = &insert_modes[BOTTOMUP]; 816 - 817 - DRM_RND_STATE(prng, random_seed); 818 - const unsigned int count = 8192; 819 - unsigned int size; 820 - unsigned long *bitmap; 821 - struct drm_mm mm; 822 - struct drm_mm_node *nodes, *node, *next; 823 - unsigned int *order, n, m, o = 0; 824 - 825 - /* Like drm_test_mm_topdown, but instead of searching for the last hole, 826 - * we search for the first. 827 - */ 828 - 829 - nodes = vzalloc(array_size(count, sizeof(*nodes))); 830 - KUNIT_ASSERT_TRUE(test, nodes); 831 - 832 - bitmap = bitmap_zalloc(count, GFP_KERNEL); 833 - if (!bitmap) 834 - goto err_nodes; 835 - 836 - order = drm_random_order(count, &prng); 837 - if (!order) 838 - goto err_bitmap; 839 - 840 - for (size = 1; size <= 64; size <<= 1) { 841 - drm_mm_init(&mm, 0, size * count); 842 - for (n = 0; n < count; n++) { 843 - if (!expect_insert(test, &mm, &nodes[n], size, 0, n, bottomup)) { 844 - KUNIT_FAIL(test, 845 - "bottomup insert failed, size %u step %d\n", size, n); 846 - goto out; 847 - } 848 - 849 - if (!assert_one_hole(test, &mm, size * (n + 1), size * count)) 850 - goto out; 851 - } 852 - 853 - if (!assert_continuous(test, &mm, size)) 854 - goto out; 855 - 856 - drm_random_reorder(order, count, &prng); 857 - for_each_prime_number_from(n, 1, min(count, max_prime)) { 858 - for (m = 0; m < n; m++) { 859 - node = &nodes[order[(o + m) % count]]; 860 - drm_mm_remove_node(node); 861 - __set_bit(node_index(node), bitmap); 862 - } 863 - 864 - for (m = 0; m < n; m++) { 865 - unsigned int first; 866 - 867 - node = &nodes[order[(o + m) % count]]; 868 - if (!expect_insert(test, &mm, node, size, 0, 0, bottomup)) { 869 - KUNIT_FAIL(test, "insert failed, step %d/%d\n", m, n); 870 - goto out; 871 - } 872 - 873 - first = find_first_bit(bitmap, count); 874 - if (node_index(node) != first) { 875 - KUNIT_FAIL(test, 876 - "node %d/%d not inserted into bottom hole, expected %d, found %d\n", 877 - m, n, first, node_index(node)); 878 - goto out; 879 - } 880 - __clear_bit(first, bitmap); 881 - } 882 - 883 - DRM_MM_BUG_ON(find_first_bit(bitmap, count) != count); 884 - 885 - o += n; 886 - } 887 - 888 - drm_mm_for_each_node_safe(node, next, &mm) 889 - drm_mm_remove_node(node); 890 - DRM_MM_BUG_ON(!drm_mm_clean(&mm)); 891 - cond_resched(); 892 - } 893 - 894 - out: 895 - drm_mm_for_each_node_safe(node, next, &mm) 896 - drm_mm_remove_node(node); 897 - drm_mm_takedown(&mm); 898 - kfree(order); 899 - err_bitmap: 900 - bitmap_free(bitmap); 901 - err_nodes: 902 - vfree(nodes); 903 1145 } 904 1146 905 1147 static void drm_test_mm_once(struct kunit *test, unsigned int mode) ··· 335 1817 drm_test_mm_once(test, DRM_MM_INSERT_HIGH); 336 1818 } 337 1819 338 - static void separate_adjacent_colors(const struct drm_mm_node *node, 339 - unsigned long color, u64 *start, u64 *end) 340 - { 341 - if (drm_mm_node_allocated(node) && node->color != color) 342 - ++*start; 343 - 344 - node = list_next_entry(node, node_list); 345 - if (drm_mm_node_allocated(node) && node->color != color) 346 - --*end; 347 - } 348 - 349 - static bool colors_abutt(struct kunit *test, const struct drm_mm_node *node) 350 - { 351 - if (!drm_mm_hole_follows(node) && 352 - drm_mm_node_allocated(list_next_entry(node, node_list))) { 353 - KUNIT_FAIL(test, "colors abutt; %ld [%llx + %llx] is next to %ld [%llx + %llx]!\n", 354 - node->color, node->start, node->size, 355 - list_next_entry(node, node_list)->color, 356 - list_next_entry(node, node_list)->start, 357 - list_next_entry(node, node_list)->size); 358 - return true; 359 - } 360 - 361 - return false; 362 - } 363 - 364 - static void drm_test_mm_color(struct kunit *test) 365 - { 366 - const unsigned int count = min(4096u, max_iterations); 367 - const struct insert_mode *mode; 368 - struct drm_mm mm; 369 - struct drm_mm_node *node, *nn; 370 - unsigned int n; 371 - 372 - /* Color adjustment complicates everything. First we just check 373 - * that when we insert a node we apply any color_adjustment callback. 374 - * The callback we use should ensure that there is a gap between 375 - * any two nodes, and so after each insertion we check that those 376 - * holes are inserted and that they are preserved. 377 - */ 378 - 379 - drm_mm_init(&mm, 0, U64_MAX); 380 - 381 - for (n = 1; n <= count; n++) { 382 - node = kzalloc(sizeof(*node), GFP_KERNEL); 383 - if (!node) 384 - goto out; 385 - 386 - if (!expect_insert(test, &mm, node, n, 0, n, &insert_modes[0])) { 387 - KUNIT_FAIL(test, "insert failed, step %d\n", n); 388 - kfree(node); 389 - goto out; 390 - } 391 - } 392 - 393 - drm_mm_for_each_node_safe(node, nn, &mm) { 394 - if (node->color != node->size) { 395 - KUNIT_FAIL(test, "invalid color stored: expected %lld, found %ld\n", 396 - node->size, node->color); 397 - 398 - goto out; 399 - } 400 - 401 - drm_mm_remove_node(node); 402 - kfree(node); 403 - } 404 - 405 - /* Now, let's start experimenting with applying a color callback */ 406 - mm.color_adjust = separate_adjacent_colors; 407 - for (mode = insert_modes; mode->name; mode++) { 408 - u64 last; 409 - 410 - node = kzalloc(sizeof(*node), GFP_KERNEL); 411 - if (!node) 412 - goto out; 413 - 414 - node->size = 1 + 2 * count; 415 - node->color = node->size; 416 - 417 - if (drm_mm_reserve_node(&mm, node)) { 418 - KUNIT_FAIL(test, "initial reserve failed!\n"); 419 - goto out; 420 - } 421 - 422 - last = node->start + node->size; 423 - 424 - for (n = 1; n <= count; n++) { 425 - int rem; 426 - 427 - node = kzalloc(sizeof(*node), GFP_KERNEL); 428 - if (!node) 429 - goto out; 430 - 431 - node->start = last; 432 - node->size = n + count; 433 - node->color = node->size; 434 - 435 - if (drm_mm_reserve_node(&mm, node) != -ENOSPC) { 436 - KUNIT_FAIL(test, "reserve %d did not report color overlap!", n); 437 - goto out; 438 - } 439 - 440 - node->start += n + 1; 441 - rem = misalignment(node, n + count); 442 - node->start += n + count - rem; 443 - 444 - if (drm_mm_reserve_node(&mm, node)) { 445 - KUNIT_FAIL(test, "reserve %d failed", n); 446 - goto out; 447 - } 448 - 449 - last = node->start + node->size; 450 - } 451 - 452 - for (n = 1; n <= count; n++) { 453 - node = kzalloc(sizeof(*node), GFP_KERNEL); 454 - if (!node) 455 - goto out; 456 - 457 - if (!expect_insert(test, &mm, node, n, n, n, mode)) { 458 - KUNIT_FAIL(test, "%s insert failed, step %d\n", mode->name, n); 459 - kfree(node); 460 - goto out; 461 - } 462 - } 463 - 464 - drm_mm_for_each_node_safe(node, nn, &mm) { 465 - u64 rem; 466 - 467 - if (node->color != node->size) { 468 - KUNIT_FAIL(test, 469 - "%s invalid color stored: expected %lld, found %ld\n", 470 - mode->name, node->size, node->color); 471 - 472 - goto out; 473 - } 474 - 475 - if (colors_abutt(test, node)) 476 - goto out; 477 - 478 - div64_u64_rem(node->start, node->size, &rem); 479 - if (rem) { 480 - KUNIT_FAIL(test, 481 - "%s colored node misaligned, start=%llx expected alignment=%lld [rem=%lld]\n", 482 - mode->name, node->start, node->size, rem); 483 - goto out; 484 - } 485 - 486 - drm_mm_remove_node(node); 487 - kfree(node); 488 - } 489 - 490 - cond_resched(); 491 - } 492 - 493 - out: 494 - drm_mm_for_each_node_safe(node, nn, &mm) { 495 - drm_mm_remove_node(node); 496 - kfree(node); 497 - } 498 - drm_mm_takedown(&mm); 499 - } 500 - 501 - static int evict_color(struct kunit *test, struct drm_mm *mm, u64 range_start, 502 - u64 range_end, struct evict_node *nodes, unsigned int *order, 503 - unsigned int count, unsigned int size, unsigned int alignment, 504 - unsigned long color, const struct insert_mode *mode) 505 - { 506 - struct drm_mm_scan scan; 507 - LIST_HEAD(evict_list); 508 - struct evict_node *e; 509 - struct drm_mm_node tmp; 510 - int err; 511 - 512 - drm_mm_scan_init_with_range(&scan, mm, size, alignment, color, range_start, 513 - range_end, mode->mode); 514 - if (!evict_nodes(test, &scan, nodes, order, count, true, &evict_list)) 515 - return -EINVAL; 516 - 517 - memset(&tmp, 0, sizeof(tmp)); 518 - err = drm_mm_insert_node_generic(mm, &tmp, size, alignment, color, 519 - DRM_MM_INSERT_EVICT); 520 - if (err) { 521 - KUNIT_FAIL(test, 522 - "Failed to insert into eviction hole: size=%d, align=%d, color=%lu, err=%d\n", 523 - size, alignment, color, err); 524 - show_scan(test, &scan); 525 - show_holes(test, mm, 3); 526 - return err; 527 - } 528 - 529 - if (tmp.start < range_start || tmp.start + tmp.size > range_end) { 530 - KUNIT_FAIL(test, 531 - "Inserted [address=%llu + %llu] did not fit into the request range [%llu, %llu]\n", 532 - tmp.start, tmp.size, range_start, range_end); 533 - err = -EINVAL; 534 - } 535 - 536 - if (colors_abutt(test, &tmp)) 537 - err = -EINVAL; 538 - 539 - if (!assert_node(test, &tmp, mm, size, alignment, color)) { 540 - KUNIT_FAIL(test, 541 - "Inserted did not fit the eviction hole: size=%lld [%d], align=%d [rem=%lld], start=%llx\n", 542 - tmp.size, size, alignment, misalignment(&tmp, alignment), tmp.start); 543 - err = -EINVAL; 544 - } 545 - 546 - drm_mm_remove_node(&tmp); 547 - if (err) 548 - return err; 549 - 550 - list_for_each_entry(e, &evict_list, link) { 551 - err = drm_mm_reserve_node(mm, &e->node); 552 - if (err) { 553 - KUNIT_FAIL(test, "Failed to reinsert node after eviction: start=%llx\n", 554 - e->node.start); 555 - return err; 556 - } 557 - } 558 - 559 - cond_resched(); 560 - return 0; 561 - } 562 - 563 - static void drm_test_mm_color_evict(struct kunit *test) 564 - { 565 - DRM_RND_STATE(prng, random_seed); 566 - const unsigned int total_size = min(8192u, max_iterations); 567 - const struct insert_mode *mode; 568 - unsigned long color = 0; 569 - struct drm_mm mm; 570 - struct evict_node *nodes; 571 - struct drm_mm_node *node, *next; 572 - unsigned int *order, n; 573 - 574 - /* Check that the drm_mm_scan also honours color adjustment when 575 - * choosing its victims to create a hole. Our color_adjust does not 576 - * allow two nodes to be placed together without an intervening hole 577 - * enlarging the set of victims that must be evicted. 578 - */ 579 - 580 - nodes = vzalloc(array_size(total_size, sizeof(*nodes))); 581 - KUNIT_ASSERT_TRUE(test, nodes); 582 - 583 - order = drm_random_order(total_size, &prng); 584 - if (!order) 585 - goto err_nodes; 586 - 587 - drm_mm_init(&mm, 0, 2 * total_size - 1); 588 - mm.color_adjust = separate_adjacent_colors; 589 - for (n = 0; n < total_size; n++) { 590 - if (!expect_insert(test, &mm, &nodes[n].node, 591 - 1, 0, color++, 592 - &insert_modes[0])) { 593 - KUNIT_FAIL(test, "insert failed, step %d\n", n); 594 - goto out; 595 - } 596 - } 597 - 598 - for (mode = evict_modes; mode->name; mode++) { 599 - for (n = 1; n <= total_size; n <<= 1) { 600 - drm_random_reorder(order, total_size, &prng); 601 - if (evict_color(test, &mm, 0, U64_MAX, nodes, order, total_size, 602 - n, 1, color++, mode)) { 603 - KUNIT_FAIL(test, "%s evict_color(size=%u) failed\n", mode->name, n); 604 - goto out; 605 - } 606 - } 607 - 608 - for (n = 1; n < total_size; n <<= 1) { 609 - drm_random_reorder(order, total_size, &prng); 610 - if (evict_color(test, &mm, 0, U64_MAX, nodes, order, total_size, 611 - total_size / 2, n, color++, mode)) { 612 - KUNIT_FAIL(test, "%s evict_color(size=%u, alignment=%u) failed\n", 613 - mode->name, total_size / 2, n); 614 - goto out; 615 - } 616 - } 617 - 618 - for_each_prime_number_from(n, 1, min(total_size, max_prime)) { 619 - unsigned int nsize = (total_size - n + 1) / 2; 620 - 621 - DRM_MM_BUG_ON(!nsize); 622 - 623 - drm_random_reorder(order, total_size, &prng); 624 - if (evict_color(test, &mm, 0, U64_MAX, nodes, order, total_size, 625 - nsize, n, color++, mode)) { 626 - KUNIT_FAIL(test, "%s evict_color(size=%u, alignment=%u) failed\n", 627 - mode->name, nsize, n); 628 - goto out; 629 - } 630 - } 631 - 632 - cond_resched(); 633 - } 634 - 635 - out: 636 - drm_mm_for_each_node_safe(node, next, &mm) 637 - drm_mm_remove_node(node); 638 - drm_mm_takedown(&mm); 639 - kfree(order); 640 - err_nodes: 641 - vfree(nodes); 642 - } 643 - 644 - static void drm_test_mm_color_evict_range(struct kunit *test) 645 - { 646 - DRM_RND_STATE(prng, random_seed); 647 - const unsigned int total_size = 8192; 648 - const unsigned int range_size = total_size / 2; 649 - const unsigned int range_start = total_size / 4; 650 - const unsigned int range_end = range_start + range_size; 651 - const struct insert_mode *mode; 652 - unsigned long color = 0; 653 - struct drm_mm mm; 654 - struct evict_node *nodes; 655 - struct drm_mm_node *node, *next; 656 - unsigned int *order, n; 657 - 658 - /* Like drm_test_mm_color_evict(), but limited to small portion of the full 659 - * drm_mm range. 660 - */ 661 - 662 - nodes = vzalloc(array_size(total_size, sizeof(*nodes))); 663 - KUNIT_ASSERT_TRUE(test, nodes); 664 - 665 - order = drm_random_order(total_size, &prng); 666 - if (!order) 667 - goto err_nodes; 668 - 669 - drm_mm_init(&mm, 0, 2 * total_size - 1); 670 - mm.color_adjust = separate_adjacent_colors; 671 - for (n = 0; n < total_size; n++) { 672 - if (!expect_insert(test, &mm, &nodes[n].node, 673 - 1, 0, color++, 674 - &insert_modes[0])) { 675 - KUNIT_FAIL(test, "insert failed, step %d\n", n); 676 - goto out; 677 - } 678 - } 679 - 680 - for (mode = evict_modes; mode->name; mode++) { 681 - for (n = 1; n <= range_size; n <<= 1) { 682 - drm_random_reorder(order, range_size, &prng); 683 - if (evict_color(test, &mm, range_start, range_end, nodes, order, 684 - total_size, n, 1, color++, mode)) { 685 - KUNIT_FAIL(test, 686 - "%s evict_color(size=%u) failed for range [%x, %x]\n", 687 - mode->name, n, range_start, range_end); 688 - goto out; 689 - } 690 - } 691 - 692 - for (n = 1; n < range_size; n <<= 1) { 693 - drm_random_reorder(order, total_size, &prng); 694 - if (evict_color(test, &mm, range_start, range_end, nodes, order, 695 - total_size, range_size / 2, n, color++, mode)) { 696 - KUNIT_FAIL(test, 697 - "%s evict_color(size=%u, alignment=%u) failed for range [%x, %x]\n", 698 - mode->name, total_size / 2, n, range_start, range_end); 699 - goto out; 700 - } 701 - } 702 - 703 - for_each_prime_number_from(n, 1, min(range_size, max_prime)) { 704 - unsigned int nsize = (range_size - n + 1) / 2; 705 - 706 - DRM_MM_BUG_ON(!nsize); 707 - 708 - drm_random_reorder(order, total_size, &prng); 709 - if (evict_color(test, &mm, range_start, range_end, nodes, order, 710 - total_size, nsize, n, color++, mode)) { 711 - KUNIT_FAIL(test, 712 - "%s evict_color(size=%u, alignment=%u) failed for range [%x, %x]\n", 713 - mode->name, nsize, n, range_start, range_end); 714 - goto out; 715 - } 716 - } 717 - 718 - cond_resched(); 719 - } 720 - 721 - out: 722 - drm_mm_for_each_node_safe(node, next, &mm) 723 - drm_mm_remove_node(node); 724 - drm_mm_takedown(&mm); 725 - kfree(order); 726 - err_nodes: 727 - vfree(nodes); 728 - } 729 - 730 - static int drm_mm_suite_init(struct kunit_suite *suite) 731 - { 732 - while (!random_seed) 733 - random_seed = get_random_u32(); 734 - 735 - kunit_info(suite, 736 - "Testing DRM range manager, with random_seed=0x%x max_iterations=%u max_prime=%u\n", 737 - random_seed, max_iterations, max_prime); 738 - 739 - return 0; 740 - } 741 - 742 - module_param(random_seed, uint, 0400); 743 - module_param(max_iterations, uint, 0400); 744 - module_param(max_prime, uint, 0400); 745 - 746 1820 static struct kunit_case drm_mm_tests[] = { 747 1821 KUNIT_CASE(drm_test_mm_init), 748 1822 KUNIT_CASE(drm_test_mm_debug), 749 - KUNIT_CASE(drm_test_mm_reserve), 750 - KUNIT_CASE(drm_test_mm_insert), 751 - KUNIT_CASE(drm_test_mm_replace), 752 - KUNIT_CASE(drm_test_mm_insert_range), 753 - KUNIT_CASE(drm_test_mm_frag), 754 - KUNIT_CASE(drm_test_mm_align), 755 1823 KUNIT_CASE(drm_test_mm_align32), 756 1824 KUNIT_CASE(drm_test_mm_align64), 757 - KUNIT_CASE(drm_test_mm_evict), 758 - KUNIT_CASE(drm_test_mm_evict_range), 759 - KUNIT_CASE(drm_test_mm_topdown), 760 - KUNIT_CASE(drm_test_mm_bottomup), 761 1825 KUNIT_CASE(drm_test_mm_lowest), 762 1826 KUNIT_CASE(drm_test_mm_highest), 763 - KUNIT_CASE(drm_test_mm_color), 764 - KUNIT_CASE(drm_test_mm_color_evict), 765 - KUNIT_CASE(drm_test_mm_color_evict_range), 766 1827 {} 767 1828 }; 768 1829 769 1830 static struct kunit_suite drm_mm_test_suite = { 770 1831 .name = "drm_mm", 771 - .suite_init = drm_mm_suite_init, 772 1832 .test_cases = drm_mm_tests, 773 1833 }; 774 1834

-4

drivers/gpu/drm/tidss/tidss_kms.c

··· 4 4 * Author: Tomi Valkeinen <tomi.valkeinen@ti.com> 5 5 */ 6 6 7 - #include <linux/dma-fence.h> 8 - 9 7 #include <drm/drm_atomic.h> 10 8 #include <drm/drm_atomic_helper.h> 11 9 #include <drm/drm_bridge.h> ··· 23 25 { 24 26 struct drm_device *ddev = old_state->dev; 25 27 struct tidss_device *tidss = to_tidss(ddev); 26 - bool fence_cookie = dma_fence_begin_signalling(); 27 28 28 29 dev_dbg(ddev->dev, "%s\n", __func__); 29 30 ··· 33 36 drm_atomic_helper_commit_modeset_enables(ddev, old_state); 34 37 35 38 drm_atomic_helper_commit_hw_done(old_state); 36 - dma_fence_end_signalling(fence_cookie); 37 39 drm_atomic_helper_wait_for_flip_done(ddev, old_state); 38 40 39 41 drm_atomic_helper_cleanup_planes(ddev, old_state);

+2 -1

drivers/gpu/drm/tiny/cirrus.c

··· 411 411 unsigned int offset = drm_fb_clip_offset(pitch, format, &damage); 412 412 struct iosys_map dst = IOSYS_MAP_INIT_OFFSET(&vaddr, offset); 413 413 414 - drm_fb_blit(&dst, &pitch, format->format, shadow_plane_state->data, fb, &damage); 414 + drm_fb_blit(&dst, &pitch, format->format, shadow_plane_state->data, fb, 415 + &damage, &shadow_plane_state->fmtcnv_state); 415 416 } 416 417 417 418 drm_dev_exit(idx);

+6 -4

drivers/gpu/drm/tiny/ili9225.c

··· 78 78 } 79 79 80 80 static void ili9225_fb_dirty(struct iosys_map *src, struct drm_framebuffer *fb, 81 - struct drm_rect *rect) 81 + struct drm_rect *rect, struct drm_format_conv_state *fmtcnv_state) 82 82 { 83 83 struct mipi_dbi_dev *dbidev = drm_to_mipi_dbi_dev(fb->dev); 84 84 unsigned int height = rect->y2 - rect->y1; ··· 98 98 if (!dbi->dc || !full || swap || 99 99 fb->format->format == DRM_FORMAT_XRGB8888) { 100 100 tr = dbidev->tx_buf; 101 - ret = mipi_dbi_buf_copy(tr, src, fb, rect, swap); 101 + ret = mipi_dbi_buf_copy(tr, src, fb, rect, swap, fmtcnv_state); 102 102 if (ret) 103 103 goto err_msg; 104 104 } else { ··· 171 171 return; 172 172 173 173 if (drm_atomic_helper_damage_merged(old_state, state, &rect)) 174 - ili9225_fb_dirty(&shadow_plane_state->data[0], fb, &rect); 174 + ili9225_fb_dirty(&shadow_plane_state->data[0], fb, &rect, 175 + &shadow_plane_state->fmtcnv_state); 175 176 176 177 drm_dev_exit(idx); 177 178 } ··· 282 281 283 282 ili9225_command(dbi, ILI9225_DISPLAY_CONTROL_1, 0x1017); 284 283 285 - ili9225_fb_dirty(&shadow_plane_state->data[0], fb, &rect); 284 + ili9225_fb_dirty(&shadow_plane_state->data[0], fb, &rect, 285 + &shadow_plane_state->fmtcnv_state); 286 286 287 287 out_exit: 288 288 drm_dev_exit(idx);

+15 -1

drivers/gpu/drm/tiny/ofdrm.c

··· 758 758 static int ofdrm_primary_plane_helper_atomic_check(struct drm_plane *plane, 759 759 struct drm_atomic_state *new_state) 760 760 { 761 + struct drm_device *dev = plane->dev; 762 + struct ofdrm_device *odev = ofdrm_device_of_dev(dev); 761 763 struct drm_plane_state *new_plane_state = drm_atomic_get_new_plane_state(new_state, plane); 764 + struct drm_shadow_plane_state *new_shadow_plane_state = 765 + to_drm_shadow_plane_state(new_plane_state); 762 766 struct drm_framebuffer *new_fb = new_plane_state->fb; 763 767 struct drm_crtc *new_crtc = new_plane_state->crtc; 764 768 struct drm_crtc_state *new_crtc_state = NULL; ··· 780 776 return ret; 781 777 else if (!new_plane_state->visible) 782 778 return 0; 779 + 780 + if (new_fb->format != odev->format) { 781 + void *buf; 782 + 783 + /* format conversion necessary; reserve buffer */ 784 + buf = drm_format_conv_state_reserve(&new_shadow_plane_state->fmtcnv_state, 785 + odev->pitch, GFP_KERNEL); 786 + if (!buf) 787 + return -ENOMEM; 788 + } 783 789 784 790 new_crtc_state = drm_atomic_get_new_crtc_state(new_state, new_plane_state->crtc); 785 791 ··· 831 817 832 818 iosys_map_incr(&dst, drm_fb_clip_offset(dst_pitch, dst_format, &dst_clip)); 833 819 drm_fb_blit(&dst, &dst_pitch, dst_format->format, shadow_plane_state->data, fb, 834 - &damage); 820 + &damage, &shadow_plane_state->fmtcnv_state); 835 821 } 836 822 837 823 drm_dev_exit(idx);

+7 -3

drivers/gpu/drm/tiny/repaper.c

··· 509 509 epd->factored_stage_time = epd->stage_time * factor10x / 10; 510 510 } 511 511 512 - static int repaper_fb_dirty(struct drm_framebuffer *fb) 512 + static int repaper_fb_dirty(struct drm_framebuffer *fb, 513 + struct drm_format_conv_state *fmtcnv_state) 513 514 { 514 515 struct drm_gem_dma_object *dma_obj = drm_fb_dma_get_gem_obj(fb, 0); 515 516 struct repaper_epd *epd = drm_to_epd(fb->dev); ··· 546 545 547 546 iosys_map_set_vaddr(&dst, buf); 548 547 iosys_map_set_vaddr(&vmap, dma_obj->vaddr); 549 - drm_fb_xrgb8888_to_mono(&dst, &dst_pitch, &vmap, fb, &clip); 548 + drm_fb_xrgb8888_to_mono(&dst, &dst_pitch, &vmap, fb, &clip, fmtcnv_state); 550 549 551 550 drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE); 552 551 ··· 831 830 struct drm_plane_state *old_state) 832 831 { 833 832 struct drm_plane_state *state = pipe->plane.state; 833 + struct drm_format_conv_state fmtcnv_state = DRM_FORMAT_CONV_STATE_INIT; 834 834 struct drm_rect rect; 835 835 836 836 if (!pipe->crtc.state->active) 837 837 return; 838 838 839 839 if (drm_atomic_helper_damage_merged(old_state, state, &rect)) 840 - repaper_fb_dirty(state->fb); 840 + repaper_fb_dirty(state->fb, &fmtcnv_state); 841 + 842 + drm_format_conv_state_release(&fmtcnv_state); 841 843 } 842 844 843 845 static const struct drm_simple_display_pipe_funcs repaper_pipe_funcs = {

+41 -2

drivers/gpu/drm/tiny/simpledrm.c

··· 19 19 #include <drm/drm_drv.h> 20 20 #include <drm/drm_fbdev_generic.h> 21 21 #include <drm/drm_format_helper.h> 22 + #include <drm/drm_framebuffer.h> 22 23 #include <drm/drm_gem_atomic_helper.h> 23 24 #include <drm/drm_gem_framebuffer_helper.h> 24 25 #include <drm/drm_gem_shmem_helper.h> ··· 580 579 DRM_FORMAT_MOD_INVALID 581 580 }; 582 581 582 + static int simpledrm_primary_plane_helper_atomic_check(struct drm_plane *plane, 583 + struct drm_atomic_state *state) 584 + { 585 + struct drm_plane_state *new_plane_state = drm_atomic_get_new_plane_state(state, plane); 586 + struct drm_shadow_plane_state *new_shadow_plane_state = 587 + to_drm_shadow_plane_state(new_plane_state); 588 + struct drm_framebuffer *new_fb = new_plane_state->fb; 589 + struct drm_crtc *new_crtc = new_plane_state->crtc; 590 + struct drm_crtc_state *new_crtc_state = NULL; 591 + struct drm_device *dev = plane->dev; 592 + struct simpledrm_device *sdev = simpledrm_device_of_dev(dev); 593 + int ret; 594 + 595 + if (new_crtc) 596 + new_crtc_state = drm_atomic_get_new_crtc_state(state, new_crtc); 597 + 598 + ret = drm_atomic_helper_check_plane_state(new_plane_state, new_crtc_state, 599 + DRM_PLANE_NO_SCALING, 600 + DRM_PLANE_NO_SCALING, 601 + false, false); 602 + if (ret) 603 + return ret; 604 + else if (!new_plane_state->visible) 605 + return 0; 606 + 607 + if (new_fb->format != sdev->format) { 608 + void *buf; 609 + 610 + /* format conversion necessary; reserve buffer */ 611 + buf = drm_format_conv_state_reserve(&new_shadow_plane_state->fmtcnv_state, 612 + sdev->pitch, GFP_KERNEL); 613 + if (!buf) 614 + return -ENOMEM; 615 + } 616 + 617 + return 0; 618 + } 619 + 583 620 static void simpledrm_primary_plane_helper_atomic_update(struct drm_plane *plane, 584 621 struct drm_atomic_state *state) 585 622 { ··· 648 609 649 610 iosys_map_incr(&dst, drm_fb_clip_offset(sdev->pitch, sdev->format, &dst_clip)); 650 611 drm_fb_blit(&dst, &sdev->pitch, sdev->format->format, shadow_plane_state->data, 651 - fb, &damage); 612 + fb, &damage, &shadow_plane_state->fmtcnv_state); 652 613 } 653 614 654 615 drm_dev_exit(idx); ··· 674 635 675 636 static const struct drm_plane_helper_funcs simpledrm_primary_plane_helper_funcs = { 676 637 DRM_GEM_SHADOW_PLANE_HELPER_FUNCS, 677 - .atomic_check = drm_plane_helper_atomic_check, 638 + .atomic_check = simpledrm_primary_plane_helper_atomic_check, 678 639 .atomic_update = simpledrm_primary_plane_helper_atomic_update, 679 640 .atomic_disable = simpledrm_primary_plane_helper_atomic_disable, 680 641 };

+11 -8

drivers/gpu/drm/tiny/st7586.c

··· 64 64 65 65 static void st7586_xrgb8888_to_gray332(u8 *dst, void *vaddr, 66 66 struct drm_framebuffer *fb, 67 - struct drm_rect *clip) 67 + struct drm_rect *clip, 68 + struct drm_format_conv_state *fmtcnv_state) 68 69 { 69 70 size_t len = (clip->x2 - clip->x1) * (clip->y2 - clip->y1); 70 71 unsigned int x, y; ··· 78 77 79 78 iosys_map_set_vaddr(&dst_map, buf); 80 79 iosys_map_set_vaddr(&vmap, vaddr); 81 - drm_fb_xrgb8888_to_gray8(&dst_map, NULL, &vmap, fb, clip); 80 + drm_fb_xrgb8888_to_gray8(&dst_map, NULL, &vmap, fb, clip, fmtcnv_state); 82 81 src = buf; 83 82 84 83 for (y = clip->y1; y < clip->y2; y++) { ··· 94 93 } 95 94 96 95 static int st7586_buf_copy(void *dst, struct iosys_map *src, struct drm_framebuffer *fb, 97 - struct drm_rect *clip) 96 + struct drm_rect *clip, struct drm_format_conv_state *fmtcnv_state) 98 97 { 99 98 int ret; 100 99 ··· 102 101 if (ret) 103 102 return ret; 104 103 105 - st7586_xrgb8888_to_gray332(dst, src->vaddr, fb, clip); 104 + st7586_xrgb8888_to_gray332(dst, src->vaddr, fb, clip, fmtcnv_state); 106 105 107 106 drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE); 108 107 ··· 110 109 } 111 110 112 111 static void st7586_fb_dirty(struct iosys_map *src, struct drm_framebuffer *fb, 113 - struct drm_rect *rect) 112 + struct drm_rect *rect, struct drm_format_conv_state *fmtcnv_state) 114 113 { 115 114 struct mipi_dbi_dev *dbidev = drm_to_mipi_dbi_dev(fb->dev); 116 115 struct mipi_dbi *dbi = &dbidev->dbi; ··· 122 121 123 122 DRM_DEBUG_KMS("Flushing [FB:%d] " DRM_RECT_FMT "\n", fb->base.id, DRM_RECT_ARG(rect)); 124 123 125 - ret = st7586_buf_copy(dbidev->tx_buf, src, fb, rect); 124 + ret = st7586_buf_copy(dbidev->tx_buf, src, fb, rect, fmtcnv_state); 126 125 if (ret) 127 126 goto err_msg; 128 127 ··· 161 160 return; 162 161 163 162 if (drm_atomic_helper_damage_merged(old_state, state, &rect)) 164 - st7586_fb_dirty(&shadow_plane_state->data[0], fb, &rect); 163 + st7586_fb_dirty(&shadow_plane_state->data[0], fb, &rect, 164 + &shadow_plane_state->fmtcnv_state); 165 165 166 166 drm_dev_exit(idx); 167 167 } ··· 240 238 241 239 msleep(100); 242 240 243 - st7586_fb_dirty(&shadow_plane_state->data[0], fb, &rect); 241 + st7586_fb_dirty(&shadow_plane_state->data[0], fb, &rect, 242 + &shadow_plane_state->fmtcnv_state); 244 243 245 244 mipi_dbi_command(dbi, MIPI_DCS_SET_DISPLAY_ON); 246 245 out_exit:

+2 -1

drivers/gpu/drm/v3d/Makefile

··· 11 11 v3d_mmu.o \ 12 12 v3d_perfmon.o \ 13 13 v3d_trace_points.o \ 14 - v3d_sched.o 14 + v3d_sched.o \ 15 + v3d_sysfs.o 15 16 16 17 v3d-$(CONFIG_DEBUG_FS) += v3d_debugfs.o 17 18

+97 -73

drivers/gpu/drm/v3d/v3d_debugfs.c

··· 12 12 #include "v3d_drv.h" 13 13 #include "v3d_regs.h" 14 14 15 - #define REGDEF(reg) { reg, #reg } 15 + #define REGDEF(min_ver, max_ver, reg) { min_ver, max_ver, reg, #reg } 16 16 struct v3d_reg_def { 17 + u32 min_ver; 18 + u32 max_ver; 17 19 u32 reg; 18 20 const char *name; 19 21 }; 20 22 21 23 static const struct v3d_reg_def v3d_hub_reg_defs[] = { 22 - REGDEF(V3D_HUB_AXICFG), 23 - REGDEF(V3D_HUB_UIFCFG), 24 - REGDEF(V3D_HUB_IDENT0), 25 - REGDEF(V3D_HUB_IDENT1), 26 - REGDEF(V3D_HUB_IDENT2), 27 - REGDEF(V3D_HUB_IDENT3), 28 - REGDEF(V3D_HUB_INT_STS), 29 - REGDEF(V3D_HUB_INT_MSK_STS), 24 + REGDEF(33, 42, V3D_HUB_AXICFG), 25 + REGDEF(33, 71, V3D_HUB_UIFCFG), 26 + REGDEF(33, 71, V3D_HUB_IDENT0), 27 + REGDEF(33, 71, V3D_HUB_IDENT1), 28 + REGDEF(33, 71, V3D_HUB_IDENT2), 29 + REGDEF(33, 71, V3D_HUB_IDENT3), 30 + REGDEF(33, 71, V3D_HUB_INT_STS), 31 + REGDEF(33, 71, V3D_HUB_INT_MSK_STS), 30 32 31 - REGDEF(V3D_MMU_CTL), 32 - REGDEF(V3D_MMU_VIO_ADDR), 33 - REGDEF(V3D_MMU_VIO_ID), 34 - REGDEF(V3D_MMU_DEBUG_INFO), 33 + REGDEF(33, 71, V3D_MMU_CTL), 34 + REGDEF(33, 71, V3D_MMU_VIO_ADDR), 35 + REGDEF(33, 71, V3D_MMU_VIO_ID), 36 + REGDEF(33, 71, V3D_MMU_DEBUG_INFO), 37 + 38 + REGDEF(71, 71, V3D_GMP_STATUS(71)), 39 + REGDEF(71, 71, V3D_GMP_CFG(71)), 40 + REGDEF(71, 71, V3D_GMP_VIO_ADDR(71)), 35 41 }; 36 42 37 43 static const struct v3d_reg_def v3d_gca_reg_defs[] = { 38 - REGDEF(V3D_GCA_SAFE_SHUTDOWN), 39 - REGDEF(V3D_GCA_SAFE_SHUTDOWN_ACK), 44 + REGDEF(33, 33, V3D_GCA_SAFE_SHUTDOWN), 45 + REGDEF(33, 33, V3D_GCA_SAFE_SHUTDOWN_ACK), 40 46 }; 41 47 42 48 static const struct v3d_reg_def v3d_core_reg_defs[] = { 43 - REGDEF(V3D_CTL_IDENT0), 44 - REGDEF(V3D_CTL_IDENT1), 45 - REGDEF(V3D_CTL_IDENT2), 46 - REGDEF(V3D_CTL_MISCCFG), 47 - REGDEF(V3D_CTL_INT_STS), 48 - REGDEF(V3D_CTL_INT_MSK_STS), 49 - REGDEF(V3D_CLE_CT0CS), 50 - REGDEF(V3D_CLE_CT0CA), 51 - REGDEF(V3D_CLE_CT0EA), 52 - REGDEF(V3D_CLE_CT1CS), 53 - REGDEF(V3D_CLE_CT1CA), 54 - REGDEF(V3D_CLE_CT1EA), 49 + REGDEF(33, 71, V3D_CTL_IDENT0), 50 + REGDEF(33, 71, V3D_CTL_IDENT1), 51 + REGDEF(33, 71, V3D_CTL_IDENT2), 52 + REGDEF(33, 71, V3D_CTL_MISCCFG), 53 + REGDEF(33, 71, V3D_CTL_INT_STS), 54 + REGDEF(33, 71, V3D_CTL_INT_MSK_STS), 55 + REGDEF(33, 71, V3D_CLE_CT0CS), 56 + REGDEF(33, 71, V3D_CLE_CT0CA), 57 + REGDEF(33, 71, V3D_CLE_CT0EA), 58 + REGDEF(33, 71, V3D_CLE_CT1CS), 59 + REGDEF(33, 71, V3D_CLE_CT1CA), 60 + REGDEF(33, 71, V3D_CLE_CT1EA), 55 61 56 - REGDEF(V3D_PTB_BPCA), 57 - REGDEF(V3D_PTB_BPCS), 62 + REGDEF(33, 71, V3D_PTB_BPCA), 63 + REGDEF(33, 71, V3D_PTB_BPCS), 58 64 59 - REGDEF(V3D_GMP_STATUS), 60 - REGDEF(V3D_GMP_CFG), 61 - REGDEF(V3D_GMP_VIO_ADDR), 65 + REGDEF(33, 41, V3D_GMP_STATUS(33)), 66 + REGDEF(33, 41, V3D_GMP_CFG(33)), 67 + REGDEF(33, 41, V3D_GMP_VIO_ADDR(33)), 62 68 63 - REGDEF(V3D_ERR_FDBGO), 64 - REGDEF(V3D_ERR_FDBGB), 65 - REGDEF(V3D_ERR_FDBGS), 66 - REGDEF(V3D_ERR_STAT), 69 + REGDEF(33, 71, V3D_ERR_FDBGO), 70 + REGDEF(33, 71, V3D_ERR_FDBGB), 71 + REGDEF(33, 71, V3D_ERR_FDBGS), 72 + REGDEF(33, 71, V3D_ERR_STAT), 67 73 }; 68 74 69 75 static const struct v3d_reg_def v3d_csd_reg_defs[] = { 70 - REGDEF(V3D_CSD_STATUS), 71 - REGDEF(V3D_CSD_CURRENT_CFG0), 72 - REGDEF(V3D_CSD_CURRENT_CFG1), 73 - REGDEF(V3D_CSD_CURRENT_CFG2), 74 - REGDEF(V3D_CSD_CURRENT_CFG3), 75 - REGDEF(V3D_CSD_CURRENT_CFG4), 76 - REGDEF(V3D_CSD_CURRENT_CFG5), 77 - REGDEF(V3D_CSD_CURRENT_CFG6), 76 + REGDEF(41, 71, V3D_CSD_STATUS), 77 + REGDEF(41, 41, V3D_CSD_CURRENT_CFG0(41)), 78 + REGDEF(41, 41, V3D_CSD_CURRENT_CFG1(41)), 79 + REGDEF(41, 41, V3D_CSD_CURRENT_CFG2(41)), 80 + REGDEF(41, 41, V3D_CSD_CURRENT_CFG3(41)), 81 + REGDEF(41, 41, V3D_CSD_CURRENT_CFG4(41)), 82 + REGDEF(41, 41, V3D_CSD_CURRENT_CFG5(41)), 83 + REGDEF(41, 41, V3D_CSD_CURRENT_CFG6(41)), 84 + REGDEF(71, 71, V3D_CSD_CURRENT_CFG0(71)), 85 + REGDEF(71, 71, V3D_CSD_CURRENT_CFG1(71)), 86 + REGDEF(71, 71, V3D_CSD_CURRENT_CFG2(71)), 87 + REGDEF(71, 71, V3D_CSD_CURRENT_CFG3(71)), 88 + REGDEF(71, 71, V3D_CSD_CURRENT_CFG4(71)), 89 + REGDEF(71, 71, V3D_CSD_CURRENT_CFG5(71)), 90 + REGDEF(71, 71, V3D_CSD_CURRENT_CFG6(71)), 91 + REGDEF(71, 71, V3D_V7_CSD_CURRENT_CFG7), 78 92 }; 79 93 80 94 static int v3d_v3d_debugfs_regs(struct seq_file *m, void *unused) ··· 99 85 int i, core; 100 86 101 87 for (i = 0; i < ARRAY_SIZE(v3d_hub_reg_defs); i++) { 102 - seq_printf(m, "%s (0x%04x): 0x%08x\n", 103 - v3d_hub_reg_defs[i].name, v3d_hub_reg_defs[i].reg, 104 - V3D_READ(v3d_hub_reg_defs[i].reg)); 88 + const struct v3d_reg_def *def = &v3d_hub_reg_defs[i]; 89 + 90 + if (v3d->ver >= def->min_ver && v3d->ver <= def->max_ver) { 91 + seq_printf(m, "%s (0x%04x): 0x%08x\n", 92 + def->name, def->reg, V3D_READ(def->reg)); 93 + } 105 94 } 106 95 107 - if (v3d->ver < 41) { 108 - for (i = 0; i < ARRAY_SIZE(v3d_gca_reg_defs); i++) { 96 + for (i = 0; i < ARRAY_SIZE(v3d_gca_reg_defs); i++) { 97 + const struct v3d_reg_def *def = &v3d_gca_reg_defs[i]; 98 + 99 + if (v3d->ver >= def->min_ver && v3d->ver <= def->max_ver) { 109 100 seq_printf(m, "%s (0x%04x): 0x%08x\n", 110 - v3d_gca_reg_defs[i].name, 111 - v3d_gca_reg_defs[i].reg, 112 - V3D_GCA_READ(v3d_gca_reg_defs[i].reg)); 101 + def->name, def->reg, V3D_GCA_READ(def->reg)); 113 102 } 114 103 } 115 104 116 105 for (core = 0; core < v3d->cores; core++) { 117 106 for (i = 0; i < ARRAY_SIZE(v3d_core_reg_defs); i++) { 118 - seq_printf(m, "core %d %s (0x%04x): 0x%08x\n", 119 - core, 120 - v3d_core_reg_defs[i].name, 121 - v3d_core_reg_defs[i].reg, 122 - V3D_CORE_READ(core, 123 - v3d_core_reg_defs[i].reg)); 107 + const struct v3d_reg_def *def = &v3d_core_reg_defs[i]; 108 + 109 + if (v3d->ver >= def->min_ver && v3d->ver <= def->max_ver) { 110 + seq_printf(m, "core %d %s (0x%04x): 0x%08x\n", 111 + core, def->name, def->reg, 112 + V3D_CORE_READ(core, def->reg)); 113 + } 124 114 } 125 115 126 - if (v3d_has_csd(v3d)) { 127 - for (i = 0; i < ARRAY_SIZE(v3d_csd_reg_defs); i++) { 116 + for (i = 0; i < ARRAY_SIZE(v3d_csd_reg_defs); i++) { 117 + const struct v3d_reg_def *def = &v3d_csd_reg_defs[i]; 118 + 119 + if (v3d->ver >= def->min_ver && v3d->ver <= def->max_ver) { 128 120 seq_printf(m, "core %d %s (0x%04x): 0x%08x\n", 129 - core, 130 - v3d_csd_reg_defs[i].name, 131 - v3d_csd_reg_defs[i].reg, 132 - V3D_CORE_READ(core, 133 - v3d_csd_reg_defs[i].reg)); 121 + core, def->name, def->reg, 122 + V3D_CORE_READ(core, def->reg)); 134 123 } 135 124 } 136 125 } ··· 164 147 str_yes_no(ident2 & V3D_HUB_IDENT2_WITH_MMU)); 165 148 seq_printf(m, "TFU: %s\n", 166 149 str_yes_no(ident1 & V3D_HUB_IDENT1_WITH_TFU)); 167 - seq_printf(m, "TSY: %s\n", 168 - str_yes_no(ident1 & V3D_HUB_IDENT1_WITH_TSY)); 150 + if (v3d->ver <= 42) { 151 + seq_printf(m, "TSY: %s\n", 152 + str_yes_no(ident1 & V3D_HUB_IDENT1_WITH_TSY)); 153 + } 169 154 seq_printf(m, "MSO: %s\n", 170 155 str_yes_no(ident1 & V3D_HUB_IDENT1_WITH_MSO)); 171 156 seq_printf(m, "L3C: %s (%dkb)\n", ··· 196 177 seq_printf(m, " QPUs: %d\n", nslc * qups); 197 178 seq_printf(m, " Semaphores: %d\n", 198 179 V3D_GET_FIELD(ident1, V3D_IDENT1_NSEM)); 199 - seq_printf(m, " BCG int: %d\n", 200 - (ident2 & V3D_IDENT2_BCG_INT) != 0); 201 - seq_printf(m, " Override TMU: %d\n", 202 - (misccfg & V3D_MISCCFG_OVRTMUOUT) != 0); 180 + if (v3d->ver <= 42) { 181 + seq_printf(m, " BCG int: %d\n", 182 + (ident2 & V3D_IDENT2_BCG_INT) != 0); 183 + } 184 + if (v3d->ver < 40) { 185 + seq_printf(m, " Override TMU: %d\n", 186 + (misccfg & V3D_MISCCFG_OVRTMUOUT) != 0); 187 + } 203 188 } 204 189 205 190 return 0; ··· 235 212 int measure_ms = 1000; 236 213 237 214 if (v3d->ver >= 40) { 215 + int cycle_count_reg = V3D_PCTR_CYCLE_COUNT(v3d->ver); 238 216 V3D_CORE_WRITE(core, V3D_V4_PCTR_0_SRC_0_3, 239 - V3D_SET_FIELD(V3D_PCTR_CYCLE_COUNT, 217 + V3D_SET_FIELD(cycle_count_reg, 240 218 V3D_PCTR_S0)); 241 219 V3D_CORE_WRITE(core, V3D_V4_PCTR_0_CLR, 1); 242 220 V3D_CORE_WRITE(core, V3D_V4_PCTR_0_EN, 1); 243 221 } else { 244 222 V3D_CORE_WRITE(core, V3D_V3_PCTR_0_PCTRS0, 245 - V3D_PCTR_CYCLE_COUNT); 223 + V3D_PCTR_CYCLE_COUNT(v3d->ver)); 246 224 V3D_CORE_WRITE(core, V3D_V3_PCTR_0_CLR, 1); 247 225 V3D_CORE_WRITE(core, V3D_V3_PCTR_0_EN, 248 226 V3D_V3_PCTR_0_EN_ENABLE |

+45 -1

drivers/gpu/drm/v3d/v3d_drv.c

··· 19 19 #include <linux/module.h> 20 20 #include <linux/of_platform.h> 21 21 #include <linux/platform_device.h> 22 + #include <linux/sched/clock.h> 22 23 #include <linux/reset.h> 23 24 24 25 #include <drm/drm_drv.h> ··· 112 111 v3d_priv->v3d = v3d; 113 112 114 113 for (i = 0; i < V3D_MAX_QUEUES; i++) { 114 + v3d_priv->enabled_ns[i] = 0; 115 + v3d_priv->start_ns[i] = 0; 116 + v3d_priv->jobs_sent[i] = 0; 117 + 115 118 sched = &v3d->queue[i].sched; 116 119 drm_sched_entity_init(&v3d_priv->sched_entity[i], 117 120 DRM_SCHED_PRIORITY_NORMAL, &sched, ··· 141 136 kfree(v3d_priv); 142 137 } 143 138 144 - DEFINE_DRM_GEM_FOPS(v3d_drm_fops); 139 + static void v3d_show_fdinfo(struct drm_printer *p, struct drm_file *file) 140 + { 141 + struct v3d_file_priv *file_priv = file->driver_priv; 142 + u64 timestamp = local_clock(); 143 + enum v3d_queue queue; 144 + 145 + for (queue = 0; queue < V3D_MAX_QUEUES; queue++) { 146 + /* Note that, in case of a GPU reset, the time spent during an 147 + * attempt of executing the job is not computed in the runtime. 148 + */ 149 + drm_printf(p, "drm-engine-%s: \t%llu ns\n", 150 + v3d_queue_to_string(queue), 151 + file_priv->start_ns[queue] ? file_priv->enabled_ns[queue] 152 + + timestamp - file_priv->start_ns[queue] 153 + : file_priv->enabled_ns[queue]); 154 + 155 + /* Note that we only count jobs that completed. Therefore, jobs 156 + * that were resubmitted due to a GPU reset are not computed. 157 + */ 158 + drm_printf(p, "v3d-jobs-%s: \t%llu jobs\n", 159 + v3d_queue_to_string(queue), file_priv->jobs_sent[queue]); 160 + } 161 + } 162 + 163 + static const struct file_operations v3d_drm_fops = { 164 + .owner = THIS_MODULE, 165 + DRM_GEM_FOPS, 166 + .show_fdinfo = drm_show_fdinfo, 167 + }; 145 168 146 169 /* DRM_AUTH is required on SUBMIT_CL for now, while we don't have GMP 147 170 * protection between clients. Note that render nodes would be ··· 209 176 .ioctls = v3d_drm_ioctls, 210 177 .num_ioctls = ARRAY_SIZE(v3d_drm_ioctls), 211 178 .fops = &v3d_drm_fops, 179 + .show_fdinfo = v3d_show_fdinfo, 212 180 213 181 .name = DRIVER_NAME, 214 182 .desc = DRIVER_DESC, ··· 221 187 222 188 static const struct of_device_id v3d_of_match[] = { 223 189 { .compatible = "brcm,2711-v3d" }, 190 + { .compatible = "brcm,2712-v3d" }, 224 191 { .compatible = "brcm,7268-v3d" }, 225 192 { .compatible = "brcm,7278-v3d" }, 226 193 {}, ··· 316 281 if (ret) 317 282 goto irq_disable; 318 283 284 + ret = v3d_sysfs_init(dev); 285 + if (ret) 286 + goto drm_unregister; 287 + 319 288 return 0; 320 289 290 + drm_unregister: 291 + drm_dev_unregister(drm); 321 292 irq_disable: 322 293 v3d_irq_disable(v3d); 323 294 gem_destroy: ··· 337 296 { 338 297 struct drm_device *drm = platform_get_drvdata(pdev); 339 298 struct v3d_dev *v3d = to_v3d_dev(drm); 299 + struct device *dev = &pdev->dev; 300 + 301 + v3d_sysfs_destroy(dev); 340 302 341 303 drm_dev_unregister(drm); 342 304

+31

drivers/gpu/drm/v3d/v3d_drv.h

··· 21 21 22 22 #define V3D_MAX_QUEUES (V3D_CACHE_CLEAN + 1) 23 23 24 + static inline char *v3d_queue_to_string(enum v3d_queue queue) 25 + { 26 + switch (queue) { 27 + case V3D_BIN: return "bin"; 28 + case V3D_RENDER: return "render"; 29 + case V3D_TFU: return "tfu"; 30 + case V3D_CSD: return "csd"; 31 + case V3D_CACHE_CLEAN: return "cache_clean"; 32 + } 33 + return "UNKNOWN"; 34 + } 35 + 24 36 struct v3d_queue_state { 25 37 struct drm_gpu_scheduler sched; 26 38 27 39 u64 fence_context; 28 40 u64 emit_seqno; 41 + 42 + u64 start_ns; 43 + u64 enabled_ns; 44 + u64 jobs_sent; 29 45 }; 30 46 31 47 /* Performance monitor object. The perform lifetime is controlled by userspace ··· 183 167 } perfmon; 184 168 185 169 struct drm_sched_entity sched_entity[V3D_MAX_QUEUES]; 170 + 171 + u64 start_ns[V3D_MAX_QUEUES]; 172 + 173 + u64 enabled_ns[V3D_MAX_QUEUES]; 174 + 175 + u64 jobs_sent[V3D_MAX_QUEUES]; 186 176 }; 187 177 188 178 struct v3d_bo { ··· 259 237 * NULL otherwise. 260 238 */ 261 239 struct v3d_perfmon *perfmon; 240 + 241 + /* File descriptor of the process that submitted the job that could be used 242 + * for collecting stats by process of GPU usage. 243 + */ 244 + struct drm_file *file; 262 245 263 246 /* Callback for the freeing of the job on refcount going to 0. */ 264 247 void (*free)(struct kref *ref); ··· 445 418 struct drm_file *file_priv); 446 419 int v3d_perfmon_get_values_ioctl(struct drm_device *dev, void *data, 447 420 struct drm_file *file_priv); 421 + 422 + /* v3d_sysfs.c */ 423 + int v3d_sysfs_init(struct device *dev); 424 + void v3d_sysfs_destroy(struct device *dev);

+11 -4

drivers/gpu/drm/v3d/v3d_gem.c

··· 47 47 static void 48 48 v3d_idle_axi(struct v3d_dev *v3d, int core) 49 49 { 50 - V3D_CORE_WRITE(core, V3D_GMP_CFG, V3D_GMP_CFG_STOP_REQ); 50 + V3D_CORE_WRITE(core, V3D_GMP_CFG(v3d->ver), V3D_GMP_CFG_STOP_REQ); 51 51 52 - if (wait_for((V3D_CORE_READ(core, V3D_GMP_STATUS) & 52 + if (wait_for((V3D_CORE_READ(core, V3D_GMP_STATUS(v3d->ver)) & 53 53 (V3D_GMP_STATUS_RD_COUNT_MASK | 54 54 V3D_GMP_STATUS_WR_COUNT_MASK | 55 55 V3D_GMP_STATUS_CFG_BUSY)) == 0, 100)) { ··· 415 415 job = *container; 416 416 job->v3d = v3d; 417 417 job->free = free; 418 + job->file = file_priv; 418 419 419 420 ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], 420 - v3d_priv); 421 + 1, v3d_priv); 421 422 if (ret) 422 423 goto fail; 423 424 ··· 1014 1013 u32 pt_size = 4096 * 1024; 1015 1014 int ret, i; 1016 1015 1017 - for (i = 0; i < V3D_MAX_QUEUES; i++) 1016 + for (i = 0; i < V3D_MAX_QUEUES; i++) { 1018 1017 v3d->queue[i].fence_context = dma_fence_context_alloc(1); 1018 + v3d->queue[i].start_ns = 0; 1019 + v3d->queue[i].enabled_ns = 0; 1020 + v3d->queue[i].jobs_sent = 0; 1021 + } 1019 1022 1020 1023 spin_lock_init(&v3d->mm_lock); 1021 1024 spin_lock_init(&v3d->job_lock); ··· 1077 1072 */ 1078 1073 WARN_ON(v3d->bin_job); 1079 1074 WARN_ON(v3d->render_job); 1075 + WARN_ON(v3d->tfu_job); 1076 + WARN_ON(v3d->csd_job); 1080 1077 1081 1078 drm_mm_takedown(&v3d->mm); 1082 1079

+74 -19

drivers/gpu/drm/v3d/v3d_irq.c

··· 14 14 */ 15 15 16 16 #include <linux/platform_device.h> 17 + #include <linux/sched/clock.h> 17 18 18 19 #include "v3d_drv.h" 19 20 #include "v3d_regs.h" 20 21 #include "v3d_trace.h" 21 22 22 - #define V3D_CORE_IRQS ((u32)(V3D_INT_OUTOMEM | \ 23 - V3D_INT_FLDONE | \ 24 - V3D_INT_FRDONE | \ 25 - V3D_INT_CSDDONE | \ 26 - V3D_INT_GMPV)) 23 + #define V3D_CORE_IRQS(ver) ((u32)(V3D_INT_OUTOMEM | \ 24 + V3D_INT_FLDONE | \ 25 + V3D_INT_FRDONE | \ 26 + V3D_INT_CSDDONE(ver) | \ 27 + (ver < 71 ? V3D_INT_GMPV : 0))) 27 28 28 - #define V3D_HUB_IRQS ((u32)(V3D_HUB_INT_MMU_WRV | \ 29 - V3D_HUB_INT_MMU_PTI | \ 30 - V3D_HUB_INT_MMU_CAP | \ 31 - V3D_HUB_INT_TFUC)) 29 + #define V3D_HUB_IRQS(ver) ((u32)(V3D_HUB_INT_MMU_WRV | \ 30 + V3D_HUB_INT_MMU_PTI | \ 31 + V3D_HUB_INT_MMU_CAP | \ 32 + V3D_HUB_INT_TFUC | \ 33 + (ver >= 71 ? V3D_V7_HUB_INT_GMPV : 0))) 32 34 33 35 static irqreturn_t 34 36 v3d_hub_irq(int irq, void *arg); ··· 102 100 if (intsts & V3D_INT_FLDONE) { 103 101 struct v3d_fence *fence = 104 102 to_v3d_fence(v3d->bin_job->base.irq_fence); 103 + struct v3d_file_priv *file = v3d->bin_job->base.file->driver_priv; 104 + u64 runtime = local_clock() - file->start_ns[V3D_BIN]; 105 + 106 + file->enabled_ns[V3D_BIN] += local_clock() - file->start_ns[V3D_BIN]; 107 + file->jobs_sent[V3D_BIN]++; 108 + v3d->queue[V3D_BIN].jobs_sent++; 109 + 110 + file->start_ns[V3D_BIN] = 0; 111 + v3d->queue[V3D_BIN].start_ns = 0; 112 + 113 + file->enabled_ns[V3D_BIN] += runtime; 114 + v3d->queue[V3D_BIN].enabled_ns += runtime; 105 115 106 116 trace_v3d_bcl_irq(&v3d->drm, fence->seqno); 107 117 dma_fence_signal(&fence->base); ··· 123 109 if (intsts & V3D_INT_FRDONE) { 124 110 struct v3d_fence *fence = 125 111 to_v3d_fence(v3d->render_job->base.irq_fence); 112 + struct v3d_file_priv *file = v3d->render_job->base.file->driver_priv; 113 + u64 runtime = local_clock() - file->start_ns[V3D_RENDER]; 114 + 115 + file->enabled_ns[V3D_RENDER] += local_clock() - file->start_ns[V3D_RENDER]; 116 + file->jobs_sent[V3D_RENDER]++; 117 + v3d->queue[V3D_RENDER].jobs_sent++; 118 + 119 + file->start_ns[V3D_RENDER] = 0; 120 + v3d->queue[V3D_RENDER].start_ns = 0; 121 + 122 + file->enabled_ns[V3D_RENDER] += runtime; 123 + v3d->queue[V3D_RENDER].enabled_ns += runtime; 126 124 127 125 trace_v3d_rcl_irq(&v3d->drm, fence->seqno); 128 126 dma_fence_signal(&fence->base); 129 127 status = IRQ_HANDLED; 130 128 } 131 129 132 - if (intsts & V3D_INT_CSDDONE) { 130 + if (intsts & V3D_INT_CSDDONE(v3d->ver)) { 133 131 struct v3d_fence *fence = 134 132 to_v3d_fence(v3d->csd_job->base.irq_fence); 133 + struct v3d_file_priv *file = v3d->csd_job->base.file->driver_priv; 134 + u64 runtime = local_clock() - file->start_ns[V3D_CSD]; 135 + 136 + file->enabled_ns[V3D_CSD] += local_clock() - file->start_ns[V3D_CSD]; 137 + file->jobs_sent[V3D_CSD]++; 138 + v3d->queue[V3D_CSD].jobs_sent++; 139 + 140 + file->start_ns[V3D_CSD] = 0; 141 + v3d->queue[V3D_CSD].start_ns = 0; 142 + 143 + file->enabled_ns[V3D_CSD] += runtime; 144 + v3d->queue[V3D_CSD].enabled_ns += runtime; 135 145 136 146 trace_v3d_csd_irq(&v3d->drm, fence->seqno); 137 147 dma_fence_signal(&fence->base); ··· 165 127 /* We shouldn't be triggering these if we have GMP in 166 128 * always-allowed mode. 167 129 */ 168 - if (intsts & V3D_INT_GMPV) 130 + if (v3d->ver < 71 && (intsts & V3D_INT_GMPV)) 169 131 dev_err(v3d->drm.dev, "GMP violation\n"); 170 132 171 133 /* V3D 4.2 wires the hub and core IRQs together, so if we & ··· 192 154 if (intsts & V3D_HUB_INT_TFUC) { 193 155 struct v3d_fence *fence = 194 156 to_v3d_fence(v3d->tfu_job->base.irq_fence); 157 + struct v3d_file_priv *file = v3d->tfu_job->base.file->driver_priv; 158 + u64 runtime = local_clock() - file->start_ns[V3D_TFU]; 159 + 160 + file->enabled_ns[V3D_TFU] += local_clock() - file->start_ns[V3D_TFU]; 161 + file->jobs_sent[V3D_TFU]++; 162 + v3d->queue[V3D_TFU].jobs_sent++; 163 + 164 + file->start_ns[V3D_TFU] = 0; 165 + v3d->queue[V3D_TFU].start_ns = 0; 166 + 167 + file->enabled_ns[V3D_TFU] += runtime; 168 + v3d->queue[V3D_TFU].enabled_ns += runtime; 195 169 196 170 trace_v3d_tfu_irq(&v3d->drm, fence->seqno); 197 171 dma_fence_signal(&fence->base); ··· 247 197 status = IRQ_HANDLED; 248 198 } 249 199 200 + if (v3d->ver >= 71 && (intsts & V3D_V7_HUB_INT_GMPV)) { 201 + dev_err(v3d->drm.dev, "GMP Violation\n"); 202 + status = IRQ_HANDLED; 203 + } 204 + 250 205 return status; 251 206 } 252 207 ··· 266 211 * for us. 267 212 */ 268 213 for (core = 0; core < v3d->cores; core++) 269 - V3D_CORE_WRITE(core, V3D_CTL_INT_CLR, V3D_CORE_IRQS); 270 - V3D_WRITE(V3D_HUB_INT_CLR, V3D_HUB_IRQS); 214 + V3D_CORE_WRITE(core, V3D_CTL_INT_CLR, V3D_CORE_IRQS(v3d->ver)); 215 + V3D_WRITE(V3D_HUB_INT_CLR, V3D_HUB_IRQS(v3d->ver)); 271 216 272 217 irq1 = platform_get_irq_optional(v3d_to_pdev(v3d), 1); 273 218 if (irq1 == -EPROBE_DEFER) ··· 311 256 312 257 /* Enable our set of interrupts, masking out any others. */ 313 258 for (core = 0; core < v3d->cores; core++) { 314 - V3D_CORE_WRITE(core, V3D_CTL_INT_MSK_SET, ~V3D_CORE_IRQS); 315 - V3D_CORE_WRITE(core, V3D_CTL_INT_MSK_CLR, V3D_CORE_IRQS); 259 + V3D_CORE_WRITE(core, V3D_CTL_INT_MSK_SET, ~V3D_CORE_IRQS(v3d->ver)); 260 + V3D_CORE_WRITE(core, V3D_CTL_INT_MSK_CLR, V3D_CORE_IRQS(v3d->ver)); 316 261 } 317 262 318 - V3D_WRITE(V3D_HUB_INT_MSK_SET, ~V3D_HUB_IRQS); 319 - V3D_WRITE(V3D_HUB_INT_MSK_CLR, V3D_HUB_IRQS); 263 + V3D_WRITE(V3D_HUB_INT_MSK_SET, ~V3D_HUB_IRQS(v3d->ver)); 264 + V3D_WRITE(V3D_HUB_INT_MSK_CLR, V3D_HUB_IRQS(v3d->ver)); 320 265 } 321 266 322 267 void ··· 331 276 332 277 /* Clear any pending interrupts we might have left. */ 333 278 for (core = 0; core < v3d->cores; core++) 334 - V3D_CORE_WRITE(core, V3D_CTL_INT_CLR, V3D_CORE_IRQS); 335 - V3D_WRITE(V3D_HUB_INT_CLR, V3D_HUB_IRQS); 279 + V3D_CORE_WRITE(core, V3D_CTL_INT_CLR, V3D_CORE_IRQS(v3d->ver)); 280 + V3D_WRITE(V3D_HUB_INT_CLR, V3D_HUB_IRQS(v3d->ver)); 336 281 337 282 cancel_work_sync(&v3d->overflow_mem_work); 338 283 }

+52 -38

drivers/gpu/drm/v3d/v3d_regs.h

··· 57 57 #define V3D_HUB_INT_MSK_STS 0x0005c 58 58 #define V3D_HUB_INT_MSK_SET 0x00060 59 59 #define V3D_HUB_INT_MSK_CLR 0x00064 60 + # define V3D_V7_HUB_INT_GMPV BIT(6) 60 61 # define V3D_HUB_INT_MMU_WRV BIT(5) 61 62 # define V3D_HUB_INT_MMU_PTI BIT(4) 62 63 # define V3D_HUB_INT_MMU_CAP BIT(3) ··· 65 64 # define V3D_HUB_INT_TFUC BIT(1) 66 65 # define V3D_HUB_INT_TFUF BIT(0) 67 66 67 + /* GCA registers only exist in V3D < 41 */ 68 68 #define V3D_GCA_CACHE_CTRL 0x0000c 69 69 # define V3D_GCA_CACHE_CTRL_FLUSH BIT(0) 70 70 ··· 88 86 # define V3D_TOP_GR_BRIDGE_SW_INIT_1 0x0000c 89 87 # define V3D_TOP_GR_BRIDGE_SW_INIT_1_V3D_CLK_108_SW_INIT BIT(0) 90 88 91 - #define V3D_TFU_CS 0x00400 89 + #define V3D_TFU_CS(ver) ((ver >= 71) ? 0x00700 : 0x00400) 90 + 92 91 /* Stops current job, empties input fifo. */ 93 92 # define V3D_TFU_CS_TFURST BIT(31) 94 93 # define V3D_TFU_CS_CVTCT_MASK V3D_MASK(23, 16) ··· 98 95 # define V3D_TFU_CS_NFREE_SHIFT 8 99 96 # define V3D_TFU_CS_BUSY BIT(0) 100 97 101 - #define V3D_TFU_SU 0x00404 98 + #define V3D_TFU_SU(ver) ((ver >= 71) ? 0x00704 : 0x00404) 102 99 /* Interrupt when FINTTHR input slots are free (0 = disabled) */ 103 100 # define V3D_TFU_SU_FINTTHR_MASK V3D_MASK(13, 8) 104 101 # define V3D_TFU_SU_FINTTHR_SHIFT 8 ··· 109 106 # define V3D_TFU_SU_THROTTLE_MASK V3D_MASK(1, 0) 110 107 # define V3D_TFU_SU_THROTTLE_SHIFT 0 111 108 112 - #define V3D_TFU_ICFG 0x00408 109 + #define V3D_TFU_ICFG(ver) ((ver >= 71) ? 0x00708 : 0x00408) 113 110 /* Interrupt when the conversion is complete. */ 114 111 # define V3D_TFU_ICFG_IOC BIT(0) 115 112 116 113 /* Input Image Address */ 117 - #define V3D_TFU_IIA 0x0040c 114 + #define V3D_TFU_IIA(ver) ((ver >= 71) ? 0x0070c : 0x0040c) 118 115 /* Input Chroma Address */ 119 - #define V3D_TFU_ICA 0x00410 116 + #define V3D_TFU_ICA(ver) ((ver >= 71) ? 0x00710 : 0x00410) 120 117 /* Input Image Stride */ 121 - #define V3D_TFU_IIS 0x00414 118 + #define V3D_TFU_IIS(ver) ((ver >= 71) ? 0x00714 : 0x00414) 122 119 /* Input Image U-Plane Address */ 123 - #define V3D_TFU_IUA 0x00418 120 + #define V3D_TFU_IUA(ver) ((ver >= 71) ? 0x00718 : 0x00418) 121 + /* Image output config (VD 7.x only) */ 122 + #define V3D_V7_TFU_IOC 0x0071c 124 123 /* Output Image Address */ 125 - #define V3D_TFU_IOA 0x0041c 124 + #define V3D_TFU_IOA(ver) ((ver >= 71) ? 0x00720 : 0x0041c) 126 125 /* Image Output Size */ 127 - #define V3D_TFU_IOS 0x00420 126 + #define V3D_TFU_IOS(ver) ((ver >= 71) ? 0x00724 : 0x00420) 128 127 /* TFU YUV Coefficient 0 */ 129 - #define V3D_TFU_COEF0 0x00424 130 - /* Use these regs instead of the defaults. */ 128 + #define V3D_TFU_COEF0(ver) ((ver >= 71) ? 0x00728 : 0x00424) 129 + /* Use these regs instead of the defaults (V3D 4.x only) */ 131 130 # define V3D_TFU_COEF0_USECOEF BIT(31) 132 131 /* TFU YUV Coefficient 1 */ 133 - #define V3D_TFU_COEF1 0x00428 132 + #define V3D_TFU_COEF1(ver) ((ver >= 71) ? 0x0072c : 0x00428) 134 133 /* TFU YUV Coefficient 2 */ 135 - #define V3D_TFU_COEF2 0x0042c 134 + #define V3D_TFU_COEF2(ver) ((ver >= 71) ? 0x00730 : 0x0042c) 136 135 /* TFU YUV Coefficient 3 */ 137 - #define V3D_TFU_COEF3 0x00430 136 + #define V3D_TFU_COEF3(ver) ((ver >= 71) ? 0x00734 : 0x00430) 138 137 138 + /* V3D 4.x only */ 139 139 #define V3D_TFU_CRC 0x00434 140 140 141 141 /* Per-MMU registers. */ 142 142 143 143 #define V3D_MMUC_CONTROL 0x01000 144 - # define V3D_MMUC_CONTROL_CLEAR BIT(3) 144 + #define V3D_MMUC_CONTROL_CLEAR(ver) ((ver >= 71) ? BIT(11) : BIT(3)) 145 145 # define V3D_MMUC_CONTROL_FLUSHING BIT(2) 146 146 # define V3D_MMUC_CONTROL_FLUSH BIT(1) 147 147 # define V3D_MMUC_CONTROL_ENABLE BIT(0) ··· 252 246 253 247 #define V3D_CTL_L2TCACTL 0x00030 254 248 # define V3D_L2TCACTL_TMUWCF BIT(8) 255 - # define V3D_L2TCACTL_L2T_NO_WM BIT(4) 256 249 /* Invalidates cache lines. */ 257 250 # define V3D_L2TCACTL_FLM_FLUSH 0 258 251 /* Removes cachelines without writing dirty lines back. */ ··· 272 267 #define V3D_CTL_INT_MSK_CLR 0x00064 273 268 # define V3D_INT_QPU_MASK V3D_MASK(27, 16) 274 269 # define V3D_INT_QPU_SHIFT 16 275 - # define V3D_INT_CSDDONE BIT(7) 276 - # define V3D_INT_PCTR BIT(6) 270 + #define V3D_INT_CSDDONE(ver) ((ver >= 71) ? BIT(6) : BIT(7)) 271 + #define V3D_INT_PCTR(ver) ((ver >= 71) ? BIT(5) : BIT(6)) 277 272 # define V3D_INT_GMPV BIT(5) 278 273 # define V3D_INT_TRFB BIT(4) 279 274 # define V3D_INT_SPILLUSE BIT(3) ··· 355 350 #define V3D_V4_PCTR_0_SRC_X(x) (V3D_V4_PCTR_0_SRC_0_3 + \ 356 351 4 * (x)) 357 352 # define V3D_PCTR_S0_MASK V3D_MASK(6, 0) 353 + # define V3D_V7_PCTR_S0_MASK V3D_MASK(7, 0) 358 354 # define V3D_PCTR_S0_SHIFT 0 359 355 # define V3D_PCTR_S1_MASK V3D_MASK(14, 8) 356 + # define V3D_V7_PCTR_S1_MASK V3D_MASK(15, 8) 360 357 # define V3D_PCTR_S1_SHIFT 8 361 358 # define V3D_PCTR_S2_MASK V3D_MASK(22, 16) 359 + # define V3D_V7_PCTR_S2_MASK V3D_MASK(23, 16) 362 360 # define V3D_PCTR_S2_SHIFT 16 363 361 # define V3D_PCTR_S3_MASK V3D_MASK(30, 24) 362 + # define V3D_V7_PCTR_S3_MASK V3D_MASK(31, 24) 364 363 # define V3D_PCTR_S3_SHIFT 24 365 - # define V3D_PCTR_CYCLE_COUNT 32 364 + #define V3D_PCTR_CYCLE_COUNT(ver) ((ver >= 71) ? 0 : 32) 366 365 367 366 /* Output values of the counters. */ 368 367 #define V3D_PCTR_0_PCTR0 0x00680 369 368 #define V3D_PCTR_0_PCTR31 0x006fc 370 369 #define V3D_PCTR_0_PCTRX(x) (V3D_PCTR_0_PCTR0 + \ 371 370 4 * (x)) 372 - #define V3D_GMP_STATUS 0x00800 371 + #define V3D_GMP_STATUS(ver) ((ver >= 71) ? 0x00600 : 0x00800) 373 372 # define V3D_GMP_STATUS_GMPRST BIT(31) 374 373 # define V3D_GMP_STATUS_WR_COUNT_MASK V3D_MASK(30, 24) 375 374 # define V3D_GMP_STATUS_WR_COUNT_SHIFT 24 ··· 386 377 # define V3D_GMP_STATUS_INVPROT BIT(1) 387 378 # define V3D_GMP_STATUS_VIO BIT(0) 388 379 389 - #define V3D_GMP_CFG 0x00804 380 + #define V3D_GMP_CFG(ver) ((ver >= 71) ? 0x00604 : 0x00804) 390 381 # define V3D_GMP_CFG_LBURSTEN BIT(3) 391 382 # define V3D_GMP_CFG_PGCRSEN BIT() 392 383 # define V3D_GMP_CFG_STOP_REQ BIT(1) 393 384 # define V3D_GMP_CFG_PROT_ENABLE BIT(0) 394 385 395 - #define V3D_GMP_VIO_ADDR 0x00808 386 + #define V3D_GMP_VIO_ADDR(ver) ((ver >= 71) ? 0x00608 : 0x00808) 396 387 #define V3D_GMP_VIO_TYPE 0x0080c 397 388 #define V3D_GMP_TABLE_ADDR 0x00810 398 389 #define V3D_GMP_CLEAR_LOAD 0x00814 ··· 407 398 # define V3D_CSD_STATUS_HAVE_CURRENT_DISPATCH BIT(1) 408 399 # define V3D_CSD_STATUS_HAVE_QUEUED_DISPATCH BIT(0) 409 400 410 - #define V3D_CSD_QUEUED_CFG0 0x00904 401 + #define V3D_CSD_QUEUED_CFG0(ver) ((ver >= 71) ? 0x00930 : 0x00904) 411 402 # define V3D_CSD_QUEUED_CFG0_NUM_WGS_X_MASK V3D_MASK(31, 16) 412 403 # define V3D_CSD_QUEUED_CFG0_NUM_WGS_X_SHIFT 16 413 404 # define V3D_CSD_QUEUED_CFG0_WG_X_OFFSET_MASK V3D_MASK(15, 0) 414 405 # define V3D_CSD_QUEUED_CFG0_WG_X_OFFSET_SHIFT 0 415 406 416 - #define V3D_CSD_QUEUED_CFG1 0x00908 407 + #define V3D_CSD_QUEUED_CFG1(ver) ((ver >= 71) ? 0x00934 : 0x00908) 417 408 # define V3D_CSD_QUEUED_CFG1_NUM_WGS_Y_MASK V3D_MASK(31, 16) 418 409 # define V3D_CSD_QUEUED_CFG1_NUM_WGS_Y_SHIFT 16 419 410 # define V3D_CSD_QUEUED_CFG1_WG_Y_OFFSET_MASK V3D_MASK(15, 0) 420 411 # define V3D_CSD_QUEUED_CFG1_WG_Y_OFFSET_SHIFT 0 421 412 422 - #define V3D_CSD_QUEUED_CFG2 0x0090c 413 + #define V3D_CSD_QUEUED_CFG2(ver) ((ver >= 71) ? 0x00938 : 0x0090c) 423 414 # define V3D_CSD_QUEUED_CFG2_NUM_WGS_Z_MASK V3D_MASK(31, 16) 424 415 # define V3D_CSD_QUEUED_CFG2_NUM_WGS_Z_SHIFT 16 425 416 # define V3D_CSD_QUEUED_CFG2_WG_Z_OFFSET_MASK V3D_MASK(15, 0) 426 417 # define V3D_CSD_QUEUED_CFG2_WG_Z_OFFSET_SHIFT 0 427 418 428 - #define V3D_CSD_QUEUED_CFG3 0x00910 419 + #define V3D_CSD_QUEUED_CFG3(ver) ((ver >= 71) ? 0x0093c : 0x00910) 429 420 # define V3D_CSD_QUEUED_CFG3_OVERLAP_WITH_PREV BIT(26) 430 421 # define V3D_CSD_QUEUED_CFG3_MAX_SG_ID_MASK V3D_MASK(25, 20) 431 422 # define V3D_CSD_QUEUED_CFG3_MAX_SG_ID_SHIFT 20 ··· 437 428 # define V3D_CSD_QUEUED_CFG3_WG_SIZE_SHIFT 0 438 429 439 430 /* Number of batches, minus 1 */ 440 - #define V3D_CSD_QUEUED_CFG4 0x00914 431 + #define V3D_CSD_QUEUED_CFG4(ver) ((ver >= 71) ? 0x00940 : 0x00914) 441 432 442 433 /* Shader address, pnan, singleseg, threading, like a shader record. */ 443 - #define V3D_CSD_QUEUED_CFG5 0x00918 434 + #define V3D_CSD_QUEUED_CFG5(ver) ((ver >= 71) ? 0x00944 : 0x00918) 444 435 445 436 /* Uniforms address (4 byte aligned) */ 446 - #define V3D_CSD_QUEUED_CFG6 0x0091c 437 + #define V3D_CSD_QUEUED_CFG6(ver) ((ver >= 71) ? 0x00948 : 0x0091c) 447 438 448 - #define V3D_CSD_CURRENT_CFG0 0x00920 449 - #define V3D_CSD_CURRENT_CFG1 0x00924 450 - #define V3D_CSD_CURRENT_CFG2 0x00928 451 - #define V3D_CSD_CURRENT_CFG3 0x0092c 452 - #define V3D_CSD_CURRENT_CFG4 0x00930 453 - #define V3D_CSD_CURRENT_CFG5 0x00934 454 - #define V3D_CSD_CURRENT_CFG6 0x00938 439 + /* V3D 7.x+ only */ 440 + #define V3D_V7_CSD_QUEUED_CFG7 0x0094c 455 441 456 - #define V3D_CSD_CURRENT_ID0 0x0093c 442 + #define V3D_CSD_CURRENT_CFG0(ver) ((ver >= 71) ? 0x00958 : 0x00920) 443 + #define V3D_CSD_CURRENT_CFG1(ver) ((ver >= 71) ? 0x0095c : 0x00924) 444 + #define V3D_CSD_CURRENT_CFG2(ver) ((ver >= 71) ? 0x00960 : 0x00928) 445 + #define V3D_CSD_CURRENT_CFG3(ver) ((ver >= 71) ? 0x00964 : 0x0092c) 446 + #define V3D_CSD_CURRENT_CFG4(ver) ((ver >= 71) ? 0x00968 : 0x00930) 447 + #define V3D_CSD_CURRENT_CFG5(ver) ((ver >= 71) ? 0x0096c : 0x00934) 448 + #define V3D_CSD_CURRENT_CFG6(ver) ((ver >= 71) ? 0x00970 : 0x00938) 449 + /* V3D 7.x+ only */ 450 + #define V3D_V7_CSD_CURRENT_CFG7 0x00974 451 + 452 + #define V3D_CSD_CURRENT_ID0(ver) ((ver >= 71) ? 0x00978 : 0x0093c) 457 453 # define V3D_CSD_CURRENT_ID0_WG_X_MASK V3D_MASK(31, 16) 458 454 # define V3D_CSD_CURRENT_ID0_WG_X_SHIFT 16 459 455 # define V3D_CSD_CURRENT_ID0_WG_IN_SG_MASK V3D_MASK(11, 8) ··· 466 452 # define V3D_CSD_CURRENT_ID0_L_IDX_MASK V3D_MASK(7, 0) 467 453 # define V3D_CSD_CURRENT_ID0_L_IDX_SHIFT 0 468 454 469 - #define V3D_CSD_CURRENT_ID1 0x00940 455 + #define V3D_CSD_CURRENT_ID1(ver) ((ver >= 71) ? 0x0097c : 0x00940) 470 456 # define V3D_CSD_CURRENT_ID0_WG_Z_MASK V3D_MASK(31, 16) 471 457 # define V3D_CSD_CURRENT_ID0_WG_Z_SHIFT 16 472 458 # define V3D_CSD_CURRENT_ID0_WG_Y_MASK V3D_MASK(15, 0)

+59 -22

drivers/gpu/drm/v3d/v3d_sched.c

··· 18 18 * semaphores to interlock between them. 19 19 */ 20 20 21 + #include <linux/sched/clock.h> 21 22 #include <linux/kthread.h> 22 23 23 24 #include "v3d_drv.h" ··· 77 76 { 78 77 struct v3d_bin_job *job = to_bin_job(sched_job); 79 78 struct v3d_dev *v3d = job->base.v3d; 79 + struct v3d_file_priv *file = job->base.file->driver_priv; 80 80 struct drm_device *dev = &v3d->drm; 81 81 struct dma_fence *fence; 82 82 unsigned long irqflags; ··· 109 107 trace_v3d_submit_cl(dev, false, to_v3d_fence(fence)->seqno, 110 108 job->start, job->end); 111 109 110 + file->start_ns[V3D_BIN] = local_clock(); 111 + v3d->queue[V3D_BIN].start_ns = file->start_ns[V3D_BIN]; 112 + 112 113 v3d_switch_perfmon(v3d, &job->base); 113 114 114 115 /* Set the current and end address of the control list. ··· 136 131 { 137 132 struct v3d_render_job *job = to_render_job(sched_job); 138 133 struct v3d_dev *v3d = job->base.v3d; 134 + struct v3d_file_priv *file = job->base.file->driver_priv; 139 135 struct drm_device *dev = &v3d->drm; 140 136 struct dma_fence *fence; 141 137 ··· 164 158 trace_v3d_submit_cl(dev, true, to_v3d_fence(fence)->seqno, 165 159 job->start, job->end); 166 160 161 + file->start_ns[V3D_RENDER] = local_clock(); 162 + v3d->queue[V3D_RENDER].start_ns = file->start_ns[V3D_RENDER]; 163 + 167 164 v3d_switch_perfmon(v3d, &job->base); 168 165 169 166 /* XXX: Set the QCFG */ ··· 185 176 { 186 177 struct v3d_tfu_job *job = to_tfu_job(sched_job); 187 178 struct v3d_dev *v3d = job->base.v3d; 179 + struct v3d_file_priv *file = job->base.file->driver_priv; 188 180 struct drm_device *dev = &v3d->drm; 189 181 struct dma_fence *fence; 190 182 ··· 200 190 201 191 trace_v3d_submit_tfu(dev, to_v3d_fence(fence)->seqno); 202 192 203 - V3D_WRITE(V3D_TFU_IIA, job->args.iia); 204 - V3D_WRITE(V3D_TFU_IIS, job->args.iis); 205 - V3D_WRITE(V3D_TFU_ICA, job->args.ica); 206 - V3D_WRITE(V3D_TFU_IUA, job->args.iua); 207 - V3D_WRITE(V3D_TFU_IOA, job->args.ioa); 208 - V3D_WRITE(V3D_TFU_IOS, job->args.ios); 209 - V3D_WRITE(V3D_TFU_COEF0, job->args.coef[0]); 210 - if (job->args.coef[0] & V3D_TFU_COEF0_USECOEF) { 211 - V3D_WRITE(V3D_TFU_COEF1, job->args.coef[1]); 212 - V3D_WRITE(V3D_TFU_COEF2, job->args.coef[2]); 213 - V3D_WRITE(V3D_TFU_COEF3, job->args.coef[3]); 193 + file->start_ns[V3D_TFU] = local_clock(); 194 + v3d->queue[V3D_TFU].start_ns = file->start_ns[V3D_TFU]; 195 + 196 + V3D_WRITE(V3D_TFU_IIA(v3d->ver), job->args.iia); 197 + V3D_WRITE(V3D_TFU_IIS(v3d->ver), job->args.iis); 198 + V3D_WRITE(V3D_TFU_ICA(v3d->ver), job->args.ica); 199 + V3D_WRITE(V3D_TFU_IUA(v3d->ver), job->args.iua); 200 + V3D_WRITE(V3D_TFU_IOA(v3d->ver), job->args.ioa); 201 + if (v3d->ver >= 71) 202 + V3D_WRITE(V3D_V7_TFU_IOC, job->args.v71.ioc); 203 + V3D_WRITE(V3D_TFU_IOS(v3d->ver), job->args.ios); 204 + V3D_WRITE(V3D_TFU_COEF0(v3d->ver), job->args.coef[0]); 205 + if (v3d->ver >= 71 || (job->args.coef[0] & V3D_TFU_COEF0_USECOEF)) { 206 + V3D_WRITE(V3D_TFU_COEF1(v3d->ver), job->args.coef[1]); 207 + V3D_WRITE(V3D_TFU_COEF2(v3d->ver), job->args.coef[2]); 208 + V3D_WRITE(V3D_TFU_COEF3(v3d->ver), job->args.coef[3]); 214 209 } 215 210 /* ICFG kicks off the job. */ 216 - V3D_WRITE(V3D_TFU_ICFG, job->args.icfg | V3D_TFU_ICFG_IOC); 211 + V3D_WRITE(V3D_TFU_ICFG(v3d->ver), job->args.icfg | V3D_TFU_ICFG_IOC); 217 212 218 213 return fence; 219 214 } ··· 228 213 { 229 214 struct v3d_csd_job *job = to_csd_job(sched_job); 230 215 struct v3d_dev *v3d = job->base.v3d; 216 + struct v3d_file_priv *file = job->base.file->driver_priv; 231 217 struct drm_device *dev = &v3d->drm; 232 218 struct dma_fence *fence; 233 - int i; 219 + int i, csd_cfg0_reg, csd_cfg_reg_count; 234 220 235 221 v3d->csd_job = job; 236 222 ··· 247 231 248 232 trace_v3d_submit_csd(dev, to_v3d_fence(fence)->seqno); 249 233 234 + file->start_ns[V3D_CSD] = local_clock(); 235 + v3d->queue[V3D_CSD].start_ns = file->start_ns[V3D_CSD]; 236 + 250 237 v3d_switch_perfmon(v3d, &job->base); 251 238 252 - for (i = 1; i <= 6; i++) 253 - V3D_CORE_WRITE(0, V3D_CSD_QUEUED_CFG0 + 4 * i, job->args.cfg[i]); 239 + csd_cfg0_reg = V3D_CSD_QUEUED_CFG0(v3d->ver); 240 + csd_cfg_reg_count = v3d->ver < 71 ? 6 : 7; 241 + for (i = 1; i <= csd_cfg_reg_count; i++) 242 + V3D_CORE_WRITE(0, csd_cfg0_reg + 4 * i, job->args.cfg[i]); 254 243 /* CFG0 write kicks off the job. */ 255 - V3D_CORE_WRITE(0, V3D_CSD_QUEUED_CFG0, job->args.cfg[0]); 244 + V3D_CORE_WRITE(0, csd_cfg0_reg, job->args.cfg[0]); 256 245 257 246 return fence; 258 247 } ··· 267 246 { 268 247 struct v3d_job *job = to_v3d_job(sched_job); 269 248 struct v3d_dev *v3d = job->v3d; 249 + struct v3d_file_priv *file = job->file->driver_priv; 250 + u64 runtime; 251 + 252 + file->start_ns[V3D_CACHE_CLEAN] = local_clock(); 253 + v3d->queue[V3D_CACHE_CLEAN].start_ns = file->start_ns[V3D_CACHE_CLEAN]; 270 254 271 255 v3d_clean_caches(v3d); 256 + 257 + runtime = local_clock() - file->start_ns[V3D_CACHE_CLEAN]; 258 + 259 + file->enabled_ns[V3D_CACHE_CLEAN] += runtime; 260 + v3d->queue[V3D_CACHE_CLEAN].enabled_ns += runtime; 261 + 262 + file->jobs_sent[V3D_CACHE_CLEAN]++; 263 + v3d->queue[V3D_CACHE_CLEAN].jobs_sent++; 264 + 265 + file->start_ns[V3D_CACHE_CLEAN] = 0; 266 + v3d->queue[V3D_CACHE_CLEAN].start_ns = 0; 272 267 273 268 return NULL; 274 269 } ··· 373 336 { 374 337 struct v3d_csd_job *job = to_csd_job(sched_job); 375 338 struct v3d_dev *v3d = job->base.v3d; 376 - u32 batches = V3D_CORE_READ(0, V3D_CSD_CURRENT_CFG4); 339 + u32 batches = V3D_CORE_READ(0, V3D_CSD_CURRENT_CFG4(v3d->ver)); 377 340 378 341 /* If we've made progress, skip reset and let the timer get 379 342 * rearmed. ··· 425 388 int ret; 426 389 427 390 ret = drm_sched_init(&v3d->queue[V3D_BIN].sched, 428 - &v3d_bin_sched_ops, 391 + &v3d_bin_sched_ops, NULL, 429 392 DRM_SCHED_PRIORITY_COUNT, 430 393 hw_jobs_limit, job_hang_limit, 431 394 msecs_to_jiffies(hang_limit_ms), NULL, ··· 434 397 return ret; 435 398 436 399 ret = drm_sched_init(&v3d->queue[V3D_RENDER].sched, 437 - &v3d_render_sched_ops, 400 + &v3d_render_sched_ops, NULL, 438 401 DRM_SCHED_PRIORITY_COUNT, 439 402 hw_jobs_limit, job_hang_limit, 440 403 msecs_to_jiffies(hang_limit_ms), NULL, ··· 443 406 goto fail; 444 407 445 408 ret = drm_sched_init(&v3d->queue[V3D_TFU].sched, 446 - &v3d_tfu_sched_ops, 409 + &v3d_tfu_sched_ops, NULL, 447 410 DRM_SCHED_PRIORITY_COUNT, 448 411 hw_jobs_limit, job_hang_limit, 449 412 msecs_to_jiffies(hang_limit_ms), NULL, ··· 453 416 454 417 if (v3d_has_csd(v3d)) { 455 418 ret = drm_sched_init(&v3d->queue[V3D_CSD].sched, 456 - &v3d_csd_sched_ops, 419 + &v3d_csd_sched_ops, NULL, 457 420 DRM_SCHED_PRIORITY_COUNT, 458 421 hw_jobs_limit, job_hang_limit, 459 422 msecs_to_jiffies(hang_limit_ms), NULL, ··· 462 425 goto fail; 463 426 464 427 ret = drm_sched_init(&v3d->queue[V3D_CACHE_CLEAN].sched, 465 - &v3d_cache_clean_sched_ops, 428 + &v3d_cache_clean_sched_ops, NULL, 466 429 DRM_SCHED_PRIORITY_COUNT, 467 430 hw_jobs_limit, job_hang_limit, 468 431 msecs_to_jiffies(hang_limit_ms), NULL,

+69

drivers/gpu/drm/v3d/v3d_sysfs.c

··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2023 Igalia S.L. 4 + */ 5 + 6 + #include <linux/sched/clock.h> 7 + #include <linux/sysfs.h> 8 + 9 + #include "v3d_drv.h" 10 + 11 + static ssize_t 12 + gpu_stats_show(struct device *dev, struct device_attribute *attr, char *buf) 13 + { 14 + struct drm_device *drm = dev_get_drvdata(dev); 15 + struct v3d_dev *v3d = to_v3d_dev(drm); 16 + enum v3d_queue queue; 17 + u64 timestamp = local_clock(); 18 + u64 active_runtime; 19 + ssize_t len = 0; 20 + 21 + len += sysfs_emit(buf, "queue\ttimestamp\tjobs\truntime\n"); 22 + 23 + for (queue = 0; queue < V3D_MAX_QUEUES; queue++) { 24 + if (v3d->queue[queue].start_ns) 25 + active_runtime = timestamp - v3d->queue[queue].start_ns; 26 + else 27 + active_runtime = 0; 28 + 29 + /* Each line will display the queue name, timestamp, the number 30 + * of jobs sent to that queue and the runtime, as can be seem here: 31 + * 32 + * queue timestamp jobs runtime 33 + * bin 239043069420 22620 17438164056 34 + * render 239043069420 22619 27284814161 35 + * tfu 239043069420 8763 394592566 36 + * csd 239043069420 3168 10787905530 37 + * cache_clean 239043069420 6127 237375940 38 + */ 39 + len += sysfs_emit_at(buf, len, "%s\t%llu\t%llu\t%llu\n", 40 + v3d_queue_to_string(queue), 41 + timestamp, 42 + v3d->queue[queue].jobs_sent, 43 + v3d->queue[queue].enabled_ns + active_runtime); 44 + } 45 + 46 + return len; 47 + } 48 + static DEVICE_ATTR_RO(gpu_stats); 49 + 50 + static struct attribute *v3d_sysfs_entries[] = { 51 + &dev_attr_gpu_stats.attr, 52 + NULL, 53 + }; 54 + 55 + static struct attribute_group v3d_sysfs_attr_group = { 56 + .attrs = v3d_sysfs_entries, 57 + }; 58 + 59 + int 60 + v3d_sysfs_init(struct device *dev) 61 + { 62 + return sysfs_create_group(&dev->kobj, &v3d_sysfs_attr_group); 63 + } 64 + 65 + void 66 + v3d_sysfs_destroy(struct device *dev) 67 + { 68 + return sysfs_remove_group(&dev->kobj, &v3d_sysfs_attr_group); 69 + }

+5

drivers/gpu/drm/virtio/virtgpu_drv.h

··· 58 58 #define MAX_CAPSET_ID 63 59 59 #define MAX_RINGS 64 60 60 61 + /* See virtio_gpu_ctx_create. One additional character for NULL terminator. */ 62 + #define DEBUG_NAME_MAX_LEN 65 63 + 61 64 struct virtio_gpu_object_params { 62 65 unsigned long size; 63 66 bool dumb; ··· 277 274 uint64_t base_fence_ctx; 278 275 uint64_t ring_idx_mask; 279 276 struct mutex context_lock; 277 + char debug_name[DEBUG_NAME_MAX_LEN]; 278 + bool explicit_debug_name; 280 279 }; 281 280 282 281 /* virtgpu_ioctl.c */

+33 -8

drivers/gpu/drm/virtio/virtgpu_ioctl.c

··· 42 42 static void virtio_gpu_create_context_locked(struct virtio_gpu_device *vgdev, 43 43 struct virtio_gpu_fpriv *vfpriv) 44 44 { 45 - char dbgname[TASK_COMM_LEN]; 45 + if (vfpriv->explicit_debug_name) { 46 + virtio_gpu_cmd_context_create(vgdev, vfpriv->ctx_id, 47 + vfpriv->context_init, 48 + strlen(vfpriv->debug_name), 49 + vfpriv->debug_name); 50 + } else { 51 + char dbgname[TASK_COMM_LEN]; 46 52 47 - get_task_comm(dbgname, current); 48 - virtio_gpu_cmd_context_create(vgdev, vfpriv->ctx_id, 49 - vfpriv->context_init, strlen(dbgname), 50 - dbgname); 53 + get_task_comm(dbgname, current); 54 + virtio_gpu_cmd_context_create(vgdev, vfpriv->ctx_id, 55 + vfpriv->context_init, strlen(dbgname), 56 + dbgname); 57 + } 51 58 52 59 vfpriv->context_created = true; 53 60 } ··· 113 106 break; 114 107 case VIRTGPU_PARAM_SUPPORTED_CAPSET_IDs: 115 108 value = vgdev->capset_id_mask; 109 + break; 110 + case VIRTGPU_PARAM_EXPLICIT_DEBUG_NAME: 111 + value = vgdev->has_context_init ? 1 : 0; 116 112 break; 117 113 default: 118 114 return -EINVAL; ··· 575 565 void *data, struct drm_file *file) 576 566 { 577 567 int ret = 0; 578 - uint32_t num_params, i, param, value; 579 - uint64_t valid_ring_mask; 568 + uint32_t num_params, i; 569 + uint64_t valid_ring_mask, param, value; 580 570 size_t len; 581 571 struct drm_virtgpu_context_set_param *ctx_set_params = NULL; 582 572 struct virtio_gpu_device *vgdev = dev->dev_private; ··· 590 580 return -EINVAL; 591 581 592 582 /* Number of unique parameters supported at this time. */ 593 - if (num_params > 3) 583 + if (num_params > 4) 594 584 return -EINVAL; 595 585 596 586 ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params), ··· 651 641 } 652 642 653 643 vfpriv->ring_idx_mask = value; 644 + break; 645 + case VIRTGPU_CONTEXT_PARAM_DEBUG_NAME: 646 + if (vfpriv->explicit_debug_name) { 647 + ret = -EINVAL; 648 + goto out_unlock; 649 + } 650 + 651 + ret = strncpy_from_user(vfpriv->debug_name, 652 + u64_to_user_ptr(value), 653 + DEBUG_NAME_MAX_LEN - 1); 654 + if (ret < 0) 655 + goto out_unlock; 656 + 657 + vfpriv->explicit_debug_name = true; 658 + ret = 0; 654 659 break; 655 660 default: 656 661 ret = -EINVAL;

+123 -5

drivers/video/fbdev/simplefb.c

··· 21 21 #include <linux/platform_device.h> 22 22 #include <linux/clk.h> 23 23 #include <linux/of.h> 24 + #include <linux/of_address.h> 24 25 #include <linux/of_clk.h> 25 26 #include <linux/of_platform.h> 26 27 #include <linux/parser.h> 28 + #include <linux/pm_domain.h> 27 29 #include <linux/regulator/consumer.h> 28 30 29 31 static const struct fb_fix_screeninfo simplefb_fix = { ··· 79 77 unsigned int clk_count; 80 78 struct clk **clks; 81 79 #endif 80 + #if defined CONFIG_OF && defined CONFIG_PM_GENERIC_DOMAINS 81 + unsigned int num_genpds; 82 + struct device **genpds; 83 + struct device_link **genpd_links; 84 + #endif 82 85 #if defined CONFIG_OF && defined CONFIG_REGULATOR 83 86 bool regulators_enabled; 84 87 u32 regulator_count; ··· 128 121 u32 height; 129 122 u32 stride; 130 123 struct simplefb_format *format; 124 + struct resource memory; 131 125 }; 132 126 133 127 static int simplefb_parse_dt(struct platform_device *pdev, 134 128 struct simplefb_params *params) 135 129 { 136 - struct device_node *np = pdev->dev.of_node; 130 + struct device_node *np = pdev->dev.of_node, *mem; 137 131 int ret; 138 132 const char *format; 139 133 int i; ··· 174 166 return -EINVAL; 175 167 } 176 168 169 + mem = of_parse_phandle(np, "memory-region", 0); 170 + if (mem) { 171 + ret = of_address_to_resource(mem, 0, &params->memory); 172 + if (ret < 0) { 173 + dev_err(&pdev->dev, "failed to parse memory-region\n"); 174 + of_node_put(mem); 175 + return ret; 176 + } 177 + 178 + if (of_property_present(np, "reg")) 179 + dev_warn(&pdev->dev, "preferring \"memory-region\" over \"reg\" property\n"); 180 + 181 + of_node_put(mem); 182 + } else { 183 + memset(&params->memory, 0, sizeof(params->memory)); 184 + } 185 + 177 186 return 0; 178 187 } 179 188 ··· 217 192 dev_err(&pdev->dev, "Invalid format value\n"); 218 193 return -EINVAL; 219 194 } 195 + 196 + memset(&params->memory, 0, sizeof(params->memory)); 220 197 221 198 return 0; 222 199 } ··· 438 411 static void simplefb_regulators_destroy(struct simplefb_par *par) { } 439 412 #endif 440 413 414 + #if defined CONFIG_OF && defined CONFIG_PM_GENERIC_DOMAINS 415 + static void simplefb_detach_genpds(void *res) 416 + { 417 + struct simplefb_par *par = res; 418 + unsigned int i = par->num_genpds; 419 + 420 + if (par->num_genpds <= 1) 421 + return; 422 + 423 + while (i--) { 424 + if (par->genpd_links[i]) 425 + device_link_del(par->genpd_links[i]); 426 + 427 + if (!IS_ERR_OR_NULL(par->genpds[i])) 428 + dev_pm_domain_detach(par->genpds[i], true); 429 + } 430 + } 431 + 432 + static int simplefb_attach_genpds(struct simplefb_par *par, 433 + struct platform_device *pdev) 434 + { 435 + struct device *dev = &pdev->dev; 436 + unsigned int i; 437 + int err; 438 + 439 + err = of_count_phandle_with_args(dev->of_node, "power-domains", 440 + "#power-domain-cells"); 441 + if (err < 0) { 442 + dev_info(dev, "failed to parse power-domains: %d\n", err); 443 + return err; 444 + } 445 + 446 + par->num_genpds = err; 447 + 448 + /* 449 + * Single power-domain devices are handled by the driver core, so 450 + * nothing to do here. 451 + */ 452 + if (par->num_genpds <= 1) 453 + return 0; 454 + 455 + par->genpds = devm_kcalloc(dev, par->num_genpds, sizeof(*par->genpds), 456 + GFP_KERNEL); 457 + if (!par->genpds) 458 + return -ENOMEM; 459 + 460 + par->genpd_links = devm_kcalloc(dev, par->num_genpds, 461 + sizeof(*par->genpd_links), 462 + GFP_KERNEL); 463 + if (!par->genpd_links) 464 + return -ENOMEM; 465 + 466 + for (i = 0; i < par->num_genpds; i++) { 467 + par->genpds[i] = dev_pm_domain_attach_by_id(dev, i); 468 + if (IS_ERR(par->genpds[i])) { 469 + err = PTR_ERR(par->genpds[i]); 470 + if (err == -EPROBE_DEFER) { 471 + simplefb_detach_genpds(par); 472 + return err; 473 + } 474 + 475 + dev_warn(dev, "failed to attach domain %u: %d\n", i, err); 476 + continue; 477 + } 478 + 479 + par->genpd_links[i] = device_link_add(dev, par->genpds[i], 480 + DL_FLAG_STATELESS | 481 + DL_FLAG_PM_RUNTIME | 482 + DL_FLAG_RPM_ACTIVE); 483 + if (!par->genpd_links[i]) 484 + dev_warn(dev, "failed to link power-domain %u\n", i); 485 + } 486 + 487 + return devm_add_action_or_reset(dev, simplefb_detach_genpds, par); 488 + } 489 + #else 490 + static int simplefb_attach_genpds(struct simplefb_par *par, 491 + struct platform_device *pdev) 492 + { 493 + return 0; 494 + } 495 + #endif 496 + 441 497 static int simplefb_probe(struct platform_device *pdev) 442 498 { 443 499 int ret; ··· 541 431 if (ret) 542 432 return ret; 543 433 544 - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); 545 - if (!res) { 546 - dev_err(&pdev->dev, "No memory resource\n"); 547 - return -EINVAL; 434 + if (params.memory.start == 0 && params.memory.end == 0) { 435 + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); 436 + if (!res) { 437 + dev_err(&pdev->dev, "No memory resource\n"); 438 + return -EINVAL; 439 + } 440 + } else { 441 + res = &params.memory; 548 442 } 549 443 550 444 mem = request_mem_region(res->start, resource_size(res), "simplefb"); ··· 606 492 ret = simplefb_regulators_get(par, pdev); 607 493 if (ret < 0) 608 494 goto error_clocks; 495 + 496 + ret = simplefb_attach_genpds(par, pdev); 497 + if (ret < 0) 498 + goto error_regulators; 609 499 610 500 simplefb_clocks_enable(par, pdev); 611 501 simplefb_regulators_enable(par, pdev);

-148

include/drm/drm_edid.h

··· 269 269 #define DRM_EDID_DSC_MAX_SLICES 0xf 270 270 #define DRM_EDID_DSC_TOTAL_CHUNK_KBYTES 0x3f 271 271 272 - /* ELD Header Block */ 273 - #define DRM_ELD_HEADER_BLOCK_SIZE 4 274 - 275 - #define DRM_ELD_VER 0 276 - # define DRM_ELD_VER_SHIFT 3 277 - # define DRM_ELD_VER_MASK (0x1f << 3) 278 - # define DRM_ELD_VER_CEA861D (2 << 3) /* supports 861D or below */ 279 - # define DRM_ELD_VER_CANNED (0x1f << 3) 280 - 281 - #define DRM_ELD_BASELINE_ELD_LEN 2 /* in dwords! */ 282 - 283 - /* ELD Baseline Block for ELD_Ver == 2 */ 284 - #define DRM_ELD_CEA_EDID_VER_MNL 4 285 - # define DRM_ELD_CEA_EDID_VER_SHIFT 5 286 - # define DRM_ELD_CEA_EDID_VER_MASK (7 << 5) 287 - # define DRM_ELD_CEA_EDID_VER_NONE (0 << 5) 288 - # define DRM_ELD_CEA_EDID_VER_CEA861 (1 << 5) 289 - # define DRM_ELD_CEA_EDID_VER_CEA861A (2 << 5) 290 - # define DRM_ELD_CEA_EDID_VER_CEA861BCD (3 << 5) 291 - # define DRM_ELD_MNL_SHIFT 0 292 - # define DRM_ELD_MNL_MASK (0x1f << 0) 293 - 294 - #define DRM_ELD_SAD_COUNT_CONN_TYPE 5 295 - # define DRM_ELD_SAD_COUNT_SHIFT 4 296 - # define DRM_ELD_SAD_COUNT_MASK (0xf << 4) 297 - # define DRM_ELD_CONN_TYPE_SHIFT 2 298 - # define DRM_ELD_CONN_TYPE_MASK (3 << 2) 299 - # define DRM_ELD_CONN_TYPE_HDMI (0 << 2) 300 - # define DRM_ELD_CONN_TYPE_DP (1 << 2) 301 - # define DRM_ELD_SUPPORTS_AI (1 << 1) 302 - # define DRM_ELD_SUPPORTS_HDCP (1 << 0) 303 - 304 - #define DRM_ELD_AUD_SYNCH_DELAY 6 /* in units of 2 ms */ 305 - # define DRM_ELD_AUD_SYNCH_DELAY_MAX 0xfa /* 500 ms */ 306 - 307 - #define DRM_ELD_SPEAKER 7 308 - # define DRM_ELD_SPEAKER_MASK 0x7f 309 - # define DRM_ELD_SPEAKER_RLRC (1 << 6) 310 - # define DRM_ELD_SPEAKER_FLRC (1 << 5) 311 - # define DRM_ELD_SPEAKER_RC (1 << 4) 312 - # define DRM_ELD_SPEAKER_RLR (1 << 3) 313 - # define DRM_ELD_SPEAKER_FC (1 << 2) 314 - # define DRM_ELD_SPEAKER_LFE (1 << 1) 315 - # define DRM_ELD_SPEAKER_FLR (1 << 0) 316 - 317 - #define DRM_ELD_PORT_ID 8 /* offsets 8..15 inclusive */ 318 - # define DRM_ELD_PORT_ID_LEN 8 319 - 320 - #define DRM_ELD_MANUFACTURER_NAME0 16 321 - #define DRM_ELD_MANUFACTURER_NAME1 17 322 - 323 - #define DRM_ELD_PRODUCT_CODE0 18 324 - #define DRM_ELD_PRODUCT_CODE1 19 325 - 326 - #define DRM_ELD_MONITOR_NAME_STRING 20 /* offsets 20..(20+mnl-1) inclusive */ 327 - 328 - #define DRM_ELD_CEA_SAD(mnl, sad) (20 + (mnl) + 3 * (sad)) 329 - 330 272 struct edid { 331 273 u8 header[8]; 332 274 /* Vendor & product info */ ··· 350 408 const struct drm_connector *connector, 351 409 const struct drm_display_mode *mode, 352 410 enum hdmi_quantization_range rgb_quant_range); 353 - 354 - /** 355 - * drm_eld_mnl - Get ELD monitor name length in bytes. 356 - * @eld: pointer to an eld memory structure with mnl set 357 - */ 358 - static inline int drm_eld_mnl(const uint8_t *eld) 359 - { 360 - return (eld[DRM_ELD_CEA_EDID_VER_MNL] & DRM_ELD_MNL_MASK) >> DRM_ELD_MNL_SHIFT; 361 - } 362 - 363 - /** 364 - * drm_eld_sad - Get ELD SAD structures. 365 - * @eld: pointer to an eld memory structure with sad_count set 366 - */ 367 - static inline const uint8_t *drm_eld_sad(const uint8_t *eld) 368 - { 369 - unsigned int ver, mnl; 370 - 371 - ver = (eld[DRM_ELD_VER] & DRM_ELD_VER_MASK) >> DRM_ELD_VER_SHIFT; 372 - if (ver != 2 && ver != 31) 373 - return NULL; 374 - 375 - mnl = drm_eld_mnl(eld); 376 - if (mnl > 16) 377 - return NULL; 378 - 379 - return eld + DRM_ELD_CEA_SAD(mnl, 0); 380 - } 381 - 382 - /** 383 - * drm_eld_sad_count - Get ELD SAD count. 384 - * @eld: pointer to an eld memory structure with sad_count set 385 - */ 386 - static inline int drm_eld_sad_count(const uint8_t *eld) 387 - { 388 - return (eld[DRM_ELD_SAD_COUNT_CONN_TYPE] & DRM_ELD_SAD_COUNT_MASK) >> 389 - DRM_ELD_SAD_COUNT_SHIFT; 390 - } 391 - 392 - /** 393 - * drm_eld_calc_baseline_block_size - Calculate baseline block size in bytes 394 - * @eld: pointer to an eld memory structure with mnl and sad_count set 395 - * 396 - * This is a helper for determining the payload size of the baseline block, in 397 - * bytes, for e.g. setting the Baseline_ELD_Len field in the ELD header block. 398 - */ 399 - static inline int drm_eld_calc_baseline_block_size(const uint8_t *eld) 400 - { 401 - return DRM_ELD_MONITOR_NAME_STRING - DRM_ELD_HEADER_BLOCK_SIZE + 402 - drm_eld_mnl(eld) + drm_eld_sad_count(eld) * 3; 403 - } 404 - 405 - /** 406 - * drm_eld_size - Get ELD size in bytes 407 - * @eld: pointer to a complete eld memory structure 408 - * 409 - * The returned value does not include the vendor block. It's vendor specific, 410 - * and comprises of the remaining bytes in the ELD memory buffer after 411 - * drm_eld_size() bytes of header and baseline block. 412 - * 413 - * The returned value is guaranteed to be a multiple of 4. 414 - */ 415 - static inline int drm_eld_size(const uint8_t *eld) 416 - { 417 - return DRM_ELD_HEADER_BLOCK_SIZE + eld[DRM_ELD_BASELINE_ELD_LEN] * 4; 418 - } 419 - 420 - /** 421 - * drm_eld_get_spk_alloc - Get speaker allocation 422 - * @eld: pointer to an ELD memory structure 423 - * 424 - * The returned value is the speakers mask. User has to use %DRM_ELD_SPEAKER 425 - * field definitions to identify speakers. 426 - */ 427 - static inline u8 drm_eld_get_spk_alloc(const uint8_t *eld) 428 - { 429 - return eld[DRM_ELD_SPEAKER] & DRM_ELD_SPEAKER_MASK; 430 - } 431 - 432 - /** 433 - * drm_eld_get_conn_type - Get device type hdmi/dp connected 434 - * @eld: pointer to an ELD memory structure 435 - * 436 - * The caller need to use %DRM_ELD_CONN_TYPE_HDMI or %DRM_ELD_CONN_TYPE_DP to 437 - * identify the display type connected. 438 - */ 439 - static inline u8 drm_eld_get_conn_type(const uint8_t *eld) 440 - { 441 - return eld[DRM_ELD_SAD_COUNT_CONN_TYPE] & DRM_ELD_CONN_TYPE_MASK; 442 - } 443 411 444 412 /** 445 413 * drm_edid_decode_mfg_id - Decode the manufacturer ID

+164

include/drm/drm_eld.h

··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2023 Intel Corporation 4 + */ 5 + 6 + #ifndef __DRM_ELD_H__ 7 + #define __DRM_ELD_H__ 8 + 9 + #include <linux/types.h> 10 + 11 + struct cea_sad; 12 + 13 + /* ELD Header Block */ 14 + #define DRM_ELD_HEADER_BLOCK_SIZE 4 15 + 16 + #define DRM_ELD_VER 0 17 + # define DRM_ELD_VER_SHIFT 3 18 + # define DRM_ELD_VER_MASK (0x1f << 3) 19 + # define DRM_ELD_VER_CEA861D (2 << 3) /* supports 861D or below */ 20 + # define DRM_ELD_VER_CANNED (0x1f << 3) 21 + 22 + #define DRM_ELD_BASELINE_ELD_LEN 2 /* in dwords! */ 23 + 24 + /* ELD Baseline Block for ELD_Ver == 2 */ 25 + #define DRM_ELD_CEA_EDID_VER_MNL 4 26 + # define DRM_ELD_CEA_EDID_VER_SHIFT 5 27 + # define DRM_ELD_CEA_EDID_VER_MASK (7 << 5) 28 + # define DRM_ELD_CEA_EDID_VER_NONE (0 << 5) 29 + # define DRM_ELD_CEA_EDID_VER_CEA861 (1 << 5) 30 + # define DRM_ELD_CEA_EDID_VER_CEA861A (2 << 5) 31 + # define DRM_ELD_CEA_EDID_VER_CEA861BCD (3 << 5) 32 + # define DRM_ELD_MNL_SHIFT 0 33 + # define DRM_ELD_MNL_MASK (0x1f << 0) 34 + 35 + #define DRM_ELD_SAD_COUNT_CONN_TYPE 5 36 + # define DRM_ELD_SAD_COUNT_SHIFT 4 37 + # define DRM_ELD_SAD_COUNT_MASK (0xf << 4) 38 + # define DRM_ELD_CONN_TYPE_SHIFT 2 39 + # define DRM_ELD_CONN_TYPE_MASK (3 << 2) 40 + # define DRM_ELD_CONN_TYPE_HDMI (0 << 2) 41 + # define DRM_ELD_CONN_TYPE_DP (1 << 2) 42 + # define DRM_ELD_SUPPORTS_AI (1 << 1) 43 + # define DRM_ELD_SUPPORTS_HDCP (1 << 0) 44 + 45 + #define DRM_ELD_AUD_SYNCH_DELAY 6 /* in units of 2 ms */ 46 + # define DRM_ELD_AUD_SYNCH_DELAY_MAX 0xfa /* 500 ms */ 47 + 48 + #define DRM_ELD_SPEAKER 7 49 + # define DRM_ELD_SPEAKER_MASK 0x7f 50 + # define DRM_ELD_SPEAKER_RLRC (1 << 6) 51 + # define DRM_ELD_SPEAKER_FLRC (1 << 5) 52 + # define DRM_ELD_SPEAKER_RC (1 << 4) 53 + # define DRM_ELD_SPEAKER_RLR (1 << 3) 54 + # define DRM_ELD_SPEAKER_FC (1 << 2) 55 + # define DRM_ELD_SPEAKER_LFE (1 << 1) 56 + # define DRM_ELD_SPEAKER_FLR (1 << 0) 57 + 58 + #define DRM_ELD_PORT_ID 8 /* offsets 8..15 inclusive */ 59 + # define DRM_ELD_PORT_ID_LEN 8 60 + 61 + #define DRM_ELD_MANUFACTURER_NAME0 16 62 + #define DRM_ELD_MANUFACTURER_NAME1 17 63 + 64 + #define DRM_ELD_PRODUCT_CODE0 18 65 + #define DRM_ELD_PRODUCT_CODE1 19 66 + 67 + #define DRM_ELD_MONITOR_NAME_STRING 20 /* offsets 20..(20+mnl-1) inclusive */ 68 + 69 + #define DRM_ELD_CEA_SAD(mnl, sad) (20 + (mnl) + 3 * (sad)) 70 + 71 + /** 72 + * drm_eld_mnl - Get ELD monitor name length in bytes. 73 + * @eld: pointer to an eld memory structure with mnl set 74 + */ 75 + static inline int drm_eld_mnl(const u8 *eld) 76 + { 77 + return (eld[DRM_ELD_CEA_EDID_VER_MNL] & DRM_ELD_MNL_MASK) >> DRM_ELD_MNL_SHIFT; 78 + } 79 + 80 + int drm_eld_sad_get(const u8 *eld, int sad_index, struct cea_sad *cta_sad); 81 + int drm_eld_sad_set(u8 *eld, int sad_index, const struct cea_sad *cta_sad); 82 + 83 + /** 84 + * drm_eld_sad - Get ELD SAD structures. 85 + * @eld: pointer to an eld memory structure with sad_count set 86 + */ 87 + static inline const u8 *drm_eld_sad(const u8 *eld) 88 + { 89 + unsigned int ver, mnl; 90 + 91 + ver = (eld[DRM_ELD_VER] & DRM_ELD_VER_MASK) >> DRM_ELD_VER_SHIFT; 92 + if (ver != 2 && ver != 31) 93 + return NULL; 94 + 95 + mnl = drm_eld_mnl(eld); 96 + if (mnl > 16) 97 + return NULL; 98 + 99 + return eld + DRM_ELD_CEA_SAD(mnl, 0); 100 + } 101 + 102 + /** 103 + * drm_eld_sad_count - Get ELD SAD count. 104 + * @eld: pointer to an eld memory structure with sad_count set 105 + */ 106 + static inline int drm_eld_sad_count(const u8 *eld) 107 + { 108 + return (eld[DRM_ELD_SAD_COUNT_CONN_TYPE] & DRM_ELD_SAD_COUNT_MASK) >> 109 + DRM_ELD_SAD_COUNT_SHIFT; 110 + } 111 + 112 + /** 113 + * drm_eld_calc_baseline_block_size - Calculate baseline block size in bytes 114 + * @eld: pointer to an eld memory structure with mnl and sad_count set 115 + * 116 + * This is a helper for determining the payload size of the baseline block, in 117 + * bytes, for e.g. setting the Baseline_ELD_Len field in the ELD header block. 118 + */ 119 + static inline int drm_eld_calc_baseline_block_size(const u8 *eld) 120 + { 121 + return DRM_ELD_MONITOR_NAME_STRING - DRM_ELD_HEADER_BLOCK_SIZE + 122 + drm_eld_mnl(eld) + drm_eld_sad_count(eld) * 3; 123 + } 124 + 125 + /** 126 + * drm_eld_size - Get ELD size in bytes 127 + * @eld: pointer to a complete eld memory structure 128 + * 129 + * The returned value does not include the vendor block. It's vendor specific, 130 + * and comprises of the remaining bytes in the ELD memory buffer after 131 + * drm_eld_size() bytes of header and baseline block. 132 + * 133 + * The returned value is guaranteed to be a multiple of 4. 134 + */ 135 + static inline int drm_eld_size(const u8 *eld) 136 + { 137 + return DRM_ELD_HEADER_BLOCK_SIZE + eld[DRM_ELD_BASELINE_ELD_LEN] * 4; 138 + } 139 + 140 + /** 141 + * drm_eld_get_spk_alloc - Get speaker allocation 142 + * @eld: pointer to an ELD memory structure 143 + * 144 + * The returned value is the speakers mask. User has to use %DRM_ELD_SPEAKER 145 + * field definitions to identify speakers. 146 + */ 147 + static inline u8 drm_eld_get_spk_alloc(const u8 *eld) 148 + { 149 + return eld[DRM_ELD_SPEAKER] & DRM_ELD_SPEAKER_MASK; 150 + } 151 + 152 + /** 153 + * drm_eld_get_conn_type - Get device type hdmi/dp connected 154 + * @eld: pointer to an ELD memory structure 155 + * 156 + * The caller need to use %DRM_ELD_CONN_TYPE_HDMI or %DRM_ELD_CONN_TYPE_DP to 157 + * identify the display type connected. 158 + */ 159 + static inline u8 drm_eld_get_conn_type(const u8 *eld) 160 + { 161 + return eld[DRM_ELD_SAD_COUNT_CONN_TYPE] & DRM_ELD_CONN_TYPE_MASK; 162 + } 163 + 164 + #endif /* __DRM_ELD_H__ */

+3 -17

include/drm/drm_flip_work.h

··· 31 31 /** 32 32 * DOC: flip utils 33 33 * 34 - * Util to queue up work to run from work-queue context after flip/vblank. 34 + * Utility to queue up work to run from work-queue context after flip/vblank. 35 35 * Typically this can be used to defer unref of framebuffer's, cursor 36 - * bo's, etc until after vblank. The APIs are all thread-safe. 37 - * Moreover, drm_flip_work_queue_task and drm_flip_work_queue can be called 38 - * in atomic context. 36 + * bo's, etc until after vblank. The APIs are all thread-safe. Moreover, 37 + * drm_flip_work_commit() can be called in atomic context. 39 38 */ 40 39 41 40 struct drm_flip_work; ··· 49 50 * drm_flip_work_commit() is called. 50 51 */ 51 52 typedef void (*drm_flip_func_t)(struct drm_flip_work *work, void *val); 52 - 53 - /** 54 - * struct drm_flip_task - flip work task 55 - * @node: list entry element 56 - * @data: data to pass to &drm_flip_work.func 57 - */ 58 - struct drm_flip_task { 59 - struct list_head node; 60 - void *data; 61 - }; 62 53 63 54 /** 64 55 * struct drm_flip_work - flip work queue ··· 68 79 spinlock_t lock; 69 80 }; 70 81 71 - struct drm_flip_task *drm_flip_work_allocate_task(void *data, gfp_t flags); 72 - void drm_flip_work_queue_task(struct drm_flip_work *work, 73 - struct drm_flip_task *task); 74 82 void drm_flip_work_queue(struct drm_flip_work *work, void *val); 75 83 void drm_flip_work_commit(struct drm_flip_work *work, 76 84 struct workqueue_struct *wq);

+68 -13

include/drm/drm_format_helper.h

··· 15 15 16 16 struct iosys_map; 17 17 18 + /** 19 + * struct drm_format_conv_state - Stores format-conversion state 20 + * 21 + * DRM helpers for format conversion store temporary state in 22 + * struct drm_xfrm_buf. The buffer's resources can be reused 23 + * among multiple conversion operations. 24 + * 25 + * All fields are considered private. 26 + */ 27 + struct drm_format_conv_state { 28 + struct { 29 + void *mem; 30 + size_t size; 31 + bool preallocated; 32 + } tmp; 33 + }; 34 + 35 + #define __DRM_FORMAT_CONV_STATE_INIT(_mem, _size, _preallocated) { \ 36 + .tmp = { \ 37 + .mem = (_mem), \ 38 + .size = (_size), \ 39 + .preallocated = (_preallocated), \ 40 + } \ 41 + } 42 + 43 + /** 44 + * DRM_FORMAT_CONV_STATE_INIT - Initializer for struct drm_format_conv_state 45 + * 46 + * Initializes an instance of struct drm_format_conv_state to default values. 47 + */ 48 + #define DRM_FORMAT_CONV_STATE_INIT \ 49 + __DRM_FORMAT_CONV_STATE_INIT(NULL, 0, false) 50 + 51 + /** 52 + * DRM_FORMAT_CONV_STATE_INIT_PREALLOCATED - Initializer for struct drm_format_conv_state 53 + * @_mem: The preallocated memory area 54 + * @_size: The number of bytes in _mem 55 + * 56 + * Initializes an instance of struct drm_format_conv_state to preallocated 57 + * storage. The caller is responsible for releasing the provided memory range. 58 + */ 59 + #define DRM_FORMAT_CONV_STATE_INIT_PREALLOCATED(_mem, _size) \ 60 + __DRM_FORMAT_CONV_STATE_INIT(_mem, _size, true) 61 + 62 + void drm_format_conv_state_init(struct drm_format_conv_state *state); 63 + void drm_format_conv_state_copy(struct drm_format_conv_state *state, 64 + const struct drm_format_conv_state *old_state); 65 + void *drm_format_conv_state_reserve(struct drm_format_conv_state *state, 66 + size_t new_size, gfp_t flags); 67 + void drm_format_conv_state_release(struct drm_format_conv_state *state); 68 + 18 69 unsigned int drm_fb_clip_offset(unsigned int pitch, const struct drm_format_info *format, 19 70 const struct drm_rect *clip); 20 71 ··· 74 23 const struct drm_rect *clip); 75 24 void drm_fb_swab(struct iosys_map *dst, const unsigned int *dst_pitch, 76 25 const struct iosys_map *src, const struct drm_framebuffer *fb, 77 - const struct drm_rect *clip, bool cached); 26 + const struct drm_rect *clip, bool cached, 27 + struct drm_format_conv_state *state); 78 28 void drm_fb_xrgb8888_to_rgb332(struct iosys_map *dst, const unsigned int *dst_pitch, 79 29 const struct iosys_map *src, const struct drm_framebuffer *fb, 80 - const struct drm_rect *clip); 30 + const struct drm_rect *clip, struct drm_format_conv_state *state); 81 31 void drm_fb_xrgb8888_to_rgb565(struct iosys_map *dst, const unsigned int *dst_pitch, 82 32 const struct iosys_map *src, const struct drm_framebuffer *fb, 83 - const struct drm_rect *clip, bool swab); 33 + const struct drm_rect *clip, struct drm_format_conv_state *state, 34 + bool swab); 84 35 void drm_fb_xrgb8888_to_xrgb1555(struct iosys_map *dst, const unsigned int *dst_pitch, 85 36 const struct iosys_map *src, const struct drm_framebuffer *fb, 86 - const struct drm_rect *clip); 37 + const struct drm_rect *clip, struct drm_format_conv_state *state); 87 38 void drm_fb_xrgb8888_to_argb1555(struct iosys_map *dst, const unsigned int *dst_pitch, 88 39 const struct iosys_map *src, const struct drm_framebuffer *fb, 89 - const struct drm_rect *clip); 40 + const struct drm_rect *clip, struct drm_format_conv_state *state); 90 41 void drm_fb_xrgb8888_to_rgba5551(struct iosys_map *dst, const unsigned int *dst_pitch, 91 42 const struct iosys_map *src, const struct drm_framebuffer *fb, 92 - const struct drm_rect *clip); 43 + const struct drm_rect *clip, struct drm_format_conv_state *state); 93 44 void drm_fb_xrgb8888_to_rgb888(struct iosys_map *dst, const unsigned int *dst_pitch, 94 45 const struct iosys_map *src, const struct drm_framebuffer *fb, 95 - const struct drm_rect *clip); 46 + const struct drm_rect *clip, struct drm_format_conv_state *state); 96 47 void drm_fb_xrgb8888_to_argb8888(struct iosys_map *dst, const unsigned int *dst_pitch, 97 48 const struct iosys_map *src, const struct drm_framebuffer *fb, 98 - const struct drm_rect *clip); 49 + const struct drm_rect *clip, struct drm_format_conv_state *state); 99 50 void drm_fb_xrgb8888_to_xrgb2101010(struct iosys_map *dst, const unsigned int *dst_pitch, 100 51 const struct iosys_map *src, const struct drm_framebuffer *fb, 101 - const struct drm_rect *clip); 52 + const struct drm_rect *clip, 53 + struct drm_format_conv_state *state); 102 54 void drm_fb_xrgb8888_to_argb2101010(struct iosys_map *dst, const unsigned int *dst_pitch, 103 55 const struct iosys_map *src, const struct drm_framebuffer *fb, 104 - const struct drm_rect *clip); 56 + const struct drm_rect *clip, 57 + struct drm_format_conv_state *state); 105 58 void drm_fb_xrgb8888_to_gray8(struct iosys_map *dst, const unsigned int *dst_pitch, 106 59 const struct iosys_map *src, const struct drm_framebuffer *fb, 107 - const struct drm_rect *clip); 60 + const struct drm_rect *clip, struct drm_format_conv_state *state); 108 61 109 62 int drm_fb_blit(struct iosys_map *dst, const unsigned int *dst_pitch, uint32_t dst_format, 110 63 const struct iosys_map *src, const struct drm_framebuffer *fb, 111 - const struct drm_rect *rect); 64 + const struct drm_rect *clip, struct drm_format_conv_state *state); 112 65 113 66 void drm_fb_xrgb8888_to_mono(struct iosys_map *dst, const unsigned int *dst_pitch, 114 67 const struct iosys_map *src, const struct drm_framebuffer *fb, 115 - const struct drm_rect *clip); 68 + const struct drm_rect *clip, struct drm_format_conv_state *state); 116 69 117 70 size_t drm_fb_build_fourcc_list(struct drm_device *dev, 118 71 const u32 *native_fourccs, size_t native_nfourccs,

+16 -16

include/drm/drm_gem.h

··· 580 580 * drm_gem_gpuva_init() - initialize the gpuva list of a GEM object 581 581 * @obj: the &drm_gem_object 582 582 * 583 - * This initializes the &drm_gem_object's &drm_gpuva list. 583 + * This initializes the &drm_gem_object's &drm_gpuvm_bo list. 584 584 * 585 585 * Calling this function is only necessary for drivers intending to support the 586 586 * &drm_driver_feature DRIVER_GEM_GPUVA. ··· 593 593 } 594 594 595 595 /** 596 - * drm_gem_for_each_gpuva() - iternator to walk over a list of gpuvas 597 - * @entry__: &drm_gpuva structure to assign to in each iteration step 598 - * @obj__: the &drm_gem_object the &drm_gpuvas to walk are associated with 596 + * drm_gem_for_each_gpuvm_bo() - iterator to walk over a list of &drm_gpuvm_bo 597 + * @entry__: &drm_gpuvm_bo structure to assign to in each iteration step 598 + * @obj__: the &drm_gem_object the &drm_gpuvm_bo to walk are associated with 599 599 * 600 - * This iterator walks over all &drm_gpuva structures associated with the 601 - * &drm_gpuva_manager. 600 + * This iterator walks over all &drm_gpuvm_bo structures associated with the 601 + * &drm_gem_object. 602 602 */ 603 - #define drm_gem_for_each_gpuva(entry__, obj__) \ 604 - list_for_each_entry(entry__, &(obj__)->gpuva.list, gem.entry) 603 + #define drm_gem_for_each_gpuvm_bo(entry__, obj__) \ 604 + list_for_each_entry(entry__, &(obj__)->gpuva.list, list.entry.gem) 605 605 606 606 /** 607 - * drm_gem_for_each_gpuva_safe() - iternator to safely walk over a list of 608 - * gpuvas 609 - * @entry__: &drm_gpuva structure to assign to in each iteration step 610 - * @next__: &next &drm_gpuva to store the next step 611 - * @obj__: the &drm_gem_object the &drm_gpuvas to walk are associated with 607 + * drm_gem_for_each_gpuvm_bo_safe() - iterator to safely walk over a list of 608 + * &drm_gpuvm_bo 609 + * @entry__: &drm_gpuvm_bostructure to assign to in each iteration step 610 + * @next__: &next &drm_gpuvm_bo to store the next step 611 + * @obj__: the &drm_gem_object the &drm_gpuvm_bo to walk are associated with 612 612 * 613 - * This iterator walks over all &drm_gpuva structures associated with the 613 + * This iterator walks over all &drm_gpuvm_bo structures associated with the 614 614 * &drm_gem_object. It is implemented with list_for_each_entry_safe(), hence 615 615 * it is save against removal of elements. 616 616 */ 617 - #define drm_gem_for_each_gpuva_safe(entry__, next__, obj__) \ 618 - list_for_each_entry_safe(entry__, next__, &(obj__)->gpuva.list, gem.entry) 617 + #define drm_gem_for_each_gpuvm_bo_safe(entry__, next__, obj__) \ 618 + list_for_each_entry_safe(entry__, next__, &(obj__)->gpuva.list, list.entry.gem) 619 619 620 620 #endif /* __DRM_GEM_H__ */

+10

include/drm/drm_gem_atomic_helper.h

··· 5 5 6 6 #include <linux/iosys-map.h> 7 7 8 + #include <drm/drm_format_helper.h> 8 9 #include <drm/drm_fourcc.h> 9 10 #include <drm/drm_plane.h> 10 11 ··· 49 48 struct drm_shadow_plane_state { 50 49 /** @base: plane state */ 51 50 struct drm_plane_state base; 51 + 52 + /** 53 + * @fmtcnv_state: Format-conversion state 54 + * 55 + * Per-plane state for format conversion. 56 + * Flags for copying shadow buffers into backend storage. Also holds 57 + * temporary storage for format conversion. 58 + */ 59 + struct drm_format_conv_state fmtcnv_state; 52 60 53 61 /* Transitional state - do not export or duplicate */ 54 62

+516 -5

include/drm/drm_gpuvm.h

··· 25 25 * OTHER DEALINGS IN THE SOFTWARE. 26 26 */ 27 27 28 + #include <linux/dma-resv.h> 28 29 #include <linux/list.h> 29 30 #include <linux/rbtree.h> 30 31 #include <linux/types.h> 31 32 33 + #include <drm/drm_device.h> 32 34 #include <drm/drm_gem.h> 35 + #include <drm/drm_exec.h> 33 36 34 37 struct drm_gpuvm; 38 + struct drm_gpuvm_bo; 35 39 struct drm_gpuvm_ops; 36 40 37 41 /** ··· 77 73 struct drm_gpuvm *vm; 78 74 79 75 /** 76 + * @vm_bo: the &drm_gpuvm_bo abstraction for the mapped 77 + * &drm_gem_object 78 + */ 79 + struct drm_gpuvm_bo *vm_bo; 80 + 81 + /** 80 82 * @flags: the &drm_gpuva_flags for this mapping 81 83 */ 82 84 enum drm_gpuva_flags flags; ··· 117 107 struct drm_gem_object *obj; 118 108 119 109 /** 120 - * @entry: the &list_head to attach this object to a &drm_gem_object 110 + * @entry: the &list_head to attach this object to a &drm_gpuvm_bo 121 111 */ 122 112 struct list_head entry; 123 113 } gem; ··· 150 140 int drm_gpuva_insert(struct drm_gpuvm *gpuvm, struct drm_gpuva *va); 151 141 void drm_gpuva_remove(struct drm_gpuva *va); 152 142 153 - void drm_gpuva_link(struct drm_gpuva *va); 143 + void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuvm_bo *vm_bo); 154 144 void drm_gpuva_unlink(struct drm_gpuva *va); 155 145 156 146 struct drm_gpuva *drm_gpuva_find(struct drm_gpuvm *gpuvm, ··· 194 184 } 195 185 196 186 /** 187 + * enum drm_gpuvm_flags - flags for struct drm_gpuvm 188 + */ 189 + enum drm_gpuvm_flags { 190 + /** 191 + * @DRM_GPUVM_RESV_PROTECTED: GPUVM is protected externally by the 192 + * GPUVM's &dma_resv lock 193 + */ 194 + DRM_GPUVM_RESV_PROTECTED = BIT(0), 195 + 196 + /** 197 + * @DRM_GPUVM_USERBITS: user defined bits 198 + */ 199 + DRM_GPUVM_USERBITS = BIT(1), 200 + }; 201 + 202 + /** 197 203 * struct drm_gpuvm - DRM GPU VA Manager 198 204 * 199 205 * The DRM GPU VA Manager keeps track of a GPU's virtual address space by using ··· 226 200 * @name: the name of the DRM GPU VA space 227 201 */ 228 202 const char *name; 203 + 204 + /** 205 + * @flags: the &drm_gpuvm_flags of this GPUVM 206 + */ 207 + enum drm_gpuvm_flags flags; 208 + 209 + /** 210 + * @drm: the &drm_device this VM lives in 211 + */ 212 + struct drm_device *drm; 229 213 230 214 /** 231 215 * @mm_start: start of the VA space ··· 263 227 } rb; 264 228 265 229 /** 230 + * @kref: reference count of this object 231 + */ 232 + struct kref kref; 233 + 234 + /** 266 235 * @kernel_alloc_node: 267 236 * 268 237 * &drm_gpuva representing the address space cutout reserved for ··· 279 238 * @ops: &drm_gpuvm_ops providing the split/merge steps to drivers 280 239 */ 281 240 const struct drm_gpuvm_ops *ops; 241 + 242 + /** 243 + * @r_obj: Resv GEM object; representing the GPUVM's common &dma_resv. 244 + */ 245 + struct drm_gem_object *r_obj; 246 + 247 + /** 248 + * @extobj: structure holding the extobj list 249 + */ 250 + struct { 251 + /** 252 + * @list: &list_head storing &drm_gpuvm_bos serving as 253 + * external object 254 + */ 255 + struct list_head list; 256 + 257 + /** 258 + * @local_list: pointer to the local list temporarily storing 259 + * entries from the external object list 260 + */ 261 + struct list_head *local_list; 262 + 263 + /** 264 + * @lock: spinlock to protect the extobj list 265 + */ 266 + spinlock_t lock; 267 + } extobj; 268 + 269 + /** 270 + * @evict: structure holding the evict list and evict list lock 271 + */ 272 + struct { 273 + /** 274 + * @list: &list_head storing &drm_gpuvm_bos currently being 275 + * evicted 276 + */ 277 + struct list_head list; 278 + 279 + /** 280 + * @local_list: pointer to the local list temporarily storing 281 + * entries from the evicted object list 282 + */ 283 + struct list_head *local_list; 284 + 285 + /** 286 + * @lock: spinlock to protect the evict list 287 + */ 288 + spinlock_t lock; 289 + } evict; 282 290 }; 283 291 284 292 void drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name, 293 + enum drm_gpuvm_flags flags, 294 + struct drm_device *drm, 295 + struct drm_gem_object *r_obj, 285 296 u64 start_offset, u64 range, 286 297 u64 reserve_offset, u64 reserve_range, 287 298 const struct drm_gpuvm_ops *ops); 288 - void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm); 289 299 300 + /** 301 + * drm_gpuvm_get() - acquire a struct drm_gpuvm reference 302 + * @gpuvm: the &drm_gpuvm to acquire the reference of 303 + * 304 + * This function acquires an additional reference to @gpuvm. It is illegal to 305 + * call this without already holding a reference. No locks required. 306 + */ 307 + static inline struct drm_gpuvm * 308 + drm_gpuvm_get(struct drm_gpuvm *gpuvm) 309 + { 310 + kref_get(&gpuvm->kref); 311 + 312 + return gpuvm; 313 + } 314 + 315 + void drm_gpuvm_put(struct drm_gpuvm *gpuvm); 316 + 317 + bool drm_gpuvm_range_valid(struct drm_gpuvm *gpuvm, u64 addr, u64 range); 290 318 bool drm_gpuvm_interval_empty(struct drm_gpuvm *gpuvm, u64 addr, u64 range); 319 + 320 + struct drm_gem_object * 321 + drm_gpuvm_resv_object_alloc(struct drm_device *drm); 322 + 323 + /** 324 + * drm_gpuvm_resv_protected() - indicates whether &DRM_GPUVM_RESV_PROTECTED is 325 + * set 326 + * @gpuvm: the &drm_gpuvm 327 + * 328 + * Returns: true if &DRM_GPUVM_RESV_PROTECTED is set, false otherwise. 329 + */ 330 + static inline bool 331 + drm_gpuvm_resv_protected(struct drm_gpuvm *gpuvm) 332 + { 333 + return gpuvm->flags & DRM_GPUVM_RESV_PROTECTED; 334 + } 335 + 336 + /** 337 + * drm_gpuvm_resv() - returns the &drm_gpuvm's &dma_resv 338 + * @gpuvm__: the &drm_gpuvm 339 + * 340 + * Returns: a pointer to the &drm_gpuvm's shared &dma_resv 341 + */ 342 + #define drm_gpuvm_resv(gpuvm__) ((gpuvm__)->r_obj->resv) 343 + 344 + /** 345 + * drm_gpuvm_resv_obj() - returns the &drm_gem_object holding the &drm_gpuvm's 346 + * &dma_resv 347 + * @gpuvm__: the &drm_gpuvm 348 + * 349 + * Returns: a pointer to the &drm_gem_object holding the &drm_gpuvm's shared 350 + * &dma_resv 351 + */ 352 + #define drm_gpuvm_resv_obj(gpuvm__) ((gpuvm__)->r_obj) 353 + 354 + #define drm_gpuvm_resv_held(gpuvm__) \ 355 + dma_resv_held(drm_gpuvm_resv(gpuvm__)) 356 + 357 + #define drm_gpuvm_resv_assert_held(gpuvm__) \ 358 + dma_resv_assert_held(drm_gpuvm_resv(gpuvm__)) 359 + 360 + #define drm_gpuvm_resv_held(gpuvm__) \ 361 + dma_resv_held(drm_gpuvm_resv(gpuvm__)) 362 + 363 + #define drm_gpuvm_resv_assert_held(gpuvm__) \ 364 + dma_resv_assert_held(drm_gpuvm_resv(gpuvm__)) 365 + 366 + /** 367 + * drm_gpuvm_is_extobj() - indicates whether the given &drm_gem_object is an 368 + * external object 369 + * @gpuvm: the &drm_gpuvm to check 370 + * @obj: the &drm_gem_object to check 371 + * 372 + * Returns: true if the &drm_gem_object &dma_resv differs from the 373 + * &drm_gpuvms &dma_resv, false otherwise 374 + */ 375 + static inline bool 376 + drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm, 377 + struct drm_gem_object *obj) 378 + { 379 + return obj && obj->resv != drm_gpuvm_resv(gpuvm); 380 + } 291 381 292 382 static inline struct drm_gpuva * 293 383 __drm_gpuva_next(struct drm_gpuva *va) ··· 497 325 */ 498 326 #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \ 499 327 list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list, rb.entry) 328 + 329 + /** 330 + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of &drm_exec 331 + * 332 + * This structure should be created on the stack as &drm_exec should be. 333 + * 334 + * Optionally, @extra can be set in order to lock additional &drm_gem_objects. 335 + */ 336 + struct drm_gpuvm_exec { 337 + /** 338 + * @exec: the &drm_exec structure 339 + */ 340 + struct drm_exec exec; 341 + 342 + /** 343 + * @flags: the flags for the struct drm_exec 344 + */ 345 + uint32_t flags; 346 + 347 + /** 348 + * @vm: the &drm_gpuvm to lock its DMA reservations 349 + */ 350 + struct drm_gpuvm *vm; 351 + 352 + /** 353 + * @num_fences: the number of fences to reserve for the &dma_resv of the 354 + * locked &drm_gem_objects 355 + */ 356 + unsigned int num_fences; 357 + 358 + /** 359 + * @extra: Callback and corresponding private data for the driver to 360 + * lock arbitrary additional &drm_gem_objects. 361 + */ 362 + struct { 363 + /** 364 + * @fn: The driver callback to lock additional &drm_gem_objects. 365 + */ 366 + int (*fn)(struct drm_gpuvm_exec *vm_exec); 367 + 368 + /** 369 + * @priv: driver private data for the @fn callback 370 + */ 371 + void *priv; 372 + } extra; 373 + }; 374 + 375 + /** 376 + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv 377 + * @gpuvm: the &drm_gpuvm 378 + * @exec: the &drm_exec context 379 + * @num_fences: the amount of &dma_fences to reserve 380 + * 381 + * Calls drm_exec_prepare_obj() for the GPUVMs dummy &drm_gem_object. 382 + * 383 + * Using this function directly, it is the drivers responsibility to call 384 + * drm_exec_init() and drm_exec_fini() accordingly. 385 + * 386 + * Returns: 0 on success, negative error code on failure. 387 + */ 388 + static inline int 389 + drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, 390 + struct drm_exec *exec, 391 + unsigned int num_fences) 392 + { 393 + return drm_exec_prepare_obj(exec, gpuvm->r_obj, num_fences); 394 + } 395 + 396 + int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, 397 + struct drm_exec *exec, 398 + unsigned int num_fences); 399 + 400 + int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, 401 + struct drm_exec *exec, 402 + u64 addr, u64 range, 403 + unsigned int num_fences); 404 + 405 + int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec); 406 + 407 + int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec, 408 + struct drm_gem_object **objs, 409 + unsigned int num_objs); 410 + 411 + int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec, 412 + u64 addr, u64 range); 413 + 414 + /** 415 + * drm_gpuvm_exec_unlock() - lock all dma-resv of all assoiciated BOs 416 + * @vm_exec: the &drm_gpuvm_exec wrapper 417 + * 418 + * Releases all dma-resv locks of all &drm_gem_objects previously acquired 419 + * through drm_gpuvm_exec_lock() or its variants. 420 + * 421 + * Returns: 0 on success, negative error code on failure. 422 + */ 423 + static inline void 424 + drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec) 425 + { 426 + drm_exec_fini(&vm_exec->exec); 427 + } 428 + 429 + int drm_gpuvm_validate(struct drm_gpuvm *gpuvm, struct drm_exec *exec); 430 + void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm, 431 + struct drm_exec *exec, 432 + struct dma_fence *fence, 433 + enum dma_resv_usage private_usage, 434 + enum dma_resv_usage extobj_usage); 435 + 436 + /** 437 + * drm_gpuvm_exec_resv_add_fence() 438 + * @vm_exec: the &drm_gpuvm_exec wrapper 439 + * @fence: fence to add 440 + * @private_usage: private dma-resv usage 441 + * @extobj_usage: extobj dma-resv usage 442 + * 443 + * See drm_gpuvm_resv_add_fence(). 444 + */ 445 + static inline void 446 + drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec *vm_exec, 447 + struct dma_fence *fence, 448 + enum dma_resv_usage private_usage, 449 + enum dma_resv_usage extobj_usage) 450 + { 451 + drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec, fence, 452 + private_usage, extobj_usage); 453 + } 454 + 455 + /** 456 + * drm_gpuvm_exec_validate() 457 + * @vm_exec: the &drm_gpuvm_exec wrapper 458 + * 459 + * See drm_gpuvm_validate(). 460 + */ 461 + static inline int 462 + drm_gpuvm_exec_validate(struct drm_gpuvm_exec *vm_exec) 463 + { 464 + return drm_gpuvm_validate(vm_exec->vm, &vm_exec->exec); 465 + } 466 + 467 + /** 468 + * struct drm_gpuvm_bo - structure representing a &drm_gpuvm and 469 + * &drm_gem_object combination 470 + * 471 + * This structure is an abstraction representing a &drm_gpuvm and 472 + * &drm_gem_object combination. It serves as an indirection to accelerate 473 + * iterating all &drm_gpuvas within a &drm_gpuvm backed by the same 474 + * &drm_gem_object. 475 + * 476 + * Furthermore it is used cache evicted GEM objects for a certain GPU-VM to 477 + * accelerate validation. 478 + * 479 + * Typically, drivers want to create an instance of a struct drm_gpuvm_bo once 480 + * a GEM object is mapped first in a GPU-VM and release the instance once the 481 + * last mapping of the GEM object in this GPU-VM is unmapped. 482 + */ 483 + struct drm_gpuvm_bo { 484 + /** 485 + * @vm: The &drm_gpuvm the @obj is mapped in. This is a reference 486 + * counted pointer. 487 + */ 488 + struct drm_gpuvm *vm; 489 + 490 + /** 491 + * @obj: The &drm_gem_object being mapped in @vm. This is a reference 492 + * counted pointer. 493 + */ 494 + struct drm_gem_object *obj; 495 + 496 + /** 497 + * @evicted: Indicates whether the &drm_gem_object is evicted; field 498 + * protected by the &drm_gem_object's dma-resv lock. 499 + */ 500 + bool evicted; 501 + 502 + /** 503 + * @kref: The reference count for this &drm_gpuvm_bo. 504 + */ 505 + struct kref kref; 506 + 507 + /** 508 + * @list: Structure containing all &list_heads. 509 + */ 510 + struct { 511 + /** 512 + * @gpuva: The list of linked &drm_gpuvas. 513 + * 514 + * It is safe to access entries from this list as long as the 515 + * GEM's gpuva lock is held. See also struct drm_gem_object. 516 + */ 517 + struct list_head gpuva; 518 + 519 + /** 520 + * @entry: Structure containing all &list_heads serving as 521 + * entry. 522 + */ 523 + struct { 524 + /** 525 + * @gem: List entry to attach to the &drm_gem_objects 526 + * gpuva list. 527 + */ 528 + struct list_head gem; 529 + 530 + /** 531 + * @evict: List entry to attach to the &drm_gpuvms 532 + * extobj list. 533 + */ 534 + struct list_head extobj; 535 + 536 + /** 537 + * @evict: List entry to attach to the &drm_gpuvms evict 538 + * list. 539 + */ 540 + struct list_head evict; 541 + } entry; 542 + } list; 543 + }; 544 + 545 + struct drm_gpuvm_bo * 546 + drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm, 547 + struct drm_gem_object *obj); 548 + 549 + struct drm_gpuvm_bo * 550 + drm_gpuvm_bo_obtain(struct drm_gpuvm *gpuvm, 551 + struct drm_gem_object *obj); 552 + struct drm_gpuvm_bo * 553 + drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *vm_bo); 554 + 555 + /** 556 + * drm_gpuvm_bo_get() - acquire a struct drm_gpuvm_bo reference 557 + * @vm_bo: the &drm_gpuvm_bo to acquire the reference of 558 + * 559 + * This function acquires an additional reference to @vm_bo. It is illegal to 560 + * call this without already holding a reference. No locks required. 561 + */ 562 + static inline struct drm_gpuvm_bo * 563 + drm_gpuvm_bo_get(struct drm_gpuvm_bo *vm_bo) 564 + { 565 + kref_get(&vm_bo->kref); 566 + return vm_bo; 567 + } 568 + 569 + void drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo); 570 + 571 + struct drm_gpuvm_bo * 572 + drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm, 573 + struct drm_gem_object *obj); 574 + 575 + void drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, bool evict); 576 + 577 + /** 578 + * drm_gpuvm_bo_gem_evict() 579 + * @obj: the &drm_gem_object 580 + * @evict: indicates whether @obj is evicted 581 + * 582 + * See drm_gpuvm_bo_evict(). 583 + */ 584 + static inline void 585 + drm_gpuvm_bo_gem_evict(struct drm_gem_object *obj, bool evict) 586 + { 587 + struct drm_gpuvm_bo *vm_bo; 588 + 589 + drm_gem_gpuva_assert_lock_held(obj); 590 + drm_gem_for_each_gpuvm_bo(vm_bo, obj) 591 + drm_gpuvm_bo_evict(vm_bo, evict); 592 + } 593 + 594 + void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo); 595 + 596 + /** 597 + * drm_gpuvm_bo_for_each_va() - iterator to walk over a list of &drm_gpuva 598 + * @va__: &drm_gpuva structure to assign to in each iteration step 599 + * @vm_bo__: the &drm_gpuvm_bo the &drm_gpuva to walk are associated with 600 + * 601 + * This iterator walks over all &drm_gpuva structures associated with the 602 + * &drm_gpuvm_bo. 603 + * 604 + * The caller must hold the GEM's gpuva lock. 605 + */ 606 + #define drm_gpuvm_bo_for_each_va(va__, vm_bo__) \ 607 + list_for_each_entry(va__, &(vm_bo)->list.gpuva, gem.entry) 608 + 609 + /** 610 + * drm_gpuvm_bo_for_each_va_safe() - iterator to safely walk over a list of 611 + * &drm_gpuva 612 + * @va__: &drm_gpuva structure to assign to in each iteration step 613 + * @next__: &next &drm_gpuva to store the next step 614 + * @vm_bo__: the &drm_gpuvm_bo the &drm_gpuva to walk are associated with 615 + * 616 + * This iterator walks over all &drm_gpuva structures associated with the 617 + * &drm_gpuvm_bo. It is implemented with list_for_each_entry_safe(), hence 618 + * it is save against removal of elements. 619 + * 620 + * The caller must hold the GEM's gpuva lock. 621 + */ 622 + #define drm_gpuvm_bo_for_each_va_safe(va__, next__, vm_bo__) \ 623 + list_for_each_entry_safe(va__, next__, &(vm_bo)->list.gpuva, gem.entry) 500 624 501 625 /** 502 626 * enum drm_gpuva_op_type - GPU VA operation type ··· 1063 595 u64 addr, u64 range); 1064 596 1065 597 struct drm_gpuva_ops * 1066 - drm_gpuvm_gem_unmap_ops_create(struct drm_gpuvm *gpuvm, 1067 - struct drm_gem_object *obj); 598 + drm_gpuvm_bo_unmap_ops_create(struct drm_gpuvm_bo *vm_bo); 1068 599 1069 600 void drm_gpuva_ops_free(struct drm_gpuvm *gpuvm, 1070 601 struct drm_gpuva_ops *ops); ··· 1083 616 * operations to drivers. 1084 617 */ 1085 618 struct drm_gpuvm_ops { 619 + /** 620 + * @vm_free: called when the last reference of a struct drm_gpuvm is 621 + * dropped 622 + * 623 + * This callback is mandatory. 624 + */ 625 + void (*vm_free)(struct drm_gpuvm *gpuvm); 626 + 1086 627 /** 1087 628 * @op_alloc: called when the &drm_gpuvm allocates 1088 629 * a struct drm_gpuva_op ··· 1114 639 * This callback is optional. 1115 640 */ 1116 641 void (*op_free)(struct drm_gpuva_op *op); 642 + 643 + /** 644 + * @vm_bo_alloc: called when the &drm_gpuvm allocates 645 + * a struct drm_gpuvm_bo 646 + * 647 + * Some drivers may want to embed struct drm_gpuvm_bo into driver 648 + * specific structures. By implementing this callback drivers can 649 + * allocate memory accordingly. 650 + * 651 + * This callback is optional. 652 + */ 653 + struct drm_gpuvm_bo *(*vm_bo_alloc)(void); 654 + 655 + /** 656 + * @vm_bo_free: called when the &drm_gpuvm frees a 657 + * struct drm_gpuvm_bo 658 + * 659 + * Some drivers may want to embed struct drm_gpuvm_bo into driver 660 + * specific structures. By implementing this callback drivers can 661 + * free the previously allocated memory accordingly. 662 + * 663 + * This callback is optional. 664 + */ 665 + void (*vm_bo_free)(struct drm_gpuvm_bo *vm_bo); 666 + 667 + /** 668 + * @vm_bo_validate: called from drm_gpuvm_validate() 669 + * 670 + * Drivers receive this callback for every evicted &drm_gem_object being 671 + * mapped in the corresponding &drm_gpuvm. 672 + * 673 + * Typically, drivers would call their driver specific variant of 674 + * ttm_bo_validate() from within this callback. 675 + */ 676 + int (*vm_bo_validate)(struct drm_gpuvm_bo *vm_bo, 677 + struct drm_exec *exec); 1117 678 1118 679 /** 1119 680 * @sm_step_map: called from &drm_gpuvm_sm_map to finally insert the

+3 -1

include/drm/drm_mipi_dbi.h

··· 12 12 #include <drm/drm_device.h> 13 13 #include <drm/drm_simple_kms_helper.h> 14 14 15 + struct drm_format_conv_state; 15 16 struct drm_rect; 16 17 struct gpio_desc; 17 18 struct iosys_map; ··· 193 192 int mipi_dbi_command_stackbuf(struct mipi_dbi *dbi, u8 cmd, const u8 *data, 194 193 size_t len); 195 194 int mipi_dbi_buf_copy(void *dst, struct iosys_map *src, struct drm_framebuffer *fb, 196 - struct drm_rect *clip, bool swap); 195 + struct drm_rect *clip, bool swap, 196 + struct drm_format_conv_state *fmtcnv_state); 197 197 198 198 /** 199 199 * mipi_dbi_command - MIPI DCS command with optional parameter(s)

+38 -12

include/drm/gpu_scheduler.h

··· 320 320 * @sched: the scheduler instance on which this job is scheduled. 321 321 * @s_fence: contains the fences for the scheduling of job. 322 322 * @finish_cb: the callback for the finished fence. 323 + * @credits: the number of credits this job contributes to the scheduler 323 324 * @work: Helper to reschdeule job kill to different context. 324 325 * @id: a unique id assigned to each job scheduled on the scheduler. 325 326 * @karma: increment on every hang caused by this job. If this exceeds the hang ··· 339 338 struct list_head list; 340 339 struct drm_gpu_scheduler *sched; 341 340 struct drm_sched_fence *s_fence; 341 + 342 + u32 credits; 342 343 343 344 /* 344 345 * work is used only after finish_cb has been used and will not be ··· 465 462 * and it's time to clean it up. 466 463 */ 467 464 void (*free_job)(struct drm_sched_job *sched_job); 465 + 466 + /** 467 + * @update_job_credits: Called when the scheduler is considering this 468 + * job for execution. 469 + * 470 + * This callback returns the number of credits the job would take if 471 + * pushed to the hardware. Drivers may use this to dynamically update 472 + * the job's credit count. For instance, deduct the number of credits 473 + * for already signalled native fences. 474 + * 475 + * This callback is optional. 476 + */ 477 + u32 (*update_job_credits)(struct drm_sched_job *sched_job); 468 478 }; 469 479 470 480 /** 471 481 * struct drm_gpu_scheduler - scheduler instance-specific data 472 482 * 473 483 * @ops: backend operations provided by the driver. 474 - * @hw_submission_limit: the max size of the hardware queue. 484 + * @credit_limit: the credit limit of this scheduler 485 + * @credit_count: the current credit count of this scheduler 475 486 * @timeout: the time after which a job is removed from the scheduler. 476 487 * @name: name of the ring for which this scheduler is being used. 477 488 * @num_rqs: Number of run-queues. This is at most DRM_SCHED_PRIORITY_COUNT, 478 489 * as there's usually one run-queue per priority, but could be less. 479 490 * @sched_rq: An allocated array of run-queues of size @num_rqs; 480 - * @wake_up_worker: the wait queue on which the scheduler sleeps until a job 481 - * is ready to be scheduled. 482 491 * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler 483 492 * waits on this wait queue until all the scheduled jobs are 484 493 * finished. 485 - * @hw_rq_count: the number of jobs currently in the hardware queue. 486 494 * @job_id_count: used to assign unique id to the each job. 495 + * @submit_wq: workqueue used to queue @work_run_job and @work_free_job 487 496 * @timeout_wq: workqueue used to queue @work_tdr 497 + * @work_run_job: work which calls run_job op of each scheduler. 498 + * @work_free_job: work which calls free_job op of each scheduler. 488 499 * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the 489 500 * timeout interval is over. 490 - * @thread: the kthread on which the scheduler which run. 491 501 * @pending_list: the list of jobs which are currently in the job queue. 492 502 * @job_list_lock: lock to protect the pending_list. 493 503 * @hang_limit: once the hangs by a job crosses this limit then it is marked ··· 509 493 * @_score: score used when the driver doesn't provide one 510 494 * @ready: marks if the underlying HW is ready to work 511 495 * @free_guilty: A hit to time out handler to free the guilty job. 496 + * @pause_submit: pause queuing of @work_run_job on @submit_wq 497 + * @own_submit_wq: scheduler owns allocation of @submit_wq 512 498 * @dev: system &struct device 513 499 * 514 500 * One scheduler is implemented for each hardware ring. 515 501 */ 516 502 struct drm_gpu_scheduler { 517 503 const struct drm_sched_backend_ops *ops; 518 - uint32_t hw_submission_limit; 504 + u32 credit_limit; 505 + atomic_t credit_count; 519 506 long timeout; 520 507 const char *name; 521 508 u32 num_rqs; 522 509 struct drm_sched_rq **sched_rq; 523 - wait_queue_head_t wake_up_worker; 524 510 wait_queue_head_t job_scheduled; 525 - atomic_t hw_rq_count; 526 511 atomic64_t job_id_count; 512 + struct workqueue_struct *submit_wq; 527 513 struct workqueue_struct *timeout_wq; 514 + struct work_struct work_run_job; 515 + struct work_struct work_free_job; 528 516 struct delayed_work work_tdr; 529 - struct task_struct *thread; 530 517 struct list_head pending_list; 531 518 spinlock_t job_list_lock; 532 519 int hang_limit; ··· 537 518 atomic_t _score; 538 519 bool ready; 539 520 bool free_guilty; 521 + bool pause_submit; 522 + bool own_submit_wq; 540 523 struct device *dev; 541 524 }; 542 525 543 526 int drm_sched_init(struct drm_gpu_scheduler *sched, 544 527 const struct drm_sched_backend_ops *ops, 545 - u32 num_rqs, uint32_t hw_submission, unsigned int hang_limit, 528 + struct workqueue_struct *submit_wq, 529 + u32 num_rqs, u32 credit_limit, unsigned int hang_limit, 546 530 long timeout, struct workqueue_struct *timeout_wq, 547 531 atomic_t *score, const char *name, struct device *dev); 548 532 549 533 void drm_sched_fini(struct drm_gpu_scheduler *sched); 550 534 int drm_sched_job_init(struct drm_sched_job *job, 551 535 struct drm_sched_entity *entity, 552 - void *owner); 536 + u32 credits, void *owner); 553 537 void drm_sched_job_arm(struct drm_sched_job *job); 554 538 int drm_sched_job_add_dependency(struct drm_sched_job *job, 555 539 struct dma_fence *fence); ··· 572 550 struct drm_gpu_scheduler **sched_list, 573 551 unsigned int num_sched_list); 574 552 553 + void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched); 575 554 void drm_sched_job_cleanup(struct drm_sched_job *job); 576 - void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched); 555 + void drm_sched_wakeup(struct drm_gpu_scheduler *sched, struct drm_sched_entity *entity); 556 + bool drm_sched_wqueue_ready(struct drm_gpu_scheduler *sched); 557 + void drm_sched_wqueue_stop(struct drm_gpu_scheduler *sched); 558 + void drm_sched_wqueue_start(struct drm_gpu_scheduler *sched); 577 559 void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad); 578 560 void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery); 579 561 void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched);

+22 -22

include/linux/iosys-map.h

··· 168 168 * about the use of uninitialized variable. 169 169 */ 170 170 #define IOSYS_MAP_INIT_OFFSET(map_, offset_) ({ \ 171 - struct iosys_map copy = *map_; \ 172 - iosys_map_incr(&copy, offset_); \ 173 - copy; \ 171 + struct iosys_map copy_ = *map_; \ 172 + iosys_map_incr(&copy_, offset_); \ 173 + copy_; \ 174 174 }) 175 175 176 176 /** ··· 391 391 * Returns: 392 392 * The value read from the mapping. 393 393 */ 394 - #define iosys_map_rd(map__, offset__, type__) ({ \ 395 - type__ val; \ 396 - if ((map__)->is_iomem) { \ 397 - __iosys_map_rd_io(val, (map__)->vaddr_iomem + (offset__), type__);\ 398 - } else { \ 399 - __iosys_map_rd_sys(val, (map__)->vaddr + (offset__), type__); \ 400 - } \ 401 - val; \ 394 + #define iosys_map_rd(map__, offset__, type__) ({ \ 395 + type__ val_; \ 396 + if ((map__)->is_iomem) { \ 397 + __iosys_map_rd_io(val_, (map__)->vaddr_iomem + (offset__), type__); \ 398 + } else { \ 399 + __iosys_map_rd_sys(val_, (map__)->vaddr + (offset__), type__); \ 400 + } \ 401 + val_; \ 402 402 }) 403 403 404 404 /** ··· 413 413 * or if pointer may be unaligned (and problematic for the architecture 414 414 * supported), use iosys_map_memcpy_to() 415 415 */ 416 - #define iosys_map_wr(map__, offset__, type__, val__) ({ \ 417 - type__ val = (val__); \ 418 - if ((map__)->is_iomem) { \ 419 - __iosys_map_wr_io(val, (map__)->vaddr_iomem + (offset__), type__);\ 420 - } else { \ 421 - __iosys_map_wr_sys(val, (map__)->vaddr + (offset__), type__); \ 422 - } \ 416 + #define iosys_map_wr(map__, offset__, type__, val__) ({ \ 417 + type__ val_ = (val__); \ 418 + if ((map__)->is_iomem) { \ 419 + __iosys_map_wr_io(val_, (map__)->vaddr_iomem + (offset__), type__); \ 420 + } else { \ 421 + __iosys_map_wr_sys(val_, (map__)->vaddr + (offset__), type__); \ 422 + } \ 423 423 }) 424 424 425 425 /** ··· 485 485 * The value read from the mapping. 486 486 */ 487 487 #define iosys_map_rd_field(map__, struct_offset__, struct_type__, field__) ({ \ 488 - struct_type__ *s; \ 488 + struct_type__ *s_; \ 489 489 iosys_map_rd(map__, struct_offset__ + offsetof(struct_type__, field__), \ 490 - typeof(s->field__)); \ 490 + typeof(s_->field__)); \ 491 491 }) 492 492 493 493 /** ··· 508 508 * usage and memory layout. 509 509 */ 510 510 #define iosys_map_wr_field(map__, struct_offset__, struct_type__, field__, val__) ({ \ 511 - struct_type__ *s; \ 511 + struct_type__ *s_; \ 512 512 iosys_map_wr(map__, struct_offset__ + offsetof(struct_type__, field__), \ 513 - typeof(s->field__), val__); \ 513 + typeof(s_->field__), val__); \ 514 514 }) 515 515 516 516 #endif /* __IOSYS_MAP_H__ */

+20

include/uapi/drm/drm.h

··· 1218 1218 1219 1219 #define DRM_IOCTL_SYNCOBJ_EVENTFD DRM_IOWR(0xCF, struct drm_syncobj_eventfd) 1220 1220 1221 + /** 1222 + * DRM_IOCTL_MODE_CLOSEFB - Close a framebuffer. 1223 + * 1224 + * This closes a framebuffer previously added via ADDFB/ADDFB2. The IOCTL 1225 + * argument is a framebuffer object ID. 1226 + * 1227 + * This IOCTL is similar to &DRM_IOCTL_MODE_RMFB, except it doesn't disable 1228 + * planes and CRTCs. As long as the framebuffer is used by a plane, it's kept 1229 + * alive. When the plane no longer uses the framebuffer (because the 1230 + * framebuffer is replaced with another one, or the plane is disabled), the 1231 + * framebuffer is cleaned up. 1232 + * 1233 + * This is useful to implement flicker-free transitions between two processes. 1234 + * 1235 + * Depending on the threat model, user-space may want to ensure that the 1236 + * framebuffer doesn't expose any sensitive user information: closed 1237 + * framebuffers attached to a plane can be read back by the next DRM master. 1238 + */ 1239 + #define DRM_IOCTL_MODE_CLOSEFB DRM_IOWR(0xD0, struct drm_mode_closefb) 1240 + 1221 1241 /* 1222 1242 * Device specific ioctls should only be in their respective headers 1223 1243 * The device specific ioctl range is from 0x40 to 0x9f.

+10

include/uapi/drm/drm_mode.h

··· 1323 1323 __s32 y2; 1324 1324 }; 1325 1325 1326 + /** 1327 + * struct drm_mode_closefb 1328 + * @fb_id: Framebuffer ID. 1329 + * @pad: Must be zero. 1330 + */ 1331 + struct drm_mode_closefb { 1332 + __u32 fb_id; 1333 + __u32 pad; 1334 + }; 1335 + 1326 1336 #if defined(__cplusplus) 1327 1337 } 1328 1338 #endif

+1 -1

include/uapi/drm/ivpu_accel.h

··· 196 196 * 197 197 * %DRM_IVPU_BO_UNCACHED: 198 198 * 199 - * Allocated BO will not be cached on host side nor snooped on the VPU side. 199 + * Not supported. Use DRM_IVPU_BO_WC instead. 200 200 * 201 201 * %DRM_IVPU_BO_WC: 202 202 *

+3 -2

include/uapi/drm/qaic_accel.h

··· 287 287 * struct qaic_partial_execute_entry - Defines a BO to resize and submit. 288 288 * @handle: In. GEM handle of the BO to commit to the device. 289 289 * @dir: In. Direction of data. 1 = to device, 2 = from device. 290 - * @resize: In. New size of the BO. Must be <= the original BO size. 0 is 291 - * short for no resize. 290 + * @resize: In. New size of the BO. Must be <= the original BO size. 291 + * @resize as 0 would be interpreted as no DMA transfer is 292 + * involved. 292 293 */ 293 294 struct qaic_partial_execute_entry { 294 295 __u32 handle;

+5

include/uapi/drm/v3d_drm.h

··· 319 319 320 320 /* Pointer to an array of ioctl extensions*/ 321 321 __u64 extensions; 322 + 323 + struct { 324 + __u32 ioc; 325 + __u32 pad; 326 + } v71; 322 327 }; 323 328 324 329 /* Submits a compute shader for dispatch. This job will block on any

+2

include/uapi/drm/virtgpu_drm.h

··· 97 97 #define VIRTGPU_PARAM_CROSS_DEVICE 5 /* Cross virtio-device resource sharing */ 98 98 #define VIRTGPU_PARAM_CONTEXT_INIT 6 /* DRM_VIRTGPU_CONTEXT_INIT */ 99 99 #define VIRTGPU_PARAM_SUPPORTED_CAPSET_IDs 7 /* Bitmask of supported capability set ids */ 100 + #define VIRTGPU_PARAM_EXPLICIT_DEBUG_NAME 8 /* Ability to set debug name from userspace */ 100 101 101 102 struct drm_virtgpu_getparam { 102 103 __u64 param; ··· 199 198 #define VIRTGPU_CONTEXT_PARAM_CAPSET_ID 0x0001 200 199 #define VIRTGPU_CONTEXT_PARAM_NUM_RINGS 0x0002 201 200 #define VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK 0x0003 201 + #define VIRTGPU_CONTEXT_PARAM_DEBUG_NAME 0x0004 202 202 struct drm_virtgpu_context_set_param { 203 203 __u64 param; 204 204 __u64 value;

+1

sound/core/pcm_drm_eld.c

··· 6 6 #include <linux/export.h> 7 7 #include <linux/hdmi.h> 8 8 #include <drm/drm_edid.h> 9 + #include <drm/drm_eld.h> 9 10 #include <sound/pcm.h> 10 11 #include <sound/pcm_drm_eld.h> 11 12

+1

sound/soc/codecs/hdac_hdmi.c

··· 16 16 #include <linux/pm_runtime.h> 17 17 #include <linux/hdmi.h> 18 18 #include <drm/drm_edid.h> 19 + #include <drm/drm_eld.h> 19 20 #include <sound/pcm_params.h> 20 21 #include <sound/jack.h> 21 22 #include <sound/soc.h>

+1

sound/soc/codecs/hdmi-codec.c

··· 17 17 #include <sound/pcm_iec958.h> 18 18 19 19 #include <drm/drm_crtc.h> /* This is only to get MAX_ELD_BYTES */ 20 + #include <drm/drm_eld.h> 20 21 21 22 #define HDMI_CODEC_CHMAP_IDX_UNKNOWN -1 22 23

+1

sound/x86/intel_hdmi_audio.c

··· 30 30 #include <sound/control.h> 31 31 #include <sound/jack.h> 32 32 #include <drm/drm_edid.h> 33 + #include <drm/drm_eld.h> 33 34 #include <drm/intel_lpe_audio.h> 34 35 #include "intel_hdmi_audio.h" 35 36

Configure Feed

Configure Feed