drm/sched: Improve teardown documentation

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

If jobs are still enqueued in struct drm_gpu_scheduler.pending_list
when drm_sched_fini() gets called, those jobs will be leaked since that
function stops both job-submission and (automatic) job-cleanup. It is,
thus, up to the driver to take care of preventing leaks.

The related function drm_sched_wqueue_stop() also prevents automatic job
cleanup.

Those pitfals are not reflected in the documentation, currently.

Explicitly inform about the leak problem in the docstring of
drm_sched_fini().

Additionally, detail the purpose of drm_sched_wqueue_{start,stop} and
hint at the consequences for automatic cleanup.

Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105143137.71893-2-pstanner@redhat.com

Philipp Stanner 2 years ago baf4afc5 21c23e4b

+21 -2

1 changed file

expand all

drivers

gpu

drm

scheduler

sched_main.c

+21 -2

drivers/gpu/drm/scheduler/sched_main.c

··· 1350 1350 * @sched: scheduler instance 1351 1351 * 1352 1352 * Tears down and cleans up the scheduler. 1353 + * 1354 + * This stops submission of new jobs to the hardware through 1355 + * drm_sched_backend_ops.run_job(). Consequently, drm_sched_backend_ops.free_job() 1356 + * will not be called for all jobs still in drm_gpu_scheduler.pending_list. 1357 + * There is no solution for this currently. Thus, it is up to the driver to make 1358 + * sure that 1359 + * a) drm_sched_fini() is only called after for all submitted jobs 1360 + * drm_sched_backend_ops.free_job() has been called or that 1361 + * b) the jobs for which drm_sched_backend_ops.free_job() has not been called 1362 + * after drm_sched_fini() ran are freed manually. 1363 + * 1364 + * FIXME: Take care of the above problem and prevent this function from leaking 1365 + * the jobs in drm_gpu_scheduler.pending_list under any circumstances. 1353 1366 */ 1354 1367 void drm_sched_fini(struct drm_gpu_scheduler *sched) 1355 1368 { ··· 1458 1445 1459 1446 /** 1460 1447 * drm_sched_wqueue_stop - stop scheduler submission 1461 - * 1462 1448 * @sched: scheduler instance 1449 + * 1450 + * Stops the scheduler from pulling new jobs from entities. It also stops 1451 + * freeing jobs automatically through drm_sched_backend_ops.free_job(). 1463 1452 */ 1464 1453 void drm_sched_wqueue_stop(struct drm_gpu_scheduler *sched) 1465 1454 { ··· 1473 1458 1474 1459 /** 1475 1460 * drm_sched_wqueue_start - start scheduler submission 1476 - * 1477 1461 * @sched: scheduler instance 1462 + * 1463 + * Restarts the scheduler after drm_sched_wqueue_stop() has stopped it. 1464 + * 1465 + * This function is not necessary for 'conventional' startup. The scheduler is 1466 + * fully operational after drm_sched_init() succeeded. 1478 1467 */ 1479 1468 void drm_sched_wqueue_start(struct drm_gpu_scheduler *sched) 1480 1469 {

Configure Feed

Configure Feed