From: Rodrigo Vivi Date: Thu, 5 Sep 2024 14:02:15 +0000 (-0400) Subject: drm/xe: Suppress missing outer rpm protection warning X-Git-Url: https://git.dujemihanovic.xyz/?a=commitdiff_plain;h=ad92f52312614b0ef6eee07ee64f1e7661072a49;p=linux.git drm/xe: Suppress missing outer rpm protection warning Do not raise a WARN if we are likely within suspending or resuming path. This is likely this false positive: rpm_status: 0000:03:00.0 status=RPM_SUSPENDING console: xe_bo_evict_all (called from suspend) xe_sched_job_create: dev=0000:03:00.0, ... xe_sched_job_exec: dev=0000:03:00.0, ... xe_pm_runtime_put: dev=0000:03:00.0, ... xe_sched_job_run: dev=0000:03:00.0, ... rpm_usage: 0000:03:00.0 flags-0 cnt-2 ... rpm_usage: 0000:03:00.0 flags-0 cnt-2 ... rpm_usage: 0000:03:00.0 flags-0 cnt-2 ... console: xe 0000:03:00.0: [drm] Missing outer runtime PM protection console: xe_guc_ct_send+0x15/0x50 [xe] console: guc_exec_queue_run_job+0x1509/0x3950 [xe] [snip] console: drm_sched_run_job_work+0x649/0xc20 At this point, BOs are getting evicted from VRAM with rpm usage-counter = 2, but rpm status = SUSPENDING. The xe->pm_callback_task won't be equal 'current' because this call is coming from a work queue. So, pm_runtime_get_if_active() will be called and return 0 because rpm status != ACTIVE (but equal SUSPENDING or RESUMING). v2: Still get the reference even on non suspending/resuming path (Jonathan, Brost). Cc: Matthew Brost Cc: Matthew Auld Reviewed-by: Jonathan Cavitt Link: https://patchwork.freedesktop.org/patch/msgid/20240905140215.56404-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi (cherry picked from commit cb85e39dc5d1717fab82810984cce0e54712a3c2) Signed-off-by: Lucas De Marchi --- diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c index e518557e0eec..9c59a30d7646 100644 --- a/drivers/gpu/drm/xe/xe_pm.c +++ b/drivers/gpu/drm/xe/xe_pm.c @@ -595,6 +595,18 @@ bool xe_pm_runtime_get_if_in_use(struct xe_device *xe) return pm_runtime_get_if_in_use(xe->drm.dev) > 0; } +/* + * Very unreliable! Should only be used to suppress the false positive case + * in the missing outer rpm protection warning. + */ +static bool xe_pm_suspending_or_resuming(struct xe_device *xe) +{ + struct device *dev = xe->drm.dev; + + return dev->power.runtime_status == RPM_SUSPENDING || + dev->power.runtime_status == RPM_RESUMING; +} + /** * xe_pm_runtime_get_noresume - Bump runtime PM usage counter without resuming * @xe: xe device instance @@ -611,8 +623,11 @@ void xe_pm_runtime_get_noresume(struct xe_device *xe) ref = xe_pm_runtime_get_if_in_use(xe); - if (drm_WARN(&xe->drm, !ref, "Missing outer runtime PM protection\n")) + if (!ref) { pm_runtime_get_noresume(xe->drm.dev); + drm_WARN(&xe->drm, !xe_pm_suspending_or_resuming(xe), + "Missing outer runtime PM protection\n"); + } } /**