git.dujemihanovic.xyz Git

drm/xe/guc: Add some failure checks

Return failures from pc_adjust_freq_bounds.

Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321191219.243583-1-vinay.belgaumkar@intel.com

drm/xe: Nuke EXEC_QUEUE_FLAG_PERSISTENT

This is a left over of commit f1a9abc0cf31 ("drm/xe/uapi: Remove support for persistent exec_queues").

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240307135229.41973-3-jose.souza@intel.com

drm/xe/devcoredump: Print errno if VM snapshot was not captured

My testing machine has only 8GB of RAM and while running piglit tests
I can reach the OOM cache in xe_vm_snapshot_capture() snap allocaiton
sometimes.

So to differentiate the OOM from race between capture and UMDs
unbinbind VMs here I'm adding a '[0].error: -12' to devcoredump.

v2:
- fix returned errno values

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240307135229.41973-2-jose.souza@intel.com

drm/xe: Make devcoredump VM error state print consistent

This makes VM error consistent with [x].length and [x].data.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240307135229.41973-1-jose.souza@intel.com

drm/xe: remove unused struct xe_device members

modeset_restore_state has been unused since commit 6af0ffc0db93
("drm/i915/display: move restore state and ctx under display
sub-struct").

member global_obj_list has been unused since commit e2925e19c006
("drm/i915/display: move global_obj_list under display sub-struct").

hti_state has been unused since commit 62749912540b ("drm/i915/display:
move hti under display sub-struct").

snps_phy_failed_calibration has been unused since commit 3a7e2d58f800
("drm/i915: move snps_phy_failed_calibration to display sub-struct under
snps").

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321161548.3509672-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/xe/query: fix gt_id bounds check

The user provided gt_id should always be less than the
XE_MAX_GT_PER_TILE.

Fixes: 7793d00d1bf5 ("drm/xe: Correlate engine and cpu timestamps with better accuracy")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321110629.334701-2-matthew.auld@intel.com

drm/xe: Add debug messages for MMU notifier and VMA invalidate

Extra debug is useful when working on VM issues.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240320194232.1910688-1-matthew.brost@intel.com

drm/xe: Use USEC_PER_MSEC rather than the hard coding

Use USEC_PER_MSEC rather than the hard coded value of 1000.

Static analyzer Reported "casting either timeout_ms or
1000U to type u64" to avoid overflow-before-widen.
Using USEC_PER_MSEC seems better and will help with static analyzer
report cleanup.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240320083325.3258720-1-himal.prasad.ghimiray@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/bb: assert width in xe_bb_create_migration_job()

The q->width should always be exactly one here for migration queue/vm.
The width will anyway be overridden later since we need to emit two
jumps for special migration jobs. Enforce that here to ensure caller is
not doing something strange. While here also convert to the helper to
determine if the queue is migration based.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240320112730.219854-4-matthew.auld@intel.com

drm/xe/bb: assert width in xe_bb_create_job()

The queue width will determine the number of batch buffer emitted into
the ring. In the case of xe_bb_create_job() we pass exactly one batch
address, therefore add an assert for the width to make sure we don't go
out of bounds. While here also convert to the helper to determine if the
queue is migration based.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240320112730.219854-3-matthew.auld@intel.com

drm/xe/uc: Use u64 for offsets for which we use upper_32_bits()

The GGTT is currently a 32 bit address space, but the HW and GuC
support 48b addresses in GGTT-related operations, both to keep the
interface/HW paths common between PPGTT and GGTT and to allow for
future increase of the GGTT size.
This leaves us having to program a 64b field with a 32b offset, which
currently we're in some cases doing this by using an upper_32_bits()
call on a 32b variable, which doesn't make any sense. To do this cleanly
we have 2 options:

1 - Set the upper 32 bits directly to zero.
2 - Use 64b variables for the offset and keep programming the whole thing,
so we're ready if we ever have bigger offsets.

This patch goes with option #2 and switches the related variables to u64.

v2: don't change the log ctl flag variable (John)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319195101.2784480-1-daniele.ceraolospurio@intel.com

drm/xe: Always check force_wake_get return code

A force_wake_get failure means that the HW might not be awake for the
access we're doing; this can lead to an immediate error or it can be a
more subtle problem (e.g. a register read might return an incorrect
value that is still valid, leading the driver to make a wrong choice
instead of flagging an error).
We avoid an error from the force_wake function because callers might
handle or tolerate the error, but this only works if all callers
are checking the error code. The majority already do, but a few are not.
These are mainly falling into 3 categories, which are each handled
differently:

1) error capture: in this case we want to continue the capture, but we
   log an info message in dmesg to notify the user that the capture
   might have incorrect data.

2) ioctl: in this case we return a -EIO error to userspace

3) unabortable actions: these are scenarios where we can't simply abort
   and retry and so it's better to just try it anyway because there is a
   chance the HW is awake even with the failure. In this case we throw a
   warning so we know there was a forcewake problem if something fails
   down the line.

v2: use gt_WARN_ON where appropriate

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318154924.3453513-1-daniele.ceraolospurio@intel.com

drm/xe/xelpg: Add Wa_14020495402

Disable clockgating for TDL SVHS fub.

v2: Extend the Wa to 1274(MattR)

Bspec: 46045
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318210120.564692-1-radhakrishna.sripada@intel.com

drm/xe/gt: Remove continue statement which has no effect

Remove continue statement which does not have real effect
as no actions are to be taken post continue.

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318114057.3831274-1-tejas.upadhyay@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/display: fix type of intel_uncore_read*() functions

Some of the backported intel_uncore_read*() functions used the wrong
types. Change the function declarations accordingly.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314065221.1181158-1-luciano.coelho@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/xe: Move xe_ggtt_invalidate out from ggtt->lock

Considering the caller of the GGTT functions should keep the
backing storage alive before the function completes, it's not
necessary to invalidate with the GGTT lock held. This just adds
latency for every user of the GGTT.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306052002.311196-5-matthew.brost@intel.com

drm/xe: Add XE_BO_GGTT_INVALIDATE flag

Add XE_BO_GGTT_INVALIDATE flag which indicates the GGTT should be
invalidated when a BO is added / removed from the GGTT. This is
typically set when a BO is used by the GuC as the GuC has GGTT TLBs.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
[mlankhorst: Small fix to only inherit GGTT_INVALIDATE from src bo]
[mlankhorst: Remove _BIT from name]
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306052002.311196-4-matthew.brost@intel.com

drm/xe: Drop ggtt invalidate from display code

Only buffers mapped in the GGTT used by the GuC require an invalidation.
Display buffers do not require an invalidation. Delete the invalidatio
from display code and make invalidation a static function in xe_ggtt.c.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306052002.311196-3-matthew.brost@intel.com

drm/xe: Add a NULL check in xe_ttm_stolen_mgr_init

Add an explicit check to ensure that the mgr is not NULL.

Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319130925.22399-1-nirmoy.das@intel.com

drm/xe: Use correct function pointer type

Use xe_exec_queue_user_extension_fn type for
exec_queue_user_extension_funcs.`

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319174919.1847-1-niranjana.vishwanathapura@intel.com

drm/xe: Streamline exec queue freeing path

Ensure exec queue freeing happens at one place, that is in
__xe_exec_queue_free(). It releases q->vm reference also. Set
q->vm before handling extensions as they can potentially reference it.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319175947.15890-1-niranjana.vishwanathapura@intel.com

drm/xe: Separate out sched/deregister_done handling

Abstract out the core part of sched_done and deregister_done handlers
to separate functions to decouple them from any protocol error handling
part and make them more readable.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240319184153.16667-1-niranjana.vishwanathapura@intel.com

drm/xe/guc: Don't support older GuC 70.x releases

Supporting older GuC versions comes with baggage, both on the coding
side (due to interfaces only being available from a certain version
onwards) and on the testing side (due to having to make sure the driver
works as expected with older GuCs).
Since all of our Xe platform are still under force probe, we haven't
committed to support any specific GuC version and we therefore don't
need to support the older once, which means that we can force a bottom
limit to what GuC we accept. This allows us to remove any conditional
statements based on older GuC versions and also to approach newer
additions knowing that we'll never attempt to load something older
than our minimum requirement.

As an initial value, the minimum expected version is set to 70.19,
which is the version currently in the firmware table, but the
expectation is that this will be bumbed every time we update the
table, until we remove the force probe.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240304162616.824884-1-daniele.ceraolospurio@intel.com

drm/xe: Add dbg messages on the suspend resume functions.

In case of the suspend/resume flow getting locked up we
can get reports with some useful hints on where it might
get locked and if that has failed.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180141.267458-2-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe: Convert gt suspend/resume messages to debug

Let's be quieter on production configuration and let's also
print the entry point of the gt suspend when debug messages
are enabled.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180141.267458-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/vm : Remove duplicate assignment of XE_VM_FLAG_LR_MODE flag.

vm->flags are already assigned with passed flags. Remove the redundant
assignment.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240307065213.1968688-1-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/display: Mark dpt and related vma as uncached

Mark dpt and related vma as uncached to avoid pipe faults on some devices.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318201850.127785-1-juhapekka.heikkila@gmail.com

drm/xe/display: mark DPT with XE_BO_PAGETABLE

Otherwise in the case where we use normal system memory, the CPU access
will always be cached, like when filling the DPT PTEs, which is likely
not what we want since HW access could be incoherent on platforms like
LNL. Marking as XE_BO_PAGETABLE will force wc/uc underneath on such
platforms.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314164905.239449-2-matthew.auld@intel.com

drm/xe: Remove usage of unsafe strcpy

Remove usage of unsafe strcpy with a helper function
to convert engine class to string.

Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318091055.638-1-nirmoy.das@intel.com

drm/xe: Drop bogus vma NULL check

The vma pointer can't be NULL here.

Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318093547.16326-1-nirmoy.das@intel.com

drm/xe/device: fix XE_MAX_TILES_PER_DEVICE check

Here XE_MAX_TILES_PER_DEVICE is the gt array size, therefore the gt
index should always be less than.

v2 (Lucas):
- Add fixes tag.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-6-matthew.auld@intel.com

drm/xe/device: fix XE_MAX_GT_PER_TILE check

Here XE_MAX_GT_PER_TILE is the total, therefore the gt index should
always be less than.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-5-matthew.auld@intel.com

drm/xe/queue: fix engine_class bounds check

The engine_class is the index into the user_to_xe_engine_class,
therefore it needs to be less than.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-4-matthew.auld@intel.com

drm/xe: Fix potential integer overflow in page size calculation

Explicitly cast tbo->page_alignment to u64 before bit-shifting to
prevent overflow when assigning to min_page_size.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318164342.3094-1-nirmoy.das@intel.com

drm/xe/vm: fix xe_assert()

The region can be used an index into the region_to_mem_type, so we
should be asserting that it is less than the ARRAY_SIZE here.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318103616.26240-2-matthew.auld@intel.com

drm/xe/client: drop bogus bo NULL check

If we fished it out the list then it can't be null; the list entry is
embedded in the bo.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318093431.21075-4-matthew.auld@intel.com

drm/xe/client: remove bogus rcu list usage

We use plain spinlock to protect readers and writers, so there is no
actual RCU here. Rather use the more appropriate non-rcu list based API.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318093431.21075-3-matthew.auld@intel.com

drm/xe/pf: Always select Multi-Level LMTT for platforms 12.60+

The Multi-Level LMTT variant is not specific only to the PVC.
Change logic to select it for all new platforms beyond 12.60.

Bspec: 52404, 67468
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313104132.1045-4-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe/pf: Request 64K aligned allocations for LMTT PD

The LMTT Page Directory, as well as the directory entries, must be
aligned on a 64KB boundary in VRAM. Use explicit alignment flag to
match hardware requirement.

Bspec: 52404, 67468
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313104132.1045-3-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe: Allow VRAM BO allocations aligned to 64K

While today we are getting VRAM allocations aligned to 64K as the
XE_VRAM_FLAGS_NEED64K flag could be set, we shouldn't only rely on
that flag and we should also allow caller to specify required 64K
alignment explicitly. Define new XE_BO_NEEDS_64K flag for that.

Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313104132.1045-2-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe: Make xe_mmio_read|write() functions non-inline

Shortly we will updating xe_mmio_read|write() functions with SR-IOV
specific features making those functions less suitable for inline.
Convert now those functions into regular ones, lowering driver
footprint, according to scripts/bloat-o-meter, by 6%

add/remove: 18/18 grow/shrink: 31/603 up/down: 2719/-79663 (-76944)
Function                                     old     new   delta
Total: Before=1276633, After=1199689, chg -6.03%
add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
Data                                         old     new   delta
Total: Before=48990, After=48990, chg +0.00%
add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
RO Data                                      old     new   delta
Total: Before=115680, After=115680, chg +0.00%

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-7-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe: Mark VF accessible interrupt registers

Interrupt registers 1900xx are VF accessible but only until version
12.50 as on newer platforms VFs are using memory-based interrupts.

To avoid complexity, we mark those registers with XE_REG_OPTION_VF
unconditionally, as IRQ handling on newer VFs is different anyway.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-6-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe: Mark VF accessible global registers

Only selected registers are available for Virtual Functions.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-5-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe: Mark VF accessible GuC registers

Only selected registers are available for Virtual Functions.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-4-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe: Define XE_REG_OPTION_VF

We will tag registers that SR-IOV Virtual Functions can access.
This will help us catch any invalid usage and/or provide custom
replacement if available.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-3-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe: Assert size of the struct xe_reg

We want to keep the struct xe_reg as small as possible.
Make sure we don't accidentally change its size.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314173130.1177-2-michal.wajdeczko@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe: Add helper macro to loop each DSS

Add helper macro to loop each DSS. This is a precursor patch to allow
for easier iteration through MCR registers and other per-DSS uses.

Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314210735.258553-2-zhanjun.dong@intel.com

drm/xe/mocs: Clarify which GT is being operated on

Switch the MOCS-related debug messages to use a GT-specific logging
function and add ID/type output to the beginning of the MOCS kunit test
to assist with debug when problems arise.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314195825.3226856-4-matthew.d.roper@intel.com

drm/xe/mocs: Determine MCR separately for primary/media GT in kunit test

Although MOCS registers became multicast in graphics version 12.50 on
the primary GT, this transition did not happen until version 20 on the
media GT. Considering each GT independently is mostly important for
MTL/ARL where the Xe_LPM+ IP has non-MCR MOCS registers, even though
Xe_LPG IP has MCR registers.

Bspec: 67789, 71186
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314195825.3226856-3-matthew.d.roper@intel.com

drm/xe/gsc: Handle GSCCS ER interrupt

Starting on Xe2, the GSCCS engine reset is a 2-step process. When the
driver or the GuC hits the GDRST register, the CS is immediately reset
and a success is reported, but the GSC shim continues its reset in the
background. While the shim reset is ongoing, the CS is able to accept
new context submission, but any commands that require the shim will
be stalled until the reset is completed. This means that we can keep
submitting to the GSCCS as long as we make sure that the preemption
timeout is big enough to cover any delay introduced by the reset; since
the GSC preempt timeout is not tunable at runtime, we only need to check
that the value set in kconfig is big enough (and increase it if it
isn't).
When the shim reset completes, a specific CS interrupt is triggered,
in response to which we need to check the GSCI_TIMER_STATUS register
to see if the reset was successful or not.
Note that the GSCI_TIMER_STATUS register is not power save/restored,
so it gets reset on MC6 entry. However, a reset failure stops MC6,
so in that scenario we're always guaranteed to find the correct value.

Since we can't check the register within interrupt context, the
existing GSC worker has been updated to handle it.
The expected action to take on ER failure is to trigger a driver FLR,
but we still don't support that, so for now we just print an error. A
comment has been added to the code to keep track of the FLR requirement.

v2: Add a check for the initial timeout value (Alan)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240304145634.820684-1-daniele.ceraolospurio@intel.com

drm/xe/guc_submit: use jiffies for job timeout

drm_sched_init() expects jiffies for the timeout, but here we are
passing the timeout in ms. Convert to jiffies instead.

Fixes: eef55700f302 ("drm/xe: Add sysfs for default engine scheduler properties")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314121554.223229-2-matthew.auld@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Remove unused xe_bo->props struct

Property struct is not being used so remove it and related dead code.

Fixes: ddfa2d6a846a ("drm/xe/uapi: Kill VM_MADVISE IOCTL")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: intel-xe@lists.freedesktop.org
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240311151159.10036-1-nirmoy.das@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Skip VMAs pin when requesting signal to the last XE_EXEC

Doing a XE_EXEC with num_batch_buffer == 0 makes signals passed as
argument to be signaled when the last real XE_EXEC is completed.
But to do that it was first pinning all VMAs in drm_gpuvm_exec_lock(),
this patch remove this pinning as it is not required.

This change also help Mesa implementing memory over-commiting recovery
as it needs to unbind not needed VMAs when the whole VM can't fit
in GPU memory but it can only do the unbiding when the last XE_EXEC
is completed.
So with this change Mesa can get the signal it want without getting
out-of-memory errors.

Fixes: eb9702ad2986 ("drm/xe: Allow num_batch_buffer / num_binds == 0 in IOCTLs")
Cc: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313171318.121066-1-jose.souza@intel.com

drm/xe: Use xe_assert in xe_device_assert_mem_access

The implementation of xe_device_assert_mem_access has a non-zero cost.
Use xe_assert rather than XE_WARN_ON so it will compile out in non-debug
kernel builds (Kconfig CONFIG_DRM_XE_DEBUG=n).

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313184430.999397-1-matthew.brost@intel.com

drm/xe/xe_exec : In xe_exec_ioctl remove deadcode

At label err_unlock_list the condition write_label will never be true.
Remove the deadcode line for write_label true.

Reported by static analyzer.

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313150545.2830408-3-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe: Return if kobj creation is failed

Return after warning regarding kobj creation failure.

Fixes: 4ae3aeab32d7 ("drm/xe: Add vram frequency sysfs attributes")
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313150545.2830408-2-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/xe_tracer: Align fence output format in ftrace log

The fence print in xe_gt_tlb_invalidation_fence and xe_hw_fence
is with "%p", change fence print in xe_sched_job to "%p" also.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240313025052.1410833-1-shuicheng.lin@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Group live kunit tests

As was done for the normal kunit tests, group the live tests into a
single module, xe_live_test.ko.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240312145158.2295351-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Invalidate userptr VMA on page pin fault

Rather than return an error to the user or ban the VM when userptr VMA
page pin fails with -EFAULT, invalidate VMA mappings. This supports the
UMD use case of freeing userptr while still having bindings.

Now that non-faulting VMs can invalidate VMAs, drop the usm prefix for
the tile_invalidated member.

v2:
- Fix build error (CI)
v3:
- Don't invalidate VMA if in fault mode, rather kill VM (Thomas)
- Update commit message with tile_invalidated name chagne (Thomas)
- Wait VM bookkeep slots with VM resv lock (Thomas)
v4:
- Move list_del_init(&userptr.repin_link) after error check (Thomas)
- Assert not in fault mode (Matthew)

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240312183907.933835-1-matthew.brost@intel.com

drm/xe/uapi: Add IP version and stepping to GT list query

For modern platforms (MTL and later), both kernel and userspace drivers
are expected to apply GT programming and workarounds based on the IP
version and stepping self-reported by the GT hardware via the GMD_ID
registers.  Since userspace drivers can't access these registers
directly, pass along the version and stepping information via the GT
list query.  Note that the new query fields will remain 0's when running
on pre-GMD_ID platforms.  Userspace is expected to continue using PCI
devid / revid on those older platforms.

Although the hardware also has a GMD_ID register for display
version/stepping, that value is intentionally *not* included anywhere in
the Xe uapi.  Display userspace should be using platform-agnostic APIs
and auto-detecting platform capabilities rather than matching specific
IP versions.

v2:
- s/revid/rev/  (Lucas)
- Fix kerneldoc copy/paste mistakes

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240312211229.2871288-4-matthew.d.roper@intel.com

drm/xe/hdcp: Fix condition for hdcp gsc cs requirement

Add condition for check of hdcp gsc cs requirement rather than
assuming gsc cs to always be required when xe is loaded. It is not
required for display version < 14

--v2
-Use display version in commit message [Lucas]

Fixes: 152f2df954d8 ("drm/xe/hdcp: Enable HDCP for XE")
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240308154939.1940960-2-suraj.kandpal@intel.com

drm/xe/pvc: Fix WA 18020744125

With the current state GUC_WA_RCS_REGS_IN_CCS_REGS_LIST could in theory
be removed since there is no render register being added to the list of
compute WAs. However the real issue is that 18020744125 is incomplete
and not setting the RING_HWSTAM on render as it should.

Writing this in RTP is a little more tricky as we want to write to
another's engine base when the match happens: first compute engine and
no render present. So use RING_HWSTAM(RENDER_RING_BASE) instead of the
usual XE_RTP_ACTION_FLAG(ENGINE_BASE).

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306192128.1895603-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Remove unused FF_SLICE_CS_CHICKEN2

Commit e89f4967d90c ("drm/xe: Drop WA 16015675438") removed the only
user of that register and should have removed it. Remove it now.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306192128.1895603-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/gsc: Fix kernel doc for xe_gsc_create_host_session_id

Fix documentation for xe_gsc_create_host_session_id which
was xe_gsc_get_host_session_id.

Fixes: 152f2df954d8 ("drm/xe/hdcp: Enable HDCP for XE")
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240307045533.1867892-2-suraj.kandpal@intel.com

drm/xe: Fix NULL check in xe_ggtt_init()

The null check for GT is after calling gt_to_xe, fix it.

Fixes: 3121fed0c51b ("drm/xe: Cleanup some layering in GGTT")
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306052002.311196-2-matthew.brost@intel.com

drm/xe: Declare __xe_lrc_*_ggtt_addr with __maybe__unused

Kernel test robot reports building error:

drivers/gpu/drm/xe/xe_lrc.c:544:1: error: unused function
'__xe_lrc_regs_ggtt_addr' [-Werror,-Wunused-function]
544 | DECL_MAP_ADDR_HELPERS(regs)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/xe/xe_lrc.c:536:19: note: expanded from macro
'DECL_MAP_ADDR_HELPERS'
536 | static inline u32 __xe_lrc_##elem##_ggtt_addr(struct xe_lrc *lrc) \

Declare __xe_lrc_*_ggtt_addr with __maybe_unused to address it.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402010928.g3j2aSBL-lkp@intel.com/
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Link: https://patchwork.freedesktop.org/patch/msgid/20240204062324.3548268-1-dawei.li@shingroup.cn
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Return immediately on tile_init failure

There's no reason to proceed with applying workaround and initing
sysfs if we are going to abort the probe upon failure.

Fixes: e5a845fd8fa4 ("drm/xe: Add sysfs entry for tile")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306203110.146387-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe: Remove unused 'create' parameter from queue property logic

The 'create' parameter in exec_queue_user_extensions was always true.
This commit removes the dead parameter and all the relevant dead code.

v2: rebase.

Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240223143043.22779-1-nirmoy.das@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe: Drop WA 16015675438

With dynamic load-balancing disabled on the compute side, there's no
reason left to enable WA 16015675438. Drop it from both PVC and DG2.

Note that this can be done because now the driver always set a fixed
partition of EUs during initialization via the ccs_mode configuration.

Cc: Mateusz Jablonski <mateusz.jablonski@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240304233103.1687412-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/hdcp: Enable HDCP for XE

Enable HDCP for Xe by defining functions which take care of
interaction of HDCP as a client with the GSC CS interface.
Add intel_hdcp_gsc_message to Makefile and add corresponding
changes to xe_hdcp_gsc.c to make it build.

--v2
-add kfree at appropriate place [Daniele]
-remove useless define [Daniele]
-move host session logic to xe_gsc_submit.c [Daniele]
-call xe_gsc_check_and_update_pending directly in an if condition
[Daniele]
-use xe_device instead of drm_i915_private [Daniele]

--v3
-use xe prefix for newly exposed function [Daniele]
-remove client specific defines from intel_gsc_mtl_header [Daniele]
-add missing kfree() [Daniele]
-have NULL check for hdcp_message in finish function [Daniele]
-dont have too many variable declarations in the same line [Daniele]

--v4
-don't point the hdcp_message structure in xe_device to anything
until it properly gets initialized [Daniele]

--v5
-Squash commits for buildability

--v6
-Order includes alphabetically [Lucas]

Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306024247.1857881-6-suraj.kandpal@intel.com

drm/xe: Use gsc_proxy_init_done to check proxy status

Expose gsc_proxy_init_done so that we can check if gsc proxy has
been initialized or not.

--v2
-Check if GSC FW is enabled before taking forcewake ref [Daniele]

--v3
-Directly call proxy check function inside if condition

Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306024247.1857881-5-suraj.kandpal@intel.com

drm/xe/hdcp: Use xe_device struct

Use xe_device struct instead of drm_i915_private so as to not
cause confusion and comply with Xe standards as drm_i915_private is
xe_device under the hood.

--v2
-Fix commit message [Daniele]

Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306024247.1857881-4-suraj.kandpal@intel.com

drm/i915/hdcp: Move intel_hdcp_gsc_message def away from header file

Move intel_hdcp_gsc_message definition into intel_hdcp_gsc.c
so that intel_hdcp_gsc_message can be redefined for xe as needed.

--v2
-Correct commit message to reflect what patch is actually doing [Arun]

Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306024741.1858039-2-suraj.kandpal@intel.com

drm/xe: Do not grab forcewakes when issuing GGTT TLB invalidation via GuC

Forcewakes are not required for communication with the GuC via CTB
as it is a memory based interfaced. Acquring forcewakes takes
considerable time. With that, do not grab a forcewake when issuing a
GGTT TLB invalidation via the GuC.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229194520.200642-1-matthew.brost@intel.com

drm/xe/arl: Add Arrow Lake H support

ARL-H uses the same media and display IP as MTL, and a version 12.74
graphics IP (referred to as Xe_LPG+). From a driver point of view, we
should be able to just treat the whole platform as MTL and rely on
GRAPHICS_VERx100 checks to handle any spots where ARL's Xe_LPG+ needs
different handling from MTL's Xe_LPG (i.e., workarounds).

v2: Resolve conflict and Reorder PCI ids in sorted order
v3: Append signed-off-by commiter to this commit

Bspec: 55420
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229070806.3402641-4-dnyaneshwar.bhadane@intel.com

drm/xe/xelpg: Extend some workarounds to graphics version 12.74

A handful of Xe_LPG workarounds are also relevant to graphics version
12.74 as well. Extend the graphics version range for these workarounds
accordingly.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229070806.3402641-3-dnyaneshwar.bhadane@intel.com

drm/xe/xelpg: Recognize graphics version 12.74 as Xe_LPG

Graphics version 12.74 (which is technically called "Xe_LPG+") should be
handled the same as versions Xe_LPG 12.70/12.71 by the KMD. Only the
workaround lists (handled in the next patch) will be a bit different.

Bspec: 55420
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229070806.3402641-2-dnyaneshwar.bhadane@intel.com

drm/xe: Pipeline evict / restore of pinned BOs during suspend / resume

Rather than waiting for each evict / restore of pinned BOs to complete
just wait on migrate exec queue to be idle once during suspend / resume.

Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240305173503.285223-1-matthew.brost@intel.com

drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion

With mem_access going away and pm_runtime getting called instead,
we need to protect these against recursions.

The put is asynchronous so there's no need to block it. However, for a
proper balance, we need to ensure that the references are taken and
restored regardless of the flow. So, let's convert them all to void and
use some direct linux/pm_runtime functions.

v2: Rebased and update commit message (Matt).

Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240301180526.643505-3-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe: Create a xe_pm_runtime_resume_and_get variant for display

Introduce the resume and get to fulfill the display need for checking
if the device was actually resumed (or it is awake) and the reference
was taken.

Then we can convert the remaining cases to a void function and have
individual functions for individual cases.

Also, already start this new function protected from the runtime
recursion, since runtime_pm will need to call for display functions
for a proper D3Cold flow.

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240301180526.643505-2-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe: Fix display runtime_pm handling

i915's intel_runtime_pm_get_if_in_use actually calls the
pm_runtime_get_if_active() with ign_usage_count = false, but Xe
was erroneously calling it with true because of the mem_access cases.

This can lead to unnecessary references getting hold here and device
never getting into the runtime suspended state.

Let's use directly the 'if_in_use' function provided by linux/pm_runtime.

Also, already start this new function protected from the runtime
recursion, since runtime_pm will need to call for display functions
for a proper D3Cold flow.

v2: Update commit message based on Matt's feedback.
Fix return condition of pm_runtime_get_if_in_use (Matt)

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240301180526.643505-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe: Implement capture of HWSP and HWCTX

Dump the HWCTX and HWSP as part of LRC capture.

Changes since v1:
- Use same layout for HWSP and HWCTX as VM bo's, to simplify dumping.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240227131248.92910-3-maarten.lankhorst@linux.intel.com

drm/xe: Add infrastructure for delayed LRC capture

Add a xe_guc_exec_queue_snapshot_capture_delayed and
xe_lrc_snapshot_capture_delayed function to capture
the contents of LRC in the next patch.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240227131248.92910-2-maarten.lankhorst@linux.intel.com

drm/xe: Move lrc snapshot capturing to xe_lrc.c

This allows the dumping of HWSP and HW Context without exporting more
functions.

Changes since v1:
- GFP_KERNEL -> GFP_NOWAIT. (Souza)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240227131248.92910-1-maarten.lankhorst@linux.intel.com

drm/xe: Replace 'grouped target' in Makefile with pattern rule

Since 'grouped target' is used only in 'make' 4.3, it should
be avoided. Replace it with 'multi-target pattern rule' which
has the same behavior.

Fixes: 9616e74b796c ("drm/xe: Add support for OOB workarounds")
Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240302153927.2602241-1-dhirschfeld@habana.ai
[ reword commit message ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Fix ref counting leak on page fault

If a page fault occurs on VM not in fault a ref can be leaked. Fix this.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240301041036.238471-1-matthew.brost@intel.com

drm/xe/mocs: Fix DG2 kunit

LNCFCMOCS31[31:16] is read-only for DG2 and MTL, so it's not possible to
check set it. While trying to set doesn't cause any issue, later when
it's read back to check if the value got correctly recorded causes the
test to fail. Now that test is reliable for an odd number of entries,
reduce it so the last entry is ignored.

Bspec: 55267
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1253
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1233
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240228061048.3661978-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/mocs: Allow odd number of entries on test

Refactor the mocs/l3cc kunit test to support odd number of entries. This
switches out from the "check the register value" approach to check the
entry value if it makes sense from the register read. This provides an
easier output to reason about and cross check with bspec.

Some code reordering and variable re-use was also done so the 2
functions follow more or less the same logic.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240228061048.3661978-5-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/mocs: Move warn/assertion up

The warn-once in __init_mocs_table() to make sure there's an index set
for unused entries is more a sanity check that should be done as the
first thing in that function. The kunit test replicates the same check,
so also move it up and turn it into a failure condition for the test.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240228061048.3661978-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/mocs: Be explicit when logging number of entries

Make sure to log if number of entries are l3cc or mocs so it doesn't
depend on the context.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240228061048.3661978-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/mocs: Refactor mocs/l3cc loop

There's no reason to keep the assignment an condition in the same
statement, particularly making use of the comma operator. Improve
readability by doing each step on its own statement. This will make
supporting odd number of entries more easily.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240228061048.3661978-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Fix build error in xe_ggtt.c

Need to include io-64-nonatomic-lo-hi.h for writeq function.
Commit 3121fed0c51b ("drm/xe: Cleanup some layering in GGTT")
removed the xe_mmio.h include so lost the indirect include. Add it
where it's needed.

Fixes: 3121fed0c51b ("drm/xe: Cleanup some layering in GGTT")
Closes: https://lore.kernel.org/oe-kbuild-all/202402241903.R5J8hKVI-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240225001448.81513-1-matthew.brost@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Add LRC parsing for more GPU instructions

The LRCs on some of our newer platforms appear to contain a few GPU
instructions that weren't handled in our LRC parser. Add the relevant
instruction names and opcodes so that our debugfs LRC dumps will
properly indicate what these are.

Bspec: 55866, 64848, 46931
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222184009.6857-2-matthew.d.roper@intel.com

drm/xe: Remove obsolete async_ops from struct xe_vm

When sync binds were reworked and worker removed, async_ops became
obsolete. Remove it.

Fixes: f3e9b1f43458 ("drm/xe: Remove async worker and rework sync binds")
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240117110908.2362615-1-mika.kuoppala@linux.intel.com

drm/xe/xe_trace: Add move_lacks_source detail to xe_bo_move trace

Add move_lacks_source detail to xe_bo_move trace to make it readable
that is to check if it is migrate clear or migrate copy.

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: a0df2cc858c3 ("drm/xe/xe_bo_move: Enhance xe_bo_move trace")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221101950.1019312-1-priyanka.dandamudi@intel.com

drm/xe/guc: Fix missing topology init

init_steering_dss need topology dss mask to be init ahead.
Fixed by moving xe_gt_topology_init ahead of xe_gt_mcr_init

Fixes: bf8ec3c3e82c ("drm/xe: Initialize GuC earlier during probe")
Cc: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240227164922.281346-2-zhanjun.dong@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/xe2: fix 64-bit division in pte_update_size

This function does not build on 32-bit targets when the compiler
fails to reduce DIV_ROUND_UP() into a shift:

ld.lld: error: undefined symbol: __aeabi_uldivmod
>>> referenced by xe_migrate.c
>>> drivers/gpu/drm/xe/xe_migrate.o:(pte_update_size) in archive vmlinux.a

There are two instances in this function. Change the first to
use an open-coded shift with the same behavior, and the second
one to a 32-bit calculation, which is sufficient here as the size
is never more than 2^32 pages (16TB).

Fixes: 237412e45390 ("drm/xe: Enable 32bits build")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-3-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/mmio: fix build warning for BAR resize on 32-bit

clang complains about a nonsensical test on builds with a 32-bit phys_addr_t,
which means resizing will always fail:

drivers/gpu/drm/xe/xe_mmio.c:109:23: error: result of comparison of constant 4294967296 with expression of type 'resource_size_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
109 | root_res->start > 0x100000000ull)
| ~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~

Previously, BAR resize was always disallowed on 32-bit kernels, but
this apparently changed recently. Since 32-bit machines can in theory
support PAE/LPAE for large address spaces, this may end up useful,
so change the driver to shut up the warning but still work when
phys_addr_t/resource_size_t is 64 bit wide.

Fixes: 9a6e6c14bfde ("drm/xe/mmio: Use non-atomic writeq/readq variant for 32b")
Fixes: 237412e45390 ("drm/xe: Enable 32bits build")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-2-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/kunit: fix link failure with built-in xe

When the driver is built-in but the tests are in loadable modules,
the helpers don't actually get put into the driver:

ERROR: modpost: "xe_kunit_helper_alloc_xe_device" [drivers/gpu/drm/xe/tests/xe_test.ko] undefined!

Change the Makefile to ensure they are always part of the driver
even when the rest of the kunit tests are in loadable modules.

Fixes: 5095d13d758b ("drm/xe/kunit: Define helper functions to allocate fake xe device")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-1-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Deny unbinds if uapi ufence pending

If user fence was provided for MAP in vm_bind_ioctl
and it has still not been signalled, deny UNMAP of said
vma with EBUSY as long as unsignalled fence exists.

This guarantees that MAP vs UNMAP sequences won't
escape under the radar if we ever want to track the
client's state wrt to completed and accessible MAPs.
By means of intercepting the ufence release signalling.

v2: find ufence with num_fences > 1 (Matt)
v3: careful on clearing vma ufence (Matt)

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1159
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215181152.450082-3-mika.kuoppala@linux.intel.com