Michal Wajdeczko [Thu, 14 Mar 2024 17:31:26 +0000 (18:31 +0100)]
drm/xe: Define XE_REG_OPTION_VF
We will tag registers that SR-IOV Virtual Functions can access.
This will help us catch any invalid usage and/or provide custom
replacement if available.
Matt Roper [Thu, 14 Mar 2024 19:58:27 +0000 (12:58 -0700)]
drm/xe/mocs: Clarify which GT is being operated on
Switch the MOCS-related debug messages to use a GT-specific logging
function and add ID/type output to the beginning of the MOCS kunit test
to assist with debug when problems arise.
Matt Roper [Thu, 14 Mar 2024 19:58:26 +0000 (12:58 -0700)]
drm/xe/mocs: Determine MCR separately for primary/media GT in kunit test
Although MOCS registers became multicast in graphics version 12.50 on
the primary GT, this transition did not happen until version 20 on the
media GT. Considering each GT independently is mostly important for
MTL/ARL where the Xe_LPM+ IP has non-MCR MOCS registers, even though
Xe_LPG IP has MCR registers.
Starting on Xe2, the GSCCS engine reset is a 2-step process. When the
driver or the GuC hits the GDRST register, the CS is immediately reset
and a success is reported, but the GSC shim continues its reset in the
background. While the shim reset is ongoing, the CS is able to accept
new context submission, but any commands that require the shim will
be stalled until the reset is completed. This means that we can keep
submitting to the GSCCS as long as we make sure that the preemption
timeout is big enough to cover any delay introduced by the reset; since
the GSC preempt timeout is not tunable at runtime, we only need to check
that the value set in kconfig is big enough (and increase it if it
isn't).
When the shim reset completes, a specific CS interrupt is triggered,
in response to which we need to check the GSCI_TIMER_STATUS register
to see if the reset was successful or not.
Note that the GSCI_TIMER_STATUS register is not power save/restored,
so it gets reset on MC6 entry. However, a reset failure stops MC6,
so in that scenario we're always guaranteed to find the correct value.
Since we can't check the register within interrupt context, the
existing GSC worker has been updated to handle it.
The expected action to take on ER failure is to trigger a driver FLR,
but we still don't support that, so for now we just print an error. A
comment has been added to the code to keep track of the FLR requirement.
v2: Add a check for the initial timeout value (Alan)
Matthew Auld [Thu, 14 Mar 2024 12:15:55 +0000 (12:15 +0000)]
drm/xe/guc_submit: use jiffies for job timeout
drm_sched_init() expects jiffies for the timeout, but here we are
passing the timeout in ms. Convert to jiffies instead.
Fixes: eef55700f302 ("drm/xe: Add sysfs for default engine scheduler properties") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240314121554.223229-2-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
drm/xe: Skip VMAs pin when requesting signal to the last XE_EXEC
Doing a XE_EXEC with num_batch_buffer == 0 makes signals passed as
argument to be signaled when the last real XE_EXEC is completed.
But to do that it was first pinning all VMAs in drm_gpuvm_exec_lock(),
this patch remove this pinning as it is not required.
This change also help Mesa implementing memory over-commiting recovery
as it needs to unbind not needed VMAs when the whole VM can't fit
in GPU memory but it can only do the unbiding when the last XE_EXEC
is completed.
So with this change Mesa can get the signal it want without getting
out-of-memory errors.
Fixes: eb9702ad2986 ("drm/xe: Allow num_batch_buffer / num_binds == 0 in IOCTLs") Cc: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Co-developed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240313171318.121066-1-jose.souza@intel.com
Matthew Brost [Wed, 13 Mar 2024 18:44:30 +0000 (11:44 -0700)]
drm/xe: Use xe_assert in xe_device_assert_mem_access
The implementation of xe_device_assert_mem_access has a non-zero cost.
Use xe_assert rather than XE_WARN_ON so it will compile out in non-debug
kernel builds (Kconfig CONFIG_DRM_XE_DEBUG=n).
Matthew Brost [Tue, 12 Mar 2024 18:39:07 +0000 (11:39 -0700)]
drm/xe: Invalidate userptr VMA on page pin fault
Rather than return an error to the user or ban the VM when userptr VMA
page pin fails with -EFAULT, invalidate VMA mappings. This supports the
UMD use case of freeing userptr while still having bindings.
Now that non-faulting VMs can invalidate VMAs, drop the usm prefix for
the tile_invalidated member.
v2:
- Fix build error (CI)
v3:
- Don't invalidate VMA if in fault mode, rather kill VM (Thomas)
- Update commit message with tile_invalidated name chagne (Thomas)
- Wait VM bookkeep slots with VM resv lock (Thomas)
v4:
- Move list_del_init(&userptr.repin_link) after error check (Thomas)
- Assert not in fault mode (Matthew)
Matt Roper [Tue, 12 Mar 2024 21:12:25 +0000 (14:12 -0700)]
drm/xe/uapi: Add IP version and stepping to GT list query
For modern platforms (MTL and later), both kernel and userspace drivers
are expected to apply GT programming and workarounds based on the IP
version and stepping self-reported by the GT hardware via the GMD_ID
registers. Since userspace drivers can't access these registers
directly, pass along the version and stepping information via the GT
list query. Note that the new query fields will remain 0's when running
on pre-GMD_ID platforms. Userspace is expected to continue using PCI
devid / revid on those older platforms.
Although the hardware also has a GMD_ID register for display
version/stepping, that value is intentionally *not* included anywhere in
the Xe uapi. Display userspace should be using platform-agnostic APIs
and auto-detecting platform capabilities rather than matching specific
IP versions.
Suraj Kandpal [Fri, 8 Mar 2024 15:49:40 +0000 (21:19 +0530)]
drm/xe/hdcp: Fix condition for hdcp gsc cs requirement
Add condition for check of hdcp gsc cs requirement rather than
assuming gsc cs to always be required when xe is loaded. It is not
required for display version < 14
--v2
-Use display version in commit message [Lucas]
With the current state GUC_WA_RCS_REGS_IN_CCS_REGS_LIST could in theory
be removed since there is no render register being added to the list of
compute WAs. However the real issue is that 18020744125 is incomplete
and not setting the RING_HWSTAM on render as it should.
Writing this in RTP is a little more tricky as we want to write to
another's engine base when the match happens: first compute engine and
no render present. So use RING_HWSTAM(RENDER_RING_BASE) instead of the
usual XE_RTP_ACTION_FLAG(ENGINE_BASE).
Suraj Kandpal [Wed, 6 Mar 2024 02:42:48 +0000 (08:12 +0530)]
drm/xe/hdcp: Enable HDCP for XE
Enable HDCP for Xe by defining functions which take care of
interaction of HDCP as a client with the GSC CS interface.
Add intel_hdcp_gsc_message to Makefile and add corresponding
changes to xe_hdcp_gsc.c to make it build.
--v2
-add kfree at appropriate place [Daniele]
-remove useless define [Daniele]
-move host session logic to xe_gsc_submit.c [Daniele]
-call xe_gsc_check_and_update_pending directly in an if condition
[Daniele]
-use xe_device instead of drm_i915_private [Daniele]
--v3
-use xe prefix for newly exposed function [Daniele]
-remove client specific defines from intel_gsc_mtl_header [Daniele]
-add missing kfree() [Daniele]
-have NULL check for hdcp_message in finish function [Daniele]
-dont have too many variable declarations in the same line [Daniele]
--v4
-don't point the hdcp_message structure in xe_device to anything
until it properly gets initialized [Daniele]
Suraj Kandpal [Wed, 6 Mar 2024 02:42:46 +0000 (08:12 +0530)]
drm/xe/hdcp: Use xe_device struct
Use xe_device struct instead of drm_i915_private so as to not
cause confusion and comply with Xe standards as drm_i915_private is
xe_device under the hood.
Matthew Brost [Thu, 29 Feb 2024 19:45:20 +0000 (11:45 -0800)]
drm/xe: Do not grab forcewakes when issuing GGTT TLB invalidation via GuC
Forcewakes are not required for communication with the GuC via CTB
as it is a memory based interfaced. Acquring forcewakes takes
considerable time. With that, do not grab a forcewake when issuing a
GGTT TLB invalidation via the GuC.
Matt Roper [Wed, 6 Mar 2024 00:40:49 +0000 (16:40 -0800)]
drm/xe/arl: Add Arrow Lake H support
ARL-H uses the same media and display IP as MTL, and a version 12.74
graphics IP (referred to as Xe_LPG+). From a driver point of view, we
should be able to just treat the whole platform as MTL and rely on
GRAPHICS_VERx100 checks to handle any spots where ARL's Xe_LPG+ needs
different handling from MTL's Xe_LPG (i.e., workarounds).
v2: Resolve conflict and Reorder PCI ids in sorted order
v3: Append signed-off-by commiter to this commit
Matt Roper [Thu, 29 Feb 2024 07:08:05 +0000 (12:38 +0530)]
drm/xe/xelpg: Extend some workarounds to graphics version 12.74
A handful of Xe_LPG workarounds are also relevant to graphics version
12.74 as well. Extend the graphics version range for these workarounds
accordingly.
Matt Roper [Thu, 29 Feb 2024 07:08:04 +0000 (12:38 +0530)]
drm/xe/xelpg: Recognize graphics version 12.74 as Xe_LPG
Graphics version 12.74 (which is technically called "Xe_LPG+") should be
handled the same as versions Xe_LPG 12.70/12.71 by the KMD. Only the
workaround lists (handled in the next patch) will be a bit different.
Rodrigo Vivi [Fri, 1 Mar 2024 18:05:25 +0000 (13:05 -0500)]
drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion
With mem_access going away and pm_runtime getting called instead,
we need to protect these against recursions.
The put is asynchronous so there's no need to block it. However, for a
proper balance, we need to ensure that the references are taken and
restored regardless of the flow. So, let's convert them all to void and
use some direct linux/pm_runtime functions.
Rodrigo Vivi [Fri, 1 Mar 2024 18:05:24 +0000 (13:05 -0500)]
drm/xe: Create a xe_pm_runtime_resume_and_get variant for display
Introduce the resume and get to fulfill the display need for checking
if the device was actually resumed (or it is awake) and the reference
was taken.
Then we can convert the remaining cases to a void function and have
individual functions for individual cases.
Also, already start this new function protected from the runtime
recursion, since runtime_pm will need to call for display functions
for a proper D3Cold flow.
Rodrigo Vivi [Fri, 1 Mar 2024 18:05:23 +0000 (13:05 -0500)]
drm/xe: Fix display runtime_pm handling
i915's intel_runtime_pm_get_if_in_use actually calls the
pm_runtime_get_if_active() with ign_usage_count = false, but Xe
was erroneously calling it with true because of the mem_access cases.
This can lead to unnecessary references getting hold here and device
never getting into the runtime suspended state.
Let's use directly the 'if_in_use' function provided by linux/pm_runtime.
Also, already start this new function protected from the runtime
recursion, since runtime_pm will need to call for display functions
for a proper D3Cold flow.
v2: Update commit message based on Matt's feedback.
Fix return condition of pm_runtime_get_if_in_use (Matt)
Lucas De Marchi [Wed, 28 Feb 2024 06:10:48 +0000 (22:10 -0800)]
drm/xe/mocs: Fix DG2 kunit
LNCFCMOCS31[31:16] is read-only for DG2 and MTL, so it's not possible to
check set it. While trying to set doesn't cause any issue, later when
it's read back to check if the value got correctly recorded causes the
test to fail. Now that test is reliable for an odd number of entries,
reduce it so the last entry is ignored.
Bspec: 55267 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1253 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1233 Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240228061048.3661978-6-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Lucas De Marchi [Wed, 28 Feb 2024 06:10:47 +0000 (22:10 -0800)]
drm/xe/mocs: Allow odd number of entries on test
Refactor the mocs/l3cc kunit test to support odd number of entries. This
switches out from the "check the register value" approach to check the
entry value if it makes sense from the register read. This provides an
easier output to reason about and cross check with bspec.
Some code reordering and variable re-use was also done so the 2
functions follow more or less the same logic.
Lucas De Marchi [Wed, 28 Feb 2024 06:10:46 +0000 (22:10 -0800)]
drm/xe/mocs: Move warn/assertion up
The warn-once in __init_mocs_table() to make sure there's an index set
for unused entries is more a sanity check that should be done as the
first thing in that function. The kunit test replicates the same check,
so also move it up and turn it into a failure condition for the test.
Lucas De Marchi [Wed, 28 Feb 2024 06:10:44 +0000 (22:10 -0800)]
drm/xe/mocs: Refactor mocs/l3cc loop
There's no reason to keep the assignment an condition in the same
statement, particularly making use of the comma operator. Improve
readability by doing each step on its own statement. This will make
supporting odd number of entries more easily.
Matthew Brost [Sun, 25 Feb 2024 00:14:48 +0000 (16:14 -0800)]
drm/xe: Fix build error in xe_ggtt.c
Need to include io-64-nonatomic-lo-hi.h for writeq function.
Commit 3121fed0c51b ("drm/xe: Cleanup some layering in GGTT")
removed the xe_mmio.h include so lost the indirect include. Add it
where it's needed.
Fixes: 3121fed0c51b ("drm/xe: Cleanup some layering in GGTT") Closes: https://lore.kernel.org/oe-kbuild-all/202402241903.R5J8hKVI-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240225001448.81513-1-matthew.brost@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Matt Roper [Thu, 22 Feb 2024 18:40:08 +0000 (10:40 -0800)]
drm/xe: Add LRC parsing for more GPU instructions
The LRCs on some of our newer platforms appear to contain a few GPU
instructions that weren't handled in our LRC parser. Add the relevant
instruction names and opcodes so that our debugfs LRC dumps will
properly indicate what these are.
Arnd Bergmann [Mon, 26 Feb 2024 12:46:38 +0000 (13:46 +0100)]
drm/xe/xe2: fix 64-bit division in pte_update_size
This function does not build on 32-bit targets when the compiler
fails to reduce DIV_ROUND_UP() into a shift:
ld.lld: error: undefined symbol: __aeabi_uldivmod
>>> referenced by xe_migrate.c
>>> drivers/gpu/drm/xe/xe_migrate.o:(pte_update_size) in archive vmlinux.a
There are two instances in this function. Change the first to
use an open-coded shift with the same behavior, and the second
one to a 32-bit calculation, which is sufficient here as the size
is never more than 2^32 pages (16TB).
Arnd Bergmann [Mon, 26 Feb 2024 12:46:37 +0000 (13:46 +0100)]
drm/xe/mmio: fix build warning for BAR resize on 32-bit
clang complains about a nonsensical test on builds with a 32-bit phys_addr_t,
which means resizing will always fail:
drivers/gpu/drm/xe/xe_mmio.c:109:23: error: result of comparison of constant 4294967296 with expression of type 'resource_size_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
109 | root_res->start > 0x100000000ull)
| ~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~
Previously, BAR resize was always disallowed on 32-bit kernels, but
this apparently changed recently. Since 32-bit machines can in theory
support PAE/LPAE for large address spaces, this may end up useful,
so change the driver to shut up the warning but still work when
phys_addr_t/resource_size_t is 64 bit wide.
Fixes: 9a6e6c14bfde ("drm/xe/mmio: Use non-atomic writeq/readq variant for 32b") Fixes: 237412e45390 ("drm/xe: Enable 32bits build") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-2-arnd@kernel.org Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Mika Kuoppala [Thu, 15 Feb 2024 18:11:52 +0000 (20:11 +0200)]
drm/xe: Deny unbinds if uapi ufence pending
If user fence was provided for MAP in vm_bind_ioctl
and it has still not been signalled, deny UNMAP of said
vma with EBUSY as long as unsignalled fence exists.
This guarantees that MAP vs UNMAP sequences won't
escape under the radar if we ever want to track the
client's state wrt to completed and accessible MAPs.
By means of intercepting the ufence release signalling.
v2: find ufence with num_fences > 1 (Matt)
v3: careful on clearing vma ufence (Matt)
Mika Kuoppala [Thu, 15 Feb 2024 18:11:51 +0000 (20:11 +0200)]
drm/xe: Expose user fence from xe_sync_entry
By allowing getting reference to user fence, we can
control the lifetime outside of sync entries.
This is needed to allow vma to track the associated
user fence that was provided with bind ioctl.
v2: xe_user_fence can be kept opaque (Jani, Matt)
v3: indent fix (Matt)
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240215181152.450082-2-mika.kuoppala@linux.intel.com
Matthew Brost [Fri, 23 Feb 2024 20:46:59 +0000 (12:46 -0800)]
drm/xe/guc: Handle timing out of signaled jobs gracefully
Timing out of signaled jobs can happen during regular operations (e.g.
an exec queue closed immediately after last fence signaled). The TDR can
pass the worker which free jobs. Rather than running through the TDR if
signaled job is found, simply free it without any debug messages.
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reported-by: José Roberto de Souza <jose.souza@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1271 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Tested-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223204659.40750-1-matthew.brost@intel.com
Paulo Zanoni [Thu, 15 Feb 2024 00:53:53 +0000 (16:53 -0800)]
drm/xe: get rid of MAX_BINDS
Mesa has been issuing a single bind operation per ioctl since xe.ko
changed to GPUVA due xe.ko bug #746. If I change Mesa to try again to
issue every single bind operation it can in the same ioctl, it hits
the MAX_BINDS assertion when running Vulkan conformance tests.
Test dEQP-VK.sparse_resources.transfer_queue.3d.rgba32i.1024_128_8
issues 960 bind operations in a single ioctl, it's the most I could
find in the conformance suite.
I don't see a reason to keep the MAX_BINDS restriction: it doesn't
seem to be preventing any specific issue. If the number is too big for
the memory allocations, then those will fail. Nothing related to
num_binds seems to be using the stack. Let's just get rid of it.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Testcase: dEQP-VK.sparse_resources.transfer_queue.3d.rgba32i.1024_128_8
References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/746 Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240215005353.1295420-1-paulo.r.zanoni@intel.com
Matthew Brost [Mon, 26 Feb 2024 15:55:54 +0000 (07:55 -0800)]
drm/xe: Use vmalloc for array of bind allocation in bind IOCTL
Use vmalloc in effort to allow a user pass in a large number of binds in
an IOCTL (mesa use case). Also use array allocations rather open coding
the size calculation.
Francois Dugast [Thu, 8 Feb 2024 18:35:39 +0000 (10:35 -0800)]
drm/xe: Extend uAPI to query HuC micro-controler firmware version
The infrastructure to query GuC firmware version is already in place. It
is extended with a new micro-controller type to query the HuC firmware
version. It can be used from user space to know if HuC is running.
Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240208183539.185095-2-jose.souza@intel.com
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:36 +0000 (11:39 -0500)]
drm/xe: Convert gt_reset from mem_access to xe_pm_runtime
We need to ensure that device is in D0 on any kind of GT reset.
We are likely already protected by outer bounds like exec,
but if exec/sched ref gets dropped on a hang, we might transition
to D3 before we are able to perform the gt_reset and recover.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:35 +0000 (11:39 -0500)]
drm/xe: Remove mem_access from suspend and resume functions
At these points, we are sure that device is awake in D0.
Likely in the middle of the transition, but awake. So,
these extra protections are useless. Let's remove it and
continue with the killing of xe_device_mem_access.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:31 +0000 (11:39 -0500)]
drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls
Continue on the path to entirely remove mem_access helpers in
favour of the direct xe_pm_runtime calls. This item is one of
the direct outer bounds of the protection.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:28 +0000 (11:39 -0500)]
drm/xe: Runtime PM wake on every sysfs call
Let's ensure our PCI device is awaken on every sysfs call.
Let's increase the runtime_pm protection and start moving
that to the outer bounds.
For now, for the files with small number of attr functions,
let's only call the runtime pm functions directly.
For the hw_engines entries with many files, let's add
the sysfs_ops wrapper.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:27 +0000 (11:39 -0500)]
drm/xe: Convert kunit tests from mem_access to xe_pm_runtime
Let's convert the kunit tests that are currently relying on
xe_device_mem_access_{get,put} towards the direct xe_pm_runtime_{get,put}.
While doing this we need to move the get/put calls towards the outer
bounds of the tests to ensure consistency with the other usages of
pm_runtime on the regular paths.
v2: include xe_pm.h in tests/xe_mocs.c and sort the include block
while at it.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:26 +0000 (11:39 -0500)]
drm/xe: Runtime PM wake on every IOCTL
Let's ensure our PCI device is awaken on every IOCTL entry.
Let's increase the runtime_pm protection and start moving
that to the outer bounds.
v2: minor typo fix and renaming function to make it clear
that is intended to be used by ioctl only. (Matt)
v3: Make it NULL if CONFIG_COMPAT is not selected.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:25 +0000 (11:39 -0500)]
drm/xe: Convert mem_access assertion towards the runtime_pm state
The mem_access helpers are going away and getting replaced by
direct calls of the xe_pm_runtime_{get,put} functions. However, an
assertion with a warning splat is desired when we hit the worst
case of a memory access with the device really in the 'suspended'
state.
Also, this needs to be the first step. Otherwise, the upcoming
conversion would be really noise with warn splats of missing mem_access
gets.
Matthew Brost [Thu, 22 Feb 2024 23:20:21 +0000 (15:20 -0800)]
drm/xe: Don't support execlists in xe_gt_tlb_invalidation layer
The xe_gt_tlb_invalidation layer implements TLB invalidations for a GuC
backend. Simply return if in execlists mode. A follow up may properly
implement the xe_gt_tlb_invalidation layer for both GuC and execlists.
Matthew Brost [Thu, 22 Feb 2024 23:20:19 +0000 (15:20 -0800)]
drm/xe: Fix execlist splat
Although execlist submission is not supported it should be kept in a
basic working state as it can be used for very early hardware bring up.
Fix the below splat.
Francois Dugast [Thu, 22 Feb 2024 23:23:56 +0000 (18:23 -0500)]
drm/xe/uapi: Remove unused flags
Those cases missed in previous uAPI cleanups were mostly accidentally
brought in from i915 or created to exercise the possibilities of gpuvm
but they are not used by userspace yet, so let's remove them. They can
still be brought back later if needed.
v2:
- Fix XE_VM_FLAG_FAULT_MODE support in xe_lrc.c (Brian Welty)
- Leave DRM_XE_VM_BIND_OP_UNMAP_ALL (José Roberto de Souza)
- Ensure invalid flag values are rejected (Rodrigo Vivi)
v3: Rebase after removal of persistent exec_queues (Francois Dugast)
v4: Rodrigo: Rebase after the new dumpable flag.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240222232356.175431-1-rodrigo.vivi@intel.com
the preferred way in the kernel is to use the struct_size() helper to
do the arithmetic instead of the argument "size + size * count" in the
kzalloc() function.
This way, the code is more readable and more safer.
Lucas De Marchi [Thu, 22 Feb 2024 14:41:24 +0000 (06:41 -0800)]
drm/xe: Use pointers in trace events
Commit a0df2cc858c3 ("drm/xe/xe_bo_move: Enhance xe_bo_move trace")
inadvertently reverted commit 8d038f49c1f3 ("drm/xe: Fix cast on trace
variable"), breaking the build on 32bits.
As noted by Ville, there's no point in converting the pointers to u64
and add casts everywhere. In fact, it's better to just use %p and let
the address be hashed. Convert all the cases in xe_trace.h to use
pointers.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Priyanka Dandamudi <priyanka.dandamudi@intel.com> Cc: Oak Zeng <oak.zeng@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240222144125.2862546-1-lucas.demarchi@intel.com
Dafna Hirschfeld [Wed, 21 Feb 2024 08:36:22 +0000 (10:36 +0200)]
drm/xe: Do not include current dir for generated/xe_wa_oob.h
The generated file 'generated/xe_wa_oob.h' is included using:
"generated/xe_wa_oob.h"
which first look inside the source code. But the file resides
in the build directory and should therefore be included using:
<generated/xe_wa_oob.h>
drm/xe: Implement VM snapshot support for BO's and userptr
Since we cannot immediately capture the BO's and userptr, perform it in
2 stages. The immediate stage takes a reference to each BO and userptr,
while a delayed worker captures the contents and then frees the
reference.
This is required because in signaling context, no locks can be taken, no
memory can be allocated, and no waits on userspace can be performed.
With the delayed worker, all of this can be performed very easily,
without having to resort to hacks.
Changes since v1:
- Fix crash on NULL captured vm.
- Use ascii85_encode to capture BO contents and save some space.
- Add length to coredump output for each captured area.
Changes since v2:
- Dump each mapping on their own line, to simplify tooling.
- Fix null pointer deref in xe_vm_snapshot_free.
Changes since v3:
- Don't add uninitialized value to snap->ofs. (Souza)
- Use kernel types for u32 and u64.
- Move snap_mutex destruction to final vm destruction. (Souza)
Changes since v4:
- Remove extra memset. (Souza)
Add the flag XE_VM_BIND_FLAG_DUMPABLE to notify devcoredump that this
mapping should be dumped.
This is not hooked up, but the uapi should be ready before merging.
It's likely easier to dump the contents of the bo's at devcoredump
readout time, so it's better if the bos will stay unmodified after
a hang. The NEEDS_CPU_MAPPING flag is removed as requirement.
Ashutosh Dixit [Tue, 6 Feb 2024 19:27:31 +0000 (11:27 -0800)]
drm/xe/xe_gt_idle: Drop redundant newline in name
Newline in name is redunant and produces an unnecessary empty line during
'cat name'. Newline is added during sysfs_emit. See '27a1a1e2e47d ("drm/xe:
stringify the argument to avoid potential vulnerability")'.
v2: Add Fixes tag (Riana)
Fixes: 7b076d14f21a ("drm/xe/mtl: Add support to get C6 residency/status of MTL") Reviewed-by: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Michał Winiarski [Mon, 19 Feb 2024 13:05:30 +0000 (14:05 +0100)]
drm/xe: Initialize GuC earlier during probe
SR-IOV VF has limited access to MMIO registers. Fortunately, it is able
to access a curated subset that is needed to initialize the driver by
communicating with SR-IOV PF using GuC CT.
Initialize GuC earlier in order to keep the unified probe ordering
between VF and PF modes.
Michał Winiarski [Mon, 19 Feb 2024 13:05:29 +0000 (14:05 +0100)]
drm/xe/guc: Move GuC power control init to "post-hwconfig"
SLPC is not used at "hwconfig" stage. Move the initialization of data
structures used for SLPC to a later point in probe.
Also - move the xe_guc_pc_init_early to happen just prior to initial
"hwconfig" load.
Michał Winiarski [Mon, 19 Feb 2024 13:05:28 +0000 (14:05 +0100)]
drm/xe/huc: Realloc HuC FW in vram for post-hwconfig
Similar to GuC, we're using system memory for the initial stage, and
move the image to vram when it's available for subsequent loads (e.g.
after reset).
Michał Winiarski [Mon, 19 Feb 2024 13:05:27 +0000 (14:05 +0100)]
drm/xe/guc: Allocate GuC data structures in system memory for initial load
GuC load will need to happen at an earlier point in probe, where local
memory is not yet available. Use system memory for GuC data structures
used for initial "hwconfig" load, and realloc at a later,
"post-hwconfig" load if needed, when local memory is available.