git.dujemihanovic.xyz Git

drm/i915: 2 GiB of relocations ought to be enough for anybody*

Kernel test robot reports i915 can hit a warn in kvmalloc_node which has
a purpose of dissalowing crazy size kernel allocations. This was added in
7661809d493b ("mm: don't allow oversized kvmalloc() calls"):

       /* Don't even allow crazy sizes */
       if (WARN_ON_ONCE(size > INT_MAX))
               return NULL;

This would be kind of okay since i915 at one point dropped the need for
making a shadow copy of the relocation list, but then it got re-added in
fd1500fcd442 ("Revert "drm/i915/gem: Drop relocation slowpath".") a year
after Linus added the above warning.

It is plausible that the issue was not seen until now because to trigger
gem_exec_reloc test requires a combination of an relatively older
generation hardware but with at least 8GiB of RAM installed. Probably even
more depending on runtime checks.

Lets cap what we allow userspace to pass in using the matching limit.
There should be no issue for real userspace since we are talking about
"crazy" number of relocations which have no practical purpose.

*) Well IGT tests might get upset but they can be easily adjusted.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202405151008.6ddd1aaf-oliver.sang@intel.com
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20240521101201.18978-1-tursulin@igalia.com

drm/i915: Attempt to get pages without eviction first

In commit a78a8da51b36 ("drm/ttm: replace busy placement with flags v6"),
__i915_ttm_get_pages was updated to use flags instead of the separate
'busy' placement list. However, the behaviour was subtly changed.
Originally, the function would attempt to use the preferred placement
without eviction, and give an opportunity to restart the operation
before falling back to allowing eviction.

This was unintentionally changed, as the preferred placement was not
given the TTM_PL_FLAG_DESIRED flag, and so eviction could be triggered
in that first pass. This caused thrashing, and a significant performance
regression on DG2 systems with small BAR. For example, Minecraft and
Team Fortress 2 would drop to single-digit framerates.

Restore the original behaviour by marking the initial placement as
desired on that first attempt. Also, rework this to use a separate
struct ttm_palcement, as the individual placements are marked 'const',
so hot-patching the flags is even more dodgy than before.

Thanks to Justin Brewer for bisecting this.

Fixes: a78a8da51b36 ("drm/ttm: replace busy placement with flags v6")
Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11255
Signed-off-by: David Gow <david@davidgow.net>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240804091851.122186-3-david@davidgow.net

drm/i915: Allow evicting to use the requested placement

In commit a78a8da51b36 ("drm/ttm: replace busy placement with flags v6"),
the old system of having a separate placement list (for placements
which should be used without eviction) and a 'busy' placement list (for
placements which should be attempted if eviction is required) was
replaced with a new one where placements could be marked 'FALLBACK' (to
be attempted if eviction is required) or 'DESIRED' (to be attempted
first, but not if eviction is required).

i915 had always included the requested placement in the list of
'busy' placements: i.e., the placement could be used either if eviction
is required or not. But when the new system was put in place, the
requested (first) placement was marked 'DESIRED', so would never be used
if eviction became necessary. While a bug in the original commit
prevented this flag from working, when this was fixed in
4a0e7b3c ("drm/i915: fix applying placement flag"), it caused long hangs
on DG2 systems with small BAR.

Don't mark the requested placement DESIRED (or FALLBACK), allowing it to
be used in both situations. This matches the old behaviour, and resolves
the hangs.

Thanks to Justin Brewer for bisecting the issue.

Fixes: a78a8da51b36 ("drm/ttm: replace busy placement with flags v6")
Fixes: 4a0e7b3c3753 ("drm/i915: fix applying placement flag")
Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11255
Signed-off-by: David Gow <david@davidgow.net>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240804091851.122186-2-david@davidgow.net

drm/i915/gem: Fix Virtual Memory mapping boundaries calculation

Calculating the size of the mapped area as the lesser value
between the requested size and the actual size does not consider
the partial mapping offset. This can cause page fault access.

Fix the calculation of the starting and ending addresses, the
total size is now deduced from the difference between the end and
start addresses.

Additionally, the calculations have been rewritten in a clearer
and more understandable form.

Fixes: c58305af1835 ("drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass")
Reported-by: Jann Horn <jannh@google.com>
Co-developed-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: Jonathan Cavitt <Jonathan.cavitt@intel.com>
[Joonas: Add Requires: tag]
Requires: 60a2066c5005 ("drm/i915/gem: Adjust vma offset for framebuffer mmap offset")
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240802083850.103694-3-andi.shyti@linux.intel.com

drm/i915/gem: Adjust vma offset for framebuffer mmap offset

When mapping a framebuffer object, the virtual memory area (VMA)
offset ('vm_pgoff') should be adjusted by the start of the
'vma_node' associated with the object. This ensures that the VMA
offset is correctly aligned with the corresponding offset within
the GGTT aperture.

Increment vm_pgoff by the start of the vma_node with the offset=
provided by the user.

Suggested-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
[Joonas: Add Cc: stable]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240802083850.103694-2-andi.shyti@linux.intel.com

drm/i915/gt: Empty uabi engines list during intel_engines_release()

While the uabi_engines_llist is populated in intel_engines_init() during
driver load, the corresponding function intel_engines_release() does not
correctly get rid of it. This can lead to a UAF if, after failed
initialization (for example when gt is set wedged on init), we try to
access the engines.

Suggested-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Krzysztof Niemiec <krzysztof.niemiec@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240801154047.115176-2-krzysztof.niemiec@intel.com

drm/i915/gt: Add Wa_14019789679

Wa_14019789679 implementation for MTL, ARL and DG2.

v2: Corrected condition

v3:
   - Fix indentation (Jani Nikula)
   - dword size should be 0x1 and
     initialize dword to 0 instead of MI_NOOP (Tejas)
   - Use IS_GFX_GT_IP_RANGE() (Tejas)

v4:
   - 3DSTATE_MESH_CONTROL instruction is 3 dwords long
     Align with dword size. (Roper, Matthew D)
   - Add RCS engine check. (Tejas)

Bspec: 47083

Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240731155614.3460645-1-nitin.r.gote@intel.com

drm/i915/guc: Extend w/a 14019159160

There is a new part to an existing workaround, so enable that piece as
well.

v2: Extend even further.
v3: Drop DG2 as there are CI failures still to resolve. Also re-order
the parameters to a function to reduce excessive line wrapping.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240622004636.662081-3-John.C.Harrison@Intel.com

drm/i915/arl: Enable Wa_14019159160 for ARL

The context switch out workaround also applies to ARL.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240622004636.662081-2-John.C.Harrison@Intel.com

drm/i915: Allow NULL memory region

Prevent a NULL pointer access in intel_memory_regions_hw_probe.

Fixes: 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup warning")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11704
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240712214156.3969584-1-jonathan.cavitt@intel.com

drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8

We're seeing a GPU hang issue on a CHV platform, which was caused by commit
bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for
Gen8").

The Gen8 platform only supports timeslicing and doesn't have a preemption
mechanism, as its engines do not have a preemption timer.

Commit 751f82b353a6 ("drm/i915/gt: Only disable preemption on Gen8 render
engines") addressed this issue only for render engines. This patch extends
that fix by ensuring that preemption is not considered for all engines on
Gen8 platforms.

v4:
- Use the correct Fixes tag (Rodrigo Vivi)
- Reworded commit log (Andi Shyti)

v3:
- Inside need_preempt(), condition of can_preempt() is not required
as simplified can_preempt() is enough. (Chris Wilson)

v2: Simplify can_preempt() function (Tvrtko Ursulin)

Fixes: 751f82b353a6 ("drm/i915/gt: Only disable preemption on gen8 render engines")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11396
Suggested-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
CC: <stable@vger.kernel.org> # v5.12+
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711163208.1355736-1-nitin.r.gote@intel.com

i915/perf: Remove code to update PWR_CLK_STATE for gen12

PWR_CLK_STATE only needs to be modified up until gen11. For gen12 this
code is not applicable. Remove code to update context image with
PWR_CLK_STATE for gen12.

Fixes: 00a7f0d7155c ("drm/i915/tgl: Add perf support on TGL")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240629005643.3050678-1-umesh.nerlige.ramappa@intel.com

drm/i915/gem: Suppress oom warning in favour of ENOMEM to userspace

We report object allocation failures to userspace with ENOMEM
so add __GFP_NOWARN to remove superfluous oom warnings.

Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/4936
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240626143318.11600-1-nirmoy.das@intel.com

drm/i915/gem: Use the correct format specifier for resource_size_t

Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup
warning") adds a debug message where the "lmem_size" and
"dsm_base" variables are printed using the %lli identifier.

However, these variables are defined as resource_size_t, which
are unsigned long for 32-bit machines and unsigned long long for
64-bit machines.

The documentation (core-api/printk-formats.rst) recommends using
the %pa specifier for printing addresses and sizes of resources.

Replace %lli with %pa.

This patch also mutes the following sparse warning when compiling
with:

   make W=1 ARCH=i386 drivers/gpu/drm/i915

>> drivers/gpu/drm/i915/gem/i915_gem_stolen.c:941:5: error: format '%lli'
   expects argument of type 'long long int', but argument 5 has type
   'resource_size_t' {aka 'unsigned int'} [-Werror=format=]

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240617184243.330231-3-andi.shyti@linux.intel.com

drm/i915/gem: Return NULL instead of '0'

Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup
warning") returns '0' from i915_gem_stolen_lmem_setup(), but it's
supposed to return a pointer to the intel_memory_region
structure.

Sparse complains with the following message:

>> drivers/gpu/drm/i915/gem/i915_gem_stolen.c:943:32: sparse: sparse:
Using plain integer as NULL pointer

Return NULL.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240617184243.330231-2-andi.shyti@linux.intel.com

drm/i915/gt: Fix potential UAF by revoke of fence registers

CI has been sporadically reporting the following issue triggered by
igt@i915_selftest@live@hangcheck on ADL-P and similar machines:

<6> [414.049203] i915: Running intel_hangcheck_live_selftests/igt_reset_evict_fence
...
<6> [414.068804] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
<6> [414.068812] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
<3> [414.070354] Unable to pin Y-tiled fence; err:-4
<3> [414.071282] i915_vma_revoke_fence:301 GEM_BUG_ON(!i915_active_is_idle(&fence->active))
...
<4>[  609.603992] ------------[ cut here ]------------
<2>[  609.603995] kernel BUG at drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c:301!
<4>[  609.604003] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[  609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G     U  W          6.9.0-CI_DRM_14785-g1ba62f8cea9c+ #1
<4>[  609.604008] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
<4>[  609.604010] Workqueue: i915 __i915_gem_free_work [i915]
<4>[  609.604149] RIP: 0010:i915_vma_revoke_fence+0x187/0x1f0 [i915]
...
<4>[  609.604271] Call Trace:
<4>[  609.604273]  <TASK>
...
<4>[  609.604716]  __i915_vma_evict+0x2e9/0x550 [i915]
<4>[  609.604852]  __i915_vma_unbind+0x7c/0x160 [i915]
<4>[  609.604977]  force_unbind+0x24/0xa0 [i915]
<4>[  609.605098]  i915_vma_destroy+0x2f/0xa0 [i915]
<4>[  609.605210]  __i915_gem_object_pages_fini+0x51/0x2f0 [i915]
<4>[  609.605330]  __i915_gem_free_objects.isra.0+0x6a/0xc0 [i915]
<4>[  609.605440]  process_scheduled_works+0x351/0x690
...

In the past, there were similar failures reported by CI from other IGT
tests, observed on other platforms.

Before commit 63baf4f3d587 ("drm/i915/gt: Only wait for GPU activity
before unbinding a GGTT fence"), i915_vma_revoke_fence() was waiting for
idleness of vma->active via fence_update().   That commit introduced
vma->fence->active in order for the fence_update() to be able to wait
selectively on that one instead of vma->active since only idleness of
fence registers was needed.  But then, another commit 0d86ee35097a
("drm/i915/gt: Make fence revocation unequivocal") replaced the call to
fence_update() in i915_vma_revoke_fence() with only fence_write(), and
also added that GEM_BUG_ON(!i915_active_is_idle(&fence->active)) in front.
No justification was provided on why we might then expect idleness of
vma->fence->active without first waiting on it.

The issue can be potentially caused by a race among revocation of fence
registers on one side and sequential execution of signal callbacks invoked
on completion of a request that was using them on the other, still
processed in parallel to revocation of those fence registers.  Fix it by
waiting for idleness of vma->fence->active in i915_vma_revoke_fence().

Fixes: 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10021
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.8+
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240603195446.297690-2-janusz.krzysztofik@linux.intel.com

drm/i915/gt/uc: Evaluate GuC priority within locks

The ce->guc_state.lock was made to protect guc_prio, which
indicates the GuC priority level.

But at the begnning of the function we perform some sanity check
of guc_prio outside its protected section. Move them within the
locked region.

Use this occasion to expand the if statement to make it clearer.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240613222402.551625-2-andi.shyti@linux.intel.com

drm/i915/gt/uc: Fix typo in comment

Replace "dynmically" with "dynamically".

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240613222837.552277-1-andi.shyti@linux.intel.com

drm/i915/gem: Downgrade stolen lmem setup warning

In the case where lmem_size < dsm_base, hardware is reporting that
stolen lmem is unusable. In this case, instead of throwing a warning,
we can continue execution as normal by disabling stolen LMEM support.
For example, this change will allow the following error report from
ATS-M to no longer apply:

<6> [144.859887] pcieport 0000:4b:00.0: bridge window [mem 0xb1000000-0xb11fffff]
<6> [144.859900] pcieport 0000:4b:00.0: bridge window [mem 0x3bbc00000000-0x3bbc17ffffff 64bit pref]
<6> [144.859917] pcieport 0000:4c:01.0: PCI bridge to [bus 4d-4e]
<6> [144.859932] pcieport 0000:4c:01.0: bridge window [mem 0xb1000000-0xb11fffff]
<6> [144.859945] pcieport 0000:4c:01.0: bridge window [mem 0x3bbc00000000-0x3bbc17ffffff 64bit pref]
<6> [144.859984] i915 0000:4d:00.0: [drm] BAR2 resized to 256M
<6> [144.860640] i915 0000:4d:00.0: [drm] Using a reduced BAR size of 256MiB. Consider enabling 'Resizable BAR' or similar, if available in the BIOS.
<4> [144.860719] -----------[ cut here ]-----------
<4> [144.860727] WARNING: CPU: 17 PID: 1815 at drivers/gpu/drm/i915/gem/i915_gem_stolen.c:939 i915_gem_stolen_lmem_setup+0x38c/0x430 [i915]
<4> [144.861430] Modules linked in: i915 snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem drm_shmem_helper prime_numbers i2c_algo_bit ttm video drm_display_helper drm_buddy fuse x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe mdio irqbypass ptp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pps_core i2c_i801 mei_me i2c_smbus mei wmi acpi_power_meter [last unloaded: i915]
<4> [144.861611] CPU: 17 PID: 1815 Comm: i915_module_loa Tainted: G U 6.8.0-rc5-drmtip_1515-g78f49af27723+ #1
<4> [144.861624] Hardware name: Intel Corporation WHITLEY/WHITLEY, BIOS SE5C6200.86B.0020.P41.2109300305 09/30/2021
<4> [144.861632] RIP: 0010:i915_gem_stolen_lmem_setup+0x38c/0x430 [i915]
<4> [144.862287] Code: ff 41 c1 e4 05 e9 ac fe ff ff 4d 63 e4 48 89 ef 48 85 ed 74 04 48 8b 7d 08 48 c7 c6 10 a3 7b a0 e8 e9 90 43 e1 e9 ee fd ff ff <0f> 0b 49 c7 c4 ed ff ff ff e9 e0 fd ff ff 0f b7 d2 48 c7 c6 00 d9
<4> [144.862299] RSP: 0018:ffffc90005607980 EFLAGS: 00010207
<4> [144.862315] RAX: fffffffffff00000 RBX: 0000000000000003 RCX: 0000000000000000

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10833
Suggested-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422135959.4127003-1-jonathan.cavitt@intel.com

drm/i915/gt: debugfs: Evaluate forcewake usage within locks

The forcewake count and domains listing is multi process critical
and the uncore provides a spinlock for such cases.

Lock the forcewake evaluation section in the fw_domains_show()
debugfs interface.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240607145131.217251-1-andi.shyti@linux.intel.com

drm/i915/mtl: Update workaround 14018575942

The WA should be extended to cover VDBOX engine. We found that
28-channels 1080p VP9 encoding may hit this issue.

v3: update the WA number and explain the reason why
this workaround is needed
v2: add WA number
v1: initial version

Signed-off-by: Angus Chen <angus.chen@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426073638.45775-1-angus.chen@intel.com

drm/i915/guc: Enable w/a 16021333562 for DG2, MTL and ARL

Enable another workaround that is implemented inside the GuC.

v2: Use the correct Gen12 w/a id rather than the Xe version (review
feedback from Matthew R) also extend to include ARL.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240528230515.479395-1-John.C.Harrison@Intel.com

drm/i915/gt: Delete the live_hearbeat_fast selftest

The test is trying to push the heartbeat frequency to the limit, which
might sometimes fail. Such a failure does not provide valuable
information, because it does not indicate that there is something
necessarily wrong with either the driver or the hardware.

Remove the test to prevent random, unnecessary failures from appearing
in CI.

Suggested-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Niemiec, Krzysztof <krzysztof.niemiec@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/fe2vu5h7v7ooxbhwpbfsypxg5mjrnt56gc3cgrqpnhgrgce334@qfrv2skxrp47

drm/i915: Increase FLR timeout from 3s to 9s

Following the guidelines it takes 3 seconds to perform an FLR
reset. Let's give it a bit more slack because this time can
change depending on the platform and on the firmware

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240523235853.171796-1-andi.shyti@linux.intel.com

drm/i915/gt: Fix CCS id's calculation for CCS mode setting

The whole point of the previous fixes has been to change the CCS
hardware configuration to generate only one stream available to
the compute users. We did this by changing the info.engine_mask
that is set during device probe, reset during the detection of
the fused engines, and finally reset again when choosing the CCS
mode.

We can't use the engine_mask variable anymore, as with the
current configuration, it imposes only one CCS no matter what the
hardware configuration is.

Before changing the engine_mask for the third time, save it and
use it for calculating the CCS mode.

After the previous changes, the user reported a performance drop
to around 1/4. We have tested that the compute operations, with
the current patch, have improved by the same factor.

Fixes: 6db31251bb26 ("drm/i915/gt: Enable only one CCS for compute workload")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Gnattu OC <gnattuoc@me.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Jian Ye <jian.ye@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Tested-by: Gnattu OC <gnattuoc@me.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240517090616.242529-1-andi.shyti@linux.intel.com

drm/i915/guc: avoid FIELD_PREP warning

With gcc-7 and earlier, there are lots of warnings like

In file included from <command-line>:0:0:
In function '__guc_context_policy_add_priority.isra.66',
    inlined from '__guc_context_set_prio.isra.67' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3292:3,
    inlined from 'guc_context_set_prio' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3320:2:
include/linux/compiler_types.h:399:38: error: call to '__compiletime_assert_631' declared with attribute error: FIELD_PREP: mask is not constant
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
                                      ^
...
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:2422:3: note: in expansion of macro 'FIELD_PREP'
   FIELD_PREP(GUC_KLV_0_KEY, GUC_CONTEXT_POLICIES_KLV_ID_##id) | \
   ^~~~~~~~~~

Make sure that GUC_KLV_0_KEY is an unsigned value to avoid the warning.

Fixes: 77b6f79df66e ("drm/i915/guc: Update to GuC version 69.0.3")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430164809.482131-1-julia.filipchuk@intel.com

drm/i915: Support replaying GPU hangs with captured context image

When debugging GPU hangs Mesa developers are finding it useful to replay
the captured error state against the simulator. But due various simulator
limitations which prevent replicating all hangs, one step further is being
able to replay against a real GPU.

This is almost doable today with the missing part being able to upload the
captured context image into the driver state prior to executing the
uploaded hanging batch and all the buffers.

To enable this last part we add a new context parameter called
I915_CONTEXT_PARAM_CONTEXT_IMAGE. It follows the existing SSEU
configuration pattern of being able to select which context to apply
against, paired with the actual image and its size.

Since this is adding a new concept of debug only uapi, we hide it behind
a new kconfig option and also require activation with a module parameter.
Together with a warning banner printed at driver load, all those combined
should be sufficient to guard against inadvertently enabling the feature.

In terms of implementation we allow the legacy context set param to be
used since that removes the need to record the per context data in the
proto context, while still allowing flexibility of specifying context
images for any context.

Mesa MR using the uapi can be seen at:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27594

v2:
* Fix whitespace alignment as per checkpatch.
* Added warning on userspace misuse.
* Rebase for extracting ce->default_state shadowing.

v3:
* Rebase for I915_CONTEXT_PARAM_LOW_LATENCY.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Carlos Santa <carlos.santa@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Carlos Santa <carlos.santa@intel.com>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514145939.87427-2-tursulin@igalia.com

drm/i915: Shadow default engine context image in the context

To enable adding override of the default engine context image let us start
shadowing the per engine state in the context.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Carlos Santa <carlos.santa@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514145939.87427-1-tursulin@igalia.com

Merge drm/drm-next into drm-intel-gt-next

Some display refactoring patches are needed in order to allow conflict-
less merging.

Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>

drm/tests: Add a unit test for range bias allocation

Allocate cleared blocks in the bias range when the DRM
buddy's clear avail is zero. This will validate the bias
range allocation in scenarios like system boot when no
cleared blocks are available and exercise the fallback
path too. The resulting blocks should always be dirty.

v1:(Matthew)
- move the size to the variable declaration section.
- move the mm.clear_avail init to allocator init.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514145636.16253-2-Arunpravin.PaneerSelvam@amd.com

drm/buddy: Fix the range bias clear memory allocation issue

Problem statement: During the system boot time, an application request
for the bulk volume of cleared range bias memory when the clear_avail
is zero, we dont fallback into normal allocation method as we had an
unnecessary clear_avail check which prevents the fallback method leads
to fb allocation failure following system goes into unresponsive state.

Solution: Remove the unnecessary clear_avail check in the range bias
allocation function.

v2: add a kunit for this corner case (Daniel Vetter)

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Fixes: 96950929eb23 ("drm/buddy: Implement tracking clear page feature")
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514145636.16253-1-Arunpravin.PaneerSelvam@amd.com

drm/i915/gt: Disarm breadcrumbs if engines are already idle

The breadcrumbs use a GT wakeref for guarding the interrupt, but are
disarmed during release of the engine wakeref. This leaves a hole where
we may attach a breadcrumb just as the engine is parking (after it has
parked its breadcrumbs), execute the irq worker with some signalers still
attached, but never be woken again.

That issue manifests itself in CI with IGT runner timeouts while tests
are waiting indefinitely for release of all GT wakerefs.

<6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats
<7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5
<7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4
<7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3
<7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2
<7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off
<7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6
<7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02
<4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT.
...
<6> [299.356526] sysrq: Show State
...
<6> [299.373964] task:i915_selftest   state:D stack:11784 pid:5578  tgid:5578  ppid:873    flags:0x00004002
<6> [299.373967] Call Trace:
<6> [299.373968]  <TASK>
<6> [299.373970]  __schedule+0x3bb/0xda0
<6> [299.373974]  schedule+0x41/0x110
<6> [299.373976]  intel_wakeref_wait_for_idle+0x82/0x100 [i915]
<6> [299.374083]  ? __pfx_var_wake_function+0x10/0x10
<6> [299.374087]  live_engine_busy_stats+0x9b/0x500 [i915]
<6> [299.374173]  __i915_subtests+0xbe/0x240 [i915]
<6> [299.374277]  ? __pfx___intel_gt_live_setup+0x10/0x10 [i915]
<6> [299.374369]  ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915]
<6> [299.374456]  intel_engine_live_selftests+0x1c/0x30 [i915]
<6> [299.374547]  __run_selftests+0xbb/0x190 [i915]
<6> [299.374635]  i915_live_selftests+0x4b/0x90 [i915]
<6> [299.374717]  i915_pci_probe+0x10d/0x210 [i915]

At the end of the interrupt worker, if there are no more engines awake,
disarm the breadcrumb and go to sleep.

Fixes: 9d5612ca165a ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: <stable@vger.kernel.org> # v5.12+
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423165505.465734-2-janusz.krzysztofik@linux.intel.com

drm/i915/gem/i915_gem_ttm_move: Fix typo

The mapings should be replaced by mappings.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Deming Wang <wangdeming@inspur.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513061451.1627-1-wangdeming@inspur.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Merge tag 'drm-xe-next-fixes-2024-05-09-1' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

Driver Changes:
- Use ordered WQ for G2H handler. (Matthew Brost)
- Use flexible-array rather than zero-sized (Lucas De Marchi)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zjz7SzCvfA3vQRxu@fedora

Merge tag 'drm-misc-next-fixes-2024-05-08' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next-fixes for v6.10-rc1:
- panthor fixes.
- Reverting Kconfig changes, and moving drm options to submenu.
- Hide physical fb address in fb helper.
- zynqmp bridge fix.
- Revert broken ti-sn65dsi83 fix.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/fe630414-d13e-4052-86f3-ce3155eb3e44@linux.intel.com

Merge tag 'drm-msm-next-2024-05-07' of https://gitlab.freedesktop.org/drm/msm into drm-next

Updates for v6.10

Core:
- Switched to generating register header files during build process
instead of shipping pre-generated headers
- Merged DPU and MDP4 format databases.

DP:
- Stop using compat string to distinguish DP and eDP cases
- Added support for X Elite platform (X1E80100)
- Reworked DP aux/audio support
- Added SM6350 DP to the bindings (no driver changes, using SM8350
as a fallback compat)

GPU:
- a7xx perfcntr reg fixes
- MAINTAINERS updates
- a750 devcoredump support

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGtpw6dNR9JBikFTQ=TCpt-9FeFW+SGjXWv+Jv3emm0Pbg@mail.gmail.com

drm/xe/ads: Use flexible-array

Zero-length arrays are deprecated and flexible arrays should be
used instead: https://www.kernel.org/doc/html/v6.9-rc7/process/deprecated.html#zero-length-and-one-element-arrays

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Closes: https://lore.kernel.org/r/202405051824.AmjAI5Pg-lkp@intel.com/
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240506141917.205714-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit ee7284230644e21fef0e38fc5bf8f907b6bb7f7c)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe: Use ordered WQ for G2H handler

System work queues are shared, use a dedicated work queue for G2H
processing to avoid G2H processing getting block behind system tasks.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240506034758.3697397-1-matthew.brost@intel.com
(cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Remove counter productive REGION_* wrappers

This extra macro level between the region IDs and their bitmasks
just makes it harder to see what is used where. Get rid of the
wrappers.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502121423.1002-3-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Pass the region ID rather than a bitmask to HAS_REGION()

The name 'HAS_REGION()' suggests we are checking for a single
region, so seem more sensible to pass in the region ID rather
than a bitmask.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502121423.1002-2-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Fix HAS_REGION() usage in intel_gt_probe_lmem()

HAS_REGION() takes a bitmask, not the region ID. This causes the
GEM_BUG_ON() to assert that the SMEM region is available rather
than the intended LMEM region. No real harm since SMEM is always
available, but also not checking what was intended.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502121423.1002-1-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Revert "drm/i915: Remove extra multi-gt pm-references"

This reverts commit 1f33dc0c1189efb9ae19c6fc22b64dd3e26261fb.

There was a patch supposed to fix an issue of illegal attempts to free a
still active i915 VMA object when parking a GT believed to be idle,
reported by CI on 2-GT Meteor Lake.  As a solution, an extra wakeref for
a Primary GT was acquired from i915_gem_do_execbuffer() -- see commit
f56fe3e91787 ("drm/i915: Fix a VMA UAF for multi-gt platform").

However, that fix occurred insufficient -- the issue was still reported by
CI.  That wakeref was released on exit from i915_gem_do_execbuffer(), then
potentially before completion of the request and deactivation of its
associated VMAs.  Moreover, CI reports indicated that single-GT platforms
also suffered sporadically from the same race.

Since that issue was fixed by another commit f3c71b2ded5c ("drm/i915/vma:
Fix UAF on destroy against retire race"), the changes introduced by that
insufficient fix were dropped as no longer useful.  However, that series
resulted in another VMA UAF scenario now being triggered in CI.

<4> [260.290809] ------------[ cut here ]------------
<4> [260.290988] list_del corruption. prev->next should be ffff888118c5d990, but was ffff888118c5a510. (prev=ffff888118c5a510)
<4> [260.291004] WARNING: CPU: 2 PID: 1143 at lib/list_debug.c:62 __list_del_entry_valid_or_report+0xb7/0xe0
..
<4> [260.291055] CPU: 2 PID: 1143 Comm: kms_plane Not tainted 6.9.0-rc2-CI_DRM_14524-ga25d180c6853+ #1
<4> [260.291058] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P LP5x T3 RVP, BIOS MTLPFWI1.R00.3471.D91.2401310918 01/31/2024
<4> [260.291060] RIP: 0010:__list_del_entry_valid_or_report+0xb7/0xe0
...
<4> [260.291087] Call Trace:
<4> [260.291089]  <TASK>
<4> [260.291124]  i915_vma_reopen+0x43/0x80 [i915]
<4> [260.291298]  eb_lookup_vmas+0x9cb/0xcc0 [i915]
<4> [260.291579]  i915_gem_do_execbuffer+0xc9a/0x26d0 [i915]
<4> [260.291883]  i915_gem_execbuffer2_ioctl+0x123/0x2a0 [i915]
...
<4> [260.292301]  </TASK>
...
<4> [260.292506] ---[ end trace 0000000000000000 ]---
<4> [260.292782] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6ca3: 0000 [#1] PREEMPT SMP NOPTI
<4> [260.303575] CPU: 2 PID: 1143 Comm: kms_plane Tainted: G        W          6.9.0-rc2-CI_DRM_14524-ga25d180c6853+ #1
<4> [260.313851] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P LP5x T3 RVP, BIOS MTLPFWI1.R00.3471.D91.2401310918 01/31/2024
<4> [260.326359] RIP: 0010:eb_validate_vmas+0x114/0xd80 [i915]
...
<4> [260.428756] Call Trace:
<4> [260.431192]  <TASK>
<4> [639.283393]  i915_gem_do_execbuffer+0xd05/0x26d0 [i915]
<4> [639.305245]  i915_gem_execbuffer2_ioctl+0x123/0x2a0 [i915]
...
<4> [639.411134]  </TASK>
...
<4> [639.449979] ---[ end trace 0000000000000000 ]---

We defer actually closing, unbinding and destroying a VMA until next idle
point, or until the object is freed in the meantime.  By postponing the
unbind, we allow for the VMA to be reopened by the client, avoiding the
work required to rebind the VMA.

Starting from commit b0647a5e79b1 ("drm/i915: Avoid live-lock with
i915_vma_parked()"), we assume that as long as a GT is held idle, no VMA
would be reopened while we destroy them.  That assumption is no longer
true in multi-GT configurations, where a VMA we reopen may be handled by a
GT different from the one that we already keep active via its engine while
we set up an execbuf request.

Restoring the extra GT0 PM wakeref removed from i915_gem_do_execbuffer()
processing path seems to fix this issue.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10608
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com>
Fixes: 1f33dc0c1189 ("drm/i915: Remove extra multi-gt pm-references")
Link: https://patchwork.freedesktop.org/patch/msgid/20240506180253.96858-2-janusz.krzysztofik@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/msm/gen_header: allow skipping the validation

We don't need to run the validation of the XML files if we are just
compiling the kernel. Skip the validation unless the user enables
corresponding Kconfig option. This removes a warning from gen_header.py
about lxml being not installed.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/all/20240409120108.2303d0bd@canb.auug.org.au/
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/592558/
Signed-off-by: Rob Clark <robdclark@chromium.org>

drm/msm/a6xx: Cleanup indexed regs const'ness

These tables were made non-const in commit 3cba4a2cdff3 ("drm/msm/a6xx:
Update ROQ size in coredump") in order to avoid powering up the GPU when
reading back a devcoredump. Instead let's just stash the count that is
potentially read from hw in struct a6xx_gpu_state_obj, and make the
tables const again.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/592699/

drm/msm: Add devcoredump support for a750

Add an a750 case to the various places where we choose a list of
registers.
Patchwork: https://patchwork.freedesktop.org/patch/592519/

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/592519
Signed-off-by: Rob Clark <robdclark@chromium.org>

drm/msm: Adjust a7xx GBIF debugbus dumping

Use the kgsl-style list of indices, because this is about to change for
a750 and we want to reuse the downstream header directly.
Patchwork: https://patchwork.freedesktop.org/patch/592520/

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/592520
Signed-off-by: Rob Clark <robdclark@chromium.org>

drm/msm: Update a6xx registers XML

Update to Mesa commit e82d70d472cc ("freedreno/a7xx: Add
A7XX_HLSQ_DP_STR location from kgsl").

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/592518/
Signed-off-by: Rob Clark <robdclark@chromium.org>

drm/msm: Fix imported a750 snapshot header for upstream

Add A7XX prefixes necessary because we use the same code for dumping
a6xx and a7xx, fix register name prefixes for upstream, and use the
upstream header.
Patchwork: https://patchwork.freedesktop.org/patch/592517/

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/592517
Signed-off-by: Rob Clark <robdclark@chromium.org>

drm/msm: Import a750 snapshot registers from kgsl

Import from kgsl commit 809ee24fe560.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/592516/
Signed-off-by: Rob Clark <robdclark@chromium.org>

MAINTAINERS: Add Konrad Dybcio as a reviewer for the Adreno driver

Add myself as a reviewer for Adreno driver changes.

Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/590705/
Signed-off-by: Rob Clark <robdclark@chromium.org>

MAINTAINERS: Add a separate entry for Qualcomm Adreno GPU drivers

The msm driver is.. gigantic and covers display hardware (incl. things
concerning (e)DP, DSI, HDMI), as well as the entire lineup of Adreno
GPUs (with hw bringup, memory mappings, userspace interaction etc.).

Because of that, people listed as M:/R: receive patches concerning
drivers for any part of the display block OR the GPU. Separate the
latter, as it's both a functionally separate block and is of
interest to different folks.

Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/590704/
Signed-off-by: Rob Clark <robdclark@chromium.org>

drm/msm/a6xx: Avoid a nullptr dereference when speedbin setting fails

Calling a6xx_destroy() before adreno_gpu_init() leads to a null pointer
dereference on:

msm_gpu_cleanup() : platform_set_drvdata(gpu->pdev, NULL);

as gpu->pdev is only assigned in:

a6xx_gpu_init()
|_ adreno_gpu_init
|_ msm_gpu_init()

Instead of relying on handwavy null checks down the cleanup chain,
explicitly de-allocate the LLC data and free a6xx_gpu instead.

Fixes: 76efc2453d0e ("drm/msm/gpu: Fix crash during system suspend after unbind")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/588919/
Signed-off-by: Rob Clark <robdclark@chromium.org>

drm/msm/adreno: fix CP cycles stat retrieval on a7xx

a7xx_submit() should use the a7xx variant of the RBBM_PERFCTR_CP register
for retrieving the CP cycles value before and after the submitted command
stream execution.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: af66706accdf ("drm/msm/a6xx: Add skeleton A7xx support")
Patchwork: https://patchwork.freedesktop.org/patch/588445/
Signed-off-by: Rob Clark <robdclark@chromium.org>

drm/msm/a7xx: allow writing to CP_BV counter selection registers

In addition to the CP_PERFCTR_CP_SEL register range, allow writes to the
CP_BV_PERFCTR_CP_SEL registers in the 0x8e0-0x8e6 range for profiling
purposes of tools like fdperf and perfetto.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Patchwork: https://patchwork.freedesktop.org/patch/580548/
[fixup a730_protect size]
Signed-off-by: Rob Clark <robdclark@chromium.org>

Merge tag 'drm-xe-next-fixes-2024-05-02' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

Driver Changes:
- Fix for a backmerge going slightly wrong.
- An UAF fix
- Avoid a WA error on LNL.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZjOijQA43zhu3SZ4@fedora

drm: zynqmp_dpsub: Always register bridge

We must always register the DRM bridge, since zynqmp_dp_hpd_work_func
calls drm_bridge_hpd_notify, which in turn expects hpd_mutex to be
initialized. We do this before zynqmp_dpsub_drm_init since that calls
drm_bridge_attach. This fixes the following lockdep warning:

[   19.217084] ------------[ cut here ]------------
[   19.227530] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[   19.227768] WARNING: CPU: 0 PID: 140 at kernel/locking/mutex.c:582 __mutex_lock+0x4bc/0x550
[   19.241696] Modules linked in:
[   19.244937] CPU: 0 PID: 140 Comm: kworker/0:4 Not tainted 6.6.20+ #96
[   19.252046] Hardware name: xlnx,zynqmp (DT)
[   19.256421] Workqueue: events zynqmp_dp_hpd_work_func
[   19.261795] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   19.269104] pc : __mutex_lock+0x4bc/0x550
[   19.273364] lr : __mutex_lock+0x4bc/0x550
[   19.277592] sp : ffffffc085c5bbe0
[   19.281066] x29: ffffffc085c5bbe0 x28: 0000000000000000 x27: ffffff88009417f8
[   19.288624] x26: ffffff8800941788 x25: ffffff8800020008 x24: ffffffc082aa3000
[   19.296227] x23: ffffffc080d90e3c x22: 0000000000000002 x21: 0000000000000000
[   19.303744] x20: 0000000000000000 x19: ffffff88002f5210 x18: 0000000000000000
[   19.311295] x17: 6c707369642e3030 x16: 3030613464662072 x15: 0720072007200720
[   19.318922] x14: 0000000000000000 x13: 284e4f5f4e524157 x12: 0000000000000001
[   19.326442] x11: 0001ffc085c5b940 x10: 0001ff88003f388b x9 : 0001ff88003f3888
[   19.334003] x8 : 0001ff88003f3888 x7 : 0000000000000000 x6 : 0000000000000000
[   19.341537] x5 : 0000000000000000 x4 : 0000000000001668 x3 : 0000000000000000
[   19.349054] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff88003f3880
[   19.356581] Call trace:
[   19.359160]  __mutex_lock+0x4bc/0x550
[   19.363032]  mutex_lock_nested+0x24/0x30
[   19.367187]  drm_bridge_hpd_notify+0x2c/0x6c
[   19.371698]  zynqmp_dp_hpd_work_func+0x44/0x54
[   19.376364]  process_one_work+0x3ac/0x988
[   19.380660]  worker_thread+0x398/0x694
[   19.384736]  kthread+0x1bc/0x1c0
[   19.388241]  ret_from_fork+0x10/0x20
[   19.392031] irq event stamp: 183
[   19.395450] hardirqs last  enabled at (183): [<ffffffc0800b9278>] finish_task_switch.isra.0+0xa8/0x2d4
[   19.405140] hardirqs last disabled at (182): [<ffffffc081ad3754>] __schedule+0x714/0xd04
[   19.413612] softirqs last  enabled at (114): [<ffffffc080133de8>] srcu_invoke_callbacks+0x158/0x23c
[   19.423128] softirqs last disabled at (110): [<ffffffc080133de8>] srcu_invoke_callbacks+0x158/0x23c
[   19.432614] ---[ end trace 0000000000000000 ]---

Fixes: eb2d64bfcc17 ("drm: xlnx: zynqmp_dpsub: Report HPD through the bridge")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240308204741.3631919-1-sean.anderson@linux.dev
(cherry picked from commit 61ba791c4a7a09a370c45b70a81b8c7d4cf6b2ae)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Revert "drm/bridge: ti-sn65dsi83: Fix enable error path"

This reverts commit 8a91b29f1f50ce7742cdbe5cf11d17f128511f3f.

The regulator_disable() added by the original commit solves one kind of
regulator imbalance but adds another one as it allows the regulator to be
disabled one more time than it is enabled in the following scenario:

1. Start video pipeline -> sn65dsi83_atomic_pre_enable -> regulator_enable
2. PLL lock fails -> regulator_disable
3. Stop video pipeline -> sn65dsi83_atomic_disable -> regulator_disable

The reason is clear from the code flow, which looks like this (after
removing unrelated code):

  static void sn65dsi83_atomic_pre_enable()
  {
      regulator_enable(ctx->vcc);

      if (PLL failed locking) {
          regulator_disable(ctx->vcc);  <---- added by patch being reverted
          return;
      }
  }

  static void sn65dsi83_atomic_disable()
  {
      regulator_disable(ctx->vcc);
  }

The use case for introducing the additional regulator_disable() was
removing the module for debugging (see link below for the discussion). If
the module is removed after a .atomic_pre_enable, i.e. with an active
pipeline from the DRM point of view, .atomic_disable is not called and thus
the regulator would not be disabled.

According to the discussion however there is no actual use case for
removing the module with an active pipeline, except for
debugging/development.

On the other hand, the occurrence of a PLL lock failure is possible due to
any physical reason (e.g. a temporary hardware failure for electrical
reasons) so handling it gracefully should be supported. As there is no way
for .atomic[_pre]_enable to report an error to the core, the only clean way
to support it is calling regulator_disabled() only in .atomic_disable,
unconditionally, as it was before.

Link: https://lore.kernel.org/all/15244220.uLZWGnKmhe@steina-w/
Fixes: 8a91b29f1f50 ("drm/bridge: ti-sn65dsi83: Fix enable error path")
Reviewed-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Signed-off-by: Robert Foss <rfoss@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426122259.46808-1-luca.ceresoli@bootlin.com
(cherry picked from commit 2940ee03b23281071620dda1d790cd644dabd394)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/fb_dma: Add checks in drm_fb_dma_get_scanout_buffer()

plane->state and plane->state->fb can be NULL, so add a check before
dereferencing them.
Found by testing with the imx driver.

Fixes: 879b3b6511fe ("drm/fb_dma: Add generic get_scanout_buffer() for drm_panic")
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426121121.241366-1-jfalempe@redhat.com
(cherry picked from commit 986c12d8c9a677c094c37bd6aa636b4d4c5ccd46)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/fbdev-generic: Do not set physical framebuffer address

Framebuffer memory is allocated via vzalloc() from non-contiguous
physical pages. The physical framebuffer start address is therefore
meaningless. Do not set it.

The value is not used within the kernel and only exported to userspace
on dedicated ARM configs. No functional change is expected.

v2:
- refer to vzalloc() in commit message (Javier)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: a5b44c4adb16 ("drm/fbdev-generic: Always use shadow buffering")
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Zack Rusin <zackr@vmware.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: <stable@vger.kernel.org> # v6.4+
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Zack Rusin <zack.rusin@broadcom.com>
Reviewed-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Tested-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419083331.7761-2-tzimmermann@suse.de
(cherry picked from commit 73ef0aecba78aa9ebd309b10b6cd17d94e632892)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/panthor: Fix the FW reset logic

In the post_reset function, if the fast reset didn't succeed, we
are not clearing the fast_reset flag, which prevents firmware
sections from being reloaded. While at it, use panthor_fw_stop()
instead of manually writing DISABLE to the MCU_CONTROL register.

Fixes: 2718d91816ee ("drm/panthor: Add the FW logical block")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430113727.493155-1-boris.brezillon@collabora.com

drm/panthor: Make sure we handle 'unknown group state' case properly

When we check for state values returned by the FW, we only cover part of
the 0:7 range. Make sure we catch FW inconsistencies by adding a default
to the switch statement, and flagging the group state as unknown in that
case.

When an unknown state is detected, we trigger a reset, and consider the
group as unusable after that point, to prevent the potential corruption
from creeping in other places if we continue executing stuff on this
context.

v2:
- Add Steve's R-b
- Fix commit message

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/dri-devel/3b7fd2f2-679e-440c-81cd-42fc2573b515@moroto.mountain/T/#u
Suggested-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502155248.1430582-1-boris.brezillon@collabora.com

drm: move DRM-related CONFIG options into DRM submenu

When you create a submenu using the 'menu' syntax, there is no
ambiguity about its end because the code between 'menu' and 'endmenu'
becomes the submenu.

In contrast, 'menuconfig' does not have the corresponding end marker.
Instead, the end of the submenu is inferred from symbol dependencies.

This is detailed in Documentation/kbuild/kconfig-language.rst, starting
line 348. It outlines two methods to place the code under the submenu:

(1) Open an if-block immediately after 'menuconfig', enclosing the
submenu content within it

(2) Add 'depends on' to every symbol intended for the submenu

Many subsystems opt for (1) because it reliably maintains the submenu
structure.

The DRM subsystem adopts (2). The submenu ends when the sequence of
'depends on DRM' breaks. It can be confirmed by running a GUI frontend
such as 'make menuconfig' and visiting the DRM menu:

< > Direct Rendering Manager (XFree86 4.1.0 and higher DRI support) ----

If you toggle this, you will notice most of the DRM-related options
appear below it, not in the submenu.

I highly recommend the approach (1). Obviously, (2) is not reliable,
as the submenu breaks whenever someone forgets to add 'depends on DRM'.

This commit encloses the entire DRM configuration with 'if DRM' and
'endif', except for DRM_PANEL_ORIENTATION_QUIRKS.

Note:
Now, 'depends on DRM' properties inside the if-block are all redundant.
I leave it as follow-up cleanups.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426135602.2500125-1-masahiroy@kernel.org
Signed-off-by: Maxime Ripard <mripard@kernel.org>

Revert "drm/display: Make all helpers visible and switch to depends on"

This reverts commit d674858ff979550a0e97b4ac766f2640f0d9d7e7, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.

The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/3db958e3f4002e26cd963596d810c37feb315fb3.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>

Revert "drm: Make drivers depends on DRM_DW_HDMI"

This reverts commit c0e0f139354c01e0213204e4a96e7076e5a3e396, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.

The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/bd93d43b07f8ed6368119f4a5ddac2ee80debe53.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>

Revert "drm: Switch DRM_DISPLAY_HELPER to depends on"

This reverts commit e075e496f516bf92bc0cbaf94d64e8d4a6b58321, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.

The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/1ba76cc4d96a8afefff5d1bc42fb1e1329c5da68.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>

Revert "drm: Switch DRM_DISPLAY_DP_AUX_BUS to depends on"

This reverts commit 4d15125d7fe637f401e64e33c99513adf6586fdd, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.

The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/79824fec01eb9ab0673b9409f9b39cc8b5cc338d.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>

Revert "drm: Switch DRM_DISPLAY_DP_HELPER to depends on"

This reverts commit 0323287de87d7e6e9c22c57d7440aa353a2298d0, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.

The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/89ac456805746b6d0c888f10c5120b11aacd3319.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>

Revert "drm: Switch DRM_DISPLAY_HDCP_HELPER to depends on"

This reverts commit 3166e7e6d935caaef07605a5c90773fbf9ffeaf4, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.

The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/a40e70a0abd3d841c23c107d452a43fdd70ef37a.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>

Revert "drm: Switch DRM_DISPLAY_HDMI_HELPER to depends on"

This reverts commit f6d2dc03fa8546b284dd8c1af027d9fac5725921, as helper
code should always be selected by the driver that needs it, for the
convenience of the final user configuring a kernel.

The user who configures a kernel should not need to know which helpers
are needed for the driver he is interested in. Making a driver depend
on helper code means that the user needs to know which helpers to enable
first, which is very user-unfriendly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/bd288a5943dab8609f2d1f2bf413595a61df727a.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>

Revert "drm: fix DRM_DISPLAY_DP_HELPER dependencies"

This reverts commit d1ef8fc18be6adbbffdee06fbb5b33699e2852be, as the
commit it fixes will be reverted, too.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/df2876c72e82cbecfe8406fa058e8d3e895e4b93.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>

Revert "drm/bridge: dw-hdmi: Make DRM_DW_HDMI selectable"

This reverts commit 0209df3b4731516fe77638bfc52ba2e9629c67cd, as the
commit it fixes (which is BTW not the commit in the Fixes: tag!) will be
reverted, too.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/b2d1a7a765e4af249f4c450383de6e8422647e2e.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>

Revert "drm/display: Select DRM_KMS_HELPER for DP helpers"

This reverts commit 7fa678cc0a5648b5ea28629a2d21b9d4b6ac8f56, as the
commit it fixes will be reverted, too.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/17b6a2c5f3250a7d02ee1b517182ca6fd9baa45a.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>

Revert "drm: fix DRM_DISPLAY_DP_HELPER dependencies, part 2"

This reverts commit a57e191ebbaa0363dbf352cc37447c2230573e29, as the
commits it fixes will be reverted, too.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/37216404c77b4c677d3b3a80d12d6d4447a3f3a0.1713780345.git.geert+renesas@glider.be
Signed-off-by: Maxime Ripard <mripard@kernel.org>

drm/panthor: Kill the faulty_slots variable in panthor_sched_suspend()

We can use upd_ctx.timedout_mask directly, and the faulty_slots update
in the flush_caches_failed situation is never used.

Suggested-by: Suggested-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240425103920.826458-1-boris.brezillon@collabora.com

drm/xe: Merge 16021540221 and 18034896535 WAs

In order to detect duplicate implementations for the same workaround,
early in the implementation of RTP it was decided to error out even if
the values set are exactly the same. With the introduction of 18034896535
in commit 74671d23ca18 ("drm/xe/xe2: Add workaround 18034896535"), LNL
stepping with graphics stepping A1 now gives the following error on
module load:

xe 0000:00:02.0: [drm] *ERROR* GT0: [GT OTHER] \
discarding save-restore reg e48c (clear: 00000200, set: 00000200,\
masked: yes, mcr: yes): ret=-22

RTP may be improved in the future, but for now simply join the entries
like done with e.g. "1607297627, 1607030317, 1607186500".

Fixes: 74671d23ca18 ("drm/xe/xe2: Add workaround 18034896535")
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240427135339.3485559-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 4caf410766add8cf376a3afc910b17dd0961dd75)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe/vm: prevent UAF in rebind_work_func()

We flush the rebind worker during the vm close phase, however in places
like preempt_fence_work_func() we seem to queue the rebind worker
without first checking if the vm has already been closed.  The concern
here is the vm being closed with the worker flushed, but then being
rearmed later, which looks like potential uaf, since there is no actual
refcounting to track the queued worker. We can't take the vm->lock here
in preempt_rebind_work_func() to first check if the vm is closed since
that will deadlock, so instead flush the worker again when the vm
refcount reaches zero.

v2:
- Grabbing vm->lock in the preempt worker creates a deadlock, so
   checking the closed state is tricky. Instead flush the worker when
   the refcount reaches zero. It should be impossible to queue the
   preempt worker without already holding vm ref.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1676
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1591
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1364
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1304
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1249
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423074721.119633-4-matthew.auld@intel.com
(cherry picked from commit 3d44d67c441a9fe6f81a1d705f7de009a32a5b35)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe: Fix unexpected backmerge results

The recent backmerge from drm-next to drm-xe-next brought with it
some silent unexpected results. One code snippet was added twice
and a partial revert had merge errors. Fix that up to
reinstate the affected code as it was before the backmerge.

v2:
- Commit log message rewording (Lucas DeMarchi)

Fixes: 79790b6818e9 ("Merge drm/drm-next into drm-xe-next")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423121114.39325-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 06e7139a034f26804904368fe4af2ceb70724756)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Merge tag 'drm-intel-next-2024-04-30' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-next

Core DRM:
- Export drm_client_dev_unregister (Thomas Zimmermann)

Display i915:
- More initial work to make display code more independent from i915 (Jani)
- Convert i915/xe fbdev to DRM client (Thomas Zimmermann)
- VLV/CHV DPIO register cleanup (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZjFPcSCTd_5c0XU_@intel.com

drm/i915/dpio: Extract vlv_dpio_phy_regs.h

Pull the VLV/CHV DPIO PHY sideband registers to their own file.

v2: drop stray tabs (Jani)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-15-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: Clean up the vlv/chv PHY register bits

Use REG_BIT() & co. for the vlv/chv DPIO PHY registers.

Note that DPIO_BIAS_CURRENT_CTL_SHIFT was incorrectly defined
to be 21 wheres 20 is the correct value. It is not used in the
code though so didn't bother splitting to a separate patch.

v2: drop stray tabs (Jani)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-14-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: Clean up VLV/CHV DPIO PHY register defines

The DPIO PHY registers follow clear numbering rules. Express
those in a few macros to get rid of the hand calculated
final offsets.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-13-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: Rename a few CHV DPIO PHY registers

Drop the leading underscore from the CHV PHY common lane
register definitions. We use these directly from actual
code so the underscore here is misleading as usually it indicates
an intermediate define that shouldn't be used directly.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-12-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: Give VLV DPIO group register a clearer name

Include _GRP in VLV DPIO PHY group access register define
names. Makes it more obvious where the accesses will land.
Also matches the naming used by BXT already.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-11-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: Derive the phy from the port rather than pipe in encoder hooks

In the encoder hooks we are dealing primarily with the encoder,
so derive the DPIO PHY from the encoder rather than the pipe.
Technically this doesn't matter as we can't cross connect
pipes<->port across PHY boundaries, but it does conveny the
intention more accurately.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-10-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: s/pipe/ch/

Stop using 'pipe' directly as the DPIO PHY channel. This
does happen to work on VLV since it just has the one PHY
with CH0==pipe A and CH1==pipe B. But explicitly converting
the thing to the right enum makes the whole thing less
confusing.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-9-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: s/port/ch/

Stop calling the DPIO PHY channel "port". Just say "ch", which
is already used in a bunch of places.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-8-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: Rename some variables

Use a consistent 'tmp' as the variable name for the register
values during rmw when we don't deal with multiple registers
in parallel.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-7-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: Remove pointless variables from vlv/chv DPLL code

Drop all the local variables for the DPLL dividers for vlv/chv
and just consult the state directly.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-6-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: Fix VLV DPIO PLL register dword numbering

The spreadsheet defines the PLL register block as having
the dwords in the following order:

block   dwords    offsets
PLL1    0x0-0x7   0x00-0x1f
PLL2    0x0-0x7   0x20-0x3f
PLL1ext 0x10-0x1f 0x40-0x5f
PLL2ext 0x10-0x1f 0x60-0x7f

So dword indexes 0x8-0xf don't even exist. Renumber
our register defines to match.

Note that the spreadsheet used hex numbering whereas our
defiens are in decimal. Perhaps we should change that?

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-5-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: s/VLV_PLL_DW9_BCAST/VLV_PCS_DW17_BCAST/

VLV_PLL_DW9_BCAST is actually VLV_PCS_DW17_BCAST. The address
does kinda look like it goes to the PLL block on a first glance,
but broadcast is special and doesn't even exist for the PLL
(only PCS and TX have it).

The fact that we use a broadcast write here is a bit sketchy
IMO since we're now blasting the register to all PCS splines
across the whole PHY. So the PCS registers in the other channel
(ie. other pipe/port) will also be written. But I guess the
fact that we always write the same value should make this a nop
even if the other channel is already enabled (assuming the VBIOS/GOP
didn't screw up and use some other value...).

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-4-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: s/VLV_REF_DW13/VLV_REF_DW11/

Our VLV_REF_DW13 is actually VLV_REF_DW11. Rename it.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-3-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/dpio: Remove pointless VLV_PCS01_DW8 read

We don't use the result of the VLV_PCS01_DW8 read at all,
so don't read.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/gt: Automate CCS Mode setting during engine resets

We missed setting the CCS mode during resume and engine resets.
Create a workaround to be added in the engine's workaround list.
This workaround sets the XEHP_CCS_MODE value at every reset.

The issue can be reproduced by running:

$ clpeak --kernel-latency

Without resetting the CCS mode, we encounter a fence timeout:

Fence expiration time out i915-0000:03:00.0:clpeak[2387]:2!

Fixes: 2bebae0112b1 ("drm/i915/gt: Enable only one CCS for compute workload")
Reported-by: Gnattu OC <gnattuoc@me.com>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Tested-by: Gnattu OC <gnattuoc@me.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426000723.229296-1-andi.shyti@linux.intel.com

drm/i915: pass dev_priv explicitly to PIPE_WGC_C22

Avoid the implicit dev_priv local variable use, and pass dev_priv
explicitly to the PIPE_WGC_C22 register macro.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/0a07f615c574040094b37c861078e41daf53c706.1714399071.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: pass dev_priv explicitly to PIPE_WGC_C21_C20

Avoid the implicit dev_priv local variable use, and pass dev_priv
explicitly to the PIPE_WGC_C21_C20 register macro.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/af39047d304f8a5c3c7a643f702f66c06ea5d638.1714399071.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: pass dev_priv explicitly to PIPE_WGC_C12

Avoid the implicit dev_priv local variable use, and pass dev_priv
explicitly to the PIPE_WGC_C12 register macro.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/62a748b685f253151b17c101dec75351577f30c0.1714399071.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: pass dev_priv explicitly to PIPE_WGC_C11_C10

Avoid the implicit dev_priv local variable use, and pass dev_priv
explicitly to the PIPE_WGC_C11_C10 register macro.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/3f7aae89cf63760bca43b54102c76b3ed2cf8735.1714399071.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: pass dev_priv explicitly to PIPE_WGC_C02

Avoid the implicit dev_priv local variable use, and pass dev_priv
explicitly to the PIPE_WGC_C02 register macro.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/550d4e787445802236f0bf89e4d2f4f32cbd6d75.1714399071.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: pass dev_priv explicitly to PIPE_WGC_C01_C00

Avoid the implicit dev_priv local variable use, and pass dev_priv
explicitly to the PIPE_WGC_C01_C00 register macro.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/85b3db6e666a7a629b10b482b7e7043d52d30511.1714399071.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: pass dev_priv explicitly to PALETTE

Avoid the implicit dev_priv local variable use, and pass dev_priv
explicitly to the PALETTE register macro.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/bf07d29cefef23ebd5d54fbb0d3bf7e41d132d93.1714399071.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>