Sync Panthor with drm-tip #392

CodeChenL · 2025-02-10T11:11:54Z

armbian/linux-rockchip#301
armbian/linux-rockchip#322

CodeChenL · 2025-02-10T11:13:03Z

自编译mesa 25 加 vkmark 跑分：1563， glmark2与glmark2-es照旧1000+

The current implementation of drm_sched_start uses a hardcoded -ECANCELED to dispose of a job when the parent/hw fence is NULL. This results in drm_sched_job_done being called with -ECANCELED for each job with a NULL parent in the pending list, making it difficult to distinguish between recovery methods, whether a queue reset or a full GPU reset was used. To improve this, we first try a soft recovery for timeout jobs and use the error code -ENODATA. If soft recovery fails, we proceed with a queue reset, where the error code remains -ENODATA for the job. Finally, for a full GPU reset, we use error codes -ECANCELED or -ETIME. This patch adds an error code parameter to drm_sched_start, allowing us to differentiate between queue reset and GPU reset failures. This enables user mode and test applications to validate the expected correctness of the requested operation. After a successful queue reset, the only way to continue normal operation is to call drm_sched_job_done with the specific error code -ENODATA. v1: Initial implementation by Jesse utilized amdgpu_device_lock_reset_domain and amdgpu_device_unlock_reset_domain to allow user mode to track the queue reset status and distinguish between queue reset and GPU reset. v2: Christian suggested using the error codes -ENODATA for queue reset and -ECANCELED or -ETIME for GPU reset, returned to amdgpu_cs_wait_ioctl. v3: To meet the requirements, we introduce a new function drm_sched_start_ex with an additional parameter to set dma_fence_set_error, allowing us to handle the specific error codes appropriately and dispose of bad jobs with the selected error code depending on whether it was a queue reset or GPU reset. v4: Alex suggested using a new name, drm_sched_start_with_recovery_error, which more accurately describes the function's purpose. Additionally, it was recommended to add documentation details about the new method. v5: Fixed declaration of new function drm_sched_start_with_recovery_error.(Alex) v6 (chk): rebase on upstream changes, cleanup the commit message, drop the new function again and update all callers, apply the errno also to scheduler fences with hw fences v7 (chk): rebased Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240826122541.85663-1-christian.koenig@amd.com

Function drm_sched_entity_push_job() doesn't have a return value, remove the return value description for it. Correct several other typo errors. v2 (Philipp): - more correction with related comments. Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20240917144732.2758572-1-shuicheng.lin@intel.com

In FIFO mode (which is the default), both drm_sched_entity_push_job() and drm_sched_rq_update_fifo(), where the latter calls the former, are currently taking and releasing the same entity->rq_lock. We can avoid that design inelegance, and also have a miniscule efficiency improvement on the submit from idle path, by introducing a new drm_sched_rq_update_fifo_locked() helper and pulling up the lock taking to its callers. v2: * Remove drm_sched_rq_update_fifo() altogether. (Christian) v3: * Improved commit message. (Philipp) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Luben Tuikov <ltuikov89@gmail.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Philipp Stanner <pstanner@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016122013.7857-2-tursulin@igalia.com

It does not seem there is a need to set the current entity in FIFO mode since ot only serves as being a "cursor" in round-robin mode. Even if scheduling mode is changed at runtime the change in behaviour is simply to restart from the first entity, instead of continuing in RR mode from where FIFO left it, and that sounds completely fine. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Luben Tuikov <ltuikov89@gmail.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Philipp Stanner <pstanner@redhat.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016122013.7857-3-tursulin@igalia.com

When writing to a drm_sched_entity's run-queue, writers are protected through the lock drm_sched_entity.rq_lock. This naming, however, frequently collides with the separate internal lock of struct drm_sched_rq, resulting in uses like this: spin_lock(&entity->rq_lock); spin_lock(&entity->rq->lock); Rename drm_sched_entity.rq_lock to improve readability. While at it, re-order that struct's members to make it more obvious what the lock protects. v2: * Rename some rq_lock straddlers in kerneldoc, improve commit text. (Philipp) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Suggested-by: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Luben Tuikov <ltuikov89@gmail.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Philipp Stanner <pstanner@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> [pstanner: Fix typo in docstring] Signed-off-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016122013.7857-5-tursulin@igalia.com

Having removed one re-lock cycle on the entity->lock in a patch titled "drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit larger refactoring we can do the same optimisation on the rq->lock. (Currently both drm_sched_rq_add_entity() and drm_sched_rq_update_fifo_locked() take and release the same lock.) To achieve this we make drm_sched_rq_update_fifo_locked() and drm_sched_rq_add_entity() expect the rq->lock to be held. We also align drm_sched_rq_update_fifo_locked(), drm_sched_rq_add_entity() and drm_sched_rq_remove_fifo_locked() function signatures, by adding rq as a parameter to the latter. v2: * Fix after rebase of the series. * Avoid naming inconsistency between drm_sched_rq_add/remove. (Christian) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Luben Tuikov <ltuikov89@gmail.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Philipp Stanner <pstanner@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016122013.7857-6-tursulin@igalia.com

drm_sched_job_init() has no control over how users allocate struct drm_sched_job. Unfortunately, the function can also not set some struct members such as job->sched. This could theoretically lead to UB by users dereferencing the struct's pointer members too early. It is easier to debug such issues if these pointers are initialized to NULL, so dereferencing them causes a NULL pointer exception. Accordingly, drm_sched_entity_init() does precisely that and initializes its struct with memset(). Initialize parameter "job" to 0 in drm_sched_job_init(). Signed-off-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241021105028.19794-2-pstanner@redhat.com Reviewed-by: Christian König <christian.koenig@amd.com>

drm_sched_job_init()'s name suggests that after the function succeeded, parameter "job" will be fully initialized. This is not the case; some members are only later set, notably drm_sched_job.sched by drm_sched_job_arm(). Document that drm_sched_job_init() does not set all struct members. Document the lifetime of drm_sched_job.sched. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023141530.113370-2-pstanner@redhat.com

drm_gpu_scheduler.submit_wq is used to submit jobs, jobs are in the path of dma-fences, and dma-fences are in the path of reclaim. Mark scheduler work queue with WQ_MEM_RECLAIM to ensure forward progress during reclaim; without WQ_MEM_RECLAIM, work queues cannot make forward progress during reclaim. v2: - Fixes tags (Philipp) - Reword commit message (Philipp) Cc: Luben Tuikov <ltuikov89@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Philipp Stanner <pstanner@redhat.com> Cc: stable@vger.kernel.org Fixes: 34f50cc ("drm/sched: Use drm sched lockdep map for submit_wq") Fixes: a6149f0 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023235917.1836428-1-matthew.brost@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm_sched_start()'s and drm_sched_stop()'s names suggest that those functions might be intended for actively starting and stopping the scheduler on initialization and teardown. They are, however, only used on timeout handling (reset recovery). The docstrings should reflect that to prevent confusion. Document those functions' purpose. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029133819.78696-2-pstanner@redhat.com

If jobs are still enqueued in struct drm_gpu_scheduler.pending_list when drm_sched_fini() gets called, those jobs will be leaked since that function stops both job-submission and (automatic) job-cleanup. It is, thus, up to the driver to take care of preventing leaks. The related function drm_sched_wqueue_stop() also prevents automatic job cleanup. Those pitfals are not reflected in the documentation, currently. Explicitly inform about the leak problem in the docstring of drm_sched_fini(). Additionally, detail the purpose of drm_sched_wqueue_{start,stop} and hint at the consequences for automatic cleanup. Signed-off-by: Philipp Stanner <pstanner@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105143137.71893-2-pstanner@redhat.com

sizeof(unsigned long) * 8 is the number of bits in an unsigned long variable, replace it with BITS_PER_LONG macro to make them simpler. And fix the warning: WARNING: Comparisons should place the constant on the right side of the test radxa#23: FILE: drivers/gpu/drm/panthor/panthor_mmu.c:2696: + if (BITS_PER_LONG < va_bits) { Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240902094404.1943710-1-ruanjinjie@huawei.com

Expose timestamp information supported by the GPU with a new device query. Mali uses an external timer as GPU system time. On ARM, this is wired to the generic arch timer so we wire cntfrq_el0 as device frequency. This new uAPI will be used in Mesa to implement timestamp queries and VK_KHR_calibrated_timestamps. Since this extends the uAPI and because userland needs a way to advertise those features conditionally, this also bumps the driver minor version. v2: - Rewrote to use GPU timestamp register - Added timestamp_offset to drm_panthor_timestamp_info - Add missing include for arch_timer_get_cntfrq - Rework commit message v3: - Add panthor_gpu_read_64bit_counter - Change panthor_gpu_read_timestamp to use panthor_gpu_read_64bit_counter v4: - Fix multiple typos in uAPI documentation - Mention behavior when the timestamp frequency is unknown - Use u64 instead of unsigned long long for panthor_gpu_read_timestamp - Apply r-b from Mihail Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Mihail Atanassov <mihail.atanassov@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240830080349.24736-2-mary.guillemard@collabora.com

The version number output when loading the firmware is actually the interface version not the version of the firmware itself. Update the message to make this clearer. However, the firmware binary has a git SHA embedded into it which can be used to identify which firmware binary is being loaded. So output this as a drm_info() so that it's obvious from a dmesg log which firmware binary is being used. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240906094025.638173-1-steven.price@arm.com

This adds a new value to drm_panthor_group_priority exposing the realtime priority to userspace. This is required to implement NV_context_priority_realtime in Mesa. v2: - Add Steven Price r-b v3: - Add Boris Brezillon r-b Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240909064820.34982-3-mary.guillemard@collabora.com

Expose allowed group priorities with a new device query. This new uAPI will be used in Mesa to properly report what priorities a user can use for EGL_IMG_context_priority. Since this extends the uAPI and because userland needs a way to advertise priorities accordingly, this also bumps the driver minor version. v2: - Remove drm_panthor_group_allow_priority_flags definition - Document that allowed_mask is a bitmask of drm_panthor_group_priority v3: - Use BIT macro in panthor_query_group_priorities_info - Add r-b from Steven Price and Boris Brezillon Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240909064820.34982-4-mary.guillemard@collabora.com

Enable calculations of job submission times in clock cycles and wall time. This is done by expanding the boilerplate command stream when running a job to include instructions that compute said times right before and after a user CS. A separate kernel BO is created per queue to store those values. Jobs can access their sampled data through an index different from that of the queue's ringbuffer. The reason for this is saving memory on the profiling information kernel BO, since the amount of simultaneous profiled jobs we can write into the queue's ringbuffer might be much smaller than for regular jobs, as the former take more CSF instructions. This commit is done in preparation for enabling DRM fdinfo support in the Panthor driver, which depends on the numbers calculated herein. A profile mode mask has been added that will in a future commit allow UM to toggle performance metric sampling behaviour, which is disabled by default to save power. When a ringbuffer CS is constructed, timestamp and cycling sampling instructions are added depending on the enabled flags in the profiling mask. A helper was provided that calculates the number of instructions for a given set of enablement mask, and these are passed as the number of credits when initialising a DRM scheduler job. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240923230912.2207320-2-adrian.larumbe@collabora.com

In order to support UM in calculating rates of GPU utilisation, the current operating and maximum GPU clock frequencies must be recorded during device initialisation, and also during OPP state transitions. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240923230912.2207320-3-adrian.larumbe@collabora.com

This commit introduces a DRM device sysfs attribute that lets UM control the job accounting status in the device. The knob variable had been brought in as part of a previous commit, but now we're able to fix it manually. As sysfs files are part of a driver's uAPI, describe its legitimate input values and output format in a documentation file. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240923230912.2207320-6-adrian.larumbe@collabora.com

…> 4k The system and GPU MMU page size might differ, which becomes a problem for FW sections that need to be mapped at explicit addresses since our PAGE_SIZE alignment might cover a VA range that's expected to be used for another section. Make sure we never map more than we need. Changes in v3: - Add R-bs Changes in v2: - Plan for per-VM page sizes so the MCU VM and user VM can have different pages sizes Fixes: 2718d91 ("drm/panthor: Add the FW logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241030150231.768949-1-boris.brezillon@collabora.com

Userspace can use GROUP_SUBMIT errors as a trigger to check the group state and recreate the group if it became unusable. Make sure we report an error when the group became unusable. Changes in v3: - None Changes in v2: - Add R-bs Fixes: de85488 ("drm/panthor: Add the scheduler logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029152912.270346-2-boris.brezillon@collabora.com

If we don't do that, the group is considered usable by userspace, but all further GROUP_SUBMIT will fail with -EINVAL. Changes in v3: - Add R-bs Changes in v2: - New patch Fixes: de85488 ("drm/panthor: Add the scheduler logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029152912.270346-3-boris.brezillon@collabora.com

Rearrange lookup of recommended OPP for the Mali GPU device and its refcnt decremental to make sure no OPP object leaks happen in the error path. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Fixes: fac9b22 ("drm/panthor: Add the devfreq logical block") Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105205458.1318989-2-adrian.larumbe@collabora.com

Similar to commit cac0757 ("drm/panthor: Fix race when converting group handle to group object") we need to use the XArray's internal locking when retrieving a vm pointer from there. v2: Removed part of the patch that was trying to protect fetching the heap pointer from XArray, as that operation is protected by the @pool->lock. Fixes: 647810e ("drm/panthor: Add the MMU/VM logical block") Reported-by: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241106185806.389089-1-liviu.dudau@arm.com

The current panthor_device_mmap_io() implementation has two issues: 1. For mapping DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET, panthor_device_mmap_io() bails if VM_WRITE is set, but does not clear VM_MAYWRITE. That means userspace can use mprotect() to make the mapping writable later on. This is a classic Linux driver gotcha. I don't think this actually has any impact in practice: When the GPU is powered, writes to the FLUSH_ID seem to be ignored; and when the GPU is not powered, the dummy_latest_flush page provided by the driver is deliberately designed to not do any flushes, so the only thing writing to the dummy_latest_flush could achieve would be to make *more* flushes happen. 2. panthor_device_mmap_io() does not block MAP_PRIVATE mappings (which are mappings without the VM_SHARED flag). MAP_PRIVATE in combination with VM_MAYWRITE indicates that the VMA has copy-on-write semantics, which for VM_PFNMAP are semi-supported but fairly cursed. In particular, in such a mapping, the driver can only install PTEs during mmap() by calling remap_pfn_range() (because remap_pfn_range() wants to **store the physical address of the mapped physical memory into the vm_pgoff of the VMA**); installing PTEs later on with a fault handler (as panthor does) is not supported in private mappings, and so if you try to fault in such a mapping, vmf_insert_pfn_prot() splats when it hits a BUG() check. Fix it by clearing the VM_MAYWRITE flag (userspace writing to the FLUSH_ID doesn't make sense) and requiring VM_SHARED (copy-on-write semantics for the FLUSH_ID don't make sense). Reproducers for both scenarios are in the notes of my patch on the mailing list; I tested that these bugs exist on a Rock 5B machine. Note that I only compile-tested the patch, I haven't tested it; I don't have a working kernel build setup for the test machine yet. Please test it before applying it. Cc: stable@vger.kernel.org Fixes: 5fe909c ("drm/panthor: Add the device logical block") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105-panthor-flush-page-fixes-v1-1-829aaf37db93@google.com

This commit fixes the bug in the handling of partial mapping of the buffer objects to the GPU, which caused kernel warnings. Panthor didn't correctly handle the case where the partial mapping spanned multiple scatterlists and the mapping offset didn't point to the 1st page of starting scatterlist. The offset variable was not cleared after reaching the starting scatterlist. Following warning messages were seen. WARNING: CPU: 1 PID: 650 at drivers/iommu/io-pgtable-arm.c:659 __arm_lpae_unmap+0x254/0x5a0 <snip> pc : __arm_lpae_unmap+0x254/0x5a0 lr : __arm_lpae_unmap+0x2cc/0x5a0 <snip> Call trace: __arm_lpae_unmap+0x254/0x5a0 __arm_lpae_unmap+0x108/0x5a0 __arm_lpae_unmap+0x108/0x5a0 __arm_lpae_unmap+0x108/0x5a0 arm_lpae_unmap_pages+0x80/0xa0 panthor_vm_unmap_pages+0xac/0x1c8 [panthor] panthor_gpuva_sm_step_unmap+0x4c/0xc8 [panthor] op_unmap_cb.isra.23.constprop.30+0x54/0x80 __drm_gpuvm_sm_unmap+0x184/0x1c8 drm_gpuvm_sm_unmap+0x40/0x60 panthor_vm_exec_op+0xa8/0x120 [panthor] panthor_vm_bind_exec_sync_op+0xc4/0xe8 [panthor] panthor_ioctl_vm_bind+0x10c/0x170 [panthor] drm_ioctl_kernel+0xbc/0x138 drm_ioctl+0x210/0x4b0 __arm64_sys_ioctl+0xb0/0xf8 invoke_syscall+0x4c/0x110 el0_svc_common.constprop.1+0x98/0xf8 do_el0_svc+0x24/0x38 el0_svc+0x34/0xc8 el0t_64_sync_handler+0xa0/0xc8 el0t_64_sync+0x174/0x178 <snip> panthor : [drm] drm_WARN_ON(unmapped_sz != pgsize * pgcount) WARNING: CPU: 1 PID: 650 at drivers/gpu/drm/panthor/panthor_mmu.c:922 panthor_vm_unmap_pages+0x124/0x1c8 [panthor] <snip> pc : panthor_vm_unmap_pages+0x124/0x1c8 [panthor] lr : panthor_vm_unmap_pages+0x124/0x1c8 [panthor] <snip> panthor : [drm] *ERROR* failed to unmap range ffffa388f000-ffffa3890000 (requested range ffffa388c000-ffffa3890000) Fixes: 647810e ("drm/panthor: Add the MMU/VM logical block") Signed-off-by: Akash Goel <akash.goel@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241111134720.780403-1-akash.goel@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>

Mali GPU Arch spec forbids the GPU PTEs to indicate Inner or Outer shareability when no_coherency protocol is selected. Doing so results in unexpected or undesired snooping of the CPU caches on some platforms, such as Juno FPGA, causing functional issues. For example the boot of MCU firmware fails as GPU ends up reading stale data for the FW memory pages from the CPU's cache. The FW memory pages are initialized with uncached mapping when the device is not reported to be dma-coherent. The shareability bits are set to inner-shareable when IOMMU_CACHE flag is passed to map_pages() callback and IOMMU_CACHE flag is passed by Panthor driver when memory needs to be mapped as cached on the GPU side. IOMMU_CACHE seems to imply cache coherent and is probably not fit for purpose for the memory that is mapped as cached on GPU side but doesn't need to remain coherent with the CPU. This commit updates the programming of MEMATTR register to use MIDGARD_INNER instead of CPU_INNER when coherency is disabled. That way the inner-shareability specified in the GPU PTEs would map to Mali's internal-shareable mode, which is always supported by the GPU regardless of the coherency protocal and is required by the Userspace driver to ensure coherency between the shader cores. v2: - Added R-b tags Signed-off-by: Akash Goel <akash.goel@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20241030225407.4077513-2-akash.goel@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>

This commit fixes the potential misalignment between the value of device tree property "dma-coherent" and default value of COHERENCY_ENABLE register. Panthor driver didn't explicitly program the COHERENCY_ENABLE register with the desired coherency mode. The default value of COHERENCY_ENABLE register is implementation defined, so it may not be always aligned with the "dma-coherent" property value. The commit also checks the COHERENCY_FEATURES register to confirm that the coherency protocol is actually supported or not. v2: - Added R-b tags Signed-off-by: Akash Goel <akash.goel@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20241030225407.4077513-3-akash.goel@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>

Stop checking the FW halt_status as MCU_STATUS should be sufficient. This should make the check for successful FW halt and subsequently setting fast_reset to true more robust. We should also clear GLB_REQ.GLB_HALT bit only on post-reset prior to starting the FW and only if we're doing a fast reset, because the slow reset will re-initialize all FW sections, including the global interface. Signed-off-by: Karunika Choo <karunika.choo@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://lore.kernel.org/r/20241119135030.3352939-1-karunika.choo@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>

Commit 498893b ("drm/panthor: Simplify FW fast reset path") forgot to copy the definition of glb_iface when it move one line of code. Fixes: 498893b ("drm/panthor: Simplify FW fast reset path") Link: https://lore.kernel.org/dri-devel/20241119164455.572771-1-liviu.dudau@arm.com/ Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Karunika Choo <karunika.choo@arm.com> Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>

Drop the _RD_ in the flag names. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241113160257.2002333-1-boris.brezillon@collabora.com

WARN() will return true if the condition is true, false otherwise. If we store the return of drm_WARN_ON() in ret, we lose the actual error code. v3: - Add R-b v2: - Add R-b Fixes: 5fe909c ("drm/panthor: Add the device logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Adrian Larumbe <adrian.larumbe@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211075419.2333731-2-boris.brezillon@collabora.com

…end path The runtime PM resume operation is not guaranteed to succeed, but if it fails, the device should be in a suspended state. Make sure we're robust to resume failures in the unplug path. v3: - Fix typo - Add R-bs v2: - Move the bit that belonged in the next commit - Drop the panthor_device_unplug() changes Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Adrian Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211075419.2333731-3-boris.brezillon@collabora.com

devfreq_{resume,suspend}_device() don't bother undoing the suspend_count modifications if something fails, so either it assumes failures are harmless, or it's super fragile/buggy. In either case it's not something we can address at the driver level, so let's just assume failures are harmless for now, like is done in panfrost. v3: - Add R-b v2: - Add R-b Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Adrian Larumbe <adrian.larumbe@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211075419.2333731-4-boris.brezillon@collabora.com

When the runtime PM resume callback returns an error, it puts the device in a state where it can't be resumed anymore. Make sure we can recover from such transient failures by calling pm_runtime_set_suspended() explicitly after a pm_runtime_resume_and_get() failure. v3: - Add R-b/A-b v2: - Add a comment explaining potential races in panthor_device_resume_and_get() Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Adrian Larumbe <adrian.larumbe@collabora.com> Acked-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211075419.2333731-5-boris.brezillon@collabora.com

If we do a GPU soft-reset, that's no longer fast reset. This also means the slow reset fallback doesn't work because the MCU state is only reset after a GPU soft-reset. Let's move the retry logic to panthor_device_resume() to issue a soft-reset between the fast and slow attempts, and patch panthor_gpu_suspend() to only power-off the L2 when a fast reset is requested. v3: - No changes v2: - Add R-b Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211075419.2333731-6-boris.brezillon@collabora.com

Groups can be killed during a reset even though they did nothing wrong. That usually happens when the FW is put in a bad state by other groups, resulting in group suspension failures when the reset happens. If we end up in that situation, flag the group innocent and report innocence through a new DRM_PANTHOR_GROUP_STATE flag. Bump the minor driver version to reflect the uAPI change. Changes in v4: - Add an entry to the driver version changelog - Add R-bs Changes in v3: - Actually report innocence to userspace Changes in v2: - New patch Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211080500.2349505-1-boris.brezillon@collabora.com

Remove unused function declaration in panthor_gem.h: - `panthor_gem_prime_import_sg_table()` Remove duplicate macro definitions: - `MAX_CSG_PRIO` - `MIN_CS_PER_CSG` - `MIN_CSGS` Signed-off-by: Florent Tomasin <florent.tomasin@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250107173310.88329-1-florent.tomasin@arm.com

Use the correct format for all kernel-doc comments. Use structname.membername for named structs. Don't precede function names in kernel-doc with '@' sign. Use the correct function parameter names in kernel-doc comments. This fixes around 80 kernel-doc warnings. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Liviu Dudau <liviu.dudau@arm.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250111062832.910495-1-rdunlap@infradead.org

If a reset is scheduled when the suspend happens, we drop the reset-pending info on the floor assuming the resume will fix things, but the resume logic might try a fast reset. If we're lucky, the fast reset fails and we fallback to a slow reset, but if the FW was corrupted in a way that makes it partially functional (it boots but doesn't quite do what it's expected to do), we won't notice immediately that things are not working correctly, leading to a new reset further down the road. Fixes: 5fe909c ("drm/panthor: Add the device logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241217092457.1582053-1-boris.brezillon@collabora.com

Commit baf4afc ("drm/sched: Improve teardown documentation") documents problems of drm_sched_fini() in form of a list. The checklist triggers htmldocs warning (but renders correctly in htmldocs output): Documentation/gpu/drm-mm:571: ./drivers/gpu/drm/scheduler/sched_main.c:1359: ERROR: Unexpected indentation. Separate the list from the preceding paragraph by a blank line to fix the warning. While at it, also end the aforementioned paragraph by a colon. Fixes: baf4afc ("drm/sched: Improve teardown documentation") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/r/20241108175655.6d3fcfb7@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> [phasta: Adjust commit message] Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241217034915.62594-1-bagasdotme@gmail.com

No driver is using the update_job_credits() schduler vfunc so lets remove it. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Christian König <christian.koenig@amd.com> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Philipp Stanner <pstanner@redhat.com> Acked-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Matt Coster <matt.coster@imgtec.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250110111301.76909-1-tvrtko.ursulin@igalia.com

There is no need to check the boolean in the work item's prologues since the boolean can be set at any later time anyway. The helper which pauses submission sets it and synchronously cancels the work and helpers which queue the work check for the flag so all should be good. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Christian König <christian.koenig@amd.com> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250114105942.64832-1-tvrtko.ursulin@igalia.com

Lets isolate scheduler internals from drivers such as pvr which currently walks the dependency array to look for fences. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Christian König <christian.koenig@amd.com> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Philipp Stanner <pstanner@redhat.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Acked-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250113103341.43914-1-tvrtko.ursulin@igalia.com

RadxaYuntian · 2025-02-11T06:01:21Z

Armbian 392不是关了么？最终合入了吗？

CodeChenL · 2025-02-11T06:05:33Z

Armbian 392不是关了么？最终合入了吗？

一开始backports作者选错分支，armbian开发者手动合了

CodeChenL · 2025-02-11T06:07:38Z

都是直接从armbian rkr5 cherry-pick的

RadxaStephen · 2025-02-11T06:13:30Z

自编译mesa 25 加 vkmark 跑分：1563， glmark2与glmark2-es照旧1000+

在最新的 rock5 debian12 镜像上，只更新内核，桌面有没有异常？跑分有提升吗？需要更新 mesa 包吗？

CodeChenL · 2025-02-11T08:10:12Z

自编译mesa 25 加 vkmark 跑分：1563， glmark2与glmark2-es照旧1000+

在最新的 rock5 debian12 镜像上，只更新内核，桌面有没有异常？跑分有提升吗？需要更新 mesa 包吗？

mesa 25如果不用vulkan不是必须项，一切正常

6.1.84-2-rk2410 + oibaf mesa 24：1022
6.1.84-999-chen-rk2410 + oibaf mesa 24：1030
mesa 25被我删了就不测了，昨天测过也是一千出头

CodeChenL · 2025-02-11T08:10:29Z

这次更新主要是支持vulkan

CodeChenL requested review from RadxaYuntian and RadxaStephen February 10, 2025 11:17

Christian König and others added 27 commits February 11, 2025 00:08

akagoe01 and others added 17 commits February 11, 2025 00:08

CodeChenL force-pushed the linux-6.1-stan-rkr4.1 branch from fd8747c to 2b0c8de Compare February 10, 2025 16:09

RadxaStephen merged commit 2b0c8de into radxa:linux-6.1-stan-rkr4.1 Feb 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync Panthor with drm-tip #392

Sync Panthor with drm-tip #392

CodeChenL commented Feb 10, 2025

CodeChenL commented Feb 10, 2025 •

edited

Loading

RadxaYuntian commented Feb 11, 2025

CodeChenL commented Feb 11, 2025

CodeChenL commented Feb 11, 2025

RadxaStephen commented Feb 11, 2025

CodeChenL commented Feb 11, 2025

CodeChenL commented Feb 11, 2025

Sync Panthor with drm-tip #392

Sync Panthor with drm-tip #392

Conversation

CodeChenL commented Feb 10, 2025

CodeChenL commented Feb 10, 2025 • edited Loading

RadxaYuntian commented Feb 11, 2025

CodeChenL commented Feb 11, 2025

CodeChenL commented Feb 11, 2025

RadxaStephen commented Feb 11, 2025

CodeChenL commented Feb 11, 2025

CodeChenL commented Feb 11, 2025

CodeChenL commented Feb 10, 2025 •

edited

Loading