Alice Ryhl [Mon, 17 Nov 2025 10:39:17 +0000 (10:39 +0000)]
gpu: nova-core: make formatting compatible with rust tree
Commit 38b7cc448a5b ("gpu: nova-core: implement Display for Spec") in
drm-rust-next introduced some usage of the Display trait, but the
Display trait is being modified in the rust tree this cycle. Thus, to
avoid conflicts with the Rust tree, tweak how the formatting machinery
is used in a way where it works both with and without the changes in the
Rust tree.
John Hubbard [Sat, 15 Nov 2025 01:09:22 +0000 (17:09 -0800)]
gpu: nova-core: provide a clear error report for unsupported GPUs
Pass in a PCI device to Spec::new(), and provide a Display
implementation for boot42, in order to provide a clear, concise report
of what happened: the driver read NV_PMC_BOOT42, and found that the GPU
is not supported.
For very old GPUs (older than Fermi), the driver still returns ENODEV,
but it does so without a driver-specific dmesg report. That is exactly
appropriate, because if such a GPU is installed, it can only be
supported by Nouveau. And if so, the user is not helped by additional
error messages from Nova.
Here's the full dmesg output for a Blackwell (not yet supported) GPU:
NovaCore 0000:01:00.0: Probe Nova Core GPU driver.
NovaCore 0000:01:00.0: Unsupported chipset: boot42 = 0x1b2a1000 (architecture 0x1b, implementation 0x2)
NovaCore 0000:01:00.0: probe with driver NovaCore failed with error -524
Cc: Alexandre Courbot <acourbot@nvidia.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Timur Tabi <ttabi@nvidia.com> Cc: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
[acourbot@nvidia.com: fix commit log with ENODEV (not ENOTSUPP) error
code for unsupported GPUs.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251115010923.1192144-5-jhubbard@nvidia.com>
John Hubbard [Sat, 15 Nov 2025 01:09:21 +0000 (17:09 -0800)]
gpu: nova-core: add boot42 support for next-gen GPUs
NVIDIA GPUs are moving away from using NV_PMC_BOOT_0 to contain
architecture and revision details, and will instead use NV_PMC_BOOT_42
in the future. NV_PMC_BOOT_0 will contain a specific set of values
that will mean "go read NV_PMC_BOOT_42 instead".
Change the selection logic in Nova so that it will claim Turing and
later GPUs. This will work for the foreseeable future, without any
further code changes here, because all NVIDIA GPUs are considered, from
the oldest supported on Linux (NV04), through the future GPUs.
Add some comment documentation to explain, chronologically, how boot0
and boot42 change with the GPU eras, and how that affects the selection
logic.
Cc: Alexandre Courbot <acourbot@nvidia.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Timur Tabi <ttabi@nvidia.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
[acourbot@nvidia.com: remove unneeded `From<BOOT_0> for Revision`
implementation.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251115010923.1192144-4-jhubbard@nvidia.com>
Alistair Popple [Fri, 14 Nov 2025 19:55:52 +0000 (14:55 -0500)]
gpu: nova-core: gsp: Retrieve GSP static info to gather GPU information
After GSP initialization is complete, retrieve the static configuration
information from GSP-RM. This information includes GPU name, capabilities,
memory configuration, and other properties. On some GPU variants, it is
also required to do this for initialization to complete.
Signed-off-by: Alistair Popple <apopple@nvidia.com> Co-developed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
[acourbot@nvidia.com: properly abstract the command's bindings, add
relevant methods, make str_from_null_terminated return an Option, fix
size of GPU name array.] Co-developed-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-14-joelagnelf@nvidia.com>
Alistair Popple [Fri, 14 Nov 2025 19:55:51 +0000 (14:55 -0500)]
gpu: nova-core: gsp: Wait for gsp initialization to complete
This adds the GSP init done command to wait for GSP initialization
to complete. Once this command has been received the GSP is fully
operational and will respond properly to normal RPC commands.
Signed-off-by: Alistair Popple <apopple@nvidia.com> Co-developed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
[acourbot@nvidia.com: move new definitions to end of commands.rs, rename
to `wait_gsp_init_done` and remove timeout argument.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-13-joelagnelf@nvidia.com>
Joel Fernandes [Fri, 14 Nov 2025 19:55:46 +0000 (14:55 -0500)]
gpu: nova-core: Implement the GSP sequencer
Implement the GSP sequencer which culminates in INIT_DONE message being
received from the GSP indicating that the GSP has successfully booted.
This is just initial sequencer support, the actual commands will be
added in the next patches.
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
[acourbot@nvidia.com: move GspSequencerInfo definition before its impl
blocks and rename it to GspSequence, adapt imports in sequencer.rs to
new formatting rules, remove `timeout` argument to harmonize with other
commands.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-8-joelagnelf@nvidia.com>
Joel Fernandes [Mon, 10 Nov 2025 13:34:21 +0000 (22:34 +0900)]
gpu: nova-core: falcon: Add support to check if RISC-V is active
Add definition for RISCV_CPUCTL register and use it in a new falcon API
to check if the RISC-V core of a Falcon is active. It is required by
the sequencer to know if the GSP's RISCV processor is active.
Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-13-8ae4058e3c0e@nvidia.com>
Alistair Popple [Mon, 10 Nov 2025 13:34:20 +0000 (22:34 +0900)]
gpu: nova-core: gsp: Add SetRegistry command
Add support for sending the SetRegistry command, which is critical to
GSP initialization.
The RM registry is serialized into a packed format and sent via the
command queue. For now only three parameters which are required to boot
GSP are hardcoded. In the future a kernel module parameter will be added
to enable other parameters to be added.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
[acourbot@nvidia.com: split into its own patch.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-12-8ae4058e3c0e@nvidia.com>
Alistair Popple [Mon, 10 Nov 2025 13:34:18 +0000 (22:34 +0900)]
gpu: nova-core: gsp: Create rmargs
Initialise the GSP resource manager arguments (rmargs) which provides
initialisation parameters to the GSP firmware during boot. The rmargs
structure contains arguments to configure the GSP message/command queue
location.
These are mapped for coherent DMA and added to the libos data structure
for access when booting GSP.
Alistair Popple [Mon, 10 Nov 2025 13:34:17 +0000 (22:34 +0900)]
gpu: nova-core: gsp: Add GSP command queue bindings and handling
This commit introduces core infrastructure for handling GSP command and
message queues in the nova-core driver. The command queue system enables
bidirectional communication between the host driver and GSP firmware
through a remote message passing interface.
The interface is based on passing serialised data structures over a ring
buffer with separate transmit and receive queues. Commands are sent by
writing to the CPU transmit queue and waiting for completion via the
receive queue.
To ensure safety mutable or immutable (depending on whether it is a send
or receive operation) references are taken on the command queue when
allocating the message to write/read to. This ensures message memory
remains valid and the command queue can't be mutated whilst an operation
is in progress.
Currently this is only used by the probe() routine and therefore can
only used by a single thread of execution. Locking to enable safe access
from multiple threads will be introduced in a future series when that
becomes necessary.
rust: enable slice_flatten feature and provide it through an extension trait
In Rust 1.80, the previously unstable `slice::flatten` family of methods
have been stabilized and renamed to `slice::as_flattened`.
This creates an issue as we want to use `as_flattened`, but need to
support the MSRV (which at the moment is Rust 1.78) where it is named
`flatten`.
Solve this by enabling the `slice_flatten` feature, and providing an
`as_flattened` implementation through an extension trait for compiler
versions where it is not available.
The trait is then exported from the prelude, making the `as_flattened`
family of methods transparently available for all supported compiler
versions.
This extension trait can be removed once the MSRV passes 1.80.
Alistair Popple [Mon, 10 Nov 2025 13:34:15 +0000 (22:34 +0900)]
gpu: nova-core: Add zeroable trait to bindings
Derive the Zeroable trait for existing bindgen generated bindings. This
is safe because all bindgen generated types are simple integer types for
which any bit pattern, including all zeros, is valid.
Joel Fernandes [Mon, 10 Nov 2025 13:34:14 +0000 (22:34 +0900)]
gpu: nova-core: Add a slice-buffer (sbuffer) datastructure
A data structure that can be used to write across multiple slices which
may be out of order in memory. This lets SBuffer user correctly and
safely write out of memory order, without error-prone tracking of
pointers/offsets.
let mut buf1 = [0u8; 3];
let mut buf2 = [0u8; 5];
let mut sbuffer = SBuffer::new([&mut buf1[..], &mut buf2[..]]);
let data = b"hello";
let result = sbuffer.write(data);
Alistair Popple [Mon, 10 Nov 2025 13:34:13 +0000 (22:34 +0900)]
gpu: nova-core: gsp: Create wpr metadata
The GSP requires some pieces of metadata to boot. These are passed in a
struct which the GSP transfers via DMA. Create this struct and get a
handle to it for future use when booting the GSP.
Alistair Popple [Mon, 10 Nov 2025 13:34:12 +0000 (22:34 +0900)]
gpu: nova-core: Create initial Gsp
The GSP requires several areas of memory to operate. Each of these have
their own simple embedded page tables. Set these up and map them for DMA
to/from GSP using CoherentAllocation's. Return the DMA handle describing
where each of these regions are for future use when booting GSP.
gpu: nova-core: num: add functions to safely convert a const value to a smaller type
There are times where we need to store a constant value defined as a
larger type (e.g. through a binding) into a smaller type, knowing
that the value will fit. Rust, unfortunately, only provides us with the
`as` operator for that purpose, the use of which is discouraged as it
silently strips data.
Extend the `num` module with functions allowing to perform the
conversion infallibly, at compile time.
Example:
const FOO_VALUE: u32 = 1;
// `FOO_VALUE` fits into a `u8`, so the conversion is valid.
let foo = num::u32_to_u8::<{ FOO_VALUE }>();
We are going to use this feature extensively in Nova.
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
[acourbot@nvidia.com: fix mistake in example pointed out by Mikko.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-3-8ae4058e3c0e@nvidia.com>
gpu: nova-core: compute layout of more framebuffer regions required for GSP
Compute more of the required FB layout information to boot the GSP
firmware.
This information is dependent on the firmware itself, so first we need
to import and abstract the required firmware bindings in the `nvfw`
module.
Then, a new FB HAL method is introduced in `fb::hal` that uses these
bindings and hardware information to compute the correct layout
information.
This information is then used in `fb` and the result made visible in
`FbLayout`.
These 3 things are grouped into the same patch to avoid lots of unused
warnings that would be tedious to work around. As they happen in
different files, they should not be too difficult to track separately.
There are a few remaining cases where we *do* want to use `as`,
because we specifically want to strip the data that does not fit into
the destination type. Comment these uses to clear confusion about the
intent.
gpu: nova-core: replace use of `as` with functions from `num`
Use the newly-introduced `num` module to replace the use of `as`
wherever it is safe to do. This ensures that a given conversion cannot
lose data if its source or destination type ever changes.
gpu: nova-core: add functions and traits for lossless integer conversions
The core library's `From` implementations do not cover conversions
that are not portable or future-proof. For instance, even though it is
safe today, `From<usize>` is not implemented for `u64` because of the
possibility to support larger-than-64bit architectures in the future.
However, the kernel supports a narrower set of architectures, and in the
case of Nova we only support 64-bit. This makes it helpful and desirable
to provide more infallible conversions, lest we need to rely on the `as`
keyword and carry the risk of silently losing data.
Thus, introduce a new module `num` that provides safe const functions
performing more conversions allowed by the build target, as well as
`FromSafeCast` and `IntoSafeCast` traits that are just extensions of
`From` and `Into` to conversions that are known to be lossless.
This small patch updates the nova todo list to
remove some tasks that have been solved lately:
* COHA is solved in this patch series
* TRSM was solved recently [1]
Signed-off-by: Daniel del Castillo <delcastillodelarosadaniel@gmail.com>
[acourbot@nvidia.com: set prefix to "Documentation: nova:".] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251104193756.57726-4-delcastillodelarosadaniel@gmail.com>
gpu: nova-core: Simplify `DmaObject::from_data` in nova-core/dma.rs
This patch solves one of the existing mentions of COHA, a task
in the Nova task list about improving the `CoherentAllocation` API.
It uses the `write` method from `CoherentAllocation`.
Signed-off-by: Daniel del Castillo <delcastillodelarosadaniel@gmail.com>
[acourbot@nvidia.com: set prefix to "gpu: nova-core:".] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251104193756.57726-3-delcastillodelarosadaniel@gmail.com>
gpu: nova-core: Fix capitalization of some comments
Some comments that already existed didn't start with a capital
letter, this patch fixes that.
Signed-off-by: Daniel del Castillo <delcastillodelarosadaniel@gmail.com>
[acourbot@nvidia.com: set prefix to "gpu: nova-core:".] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251104193756.57726-2-delcastillodelarosadaniel@gmail.com>
gpu: nova-core: Simplify `transmute` and `transmute_mut` in fwsec.rs
This patch solves one of the existing mentions of COHA, a task
in the Nova task list about improving the `CoherentAllocation` API.
It uses the new `from_bytes` method from the `FromBytes` trait as
well as the `as_slice` and `as_slice_mut` methods from
`CoherentAllocation`.
Signed-off-by: Daniel del Castillo <delcastillodelarosadaniel@gmail.com>
[acourbot@nvidia.com: set prefix to "gpu: nova-core:".]
[acourbot@nvidia.com: fix merge conflict after imports refactor.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251104193756.57726-1-delcastillodelarosadaniel@gmail.com>
John Hubbard [Fri, 7 Nov 2025 02:10:06 +0000 (18:10 -0800)]
gpu: nova-core: apply the one "use" item per line policy
As per [1], we need one "use" item per line, in order to reduce merge
conflicts. Furthermore, we need a trailing ", //" in order to tell
rustfmt(1) to leave it alone.
Acked-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
[acourbot@nvidia.com: remove imports already in prelude as pointed out
by Danilo.]
[acourbot@nvidia.com: remove a few unneeded trailing `//`.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251107021006.434109-1-jhubbard@nvidia.com>
gpu: nova-core: vbios: use FromBytes for NpdeStruct
Use `from_bytes_copy_prefix` to create `NpdeStruct` instead of building
it ourselves from the bytes stream. This lets us remove a few array
accesses and results in shorter code.
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251029-nova-vbios-frombytes-v1-5-ac441ebc1de3@nvidia.com>
gpu: nova-core: vbios: use FromBytes for BitHeader
Use `from_bytes_copy_prefix` to create `BitHeader` instead of building
it ourselves from the bytes stream. This lets us remove a few array
accesses and results in shorter code.
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251029-nova-vbios-frombytes-v1-4-ac441ebc1de3@nvidia.com>
gpu: nova-core: vbios: use FromBytes for PcirStruct
Use `from_bytes_copy_prefix` to create `PcirStruct` instead of building
it ourselves from the bytes stream. This lets us remove a few array
accesses and results in shorter code.
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251029-nova-vbios-frombytes-v1-3-ac441ebc1de3@nvidia.com>
gpu: nova-core: vbios: use FromBytes for PmuLookupTable header
Use `from_bytes_copy_prefix` to create the `PmuLookupTable` header
instead of building it ourselves from the bytes stream. This lets us
remove a few `as` conversions and array accesses.
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251029-nova-vbios-frombytes-v1-2-ac441ebc1de3@nvidia.com>
rust: transmute: add `from_bytes_prefix` family of methods
The `from_bytes*` family of functions expect a slice of the exact same
size as the requested type. This can be sometimes cumbersome for callers
that deal with dynamic stream of data that needs to be manually cut
before each invocation of `from_bytes`.
To simplify such callers, introduce a new `from_bytes*_prefix` family of
methods, which split the input slice at the index required for the
equivalent `from_bytes` method to succeed, and return its result
alongside with the remainder of the slice.
This design is inspired by zerocopy's `try_*_from_prefix` family of
methods.
gpu: nova-core: use `try_from` instead of `as` for u32 conversions
There are a few situations in the driver where we convert a `usize` into
a `u32` using `as`. Even though most of these are obviously correct, use
`try_from` and let the compiler optimize wherever it is safe to do so.
Danilo Krummrich [Tue, 28 Oct 2025 10:44:27 +0000 (11:44 +0100)]
MAINTAINERS: add Tyr to DRM DRIVERS AND COMMON INFRASTRUCTURE [RUST]
Commit cf4fd52e3236 ("rust: drm: Introduce the Tyr driver for Arm Mali
GPUs") introduced the Tyr driver for ARM Mali GPUs, which is maintained
through the drm-rust tree, hence add it to the corresponding entry in
MAINTAINERS.
Signed-off-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Daniel Almeida <daniel.almeida@collabora.com> Acked-by: Alice Ryhl <aliceryhl@google.com> Link: https://patch.msgid.link/20251028104433.334886-1-dakr@kernel.org Signed-off-by: Alice Ryhl <aliceryhl@google.com>
John Hubbard [Sat, 25 Oct 2025 01:40:50 +0000 (18:40 -0700)]
gpu: nova-core: remove unnecessary need_riscv, bar parameters
The need_riscv parameter and its associated RISCV validation logic are
are actually unnecessary for correct operation. Remove it, along with
the now-unused bar parameter as well.
Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251025014050.585153-3-jhubbard@nvidia.com>
John Hubbard [Sat, 25 Oct 2025 01:40:49 +0000 (18:40 -0700)]
gpu: nova-core: remove an unnecessary register read: HWCFG1
This register read is not required in order to bring up any of the GPUs,
and it is read too early on Hopper/Blackwell+ GPUs anyway. So just stop
doing this.
Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251025014050.585153-2-jhubbard@nvidia.com>
Lyude Paul [Thu, 16 Oct 2025 21:08:14 +0000 (17:08 -0400)]
Partially revert "rust: drm: gem: Implement AlwaysRefCounted for all gem objects automatically"
Currently in order to implement AlwaysRefCounted for gem objects, we use a
blanket implementation:
unsafe impl<T: IntoGEMObject> AlwaysRefCounted for T { … }
While this technically works, it comes with the rather unfortunate downside
that attempting to create a similar blanket implementation in any other
kernel crate will now fail in a rather confusing way.
Using an example from the (not yet upstream) rust DRM KMS bindings, if we
were to add:
unsafe impl<T: RcModeObject> AlwaysRefCounted for T { … }
Then the moment that both blanket implementations are present in the same
kernel tree, compilation fails with the following:
error[E0119]: conflicting implementations of trait `types::AlwaysRefCounted`
--> rust/kernel/drm/kms.rs:504:1
|
504 | unsafe impl<T: RcModeObject> AlwaysRefCounted for T {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation
|
::: rust/kernel/drm/gem/mod.rs:97:1
|
97 | unsafe impl<T: IntoGEMObject> AlwaysRefCounted for T {
| ---------------------------------------------------- first implementation here
So, revert these changes for now. The proper fix for this is to introduce a
macro for copy/pasting the same implementation of AlwaysRefCounted around.
Joel Fernandes [Thu, 16 Oct 2025 15:13:22 +0000 (11:13 -0400)]
gpu: nova-core: bitfield: Add support for different storage widths
Previously, bitfields were hardcoded to use u32 as the underlying
storage type. Add support for different storage types (u8, u16, u32,
u64) to the bitfield macro.
New syntax is: struct Name(<type ex., u32>) { ... }
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Elle Rhumsaa <elle@weathered-steel.dev> Reviewed-by: Edwin Peer <epeer@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
[acourbot@nvidia.com: fix long lines warnings.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251016151323.1201196-4-joelagnelf@nvidia.com>
Joel Fernandes [Thu, 16 Oct 2025 15:13:21 +0000 (11:13 -0400)]
gpu: nova-core: bitfield: Move bitfield-specific code from register! into new macro
Move the bitfield-specific code from the register macro into a new macro
called bitfield. This will be used to define structs with bitfields,
similar to C language.
gpu: nova-core: register: use field type for Into implementation
The getter method of a field works with the field type, but its setter
expects the type of the register. This leads to an asymmetry in the
From/Into implementations required for a field with a dedicated type.
For instance, a field declared as
Change this so the `From<Mode>` now needs to be implemented for `u8`,
i.e. the primitive type of the field. This is more consistent, and will
become a requirement once we start using the TryFrom/Into derive macros
to implement these automatically.
Reported-by: Edwin Peer <epeer@nvidia.com> Closes: https://lore.kernel.org/rust-for-linux/F3853912-2C1C-4F9B-89B0-3168689F35B3@nvidia.com/ Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251016151323.1201196-2-joelagnelf@nvidia.com>
Alice Ryhl [Mon, 6 Oct 2025 12:05:56 +0000 (12:05 +0000)]
panthor: use drm_gpuva_unlink_defer()
Instead of manually deferring cleanup of vm_bos, use the new GPUVM
infrastructure for doing so.
To avoid manual management of vm_bo refcounts, the panthor_vma_link()
and panthor_vma_unlink() methods are changed to get and put a vm_bo
refcount on the vm_bo. This simplifies the code a lot. I preserved the
behavior where panthor_gpuva_sm_step_map() drops the refcount right away
rather than letting panthor_vm_cleanup_op_ctx() do it later.
Alice Ryhl [Mon, 6 Oct 2025 12:05:55 +0000 (12:05 +0000)]
drm/gpuvm: add deferred vm_bo cleanup
When using GPUVM in immediate mode, it is necessary to call
drm_gpuvm_unlink() from the fence signalling critical path. However,
unlink may call drm_gpuvm_bo_put(), which causes some challenges:
1. drm_gpuvm_bo_put() often requires you to take resv locks, which you
can't do from the fence signalling critical path.
2. drm_gpuvm_bo_put() calls drm_gem_object_put(), which is often going
to be unsafe to call from the fence signalling critical path.
To solve these issues, add a deferred version of drm_gpuvm_unlink() that
adds the vm_bo to a deferred cleanup list, and then clean it up later.
The new methods take the GEMs GPUVA lock internally rather than letting
the caller do it because it also needs to perform an operation after
releasing the mutex again. This is to prevent freeing the GEM while
holding the mutex (more info as comments in the patch). This means that
the new methods can only be used with DRM_GPUVM_IMMEDIATE_MODE.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20251006-vmbo-defer-v4-1-30cbd2c05adb@google.com
[aliceryhl: fix formatting of vm_bo = llist_entry(...) line] Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Linus Torvalds [Sun, 19 Oct 2025 14:59:43 +0000 (04:59 -1000)]
Merge tag 'sched_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Make sure the check for lost pelt idle time is done unconditionally
to have correct lost idle time accounting
- Stop the deadline server task before a CPU goes offline
* tag 'sched_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix pelt lost idle time detection
sched/deadline: Stop dl_server before CPU goes offline
Linus Torvalds [Sun, 19 Oct 2025 14:54:08 +0000 (04:54 -1000)]
Merge tag 'perf_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:
- Make sure perf reporting works correctly in setups using
overlayfs or FUSE
- Move the uprobe optimization to a better location logically
* tag 'perf_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix MMAP2 event device with backing files
perf/core: Fix MMAP event path names with backing files
perf/core: Fix address filter match with backing files
uprobe: Move arch_uprobe_optimize right after handlers execution
Linus Torvalds [Sun, 19 Oct 2025 14:41:27 +0000 (04:41 -1000)]
Merge tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Reset the why-the-system-rebooted register on AMD to avoid stale bits
remaining from previous boots
- Add a missing barrier in the TLB flushing code to prevent erroneously
not flushing a TLB generation
- Make sure cpa_flush() does not overshoot when computing the end range
of a flush region
- Fix resctrl bandwidth counting on AMD systems when the amount of
monitoring groups created exceeds the number the hardware can track
* tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/CPU/AMD: Prevent reset reasons from being retained across reboot
x86/mm: Fix SMP ordering in switch_mm_irqs_off()
x86/mm: Fix overflow in __cpa_addr()
x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID
Linus Torvalds [Sat, 18 Oct 2025 20:05:13 +0000 (10:05 -1000)]
Merge tag 'rust-rustfmt' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull rustfmt fixes from Miguel Ojeda:
"Rust 'rustfmt' cleanup
'rustfmt', by default, formats imports in a way that is prone to
conflicts while merging and rebasing, since in some cases it condenses
several items into the same line.
Document in our guidelines that we will handle this for the moment
with the trailing empty comment workaround and make the tree
'rustfmt'-clean again"
* tag 'rust-rustfmt' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
rust: bitmap: fix formatting
rust: cpufreq: fix formatting
rust: alloc: employ a trailing comment to keep vertical layout
docs: rust: add section on imports formatting
Linus Torvalds [Sat, 18 Oct 2025 18:38:28 +0000 (08:38 -1000)]
Merge tag 'tpmdd-next-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm fix from Jarkko Sakkinen:
"Correct the state transitions for ARM FF-A to match the spec and how
tpm_crb behaves on other platforms"
* tag 'tpmdd-next-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm_crb: Add idle support for the Arm FF-A start method
Linus Torvalds [Sat, 18 Oct 2025 18:35:09 +0000 (08:35 -1000)]
Merge tag 'pci-v6.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Search for MSI Capability with correct ID to fix an MSI regression on
platforms with Cadence IP (Hans Zhang)
- Revert early bridge resource set up to fix resource assignment
failures that broke at least alpha boot and Snapdragon ath12k WiFi
(Ilpo Järvinen)
- Implement VMD .irq_startup()/.irq_shutdown() to fix IRQ issues that
caused boot crashes and broken devices below VMD (Inochi Amaoto)
- Select CONFIG_SCREEN_INFO on X86 to fix black screen on boot when
SCREEN_INFO not selected (Mario Limonciello)
* tag 'pci-v6.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI/VGA: Select SCREEN_INFO on X86
PCI: vmd: Override irq_startup()/irq_shutdown() in vmd_init_dev_msi_info()
PCI: Revert early bridge resource set up
PCI: cadence: Search for MSI Capability with correct ID
Linus Torvalds [Sat, 18 Oct 2025 18:22:07 +0000 (08:22 -1000)]
Merge tag 'cxl-fixes-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull Compute Express Link fixes from Dave Jiang:
"A small collection of CXL fixes. In addition to some misc fixes for
the CXL subsystem, a number of fixes for CXL extended linear cache
support are included to make it functional again.
- Avoid missing port component registers setup due to dport
enumeration failure
- Add check for no entries in cxl_feature_info to address accessing
invalid pointer.
- Use %pa printk format to emit resource_size_t in
validate_region_offset()
CXL extended linear cache support fixes:
- Fix setup of memory resource in cxl_acpi_set_cache_size()
- Set range param for region_res_match_cxl_range() as const
(addresses a compile warning for match_region_by_range() fix)
- Fix match_region_by_range() to use region_res_match_cxl_range()
- Subtract to find an hpa_alias0 in cxl_poison events to correct the
alias math calculation"
* tag 'cxl-fixes-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/trace: Subtract to find an hpa_alias0 in cxl_poison events
cxl/region: Use %pa printk format to emit resource_size_t
cxl: Fix match_region_by_range() to use region_res_match_cxl_range()
cxl: Set range param for region_res_match_cxl_range() as const
cxl/acpi: Fix setup of memory resource in cxl_acpi_set_cache_size()
cxl/features: Add check for no entries in cxl_feature_info
cxl/port: Avoid missing port component registers setup
Linus Torvalds [Sat, 18 Oct 2025 18:00:43 +0000 (08:00 -1000)]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
- Replace bpf_map_kmalloc_node() with kmalloc_nolock() to fix kmemleak
imbalance in tracking of bpf_async_cb structures (Alexei Starovoitov)
- Make selftests/bpf arg_parsing.c more robust to errors (Andrii
Nakryiko)
- Fix redefinition of 'off' as different kind of symbol when I40E
driver is builtin (Brahmajit Das)
- Do not disable preemption in bpf_test_run (Sahil Chandna)
- Fix memory leak in __lookup_instance error path (Shardul Bankar)
- Ensure test data is flushed to disk before reading it (Xing Guo)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Fix redefinition of 'off' as different kind of symbol
bpf: Do not disable preemption in bpf_test_run().
bpf: Fix memory leak in __lookup_instance error path
selftests: arg_parsing: Ensure data is flushed to disk before reading.
bpf: Replace bpf_map_kmalloc_node() with kmalloc_nolock() to allocate bpf_async_cb structures.
selftests/bpf: make arg_parsing.c more robust to crashes
bpf: test_run: Fix ctx leak in bpf_prog_test_run_xdp error path
Linus Torvalds [Sat, 18 Oct 2025 17:23:59 +0000 (07:23 -1000)]
Merge tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat fixes from Namjae Jeon:
- Fix out-of-bounds in FS_IOC_SETFSLABEL
- Add validation for stream entry size to prevent infinite loop
* tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: fix out-of-bounds in exfat_nls_to_ucs2()
exfat: fix improper check of dentry.stream.valid_size
Linus Torvalds [Sat, 18 Oct 2025 17:18:48 +0000 (07:18 -1000)]
Merge tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:
- Fix for FlexFiles mirror->dss allocation
- Apply delay_retrans to async operations
- Check if suid/sgid is cleared after a write when needed
- Fix setting the state renewal timer for early mounts after a reboot
* tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS4: Fix state renewals missing after boot
NFS: check if suid/sgid was cleared after a write as needed
NFS4: Apply delay_retrans to async operations
NFSv4/flexfiles: fix to allocate mirror->dss before use
Linus Torvalds [Sat, 18 Oct 2025 17:11:32 +0000 (07:11 -1000)]
Merge tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
"smb client fixes, security and smbdirect improvements, and some minor cleanup:
- Important OOB DFS fix
- Fix various potential tcon refcount leaks
- smbdirect (RDMA) fixes (following up from test event a few weeks
ago):
- Fixes to improve and simplify handling of memory lifetime of
smbdirect_mr_io structures, when a connection gets disconnected
- Make sure we really wait to reach SMBDIRECT_SOCKET_DISCONNECTED
before destroying resources
- Make sure the send/recv submission/completion queues are large
enough to avoid ib_post_send() from failing under pressure
- convert cifs.ko to use the recommended crypto libraries (instead of
crypto_shash), this also can improve performance
- Three small cleanup patches"
* tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: (24 commits)
smb: client: Consolidate cmac(aes) shash allocation
smb: client: Remove obsolete crypto_shash allocations
smb: client: Use HMAC-MD5 library for NTLMv2
smb: client: Use MD5 library for SMB1 signature calculation
smb: client: Use MD5 library for M-F symlink hashing
smb: client: Use HMAC-SHA256 library for SMB2 signature calculation
smb: client: Use HMAC-SHA256 library for key generation
smb: client: Use SHA-512 library for SMB3.1.1 preauth hash
cifs: parse_dfs_referrals: prevent oob on malformed input
smb: client: Fix refcount leak for cifs_sb_tlink
smb: client: let smbd_destroy() wait for SMBDIRECT_SOCKET_DISCONNECTED
smb: move some duplicate definitions to common/cifsglob.h
smb: client: let destroy_mr_list() keep smbdirect_mr_io memory if registered
smb: client: let destroy_mr_list() call ib_dereg_mr() before ib_dma_unmap_sg()
smb: client: call ib_dma_unmap_sg if mr->sgt.nents is not 0
smb: client: improve logic in smbd_deregister_mr()
smb: client: improve logic in smbd_register_mr()
smb: client: improve logic in allocate_mr_list()
smb: client: let destroy_mr_list() remove locked from the list
smb: client: let destroy_mr_list() call list_del(&mr->list)
...
Linus Torvalds [Sat, 18 Oct 2025 17:07:14 +0000 (07:07 -1000)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Fix the handling of ZCR_EL2 in NV VMs
- Pick the correct translation regime when doing a PTW on the back of
a SEA
- Prevent userspace from injecting an event into a vcpu that isn't
initialised yet
- Move timer save/restore to the sysreg handling code, fixing EL2
timer access in the process
- Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug
- Fix trapping configuration when the host isn't GICv3
- Improve the detection of HCR_EL2.E2H being RES1
- Drop a spurious 'break' statement in the S1 PTW
- Don't try to access SPE when owned by EL3
Documentation updates:
- Document the failure modes of event injection
- Document that a GICv3 guest can be created on a GICv5 host with
FEAT_GCIE_LEGACY
Selftest improvements:
- Add a selftest for the effective value of HCR_EL2.AMO
- Address build warning in the timer selftest when building with
clang
- Teach irqfd selftests about non-x86 architectures
- Add missing sysregs to the set_id_regs selftest
- Fix vcpu allocation in the vgic_lpi_stress selftest
- Correctly enable interrupts in the vgic_lpi_stress selftest
x86:
- Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test
for the bug fixed by commit 3ccbf6f47098 ("KVM: x86/mmu: Return
-EAGAIN if userspace deletes/moves memslot during prefault")
- Don't try to get PMU capabilities from perf when running a CPU with
hybrid CPUs/PMUs, as perf will rightly WARN.
guest_memfd:
- Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a
more generic KVM_CAP_GUEST_MEMFD_FLAGS
- Add a guest_memfd INIT_SHARED flag and require userspace to
explicitly set said flag to initialize memory as SHARED,
irrespective of MMAP.
The behavior merged in 6.18 is that enabling mmap() implicitly
initializes memory as SHARED, which would result in an ABI
collision for x86 CoCo VMs as their memory is currently always
initialized PRIVATE.
- Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with
private memory, to enable testing such setups, i.e. to hopefully
flush out any other lurking ABI issues before 6.18 is officially
released.
- Add testcases to the guest_memfd selftest to cover guest_memfd
without MMAP, and host userspace accesses to mmap()'d private
memory"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits)
arm64: Revamp HCR_EL2.E2H RES1 detection
KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available
KVM: arm64: Compute per-vCPU FGTs at vcpu_load()
KVM: arm64: selftests: Fix misleading comment about virtual timer encoding
KVM: arm64: selftests: Add an E2H=0-specific configuration to get_reg_list
KVM: arm64: selftests: Make dependencies on VHE-specific registers explicit
KVM: arm64: Kill leftovers of ad-hoc timer userspace access
KVM: arm64: Fix WFxT handling of nested virt
KVM: arm64: Move CNT*CT_EL0 userspace accessors to generic infrastructure
KVM: arm64: Move CNT*_CVAL_EL0 userspace accessors to generic infrastructure
KVM: arm64: Move CNT*_CTL_EL0 userspace accessors to generic infrastructure
KVM: arm64: Add timer UAPI workaround to sysreg infrastructure
KVM: arm64: Make timer_set_offset() generally accessible
KVM: arm64: Replace timer context vcpu pointer with timer_id
KVM: arm64: Introduce timer_context_to_vcpu() helper
KVM: arm64: Hide CNTHV_*_EL2 from userspace for nVHE guests
Documentation: KVM: Update GICv3 docs for GICv5 hosts
KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests
KVM: arm64: selftests: Actually enable IRQs in vgic_lpi_stress
KVM: arm64: selftests: Allocate vcpus with correct size
...
Linus Torvalds [Sat, 18 Oct 2025 17:02:28 +0000 (07:02 -1000)]
Merge tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Madhavan Srinivasan:
- Fix to handle NULL pointer dereference at irq domain teardown
- Fix for handling extraction of struct xive_irq_data
- Fix to skip parameter area allocation when fadump disabled
Thanks to Ganesh Goudar, Hari Bathini, Nam Cao, Ritesh Harjani (IBM),
Sourabh Jain, and Venkat Rao Bagalkote,
* tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/fadump: skip parameter area allocation when fadump is disabled
powerpc, ocxl: Fix extraction of struct xive_irq_data
powerpc/pseries/msi: Fix NULL pointer dereference at irq domain teardown
Linus Torvalds [Sat, 18 Oct 2025 16:59:25 +0000 (06:59 -1000)]
Merge tag 'slab-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fixes from Vlastimil Babka:
- Fixes for two bugs that can be triggered when debugging options are
enabled (Hao Ge, Vlastimil Babka)
* tag 'slab-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
slab: reset slab->obj_ext when freeing and it is OBJEXTS_ALLOC_FAIL
slab: fix clearing freelist in free_deferred_objects()
Stuart Yoder [Sat, 18 Oct 2025 11:25:18 +0000 (14:25 +0300)]
tpm_crb: Add idle support for the Arm FF-A start method
According to the CRB over FF-A specification [1], a TPM that implements
the ABI must comply with the TCG PTP specification. This requires support
for the Idle and Ready states.
This patch implements CRB control area requests for goIdle and
cmdReady on FF-A based TPMs.
The FF-A message used to notify the TPM of CRB updates includes a
locality parameter, which provides a hint to the TPM about which
locality modified the CRB. This patch adds a locality parameter
to __crb_go_idle() and __crb_cmd_ready() to support this.
Paolo Bonzini [Sat, 18 Oct 2025 08:25:43 +0000 (10:25 +0200)]
Merge tag 'kvm-x86-fixes-6.18-rc2' of https://github.com/kvm-x86/linux into HEAD
KVM x86 fixes for 6.18:
- Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the
bug fixed by commit 3ccbf6f47098 ("KVM: x86/mmu: Return -EAGAIN if userspace
deletes/moves memslot during prefault")
- Don't try to get PMU capabbilities from perf when running a CPU with hybrid
CPUs/PMUs, as perf will rightly WARN.
- Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more
generic KVM_CAP_GUEST_MEMFD_FLAGS
- Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set
said flag to initialize memory as SHARED, irrespective of MMAP. The
behavior merged in 6.18 is that enabling mmap() implicitly initializes
memory as SHARED, which would result in an ABI collision for x86 CoCo VMs
as their memory is currently always initialized PRIVATE.
- Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private
memory, to enable testing such setups, i.e. to hopefully flush out any
other lurking ABI issues before 6.18 is officially released.
- Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP,
and host userspace accesses to mmap()'d private memory.
Linus Torvalds [Fri, 17 Oct 2025 23:04:21 +0000 (13:04 -1000)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Explicitly encode the XZR register if the value passed to
write_sysreg_s() is 0.
The GIC CDEOI instruction is encoded as a system register write with
XZR as the source register. However, clang does not honour the "Z"
register constraint, leading to incorrect code generation
- Ensure the interrupts (DAIF.IF) are unmasked when completing
single-step of a suspended breakpoint before calling
exit_to_user_mode().
With pseudo-NMIs, interrupts are (additionally) masked at the PMR_EL1
register, handled by local_irq_*()
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: debug: always unmask interrupts in el0_softstp()
arm64/sysreg: Fix GIC CDEOI instruction encoding
- Avoid duplicate device probes by avoiding DT hardware probing when
ACPI is enabled in early boot
- Use the correct set of dependencies for
CONFIG_ARCH_HAS_ELF_CORE_EFLAGS, avoiding an allnoconfig warning
- Fix a few other minor issues
* tag 'riscv-for-linux-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: kprobes: convert one final __ASSEMBLY__ to __ASSEMBLER__
riscv: Respect dependencies of ARCH_HAS_ELF_CORE_EFLAGS
riscv: acpi: avoid errors caused by probing DT devices when ACPI is used
riscv: kprobes: Fix probe address validation
riscv: entry: fix typo in comment 'instruciton' -> 'instruction'
RISC-V: clear hot-unplugged cores from all task mm_cpumasks to avoid rfence errors
riscv: kgdb: Ensure that BUFMAX > NUMREGBYTES
rust: cfi: only 64-bit arm and x86 support CFI_CLANG
Brahmajit Das [Fri, 17 Oct 2025 17:15:51 +0000 (22:45 +0530)]
selftests/bpf: Fix redefinition of 'off' as different kind of symbol
This fixes the following build error
CLNG-BPF [test_progs] verifier_global_ptr_args.bpf.o
progs/verifier_global_ptr_args.c:228:5: error: redefinition of 'off' as
different kind of symbol
228 | u32 off;
| ^
The symbol 'off' was previously defined in
tools/testing/selftests/bpf/tools/include/vmlinux.h, which includes an
enum i40e_ptp_gpio_pin_state from
drivers/net/ethernet/intel/i40e/i40e_ptp.c:
This enum is included when CONFIG_I40E is enabled. As of commit 032676ff8217 ("LoongArch: Update Loongson-3 default config file"),
CONFIG_I40E is set in the defconfig, which leads to the conflict.
Renaming the local variable avoids the redefinition and allows the
build to succeed.
Suggested-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Brahmajit Das <listout@listout.xyz> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20251017171551.53142-1-listout@listout.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Sahil Chandna [Tue, 14 Oct 2025 18:56:35 +0000 (00:26 +0530)]
bpf: Do not disable preemption in bpf_test_run().
The timer mode is initialized to NO_PREEMPT mode by default,
this disables preemption and force execution in atomic context
causing issue on PREEMPT_RT configurations when invoking
spin_lock_bh(), leading to the following warning:
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6107, name: syz.0.17
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 1
Preemption disabled at:
[<ffffffff891fce58>] bpf_test_timer_enter+0xf8/0x140 net/bpf/test_run.c:42
Fix this, by removing NO_PREEMPT/NO_MIGRATE mode check.
Also, the test timer context no longer needs explicit calls to
migrate_disable()/migrate_enable() with rcu_read_lock()/rcu_read_unlock().
Use helpers rcu_read_lock_dont_migrate() and rcu_read_unlock_migrate()
instead.
Reported-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1f1fbecb9413cdbfbef8 Suggested-by: Yonghong Song <yonghong.song@linux.dev> Suggested-by: Menglong Dong <menglong.dong@linux.dev> Acked-by: Yonghong Song <yonghong.song@linux.dev> Tested-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com Co-developed-by: Brahmajit Das <listout@listout.xyz> Signed-off-by: Brahmajit Das <listout@listout.xyz> Signed-off-by: Sahil Chandna <chandna.sahil@gmail.com> Link: https://lore.kernel.org/r/20251014185635.10300-1-chandna.sahil@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Ada Couprie Diaz [Tue, 14 Oct 2025 09:25:36 +0000 (10:25 +0100)]
arm64: debug: always unmask interrupts in el0_softstp()
We intend that EL0 exception handlers unmask all DAIF exceptions
before calling exit_to_user_mode().
When completing single-step of a suspended breakpoint, we do not call
local_daif_restore(DAIF_PROCCTX) before calling exit_to_user_mode(),
leaving all DAIF exceptions masked.
When pseudo-NMIs are not in use this is benign.
When pseudo-NMIs are in use, this is unsound. At this point interrupts
are masked by both DAIF.IF and PMR_EL1, and subsequent irq flag
manipulation may not work correctly. For example, a subsequent
local_irq_enable() within exit_to_user_mode_loop() will only unmask
interrupts via PMR_EL1 (leaving those masked via DAIF.IF), and
anything depending on interrupts being unmasked (e.g. delivery of
signals) will not work correctly.
This was detected by CONFIG_ARM64_DEBUG_PRIORITY_MASKING.
Move the call to `try_step_suspended_breakpoints()` outside of the check
so that interrupts can be unmasked even if we don't call the step handler.
Fixes: 0ac7584c08ce ("arm64: debug: split single stepping exception entry") Cc: <stable@vger.kernel.org> # 6.17 Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: added Mark's rewritten commit log and some whitespace] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The GIC CDEOI system instruction requires the Rt field to be set to 0b11111
otherwise the instruction behaviour becomes CONSTRAINED UNPREDICTABLE.
Currenly, its usage is encoded as a system register write, with a constant
0 value:
write_sysreg_s(0, GICV5_OP_GIC_CDEOI)
While compiling with GCC, the 0 constant value, through these asm
constraints and modifiers ('x' modifier and 'Z' constraint combo):
asm volatile(__msr_s(r, "%x0") : : "rZ" (__val));
forces the compiler to issue the XZR register for the MSR operation (ie
that corresponds to Rt == 0b11111) issuing the right instruction encoding.
Unfortunately LLVM does not yet understand that modifier/constraint
combo so it ends up issuing a different register from XZR for the MSR
source, which in turns means that it encodes the GIC CDEOI instruction
wrongly and the instruction behaviour becomes CONSTRAINED UNPREDICTABLE
that we must prevent.
Add a conditional to write_sysreg_s() macro that detects whether it
is passed a constant 0 value and issues an MSR write with XZR as source
register - explicitly doing what the asm modifier/constraint is meant to
achieve through constraints/modifiers, fixing the LLVM compilation issue.
Fixes: 7ec80fb3f025 ("irqchip/gic-v5: Add GICv5 PPI support") Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Cc: Sascha Bischoff <sascha.bischoff@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Linus Torvalds [Fri, 17 Oct 2025 15:45:54 +0000 (08:45 -0700)]
Merge tag 'io_uring-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Revert of a change that went into an older kernel, and which has been
reported to cause a regression for some write workloads on LVM while
a snapshop is being created
- Fix a regression from this merge window, where some compilers (and/or
certain .config options) would cause an earlier evaluations of a
dereference which would then cause a NULL pointer dereference.
I was only able to reproduce this with OPTIMIZE_FOR_SIZE=y, but David
Howells hit it with just KASAN enabled. Depending on how things
inlined, this makes sense
- Fix for a missing lock around a mem region unregistration
- Fix for ring resizing with the same placement after resize
* tag 'io_uring-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/rw: check for NULL io_br_sel when putting a buffer
io_uring: fix unexpected placement on same size resizing
io_uring: protect mem region deregistration
Revert "io_uring/rw: drop -EOPNOTSUPP check in __io_complete_rw_common()"
Linus Torvalds [Fri, 17 Oct 2025 15:31:26 +0000 (08:31 -0700)]
Merge tag 'block-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- iostats accounting fixed on multipath retries (Amit)
- secure concatenation response fixup (Martin)
- tls partial record fixup (Wilfred)
- Fix for a lockdep reported issue with the elevator lock and
blk group frozen operations
- Fix for a regression in this merge window, where updating
'nr_requests' would not do the right thing for queues with
shared tags
* tag 'block-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
nvme/tcp: handle tls partially sent records in write_space()
block: Remove elevator_lock usage from blkg_conf frozen operations
blk-mq: fix stale tag depth for shared sched tags in blk_mq_update_nr_requests()
nvme-auth: update sc_c in host response
nvme-multipath: Skip nr_active increments in RETRY disposition