Bjorn Helgaas [Wed, 3 Dec 2025 20:18:45 +0000 (14:18 -0600)]
Merge branch 'pci/pwrctrl-tc9563'
- Add a struct pci_ops.assert_perst() function pointer to assert/deassert
PCIe PERST# and implement it for the qcom driver (Krishna Chaitanya
Chundru)
- Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe switch,
which must be held in reset after poweron so the pwrctrl driver can
configure the switch via I2C before bringing up the links (Krishna
Chaitanya Chundru)
* pci/pwrctrl-tc9563:
PCI: pwrctrl: Add power control driver for TC9563
PCI: qcom: Implement .assert_perst()
PCI: dwc: Implement .assert_perst() for dwc glue drivers
PCI: Add .assert_perst() to control PCIe PERST#
dt-bindings: PCI: Add binding for Toshiba TC9563 PCIe switch
Bjorn Helgaas [Wed, 3 Dec 2025 20:18:38 +0000 (14:18 -0600)]
Merge branch 'pci/controller/mediatek'
- Convert DT binding to YAML schema (Christian Marangi)
- Add Airoha AN7583 DT compatible and driver support (Christian Marangi)
* pci/controller/mediatek:
PCI: mediatek: Add support for Airoha AN7583 SoC
PCI: mediatek: Use generic MACRO for TPVPERL delay
PCI: mediatek: Convert bool to single quirks entry and bitmap
dt-bindings: PCI: mediatek: Add support for Airoha AN7583
dt-bindings: PCI: mediatek: Convert to YAML schema
Bjorn Helgaas [Wed, 3 Dec 2025 20:18:38 +0000 (14:18 -0600)]
Merge branch 'pci/controller/keystone'
- Fail the probe instead of silently succeeding if ks_pcie_of_data
didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli)
- Make keystone buildable as a loadable module, except on ARM32 where
hook_fault_code() is __init (Siddharth Vadapalli)
* pci/controller/keystone:
PCI: keystone: Add support to build as a loadable module
PCI: dwc: Export dw_pcie_allocate_domains() and dw_pcie_ep_raise_msix_irq()
PCI: Export pci_get_host_bridge_device() for use by pci-keystone
PCI: keystone: Exit ks_pcie_probe() for invalid mode
Bjorn Helgaas [Wed, 3 Dec 2025 20:18:37 +0000 (14:18 -0600)]
Merge branch 'pci/controller/j721e'
- Use devm_clk_get_optional_enabled() instead of open-coding
devm_clk_get_optional() and clk_prepare_enable() (Anand Moon)
* pci/controller/j721e:
PCI: j721e: Use 'pcie->reset_gpio' directly and drop the local variable
PCI: j721e: Use devm_clk_get_optional_enabled() to get and enable the clock
Bjorn Helgaas [Wed, 3 Dec 2025 20:18:35 +0000 (14:18 -0600)]
Merge branch 'pci/controller/dwc'
- Update PORT_LOGIC_LTSSM_STATE_MASK to be a 6-bit mask as per spec, not a
5-bit mask (Shawn Lin)
- Clear L1 PM Substate Capability 'Supported' bits unless glue driver says
it's supported, which prevents users from enabling non-working L1SS.
Currently only qcom and tegra194 support L1SS (Bjorn Helgaas)
- Remove now-superfluous L1SS disable code from tegra194 (Bjorn Helgaas)
- Configure L1SS support in dw-rockchip when DT says 'supports-clkreq'
(Shawn Lin)
* pci/controller/dwc:
PCI: dw-rockchip: Configure L1SS support
PCI: tegra194: Remove unnecessary L1SS disable code
PCI: dwc: Advertise L1 PM Substates only if driver requests it
PCI: dwc: Fix wrong PORT_LOGIC_LTSSM_STATE_MASK definition
Bjorn Helgaas [Wed, 3 Dec 2025 20:18:35 +0000 (14:18 -0600)]
Merge branch 'pci/controller/brcmstb'
- Disable advertising ASPM L0s support correctly (Jim Quinlan)
- Add a panic/die handler to print diagnostic info in case PCIe caused an
unrecoverable abort (Jim Quinlan)
* pci/controller/brcmstb:
PCI: brcmstb: Add panic/die handler to driver
PCI: brcmstb: Add a way to indicate if PCIe bridge is active
PCI: brcmstb: Fix disabling L0s capability
Bjorn Helgaas [Wed, 3 Dec 2025 20:18:32 +0000 (14:18 -0600)]
Merge branch 'pci/resource'
- Prevent resource tree corruption when BAR resize fails (Ilpo Järvinen)
- Restore BARs to the original size if a BAR resize fails (Ilpo Järvinen)
- Remove BAR release from BAR resize attempts by the xe, i915, and amdgpu
drivers so the PCI core can restore BARs if the resize fails (Ilpo
Järvinen)
- Move Resizable BAR code to rebar.c (Ilpo Järvinen)
- Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo Järvinen)
- Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo Järvinen)
* pci/resource:
PCI: Validate pci_rebar_size_supported() input
PCI: Convert BAR sizes bitmasks to u64
drm/amdgpu: Use pci_rebar_get_max_size()
drm/xe/vram: Use pci_rebar_get_max_size()
PCI: Add pci_rebar_get_max_size()
drm/xe/vram: Use PCI rebar helpers in resize_vram_bar()
drm/i915/gt: Use pci_rebar_size_supported()
PCI: Add pci_rebar_size_supported() helper
PCI: Improve Resizable BAR functions kernel doc
PCI: Move pci_rebar_size_to_bytes() and export it
PCI: Move pci_rebar_bytes_to_size() and clean it up
PCI: Move Resizable BAR code to rebar.c
PCI: Prevent restoring assigned resources
drm/amdgpu: Remove driver side BAR release before resize
drm/i915: Remove driver side BAR release before resize
drm/xe: Remove driver side BAR release before resize
PCI: Add kerneldoc for pci_resize_resource()
PCI: Fix restoring BARs on BAR resize rollback path
PCI: Free saved list without holding pci_bus_sem
PCI: Try BAR resize even when no window was released
PCI: Change pci_dev variable from 'bridge' to 'dev'
PCI/IOV: Adjust ->barsz[] when changing BAR size
PCI: Prevent resource tree corruption when BAR resize fails
Bjorn Helgaas [Wed, 3 Dec 2025 20:18:31 +0000 (14:18 -0600)]
Merge branch 'pci/ptm'
- Enable PTM only if device advertises support for a relevant role, to
prevent invalid PTM Requests that cause ACS violations that are reported
as AER Uncorrectable Non-Fatal errors (Mika Westerberg)
* pci/ptm:
PCI/PTM: Enable only if device advertises relevant role
Bjorn Helgaas [Wed, 3 Dec 2025 20:18:31 +0000 (14:18 -0600)]
Merge branch 'pci/err'
- For drivers using PCI legacy suspend, save config state at suspend so
that state (not any earlier state from enumeration, probe, or error
recovery) will be restored when resuming (Lukas Wunner)
- For devices with no driver or a driver that lacks PM, save config state
at hibernate so that state (not any earlier state from enumeration,
probe, or error recovery) will be restored when resuming (Lukas Wunner)
- Save device config space on device addition, before driver binding, so
error recovery works more reliably (Lukas Wunner)
- Drop pci_save_state() from several drivers that no longer need it since
the PCI core always does it and pci_restore_state() no longer invalidates
the saved state (Lukas Wunner)
- Document use of pci_save_state() by drivers to capture the state they
want restored during error recovery (Lukas Wunner)
* pci/err:
Documentation: PCI: Amend error recovery doc with pci_save_state() rules
treewide: Drop pci_save_state() after pci_restore_state()
PCI/ERR: Ensure error recoverability at all times
PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw
PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths
PCI: cadence: Add support for High Perf Architecture (HPA) controller
Add support for Cadence PCIe RP configuration for High Performance
Architecture (HPA) controllers. The Cadence High Performance controllers
are the latest PCIe controllers that have support for DMA, optional IDE
and updated register set. Add a common library for High Performance
Architecture (HPA) PCIe controllers.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: update to Ciprian Marian Costea per
https://lore.kernel.org/r/f38396c7-0605-4876-9ea6-0a179d6577c7@oss.nxp.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20251121164920.2008569-5-vincent.guittot@linaro.org
Claudiu Beznea [Wed, 19 Nov 2025 14:35:19 +0000 (16:35 +0200)]
PCI: Add Renesas RZ/G3S host controller driver
The Renesas RZ/G3S features a PCIe IP that complies with the PCI Express
Base Specification 4.0 and supports speeds of up to 5 GT/s. It functions
only as a root complex, with a single-lane (x1) configuration. The
controller includes Type 1 configuration registers, as well as IP
specific registers (called AXI registers) required for various adjustments.
Hardware manual can be downloaded from the address in the "Link" section.
The following steps should be followed to access the manual:
1/ Click the "User Manual" button
2/ Click "Confirm"; this will start downloading an archive
3/ Open the downloaded archive
4/ Navigate to r01uh1014ej*-rzg3s-users-manual-hardware -> Deliverables
5/ Open the file r01uh1014ej*-rzg3s.pdf
Marc Zyngier [Tue, 25 Nov 2025 10:27:26 +0000 (10:27 +0000)]
PCI: host-generic: Move bridge allocation outside of pci_host_common_init()
Having the host bridge allocation inside pci_host_common_init() results
in a lot of complexity in the pcie-apple driver (the only direct user
of this function outside of core PCI code).
It forces the allocation of driver-specific tracking structures outside
of the bridge allocation, which in turn requires it to use inefficient
data structures to match the bridge and the private structure as needed.
Instead, let the bridge structure be passed to pci_host_common_init(),
allowing the driver to allocate it together with the private data,
as it is usually intended. The driver can then retrieve the bridge
via the owning device attached to the PCI config window structure.
This allows the pcie-apple driver to be significantly simplified.
Both core and driver code are changed in one go to avoid going via
a transitional interface.
Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Radu Rendec <rrendec@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Manivannan Sadhasivam <mani@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Krzysztof Wilczyński <kwilczynski@kernel.org> Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> Link: https://lore.kernel.org/r/86jyzms036.wl-maz@kernel.org Link: https://patch.msgid.link/20251125102726.865617-1-maz@kernel.org
The PCIe IP available on the Renesas RZ/G3S complies with the PCI Express
Base Specification 4.0. It is designed for root complex applications and
features a single-lane (x1) implementation. Add binding documentation for
it.
The problem is this call tree, which uses the 'size' from the user to shift
in BIT() without validating it:
__resource_resize_store # takes 'buf' from user sysfs write
kstrtoul(buf, 0, &size) # converts to unsigned long
pci_resize_resource # truncates to int
pci_rebar_size_supported # BIT(size) without validation
There could be similar problems also with pci_resize_resource() parameter
values coming from drivers.
Add 'size' validation to pci_rebar_size_supported().
There seems to be no SZ_128T prior to this so add one to be able to specify
the largest size supported by the kernel (PCIe r7.0 spec already defines
sizes even beyond 128TB but kernel does not yet support them).
The issue looks older than the introduction of pci_rebar_size_supported()
by bb1fabd0d94e ("PCI: Add pci_rebar_size_supported() helper").
It would be also nice to convert 'size' unsigned too everywhere, maybe even
u8 but that is left as further work.
Fixes: 8bb705e3e79d ("PCI: Add pci_resize_resource() for resizing BARs") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/aSA1WiRG3RuhqZMY@stanley.mountain/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: commit log, add report URL] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251124153740.2995-1-ilpo.jarvinen@linux.intel.com
Lukas Wunner [Fri, 21 Nov 2025 17:31:17 +0000 (18:31 +0100)]
Documentation: PCI: Amend error recovery doc with pci_save_state() rules
After recovering from a PCI error through reset, affected devices are in
D0_uninitialized state and need to be brought into D0_active state by
re-initializing their Config Space registers (PCIe r7.0 sec 5.3.1.1).
To facilitate that, the PCI core provides pci_restore_state() and
pci_save_state() helpers. Document rules governing their usage.
As Bjorn notes, so far no file in "Documentation/ includes anything about
the idea of a driver using pci_save_state() to capture the state it wants
to restore after an error", even though it is a common pattern in drivers.
So that's obviously a gap that should be closed.
Lukas Wunner [Sun, 12 Oct 2025 13:25:02 +0000 (15:25 +0200)]
treewide: Drop pci_save_state() after pci_restore_state()
In 2009, commit c82f63e411f1 ("PCI: check saved state before restore")
changed the behavior of pci_restore_state() such that it became necessary
to call pci_save_state() afterwards, lest recovery from subsequent PCI
errors fails.
The commit has just been reverted and so all the pci_save_state() after
pci_restore_state() calls that have accumulated in the tree are now
superfluous. Drop them.
Two drivers chose a different approach to achieve the same result:
drivers/scsi/ipr.c and drivers/net/ethernet/intel/e1000e/netdev.c set the
pci_dev's "state_saved" flag to true before calling pci_restore_state().
Drop this as well.
Lukas Wunner [Wed, 19 Nov 2025 08:50:03 +0000 (09:50 +0100)]
PCI/ERR: Ensure error recoverability at all times
When the PCI core gained power management support in 2002, it introduced
pci_save_state() and pci_restore_state() helpers to restore Config Space
after a D3hot or D3cold transition, which implies a Soft or Fundamental
Reset (PCIe r7.0 sec 5.8):
In 2006, EEH and AER were introduced to recover from errors by performing
a reset. Because errors can occur at any time, drivers began calling
pci_save_state() on probe to ensure recoverability.
In 2009, recoverability was foiled by commit c82f63e411f1 ("PCI: check
saved state before restore"): It amended pci_restore_state() to bail out
if the "state_saved" flag has been cleared. The flag is cleared by
pci_restore_state() itself, hence a saved state is now allowed to be
restored only once and is then invalidated. That doesn't seem to make
sense because the saved state should be good enough to be reused.
Soon after, drivers began to work around this behavior by calling
pci_save_state() immediately after pci_restore_state(), see e.g. commit b94f2d775a71 ("igb: call pci_save_state after pci_restore_state").
Hilariously, two drivers even set the "saved_state" flag to true before
invoking pci_restore_state(), see ipr_reset_restore_cfg_space() and
e1000_io_slot_reset().
Despite these workarounds, recoverability at all times is not guaranteed:
E.g. when a PCIe port goes through a runtime suspend and resume cycle,
the "saved_state" flag is cleared by:
... and hence on a subsequent AER event, the port's Config Space cannot be
restored. Riana reports a recovery failure of a GPU-integrated PCIe
switch and has root-caused it to the behavior of pci_restore_state().
Another workaround would be necessary, namely calling pci_save_state() in
pcie_port_device_runtime_resume().
The motivation of commit c82f63e411f1 was to prevent restoring state if
pci_save_state() hasn't been called before. But that can be achieved by
saving state already on device addition, after Config Space has been
initialized. A desirable side effect is that devices become recoverable
even if no driver gets bound. This renders the commit unnecessary, so
revert it.
Lukas Wunner [Wed, 19 Nov 2025 08:50:02 +0000 (09:50 +0100)]
PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw
The state_saved flag tells the PCI core whether a driver assumes
responsibility to save Config Space and put the device into a low power
state on suspend.
The flag is currently initialized to false on enumeration, even though it
already is false (because struct pci_dev is zeroed by kzalloc()) and even
though it is set to false before commencing the suspend sequence (the only
code path where it's relevant).
The flag is also set to false in pci_pm_thaw(), i.e. on resume, when it's
no longer relevant.
Drop these two superfluous flag assignments for simplicity.
Lukas Wunner [Wed, 19 Nov 2025 08:50:01 +0000 (09:50 +0100)]
PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths
When a PCI device is suspended, it is normally the PCI core's job to save
Config Space and put the device into a low power state. However drivers
are allowed to assume these responsibilities. When they do, the PCI core
can tell by looking at the state_saved flag in struct pci_dev: The flag
is cleared before commencing the suspend sequence and it is set when
pci_save_state() is called. If the PCI core finds the flag set late in
the suspend sequence, it refrains from calling pci_save_state() itself.
But there are two corner cases where the PCI core neglects to clear the
flag before commencing the suspend sequence:
* If a driver has legacy PCI PM callbacks, pci_legacy_suspend() neglects
to clear the flag. The (stale) flag is subsequently queried by
pci_legacy_suspend() itself and pci_legacy_suspend_late().
* If a device has no driver or its driver has no PCI PM callbacks,
pci_pm_freeze() neglects to clear the flag. The (stale) flag is
subsequently queried by pci_pm_freeze_noirq().
The flag may be set prior to suspend if the device went through error
recovery: Drivers commonly invoke pci_restore_state() + pci_save_state()
to restore Config Space after reset.
The flag may also be set if drivers call pci_save_state() on probe to
allow for recovery from subsequent errors.
The result is that pci_legacy_suspend_late() and pci_pm_freeze_noirq()
don't call pci_save_state() and so the state that will be restored on
resume is the one recorded on last error recovery or on probe, not the one
that the device had on suspend. If the two states happen to be identical,
there's no problem.
Reinstate clearing the flag in pci_legacy_suspend() and pci_pm_freeze().
The two functions used to do that until commit 4b77b0a2ba27 ("PCI: Clear
saved_state after the state has been restored") deemed it unnecessary
because it assumed that it's sufficient to clear the flag on resume in
pci_restore_state(). The commit seemingly did not take into account that
pci_save_state() and pci_restore_state() are not only used by power
management code, but also for error recovery.
Devices without driver or whose driver has no PCI PM callbacks may be in
runtime suspend when pci_pm_freeze() is called. Their state has already
been saved, so don't clear the flag to skip a pointless pci_save_state()
in pci_pm_freeze_noirq().
None of the drivers with legacy PCI PM callbacks seem to use runtime PM,
so clear the flag unconditionally in their case.
Shawn Lin [Tue, 18 Nov 2025 21:42:17 +0000 (15:42 -0600)]
PCI: dw-rockchip: Configure L1SS support
L1 PM Substates for RC mode require support in the dw-rockchip driver
including proper handling of the CLKREQ# sideband signal. It is mostly
handled by hardware, but software still needs to set the clkreq fields
in the PCIE_CLIENT_POWER_CON register to match the hardware implementation.
For more details, see section '18.6.6.4 L1 Substate' in the RK3568 TRM 1.1
Part 2, or section '11.6.6.4 L1 Substate' in the RK3588 TRM 1.0 Part2.
[bhelgaas: set pci->l1ss_support so DWC core preserves L1SS Capability bits;
drop corresponding code here, include updates from
https://lore.kernel.org/r/aRRG8wv13HxOCqgA@ryzen]
The DWC core clears the L1 Substates Supported bits unless the driver sets
the "dw_pcie.l1ss_support" flag.
The tegra194 init_host_aspm() sets "dw_pcie.l1ss_support" if the platform
has the "supports-clkreq" DT property. If "supports-clkreq" is absent,
"dw_pcie.l1ss_support" is not set, and the DWC core will clear the L1
Substates Supported bits.
The tegra194 code to clear the L1 Substates Supported bits is unnecessary,
so remove it.
Bjorn Helgaas [Tue, 18 Nov 2025 21:42:15 +0000 (15:42 -0600)]
PCI: dwc: Advertise L1 PM Substates only if driver requests it
L1 PM Substates require the CLKREQ# signal and may also require
device-specific support. If CLKREQ# is not supported or driver support is
lacking, enabling L1.1 or L1.2 may cause errors when accessing devices,
e.g.,
nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10
If the kernel is built with CONFIG_PCIEASPM_POWER_SUPERSAVE=y or users
enable L1.x via sysfs, users may trip over these errors even if L1
Substates haven't been enabled by firmware or the driver.
To prevent such errors, disable advertising the L1 PM Substates unless the
driver sets "dw_pcie.l1ss_support" to indicate that it knows CLKREQ# is
present and any device-specific configuration has been done.
Set "dw_pcie.l1ss_support" in tegra194 (if DT includes the
"supports-clkreq' property) and qcom (for cfg_2_7_0, cfg_1_9_0, cfg_1_34_0,
and cfg_sc8280xp controllers) so they can continue to use L1 Substates.
Based on Niklas's patch:
https://patch.msgid.link/20251017163252.598812-2-cassel@kernel.org
TC9563 is a PCIe switch that has one upstream and three downstream ports.
One of the downstream ports is connected to an integrated ethernet MAC
endpoint. The other two downstream ports are available to connect to
external devices. One Host can connect to TC9563 by upstream port. The
TC9563 switch needs to be configured after powering on and before the PCIe
link is up.
The PCIe controller driver already enables link training at the host side
even before this driver probe happens. Due to this, when driver enables
power to the switch, it participates in link training and the PCIe link may
come up before configuring the switch through I2C. Once the link is up the
configuration done through I2C will not have any effect. To prevent the
host from participating in link training, disable link training on the host
side to ensure the link does not come up before the switch is configured
via I2C.
Based on DT property and type of the port, TC9563 is configured through
I2C.
Add support for assert_perst() for switches like TC9563, which require
configuration before the PCIe link is established. Such devices use this
function op to assert PERST# before configuring the device and once the
configuration is done they de-assert PERST#.
PCI: dwc: Implement .assert_perst() for dwc glue drivers
Add .assert_perst() hook for dwc glue drivers to register with
assert_perst() of pci ops, allowing for better control over the link
initialization and shutdown process.
Implement assert_perst() function op for dwc drivers.
Controller driver probes first, enables link training and scans the bus.
When the PCI bridge is found, its child DT nodes will be scanned and
pwrctrl devices will be created if needed. By the time pwrctrl driver probe
gets called, link training is already enabled by controller driver.
Certain devices like TC9563, which uses the PCI pwrctl framework, need to
configure the device before the PCIe link is up.
As the controller driver already enables link training as part of its
probe, the moment device is powered on, controller and device participate
in link training and link can come up immediately and may not have time to
configure the device.
So we need to stop the link training by using assert_perst() by asserting
PERST# and de-asserting PERST# after device is configured.
dt-bindings: PCI: Add binding for Toshiba TC9563 PCIe switch
Add a device tree binding for the Toshiba TC9563 PCIe switch, which
provides an Ethernet MAC integrated to the 3rd downstream port and
two downstream PCIe ports.
Christian Bruel [Fri, 14 Nov 2025 08:08:05 +0000 (09:08 +0100)]
PCI: stm32: Fix EP page_size alignment
pci_epc_mem_alloc_addr() allocates a CPU address from the ATU window phys
base and a page number. Set the ep->page_size so the resulting CPU address
is correctly aligned with the ATU required alignment.
Fixes: 151f3d29baf4 ("PCI: stm32-ep: Add PCIe Endpoint support for STM32MP25") Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
[mani: added fixes tag] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251114-atu_align_ep-v1-1-88da5366fa04@foss.st.com
Christian Bruel [Fri, 14 Nov 2025 07:45:52 +0000 (08:45 +0100)]
PCI: stm32: Fix LTSSM EP race with start link
If the host has deasserted PERST# and started link training before the link
is started on EP side, enabling LTSSM before the endpoint registers are
initialized in the perst_irq handler results in probing incorrect values.
Thus, wait for the PERST# level-triggered interrupt to start link training
at the end of initialization and cleanup the stm32_pcie_[start stop]_link
functions.
Alex Elder [Thu, 13 Nov 2025 21:45:37 +0000 (15:45 -0600)]
PCI: spacemit: Add SpacemiT PCIe host driver
Introduce a driver for the PCIe host controller found in the SpacemiT K1
SoC. The hardware is derived from the Synopsys DesignWare PCIe IP. The
driver supports up to three PCIe ports operating at PCIe link speed up to
5 GT/s. The first port uses a combo PHY, which may be configured for use
for USB3 instead.
Signed-off-by: Alex Elder <elder@riscstar.com>
[mani: added FIXME to the comment on disabling ASPM L1] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Jason Montleon <jmontleo@redhat.com> Tested-by: Johannes Erdfelt <johannes@erdfelt.com> Tested-by: Aurelien Jarno <aurelien@aurel32.net> Link: https://patch.msgid.link/20251113214540.2623070-6-elder@riscstar.com
Add the Devicetree binding for the PCIe Root Complex found on the SpacemiT
K1 SoC. This Root Complex is derived from the Synopsys Designware PCIe IP.
It supports up to three PCIe ports operating at PCIe link speed up to 5
GT/sec. One of the ports uses a combo PHY, which is typically used to
support a USB3 port.
Signed-off-by: Alex Elder <elder@riscstar.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Jason Montleon <jmontleo@redhat.com> Tested-by: Johannes Erdfelt <johannes@erdfelt.com> Tested-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20251113214540.2623070-4-elder@riscstar.com
dt-bindings: PCI: qcom,pcie-sm8550: Add missing required power-domains and resets
Commit b8d3404058a6 ("dt-bindings: PCI: qcom,pcie-sm8550: Move SM8550 to
dedicated schema") move the device schema to separate file, but it
missed a "if:not:...then:" clause in the original binding which was
requiring power-domains and resets for this particular chip.
dt-bindings: PCI: qcom,pcie-sm8450: Add missing required power-domains and resets
Commit 88c9b3af4e31 ("dt-bindings: PCI: qcom,pcie-sm8450: Move SM8450 to
dedicated schema") move the device schema to separate file, but it
missed a "if:not:...then:" clause in the original binding which was
requiring power-domains and resets for this particular chip.
dt-bindings: PCI: qcom,pcie-sm8350: Add missing required power-domains and resets
Commit 2278b8b54773 ("dt-bindings: PCI: qcom,pcie-sm8350: Move SM8350 to
dedicated schema") move the device schema to separate file, but it
missed a "if:not:...then:" clause in the original binding which was
requiring power-domains and resets for this particular chip.
dt-bindings: PCI: qcom,pcie-sm8250: Add missing required power-domains and resets
Commit 4891b66185c1 ("dt-bindings: PCI: qcom,pcie-sm8250: Move SM8250 to
dedicated schema") move the device schema to separate file, but it
missed a "if:not:...then:" clause in the original binding which was
requiring power-domains and resets for this particular chip.
dt-bindings: PCI: qcom,pcie-sm8150: Add missing required power-domains and resets
Commit 51bc04d5b49d ("dt-bindings: PCI: qcom,pcie-sm8150: Move SM8150 to
dedicated schema") move the device schema to separate file, but it
missed a "if:not:...then:" clause in the original binding which was
requiring power-domains and resets for this particular chip.
dt-bindings: PCI: qcom,pcie-sc8280xp: Add missing required power-domains and resets
Commit c007a5505504 ("dt-bindings: PCI: qcom,pcie-sc8280xp: Move
SC8280XP to dedicated schema") move the device schema to separate file,
but it missed a "if:not:...then:" clause in the original binding which
was requiring power-domains and resets for this particular chip.
dt-bindings: PCI: qcom,pcie-sc7280: Add missing required power-domains and resets
Commit 756485bfbb85 ("dt-bindings: PCI: qcom,pcie-sc7280: Move SC7280 to
dedicated schema") move the device schema to separate file, but it
missed a "if:not:...then:" clause in the original binding which was
requiring power-domains and resets for this particular chip.
dt-bindings: PCI: qcom,pcie-sa8775p: Add missing required power-domains and resets
Commit 544e8f96efc0 ("dt-bindings: PCI: qcom,pcie-sa8775p: Move SA8775p
to dedicated schema") move the device schema to separate file, but it
missed a "if:not:...then:" clause in the original binding which was
requiring power-domains and resets for this particular chip.
Ilpo Järvinen [Thu, 13 Nov 2025 18:00:53 +0000 (20:00 +0200)]
PCI: Convert BAR sizes bitmasks to u64
PCIe r7.0, sec 7.8.6, defines resizable BAR sizes beyond the currently
supported maximum of 128TB, which will require more than u32 to store the
entire bitmask.
Convert Resizable BAR related functions to use u64 bitmask for BAR sizes to
make the typing more future-proof.
The support for the larger BAR sizes themselves is not added at this point.
Ilpo Järvinen [Thu, 13 Nov 2025 18:00:48 +0000 (20:00 +0200)]
drm/i915/gt: Use pci_rebar_size_supported()
PCI core provides pci_rebar_size_supported() that helps in checking if an
encoded BAR Size is supported for the BAR or not. Use it in
i915_resize_lmem_bar() to simplify code.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/20251113180053.27944-7-ilpo.jarvinen@linux.intel.com
Ilpo Järvinen [Thu, 13 Nov 2025 18:00:47 +0000 (20:00 +0200)]
PCI: Add pci_rebar_size_supported() helper
Many callers of pci_rebar_get_possible_sizes() are interested in finding
out if a particular encoded BAR Size (PCIe r7.0, sec 7.8.6.3) is supported
by the particular BAR.
Add pci_rebar_size_supported() into PCI core to make it easy for the
drivers to determine if the BAR size is supported or not.
Use the new function in pci_resize_resource() and in
pci_iov_vf_bar_set_size().
Ilpo Järvinen [Thu, 13 Nov 2025 18:00:43 +0000 (20:00 +0200)]
PCI: Move Resizable BAR code to rebar.c
For lack of a better place to put it, Resizable BAR code has been placed
inside pci.c and setup-res.c that do not use it for anything. Upcoming
changes are going to add more Resizable BAR related functions, increasing
the code size.
As pci.c is huge as is, move the Resizable BAR related code and the BAR
resize code from setup-res.c to rebar.c.
Ilpo Järvinen [Thu, 13 Nov 2025 16:26:28 +0000 (18:26 +0200)]
PCI: Prevent restoring assigned resources
restore_dev_resource() copies saved addresses and flags from the struct
pci_dev_resource back to the struct resource, typically, during rollback
from a failure or in preparation for a retry attempt.
If the resource is within resource tree, the resource must not be
modified as the resource tree could be corrupted. Thus, it's a bug to
call restore_dev_resource() for assigned resources (which did happen
due to logic flaws in the BAR resize rollback).
Add WARN_ON_ONCE() into restore_dev_resource() to detect such bugs easily
and return without altering the resource to prevent corruption.
Ilpo Järvinen [Thu, 13 Nov 2025 16:26:27 +0000 (18:26 +0200)]
drm/amdgpu: Remove driver side BAR release before resize
PCI core handles releasing device's resources and their rollback in case of
failure of a BAR resizing operation. Releasing resource prior to calling
pci_resize_resource() prevents PCI core from restoring the BARs as they
were.
Remove driver-side release of BARs from the amdgpu driver.
Also remove the driver initiated assignment as pci_resize_resource() should
try to assign as much as possible. If the driver side call manages to get
more required resources assigned in some scenario, such a problem should be
fixed inside pci_resize_resource() instead.
Ilpo Järvinen [Thu, 13 Nov 2025 16:26:26 +0000 (18:26 +0200)]
drm/i915: Remove driver side BAR release before resize
PCI core handles releasing device's resources and their rollback in case of
failure of a BAR resizing operation. Releasing resource prior to calling
pci_resize_resource() prevents PCI core from restoring the BARs as they
were.
Remove driver-side release of BARs from the i915 driver.
Ilpo Järvinen [Thu, 13 Nov 2025 16:26:25 +0000 (18:26 +0200)]
drm/xe: Remove driver side BAR release before resize
PCI core handles releasing device's resources and their rollback in case of
failure of a BAR resizing operation. Releasing resource prior to calling
pci_resize_resource() prevents PCI core from restoring the BARs as they
were.
Remove driver-side release of BARs from the xe driver.
Ilpo Järvinen [Thu, 13 Nov 2025 16:26:23 +0000 (18:26 +0200)]
PCI: Fix restoring BARs on BAR resize rollback path
BAR resize operation is implemented in the pci_resize_resource() and
pbus_reassign_bridge_resources() functions. pci_resize_resource() can be
called either from __resource_resize_store() from sysfs or directly by the
driver for the Endpoint Device.
The pci_resize_resource() requires that caller has released the device
resources that share the bridge window with the BAR to be resized as
otherwise the bridge window is pinned in place and cannot be changed.
pbus_reassign_bridge_resources() rolls back resources if the resize
operation fails, but rollback is performed only for the bridge windows.
Because releasing the device resources are done by the caller of the BAR
resize interface, these functions performing the BAR resize do not have
access to the device resources as they were before the resize.
pbus_reassign_bridge_resources() could try __pci_bridge_assign_resources()
after rolling back the bridge windows as they were, however, it will not
guarantee the resource are assigned due to differences in how FW and the
kernel assign the resources (alignment of the start address and tail).
To perform rollback robustly, the BAR resize interface has to be altered to
also release the device resources that share the bridge window with the BAR
to be resized.
Also, remove restoring from the entries failed list as saved list should
now contain both the bridge windows and device resources so the extra
restore is duplicated work.
Some drivers (currently only amdgpu) want to prevent releasing some
resources. Add exclude_bars param to pci_resize_resource() and make amdgpu
pass its register BAR (BAR 2 or 5), which should never be released during
resize operation. Normally 64-bit prefetchable resources do not share a
bridge window with the 32-bit only register BAR, but there are various
fallbacks in the resource assignment logic which may make the resources
share the bridge window in rare cases.
This change (together with the driver side changes) is to counter the
resource releases that had to be done to prevent resource tree corruption
in the ("PCI: Release assigned resource before restoring them") change. As
such, it likely restores functionality in cases where device resources were
released to avoid resource tree conflicts which appeared to be "working"
when such conflicts were not correctly detected by the kernel.
Ilpo Järvinen [Thu, 13 Nov 2025 16:26:21 +0000 (18:26 +0200)]
PCI: Try BAR resize even when no window was released
Usually, resizing BARs requires releasing bridge windows in order to
resize it to fit a larger BAR into the window. This is not always the
case, however, FW could have made the window large enough to accommodate
larger BAR as is, or the user might prefer to shrink a BAR to make more
space for another Resizable BAR.
Thus, replace the check that requires that at least one bridge window
was released with a check that simply ensures bridge is not NULL.
Ilpo Järvinen [Thu, 13 Nov 2025 16:26:20 +0000 (18:26 +0200)]
PCI: Change pci_dev variable from 'bridge' to 'dev'
Upcoming fix to BAR resize will store also device BAR resource in the
saved list. Change the pci_dev variable in the loop from 'bridge' to
'dev' as the former would be misleading with non-bridges in the list.
This is in a separate change to reduce churn in the upcoming BAR resize
fix.
While it appears that the logic in the loop doing pci_setup_bridge() is
altered as 'bridge' variable is no longer updated, a bridge should never
appear more than once in the saved list so the check can only match to the
first entry. As such, the code with two distinct pci_dev variables better
represents the intention of the check compared with the old code where
bridge variable was reused for a different purpose.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Link: https://patch.msgid.link/20251113162628.5946-4-ilpo.jarvinen@linux.intel.com
Ilpo Järvinen [Thu, 13 Nov 2025 16:26:19 +0000 (18:26 +0200)]
PCI/IOV: Adjust ->barsz[] when changing BAR size
pci_rebar_set_size() adjusts BAR size for both normal and IOV BARs. The
struct pci_sriov keeps a cached copy of BAR size in ->barsz[] which is not
adjusted by pci_rebar_set_size() but by pci_iov_resource_set_size().
pci_iov_resource_set_size() is called also from
pci_resize_resource_set_size().
The current arrangement is problematic once BAR resize algorithm starts to
roll back changes properly in case of a failure. The normal resource
fitting algorithm rolls back resource size using the struct
pci_dev_resource easily but also calling pci_resize_resource_set_size() or
pci_iov_resource_set_size() to roll back BAR size would be an extra burden,
whereas combining ->barsz[] update with pci_rebar_set_size() naturally
rolls back it when restoring the old BAR size on a different layer of the
BAR resize operation.
Thus, rework pci_rebar_set_size() to also update ->barsz[].
Ilpo Järvinen [Thu, 13 Nov 2025 16:26:18 +0000 (18:26 +0200)]
PCI: Prevent resource tree corruption when BAR resize fails
pbus_reassign_bridge_resources() saves bridge windows into the saved
list before attempting to adjust resource assignments to perform a BAR
resize operation. If resource adjustments cannot be completed fully,
rollback is attempted by restoring the resource from the saved list.
The rollback, however, does not check whether the resources it restores were
assigned by the partial resize attempt. If restore changes addresses of the
resource, it can result in corrupting the resource tree.
An example of a corrupted resource tree with overlapping addresses:
A resource that are assigned into the resource tree must remain
unchanged. Thus, release such a resource before attempting to restore
and claim it back.
For simplicity, always do the release and claim back for the resource
even in the cases where it is restored to the same address range.
Note: this fix may "break" some cases where devices "worked" because
the resource tree corruption allowed address space double counting to
fit more resource than what can now be assigned without double
counting. The upcoming changes to BAR resizing should address those
scenarios (to the extent possible).
PCI: cadence: Move PCIe RP common functions to a separate file
Move the Cadence PCIe controller RP common functions into a separate file.
The common library functions are split from legacy PCIe RP controller
functions to a separate file.
Split the Cadence PCIe header file by moving the Legacy (LGA) controller
register definitions to a separate header file for support of next
generation PCIe controller architecture.
PCI: keystone: Add support to build as a loadable module
The 'pci-keystone.c' driver is the application/glue/wrapper driver for the
Designware PCIe Controllers on TI SoCs. Now that all of the helper APIs
that the 'pci-keystone.c' driver depends upon have been exported for use,
enable support to build the driver as a loadable module.
When building the driver as a module, the functions marked by the '__init'
keyword may be invoked after the init memory has been freed by the kernel.
This results will result in an exception of the form:
Unable to handle kernel paging request at virtual address ...
Mem abort info:
...
pc : ks_pcie_host_init+0x0/0x540
lr : dw_pcie_host_init+0x170/0x498
...
ks_pcie_host_init+0x0/0x540 (P)
ks_pcie_probe+0x728/0x84c
platform_probe+0x5c/0x98
really_probe+0xbc/0x29c
__driver_probe_device+0x78/0x12c
driver_probe_device+0xd8/0x15c
To address this, introduce a new function namely 'ks_pcie_init()' to
register the 'fault handler' while removing the '__init' keyword from
existing functions.
Note that hook_fault_code() is defined as '__init' function. Since the init
functions should never be called during runtime (after init memory freeing
stage), the driver is made as a built-in if CONFIG_ARM (where
hook_fault_code() is used) is selected.
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
[mani: added a note about hook_fault_code()] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251029080547.1253757-5-s-vadapalli@ti.com
PCI: dwc: Export dw_pcie_allocate_domains() and dw_pcie_ep_raise_msix_irq()
The pci-keystone.c driver uses the functions 'dw_pcie_allocate_domains()'
and 'dw_pcie_ep_raise_msix_irq()'. Export them in preparation for enabling
the pci-keystone.c driver to be built as a loadable module.
PCI: Export pci_get_host_bridge_device() for use by pci-keystone
The pci-keystone.c driver uses the 'pci_get_host_bridge_device()' helper.
Export it in preparation for enabling the pci-keystone.c driver to be built
as a loadable module.
PCI: keystone: Exit ks_pcie_probe() for invalid mode
Commit under Fixes introduced support for PCIe EP mode on AM654x platforms.
When the mode happens to be either "DW_PCIE_RC_TYPE" or "DW_PCIE_EP_TYPE",
the PCIe Controller is configured accordingly. However, when the mode is
neither of them, an error message is displayed, but the driver probe
succeeds. Since this "invalid" mode is not associated with a functional
PCIe Controller, the probe should fail.
Fix the behavior by exiting "ks_pcie_probe()" with the return value of
"-EINVAL" in addition to displaying the existing error message when the
mode is invalid.
Fixes: 23284ad677a9 ("PCI: keystone: Add support for PCIe EP in AM654x Platforms") Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251029080547.1253757-4-s-vadapalli@ti.com
Jim Quinlan [Wed, 29 Oct 2025 19:36:15 +0000 (15:36 -0400)]
PCI: brcmstb: Add panic/die handler to driver
Most PCIe HW returns 0xffffffff for failed reads on PCIe, but by default
Broadcom's STB PCIe controller effects an abort. Some SoCs -- 7216 and its
descendants -- have new HW that identifies error details.
Add a simple handler to print diagnostic info in case the PCIe controller
was the cause of the abort. Unfortunately, an abort still occurs.
Read the error registers only when the PCIe bridge is active and the PCIe
registers are accessible. Otherwise, a "die" event caused by something
other than PCIe could cause an abort if the PCIe "die" handler tried to
access registers when the bridge is off.
Example error output:
brcm-pcie 8b20000.pcie: Error: Mem Acc: 32bit, read, @0x38000000
brcm-pcie 8b20000.pcie: Type: TO=0 Abt=0 UnspReq=1 AccDsble=0 BadAddr=0
Jim Quinlan [Wed, 29 Oct 2025 19:36:14 +0000 (15:36 -0400)]
PCI: brcmstb: Add a way to indicate if PCIe bridge is active
In a future commit, a new handler will be introduced that in part does
reads and writes to some of the PCIe registers. When this handler is
invoked, it is paramount that it does not do these register accesses when
the PCIe bridge is inactive, as this will cause CPU abort errors.
To solve this we keep a spinlock that guards a variable which indicates
whether the bridge is on or off. When the bridge is on, access of the PCIe
HW registers may proceed.
Since there are multiple ways to reset the bridge, we introduce a general
function to obtain the spinlock, call the specific function that is used
for the specific SoC, sets the bridge active indicator variable, and
releases the spinlock.
Mika Westerberg [Wed, 12 Nov 2025 07:46:14 +0000 (08:46 +0100)]
PCI/PTM: Enable only if device advertises relevant role
We have a Switch Upstream Port (2b:00.0) that has a PTM Capability, but
doesn't advertise support for any PTM roles:
Capabilities: [220 v1] Precision Time Measurement
PTMCap: Requester- Responder- Root-
Linux enables PTM without looking into what roles it actually supports, and
apparently the Port immediately sends PTM Requests even though it doesn't
support the PTM Requester role. The messages include an invalid bus number,
so the Root Port detects an ACS Violation (see the PCIe r7.0, sec 6.12.1.1,
implementation note):
pci 0000:2b:00.0: [8086:5786] type 01 class 0x060400 PCIe Switch Upstream Port
pci 0000:2b:00.0: PTM enabled, 4ns granularity
pcieport 0000:00:07.1: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:00:07.1
pcieport 0000:00:07.1: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Receiver ID)
pcieport 0000:00:07.1: device [8086:e44f] error status/mask=00200000/00000000
pcieport 0000:00:07.1: [21] ACSViol (First)
pcieport 0000:00:07.1: AER: TLP Header: 0x34000000 0x00000052 0x00000000 0x00000000
The TLP Header shows a 4 DW header, no data (001b) Msg with Local routing
(1 0100b) with Requester ID 0x0000 and PTM Request code (0x52).
Fix this by enabling PTM only if the following conditions are true (see sec
6.21.1 figure 6-21):
- Endpoint must advertise PTM Requester Capable
- Switch Upstream Port must advertise PTM Responder Capable
First of all, the driver was parsing the 'dbi' register region as 'elbi'.
This was due to DT mistakenly passing 'dbi' as 'elbi'. Since the DT is
now fixed to supply 'dbi' region, this driver can rely on the DWC core
driver to parse and map it.
However, to support the old DTs, if the 'elbi' region is found in DT, parse
and map the region as both 'dw_pcie::elbi_base' as 'dw_pcie::dbi_base'.
This will allow the driver to work with both broken and fixed DTs.
Also, skip parsing the 'elbi' region in DWC core if 'pci->elbi_base' was
already populated.
Fixes: 9c0ef6d34fdb ("PCI: amlogic: Add the Amlogic Meson PCIe controller driver") Fixes: c96992a24bec ("PCI: dwc: Add support for ELBI resource mapping") Reported-by: Linnaea Lavia <linnaea-von-lavia@live.com> Closes: https://lore.kernel.org/linux-pci/DM4PR05MB102707B8CDF84D776C39F22F2C7F0A@DM4PR05MB10270.namprd05.prod.outlook.com/ Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on Bananapi-M2S Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Cc: stable@vger.kernel.org # 6.2 Link: https://patch.msgid.link/20251101-pci-meson-fix-v1-3-c50dcc56ed6a@oss.qualcomm.com
dt-bindings: PCI: amlogic: Fix the register name of the DBI region
Binding incorrectly specifies the 'DBI' region as 'ELBI'. DBI is a must
have region for DWC controllers as it has the Root Port and controller
specific registers, while ELBI has optional registers.
Hence, fix the binding. Though this is an ABI break, this change is needed
to accurately describe the PCI memory map.