]> Gentwo Git Trees - linux/.git/log
linux/.git
10 months agoMerge branch into tip/master: 'x86/cpu'
Ingo Molnar [Mon, 24 Feb 2025 19:40:21 +0000 (20:40 +0100)]
Merge branch into tip/master: 'x86/cpu'

 # New commits in x86/cpu:
    43bb700cff6b ("x86/cpu: Update Intel Family comments")
    64aad4749d79 ("ACPI/processor_idle: Export acpi_processor_ffh_play_dead()")
    96040f7273e2 ("x86/smp: Eliminate mwait_play_dead_cpuid_hint()")
    fc4ca9537bc4 ("intel_idle: Provide the default enter_dead() handler")
    541ddf31e300 ("ACPI/processor_idle: Add FFH state handling")
    a7dd183f0b38 ("x86/smp: Allow calling mwait_play_dead with an arbitrary hint")
    1e66d6cf888f ("x86/cpu: Fix #define name for Intel CPU model 0x5A")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'x86/core'
Ingo Molnar [Mon, 24 Feb 2025 19:40:21 +0000 (20:40 +0100)]
Merge branch into tip/master: 'x86/core'

 # New commits in x86/core:
    684b12916a10 ("x86/arch_prctl/64: Clean up ARCH_MAP_VDSO_32")
    2df1ad0d2580 ("x86/arch_prctl: Simplify sys_arch_prctl()")
    06dd759b68ee ("x86/module: Remove unnecessary check in module_finalize()")
    c305a4e98378 ("x86: Move sysctls into arch/x86")
    882b86fd4e0d ("x86/ibt: Handle FineIBT in handle_cfi_failure()")
    306859de59e5 ("x86/early_printk: Harden early_serial")
    c4239a72a29d ("x86/ibt: Clean up poison_endbr()")
    c20ad96c9a8f ("x86/traps: Cleanup and robustify decode_bug()")
    ab9fea59487d ("x86/alternative: Simplify callthunk patching")
    93f16a1ab78c ("x86/boot: Mark start_secondary() with __noendbr")
    582077c94052 ("x86/cfi: Clean up linkage")
    2981557cb040 ("x86,kcfi: Fix EXPORT_SYMBOL vs kCFI")
    72e213a7ccf9 ("x86/ibt: Clean up is_endbr()")
    675204778c69 ("module: don't annotate ROX memory as kmemleak_not_leak()")
    63887c9f0203 ("x86: Compare physical instead of virtual PGD addresses")
    3ef938c35035 ("x86/mm: Fix flush_tlb_range() when used for zapping normal PMDs")
    64f6a4e10c05 ("x86: re-enable EXECMEM_ROX support")
    602df3712979 ("module: drop unused module_writable_address()")
    1d7e707af446 ("Revert "x86/module: prepare module loading for ROX allocations of text"")
    c287c0723329 ("module: switch to execmem API for remapping as RW and restoring ROX")
    05e555b81726 ("execmem: add API for temporal remapping as RW and restoring ROX afterwards")
    925f42645118 ("execmem: don't remove ROX cache from the direct map")
    41d88484c71c ("x86/mm/pat: restore large ROX pages after fragmentation")
    4ee788eb0781 ("x86/mm/pat: drop duplicate variable in cpa_flush()")
    33ea120582a6 ("x86/mm/pat: cpa-test: fix length for CPA_ARRAY test")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'x86/cleanups'
Ingo Molnar [Mon, 24 Feb 2025 19:40:21 +0000 (20:40 +0100)]
Merge branch into tip/master: 'x86/cleanups'

 # New commits in x86/cleanups:
    51184c3c96a1 ("x86/usercopy: Fix kernel-doc func param name in clean_cache_range()'s description")
    0156338a18eb ("x86/apic: Use str_disabled_enabled() helper in print_ipi_mode()")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'x86/boot'
Ingo Molnar [Mon, 24 Feb 2025 19:40:20 +0000 (20:40 +0100)]
Merge branch into tip/master: 'x86/boot'

 # New commits in x86/boot:
    a2498e5c453b ("x86/kexec: Export e820_table_kexec[] to sysfs")
    5bebe2e4fe27 ("x86/boot: Change some static bootflag functions to bool")
    efe659ac0146 ("x86/e820: Drop obsolete E820_TYPE_RESERVED_KERN and related code")
    d45dd0a9b27e ("x86/boot: Split parsing of boot_params into the parse_boot_params() helper function")
    297fb82ebaad ("x86/boot: Split kernel resources setup into the setup_kernel_resources() helper function")
    2d6bff31399b ("x86/boot: Move setting of memblock parameters to e820__memblock_setup()")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'x86/asm'
Ingo Molnar [Mon, 24 Feb 2025 19:40:20 +0000 (20:40 +0100)]
Merge branch into tip/master: 'x86/asm'

 # New commits in x86/asm:
    dc8bd769e70e ("x86/ioperm: Use atomic64_inc_return() in ksys_ioperm()")
    7861640aac52 ("x86/build: Raise the minimum LLVM version to 15.0.0")
    01157ddc58dc ("kallsyms: Remove KALLSYMS_ABSOLUTE_PERCPU")
    4b00c1160a13 ("percpu: Remove __per_cpu_load")
    e23cff686178 ("percpu: Remove PERCPU_VADDR()")
    95b091611810 ("percpu: Remove PER_CPU_FIRST_SECTION")
    38a4968b3190 ("x86/percpu/64: Remove INIT_PER_CPU macros")
    a8327be7b2aa ("x86/boot/64: Remove inverse relocations")
    b5c4f95351a0 ("x86/percpu/64: Remove fixed_percpu_data")
    9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")
    80d47defddc0 ("x86/stackprotector/64: Convert to normal per-CPU variable")
    78c4374ef8b8 ("x86/module: Deal with GOT based stack cookie load on Clang < 17")
    cb7927fda002 ("x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations")
    f58b63857ae3 ("x86/pvh: Use fixed_percpu_data for early boot GSBASE")
    a9a76b38aaf5 ("x86/boot: Disable stack protector for early boot code")
    0ee2689b9374 ("x86/stackprotector: Remove stack protector test scripts")
    a3e8fe814ad1 ("x86/build: Raise the minimum GCC version to 8.1")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'timers/vdso'
Ingo Molnar [Mon, 24 Feb 2025 19:40:19 +0000 (20:40 +0100)]
Merge branch into tip/master: 'timers/vdso'

 # New commits in timers/vdso:
    ac1a42f4e4e2 ("vdso: Remove remnants of architecture-specific time storage")
    998a8a260819 ("vdso: Remove remnants of architecture-specific random state storage")
    9729dceab17b ("x86/vdso/vdso2c: Remove page handling")
    dafde29605eb ("x86/vdso: Switch to generic storage implementation")
    223970df2bff ("powerpc/vdso: Switch to generic storage implementation")
    69896119dc9d ("MIPS: vdso: Switch to generic storage implementation")
    9bf39a65b20c ("s390/vdso: Switch to generic storage implementation")
    31e9fa2ba9ad ("arm: vdso: Switch to generic storage implementation")
    d2862bb9d9ca ("LoongArch: vDSO: Switch to generic storage implementation")
    46fe55b204bf ("riscv: vdso: Switch to generic storage implementation")
    0b3bc3354eb9 ("arm64: vdso: Switch to generic storage implementation")
    365841e1557a ("vdso: Add generic architecture-specific data storage")
    51d6ca373f45 ("vdso: Add generic random data storage")
    df7fcbefa710 ("vdso: Add generic time data storage")
    127b0e05c166 ("vdso: Rename included Makefile")
    5b47aba85810 ("vdso: Introduce vdso/align.h")
    30533a55ec8e ("parisc: Remove unused symbol vdso_data")
    3ef32d90cdaa ("x86/vdso: Fix latent bug in vclock_pages calculation")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'timers/core'
Ingo Molnar [Mon, 24 Feb 2025 19:40:19 +0000 (20:40 +0100)]
Merge branch into tip/master: 'timers/core'

 # New commits in timers/core:
    f99c5bb396b8 ("posix-timers: Invoke cond_resched() during exit_itimers()")
    4441b976dfef ("hrtimers: Replace hrtimer_clock_to_base_table with switch-case")
    2ea97b76d671 ("hrtimers: Make hrtimer_update_function() less expensive")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'timers/cleanups'
Ingo Molnar [Mon, 24 Feb 2025 19:40:19 +0000 (20:40 +0100)]
Merge branch into tip/master: 'timers/cleanups'

 # New commits in timers/cleanups:
    86a578e780a9 ("wifi: rt2x00: Switch to use hrtimer_update_function()")
    3f8d93d1371f ("io_uring: Use helper function hrtimer_update_function()")
    eee00df8e1f1 ("serial: xilinx_uartps: Use helper function hrtimer_update_function()")
    ce68de08a2cc ("ASoC: fsl: imx-pcm-fiq: Switch to use hrtimer_setup()")
    bbdafde7c220 ("RDMA: Switch to use hrtimer_setup()")
    7b5edfd278b0 ("virtio: mem: Switch to use hrtimer_setup()")
    ff533f73d5c0 ("drm/vmwgfx: Switch to use hrtimer_setup()")
    397c07a3c90b ("drm/xe/oa: Switch to use hrtimer_setup()")
    c38e753abee2 ("drm/vkms: Switch to use hrtimer_setup()")
    58ac3c93306e ("drm/msm: Switch to use hrtimer_setup()")
    1a2ff5c3058d ("drm/i915/request: Switch to use hrtimer_setup()")
    f97e1d787f9f ("drm/i915/uncore: Switch to use hrtimer_setup()")
    82ad584eed8b ("drm/i915/pmu: Switch to use hrtimer_setup()")
    7358f053c4d6 ("drm/i915/perf: Switch to use hrtimer_setup()")
    9892287897ca ("drm/i915/gvt: Switch to use hrtimer_setup()")
    0592bb39e3a3 ("drm/i915/huc: Switch to use hrtimer_setup()")
    690d59fee83c ("drm/amdgpu: Switch to use hrtimer_setup()")
    c6be6eafd620 ("stm class: heartbeat: Switch to use hrtimer_setup()")
    f1061c1442c1 ("i2c: Switch to use hrtimer_setup()")
    c69da1735f19 ("iio: Switch to use hrtimer_setup()")
    a9d0ac739658 ("leds: trigger: pattern: Switch to use hrtimer_setup()")
    c158a29c5c5b ("mailbox: Switch to use hrtimer_setup()")
    0ebb5e74db09 ("media: Switch to use hrtimer_setup()")
    7f657ad09482 ("misc: vcpu_stall_detector: Switch to use hrtimer_setup()")
    3a1ed018e995 ("mmc: dw_mmc: Switch to use hrtimer_setup()")
    abeebe8889b7 ("ntb: ntb_pingpong: Switch to use hrtimer_setup()")
    5f8401cf7b3a ("drivers: perf: Switch to use hrtimer_setup()")
    563608c20403 ("power: reset: ltc2952-poweroff: Switch to use hrtimer_setup()")
    1b73fd14cfb4 ("power: supply: ab8500_chargalg: Switch to use hrtimer_setup()")
    d9a67240729d ("powercap: Switch to use hrtimer_setup()")
    5e55888e340a ("pps: generators: pps_gen_parport: Switch to use hrtimer_setup()")
    c92697913fdc ("rtc: class: Switch to use hrtimer_setup()")
    b7011929380d ("scsi: Switch to use hrtimer_setup()")
    0852ca41ce1c ("serial: xilinx_uartps: Switch to use hrtimer_setup()")
    4e1214969603 ("serial: sh-sci: Switch to use hrtimer_setup()")
    721c5bf65a1d ("serial: imx: Switch to use hrtimer_setup()")
    c5f0fa1622f6 ("serial: amba-pl011: Switch to use hrtimer_setup()")
    6bf9bb76b3af ("serial: 8250: Switch to use hrtimer_setup()")
    9fdf17c5aa2c ("usb: typec: tcpm: Switch to use hrtimer_setup()")
    8073d9dfe2ef ("usb: musb: cppi41: Switch to use hrtimer_setup()")
    da4f28741b90 ("usb: ehci: Switch to use hrtimer_setup()")
    060baec57cfe ("usb: gadget: Switch to use hrtimer_setup()")
    e0e59e95eb38 ("usb: fotg210-hcd: Switch to use hrtimer_setup()")
    4cf533bbdfab ("usb: dwc2: Switch to use hrtimer_setup()")
    a63cb05bd553 ("USB: chipidea: Switch to use hrtimer_setup()")
    1417c85d1625 ("xfrm: Switch to use hrtimer_setup()")
    7b449279f56a ("octeontx2-pf: Switch to use hrtimer_setup()")
    e26ad10db84b ("igc: Switch to use hrtimer_setup()")
    1528fd734e7b ("wifi: rt2x00: Switch to use hrtimer_setup()")
    cbe2691bee4e ("wifi: Switch to use hrtimer_setup()")
    d1ba57528f44 ("net/cdc_ncm: Switch to use hrtimer_setup()")
    d4bcc73352e4 ("net: wwan: iosm: Switch to use hrtimer_setup()")
    e193660f5e7f ("net: fec: Switch to use hrtimer_setup()")
    78afb7fa96ed ("net: stmmac: Switch to use hrtimer_setup()")
    3c85516612f8 ("net: qualcomm: rmnet: Switch to use hrtimer_setup()")
    4781599491bd ("net: mvpp2: Switch to use hrtimer_setup()")
    dbf13c4278a5 ("net: ieee802154: at86rf230: Switch to use hrtimer_setup()")
    7b63b1dc473e ("net: sparx5: Switch to use hrtimer_setup()")
    964177da435c ("net: ethernet: hisilicon: Switch to use hrtimer_setup()")
    66a3898a203d ("net: ethernet: ec_bhf: Switch to use hrtimer_setup()")
    f12185af60cb ("net: ethernet: cortina: Switch to use hrtimer_setup()")
    e9cc3a8936ee ("net: ethernet: ti: Switch to use hrtimer_setup()")
    806e32248e22 ("can: Switch to use hrtimer_setup()")
    881ec0c6db17 ("can: mcp251xfd: Switch to use hrtimer_setup()")
    e0eaefcd7e44 ("can: m_can: Switch to use hrtimer_setup()")
    553f9a8be728 ("tcp: Switch to use hrtimer_setup()")
    96b2fb3e6d14 ("mac802154: Switch to use hrtimer_setup()")
    efcb2d32a8f5 ("net/sched: Switch to use hrtimer_setup()")
    fe0b776543e9 ("netdev: Switch to use hrtimer_setup()")
    8030d4673e99 ("hwrng: timeriomem: Switch to use hrtimer_setup()")
    68d3de7fc49c ("null_blk: Switch to use hrtimer_setup()")
    4279d7054c87 ("PM / devfreq: rockchip-dfi: Switch to use hrtimer_setup()")
    efad91a9836e ("PM: runtime: Switch to use hrtimer_setup()")
    cab0e0a05627 ("blk_iocost: Switch to use hrtimer_setup()")
    32539b780c4f ("ata: pata_octeon_cf: Switch to use hrtimer_setup()")
    2414f15910c5 ("block, bfq: Switch to use hrtimer_setup()")
    19fec9c4434f ("tracing/osnoise: Switch to use hrtimer_setup()")
    d2254b064322 ("watchdog: Switch to use hrtimer_setup()")
    1654eba8f74d ("ubifs: Switch to use hrtimer_setup()")
    deacdc871b48 ("bpf: Switch to use hrtimer_setup()")
    f66b0acf394b ("time: Switch to hrtimer_setup()")
    9eeb54b47541 ("timerfd: Switch to use hrtimer_setup()")
    022a223546e4 ("perf: Switch to use hrtimer_setup()")
    91b7be704dd4 ("fork: Switch to use hrtimer_setup()")
    4248fd6f37c1 ("io_uring/timeout: Switch to use hrtimer_setup()")
    b09dffdeb369 ("lib: test_objpool: Switch to use hrtimer_setup()")
    53867760f50c ("mm/slab: Switch to use hrtimer_setup()")
    ee13da875b8a ("sched: Switch to use hrtimer_setup()")
    99fb79f6d6de ("s390/ap_bus: Switch to use hrtimer_setup()")
    c56c98e5af6d ("perf/x86: Switch to use hrtimer_setup()")
    d1f0d81b3604 ("powerpc/watchdog: Switch to use hrtimer_setup()")
    878a388866a6 ("ARM: 8611/1: l2x0: Switch to use hrtimer_setup()")
    2f33de836402 ("ARM: imx: Switch to use hrtimer_setup()")
    92051cb9d3e1 ("riscv: kvm: Switch to use hrtimer_setup()")
    7d6f12520bd4 ("LoongArch: KVM: Switch to use hrtimer_setup()")
    7e5fd922c146 ("KVM: arm64: Switch to use hrtimer_setup()")
    7764b9dd174c ("KVM: x86: Switch to use hrtimer_setup()")
    7ff22753d894 ("KVM: s390: Switch to use hrtimer_setup()")
    a0241210a3f3 ("KVM: PPC: Switch to use hrtimer_setup()")
    c97f85ddd60a ("KVM: MIPS: Switch to use hrtimer_setup()")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'sched/core'
Ingo Molnar [Mon, 24 Feb 2025 19:40:18 +0000 (20:40 +0100)]
Merge branch into tip/master: 'sched/core'

 # New commits in sched/core:
    3c27b40830ca ("selftests/rseq: Add rseq syscall errors test")
    1a5d3492f8e1 ("sched: Add unlikey branch hints to several system calls")
    b796ea848991 ("sched/core: Remove duplicate included header file stats.h")
    d90c9de9de2f ("x86/tsc: Always save/restore TSC sched_clock() on suspend/resume")
    d34e798094ca ("sched/fair: Refactor can_migrate_task() to elimate looping")
    563bc2161b94 ("sched/eevdf: Force propagating min_slice of cfs_rq when {en,de}queue tasks")
    b9f2b29b9494 ("sched: Don't define sched_clock_irqtime as static key")
    2ae891b82695 ("sched: Reduce the default slice to avoid tasks getting an extra tick")
    f553741ac8c0 ("sched: Cancel the slice protection of the idle entity")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'perf/core'
Ingo Molnar [Mon, 24 Feb 2025 19:40:18 +0000 (20:40 +0100)]
Merge branch into tip/master: 'perf/core'

 # New commits in perf/core:
    3acfcefa795c ("perf/x86/intel/bts: Allocate bts_ctx only if necessary")
    8aeacf257070 ("perf/core: Move perf_event sysctls into kernel/events")
    3201bfa368fe ("perf amd ibs: Sync arch/x86/include/asm/amd-ibs.h header with the kernel")
    0b347a4218da ("perf/amd/ibs: Update DTLB/PageSize decode logic")
    d20610c19b4a ("perf/amd/ibs: Add support for OP Load Latency Filtering")
    1623ced247f7 ("x86/events/amd/iommu: Increase IOMMU_NAME_SIZE")
    e02e9b0374c3 ("perf/x86/intel: Support PEBS counters snapshotting")
    8ce939a0fa19 ("perf: Avoid the read if the count is already updated")
    f9bdf1f95339 ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read")
    314dfe105769 ("perf/x86/intel: Apply static call for drain_pebs")
    83179cd67846 ("uprobes: Remove the spinlock within handle_singlestep()")
    eae8a56ae0c7 ("uprobes: Remove redundant spinlock in uprobe_deny_signal()")
    fa5d0a824e3b ("perf/amd/ibs: Ceil sample_period to min_period")
    1afbdd970f50 ("perf/amd/ibs: Add ->check_period() callback")
    b2fc7b282bf7 ("perf/amd/ibs: Add PMU specific minimum period")
    e1e7844ced88 ("perf/amd/ibs: Don't allow freq mode event creation through ->config interface")
    46dcf8556617 ("perf/amd/ibs: Fix perf_ibs_op.cnt_mask for CurCnt")
    598bdf4fefff ("perf/amd/ibs: Fix ->config to sample period calculation for OP PMU")
    88c7bcad71c8 ("perf/amd/ibs: Remove pointless sample period check")
    003c0414318a ("perf/amd/ibs: Remove IBS_{FETCH|OP}_CONFIG_MASK macros")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'locking/core'
Ingo Molnar [Mon, 24 Feb 2025 19:40:18 +0000 (20:40 +0100)]
Merge branch into tip/master: 'locking/core'

 # New commits in locking/core:
    337369f8ce9e ("locking/mutex: Add MUTEX_WARN_ON() into fast path")
    2d352ec9fcb5 ("x86/locking: Use asm_inline for {,try_}cmpxchg{64,128} emulations")
    4087e16b0331 ("x86/locking: Use ALT_OUTPUT_SP() for percpu_{,try_}cmpxchg{64,128}_op()")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'irq/drivers'
Ingo Molnar [Mon, 24 Feb 2025 19:40:17 +0000 (20:40 +0100)]
Merge branch into tip/master: 'irq/drivers'

 # New commits in irq/drivers:
    b956c9de9175 ("arm64: dts: rockchip: rk356x: Move PCIe MSI to use GIC ITS instead of MBI")
    f15be3d4a0a5 ("arm64: dts: rockchip: rk356x: Add MSI controller node")
    2d81e1bb6252 ("irqchip/gic-v3: Add Rockchip 3568002 erratum workaround")
    896f8e436f99 ("irqchip/riscv-imsic: Special handling for non-atomic device MSI update")
    0bd55080ba9e ("irqchip/riscv-imsic: Avoid interrupt translation in interrupt handler")
    51611130d57d ("irqchip/riscv-imsic: Implement irq_force_complete_move() for IMSIC")
    0f67911e821c ("irqchip/riscv-imsic: Separate next and previous pointers in IMSIC vector")
    58d868b67a9a ("RISC-V: Select CONFIG_GENERIC_PENDING_IRQ")
    e54b1b5e89ae ("genirq: Introduce irq_can_move_in_process_context()")
    751dc837dabd ("genirq: Introduce common irq_force_complete_move() implementation")
    fe35ecee8ec8 ("irqchip/riscv-imsic: Move to common MSI library")
    1c000dcaad2b ("irqchip/irq-msi-lib: Optionally set default irq_eoi()/irq_ack()")
    999f458c1771 ("irqchip/riscv-imsic: Set irq_set_affinity() for IMSIC base")
    0699e578e279 ("irqchip/renesas-rzg2l: Simplify checks in rzg2l_irqc_common_init()")
    4bd0317ce63c ("irqchip/renesas-rzg2l: Switch to using dev_err_probe()")
    bec8a3712943 ("irqchip/renesas-rzg2l: Remove pm_put label")
    7de11369ef30 ("irqchip/renesas-rzg2l: Use devm_pm_runtime_enable()")
    78f384dad082 ("irqchip/renesas-rzg2l: Use devm_reset_control_get_exclusive_deasserted()")
    dd4e17c30944 ("irqchip/renesas-rzg2l: Use local dev pointer in rzg2l_irqc_common_init()")
    b93afe8a3ac5 ("irqchip/riscv-aplic: Add support for hart indexes")
    c057b6e42135 ("dt-bindings: interrupt-controller: Add risc-v,aplic hart indexes")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'perf/urgent'
Ingo Molnar [Mon, 24 Feb 2025 19:40:17 +0000 (20:40 +0100)]
Merge branch into tip/master: 'perf/urgent'

 # New commits in perf/urgent:
    bddf10d26e6e ("uprobes: Reject the shared zeropage in uprobe_write_opcode()")
    2016066c6619 ("perf/core: Order the PMU list to fix warning about unordered pmu_ctx_list")
    0fe8813baf4b ("perf/core: Add RCU read lock protection to perf_iterate_ctx()")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agoMerge branch into tip/master: 'locking/urgent'
Ingo Molnar [Mon, 24 Feb 2025 19:40:17 +0000 (20:40 +0100)]
Merge branch into tip/master: 'locking/urgent'

 # New commits in locking/urgent:
    b9a49520679e ("rcuref: Plug slowpath race in rcuref_put()")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agouprobes: Reject the shared zeropage in uprobe_write_opcode()
Tong Tiangen [Mon, 24 Feb 2025 03:11:49 +0000 (11:11 +0800)]
uprobes: Reject the shared zeropage in uprobe_write_opcode()

We triggered the following crash in syzkaller tests:

  BUG: Bad page state in process syz.7.38  pfn:1eff3
  page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1eff3
  flags: 0x3fffff00004004(referenced|reserved|node=0|zone=1|lastcpupid=0x1fffff)
  raw: 003fffff00004004 ffffe6c6c07bfcc8 ffffe6c6c07bfcc8 0000000000000000
  raw: 0000000000000000 0000000000000000 00000000fffffffe 0000000000000000
  page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x32/0x50
   bad_page+0x69/0xf0
   free_unref_page_prepare+0x401/0x500
   free_unref_page+0x6d/0x1b0
   uprobe_write_opcode+0x460/0x8e0
   install_breakpoint.part.0+0x51/0x80
   register_for_each_vma+0x1d9/0x2b0
   __uprobe_register+0x245/0x300
   bpf_uprobe_multi_link_attach+0x29b/0x4f0
   link_create+0x1e2/0x280
   __sys_bpf+0x75f/0xac0
   __x64_sys_bpf+0x1a/0x30
   do_syscall_64+0x56/0x100
   entry_SYSCALL_64_after_hwframe+0x78/0xe2

   BUG: Bad rss-counter state mm:00000000452453e0 type:MM_FILEPAGES val:-1

The following syzkaller test case can be used to reproduce:

  r2 = creat(&(0x7f0000000000)='./file0\x00', 0x8)
  write$nbd(r2, &(0x7f0000000580)=ANY=[], 0x10)
  r4 = openat(0xffffffffffffff9c, &(0x7f0000000040)='./file0\x00', 0x42, 0x0)
  mmap$IORING_OFF_SQ_RING(&(0x7f0000ffd000/0x3000)=nil, 0x3000, 0x0, 0x12, r4, 0x0)
  r5 = userfaultfd(0x80801)
  ioctl$UFFDIO_API(r5, 0xc018aa3f, &(0x7f0000000040)={0xaa, 0x20})
  r6 = userfaultfd(0x80801)
  ioctl$UFFDIO_API(r6, 0xc018aa3f, &(0x7f0000000140))
  ioctl$UFFDIO_REGISTER(r6, 0xc020aa00, &(0x7f0000000100)={{&(0x7f0000ffc000/0x4000)=nil, 0x4000}, 0x2})
  ioctl$UFFDIO_ZEROPAGE(r5, 0xc020aa04, &(0x7f0000000000)={{&(0x7f0000ffd000/0x1000)=nil, 0x1000}})
  r7 = bpf$PROG_LOAD(0x5, &(0x7f0000000140)={0x2, 0x3, &(0x7f0000000200)=ANY=[@ANYBLOB="1800000000120000000000000000000095"], &(0x7f0000000000)='GPL\x00', 0x7, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0, @fallback=0x30, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x10, 0x0, @void, @value}, 0x94)
  bpf$BPF_LINK_CREATE_XDP(0x1c, &(0x7f0000000040)={r7, 0x0, 0x30, 0x1e, @val=@uprobe_multi={&(0x7f0000000080)='./file0\x00', &(0x7f0000000100)=[0x2], 0x0, 0x0, 0x1}}, 0x40)

The cause is that zero pfn is set to the PTE without increasing the RSS
count in mfill_atomic_pte_zeropage() and the refcount of zero folio does
not increase accordingly. Then, the operation on the same pfn is performed
in uprobe_write_opcode()->__replace_page() to unconditional decrease the
RSS count and old_folio's refcount.

Therefore, two bugs are introduced:

 1. The RSS count is incorrect, when process exit, the check_mm() report
    error "Bad rss-count".

 2. The reserved folio (zero folio) is freed when folio->refcount is zero,
    then free_pages_prepare->free_page_is_bad() report error
    "Bad page state".

There is more, the following warning could also theoretically be triggered:

  __replace_page()
    -> ...
      -> folio_remove_rmap_pte()
        -> VM_WARN_ON_FOLIO(is_zero_folio(folio), folio)

Considering that uprobe hit on the zero folio is a very rare case, just
reject zero old folio immediately after get_user_page_vma_remote().

[ mingo: Cleaned up the changelog ]

Fixes: 7396fa818d62 ("uprobes/core: Make background page replacement logic account for rss_stat counters")
Fixes: 2b1444983508 ("uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints")
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20250224031149.1598949-1-tongtiangen@huawei.com
10 months agoperf/core: Order the PMU list to fix warning about unordered pmu_ctx_list
Luo Gengkun [Wed, 22 Jan 2025 07:33:56 +0000 (07:33 +0000)]
perf/core: Order the PMU list to fix warning about unordered pmu_ctx_list

Syskaller triggers a warning due to prev_epc->pmu != next_epc->pmu in
perf_event_swap_task_ctx_data(). vmcore shows that two lists have the same
perf_event_pmu_context, but not in the same order.

The problem is that the order of pmu_ctx_list for the parent is impacted by
the time when an event/PMU is added. While the order for a child is
impacted by the event order in the pinned_groups and flexible_groups. So
the order of pmu_ctx_list in the parent and child may be different.

To fix this problem, insert the perf_event_pmu_context to its proper place
after iteration of the pmu_ctx_list.

The follow testcase can trigger above warning:

 # perf record -e cycles --call-graph lbr -- taskset -c 3 ./a.out &
 # perf stat -e cpu-clock,cs -p xxx // xxx is the pid of a.out

 test.c

 void main() {
        int count = 0;
        pid_t pid;

        printf("%d running\n", getpid());
        sleep(30);
        printf("running\n");

        pid = fork();
        if (pid == -1) {
                printf("fork error\n");
                return;
        }
        if (pid == 0) {
                while (1) {
                        count++;
                }
        } else {
                while (1) {
                        count++;
                }
        }
 }

The testcase first opens an LBR event, so it will allocate task_ctx_data,
and then open tracepoint and software events, so the parent context will
have 3 different perf_event_pmu_contexts. On inheritance, child ctx will
insert the perf_event_pmu_context in another order and the warning will
trigger.

[ mingo: Tidied up the changelog. ]

Fixes: bd2756811766 ("perf: Rewrite core context handling")
Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250122073356.1824736-1-luogengkun@huaweicloud.com
10 months agoperf/core: Add RCU read lock protection to perf_iterate_ctx()
Breno Leitao [Fri, 17 Jan 2025 14:41:07 +0000 (06:41 -0800)]
perf/core: Add RCU read lock protection to perf_iterate_ctx()

The perf_iterate_ctx() function performs RCU list traversal but
currently lacks RCU read lock protection. This causes lockdep warnings
when running perf probe with unshare(1) under CONFIG_PROVE_RCU_LIST=y:

WARNING: suspicious RCU usage
kernel/events/core.c:8168 RCU-list traversed in non-reader section!!

 Call Trace:
  lockdep_rcu_suspicious
  ? perf_event_addr_filters_apply
  perf_iterate_ctx
  perf_event_exec
  begin_new_exec
  ? load_elf_phdrs
  load_elf_binary
  ? lock_acquire
  ? find_held_lock
  ? bprm_execve
  bprm_execve
  do_execveat_common.isra.0
  __x64_sys_execve
  do_syscall_64
  entry_SYSCALL_64_after_hwframe

This protection was previously present but was removed in commit
bd2756811766 ("perf: Rewrite core context handling"). Add back the
necessary rcu_read_lock()/rcu_read_unlock() pair around
perf_iterate_ctx() call in perf_event_exec().

[ mingo: Use scoped_guard() as suggested by Peter ]

Fixes: bd2756811766 ("perf: Rewrite core context handling")
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250117-fix_perf_rcu-v1-1-13cb9210fc6a@debian.org
10 months agoLinux 6.14-rc4 linux-next/stable v6.14-rc4
Linus Torvalds [Sun, 23 Feb 2025 20:32:57 +0000 (12:32 -0800)]
Linux 6.14-rc4

10 months agoMerge tag 'i2c-for-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sun, 23 Feb 2025 18:37:18 +0000 (10:37 -0800)]
Merge tag 'i2c-for-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fix from Wolfram Sang:
 "Revert one cleanup which turned out to eat too much stack space"

* tag 'i2c-for-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: core: Allocate temporary client dynamically

10 months agox86/ioperm: Use atomic64_inc_return() in ksys_ioperm()
Uros Bizjak [Sun, 23 Feb 2025 16:13:38 +0000 (17:13 +0100)]
x86/ioperm: Use atomic64_inc_return() in ksys_ioperm()

Use atomic64_inc_return(&ref) instead of atomic64_add_return(1, &ref)
to use optimized implementation on targets that define
atomic_inc_return() and to remove now unneeded initialization of the
%eax/%edx register pair before the call to atomic64_inc_return().

On x86_32 the code improves from:

 1b0:    b9 00 00 00 00           mov    $0x0,%ecx
            1b1: R_386_32    .bss
 1b5:    89 43 0c                 mov    %eax,0xc(%ebx)
 1b8:    31 d2                    xor    %edx,%edx
 1ba:    b8 01 00 00 00           mov    $0x1,%eax
 1bf:    e8 fc ff ff ff           call   1c0 <ksys_ioperm+0xa8>
            1c0: R_386_PC32    atomic64_add_return_cx8
 1c4:    89 03                    mov    %eax,(%ebx)
 1c6:    89 53 04                 mov    %edx,0x4(%ebx)

to:

 1b0:    be 00 00 00 00           mov    $0x0,%esi
            1b1: R_386_32    .bss
 1b5:    89 43 0c                 mov    %eax,0xc(%ebx)
 1b8:    e8 fc ff ff ff           call   1b9 <ksys_ioperm+0xa1>
            1b9: R_386_PC32    atomic64_inc_return_cx8
 1bd:    89 03                    mov    %eax,(%ebx)
 1bf:    89 53 04                 mov    %edx,0x4(%ebx)

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250223161355.3607-1-ubizjak@gmail.com
10 months agoMerge tag 'edac_urgent_for_v6.14_rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 23 Feb 2025 17:50:57 +0000 (09:50 -0800)]
Merge tag 'edac_urgent_for_v6.14_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC fix from Borislav Petkov:

 - Have qcom_edac use the correct interrupt enable register to configure
   the RAS interrupt lines

* tag 'edac_urgent_for_v6.14_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/qcom: Correct interrupt enable register configuration

10 months agox86/usercopy: Fix kernel-doc func param name in clean_cache_range()'s description
Randy Dunlap [Sat, 11 Jan 2025 06:33:33 +0000 (22:33 -0800)]
x86/usercopy: Fix kernel-doc func param name in clean_cache_range()'s description

Use @addr instead of @vaddr in the kernel-doc comment for
clean_cache_range() to eliminate warnings:

  arch/x86/lib/usercopy_64.c:29: warning: Function parameter or struct member 'addr' not described in 'clean_cache_range'
  arch/x86/lib/usercopy_64.c:29: warning: Excess function parameter 'vaddr' description in 'clean_cache_range'

Fixes: 0aed55af8834 ("x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20250111063333.911084-1-rdunlap@infradead.org
10 months agoMerge tag 'v6.14-rc3-smb3-client-fix-part2' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 23 Feb 2025 01:32:00 +0000 (17:32 -0800)]
Merge tag 'v6.14-rc3-smb3-client-fix-part2' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fix from Steve French:

 - Fix potential null pointer dereference

* tag 'v6.14-rc3-smb3-client-fix-part2' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: Add check for next_buffer in receive_encrypted_standard()

10 months agoMerge tag 'x86-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 22 Feb 2025 18:45:02 +0000 (10:45 -0800)]
Merge tag 'x86-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

 - Fix AVX-VNNI CPU feature dependency bug triggered via the 'noxsave'
   boot option

 - Fix typos in the SVA documentation

 - Add Tony Luck as RDT co-maintainer and remove Fenghua Yu

* tag 'x86-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  docs: arch/x86/sva: Fix two grammar errors under Background and FAQ
  x86/cpufeatures: Make AVX-VNNI depend on AVX
  MAINTAINERS: Change maintainer for RDT

10 months agoMerge tag 'sched-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 22 Feb 2025 17:30:04 +0000 (09:30 -0800)]
Merge tag 'sched-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull rseq fixes from Ingo Molnar:

 - Fix overly spread-out RSEQ concurrency ID allocation pattern that
   regressed certain workloads

 - Fix RSEQ registration syscall behavior on -EFAULT errors when
   CONFIG_DEBUG_RSEQ=y (This debug option is disabled on most
   distributions)

* tag 'sched-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rseq: Fix rseq registration with CONFIG_DEBUG_RSEQ
  sched: Compact RSEQ concurrency IDs with reduced threads and affinity

10 months agoMerge tag 'perf-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 22 Feb 2025 17:26:12 +0000 (09:26 -0800)]
Merge tag 'perf-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event fixes from Ingo Molnar:
 "Fix x86 Intel Lion Cove CPU event constraints, and fix uprobes
  debug/error printk output pointer-value verbosity"

* tag 'perf-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix event constraints for LNC
  uprobes: Don't use %pK through printk

10 months agoMerge tag 'irq-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 22 Feb 2025 17:20:43 +0000 (09:20 -0800)]
Merge tag 'irq-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:
 "Fix miscellaneous irqchip bugs"

* tag 'irq-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/qcom-pdc: Workaround hardware register bug on X1E80100
  irqchip/jcore-aic, clocksource/drivers/jcore: Fix jcore-pit interrupt request
  irqchip/gic-v3: Fix rk3399 workaround when secure interrupts are enabled

10 months agoMerge tag 's390-6.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Sat, 22 Feb 2025 17:09:33 +0000 (09:09 -0800)]
Merge tag 's390-6.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

 - Fix inline asm constraint in cmma_test_essa() to avoid potential ESSA
   detection miscompilation

 - Fix build failure with CONFIG_GENDWARFKSYMS by disabling purgatory
   symbol exports with -D__DISABLE_EXPORTS

 - Update defconfigs

* tag 's390-6.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/boot: Fix ESSA detection
  s390/purgatory: Use -D__DISABLE_EXPORTS
  s390: Update defconfigs

10 months agoMerge tag 'ftrace-v6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Sat, 22 Feb 2025 17:03:54 +0000 (09:03 -0800)]
Merge tag 'ftrace-v6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Function graph accounting fixes:

   - Fix the manage ops hashes

     The function graph registers a "manager ops" and "sub-ops" to
     ftrace. The manager ops does not have any callback but calls the
     sub-ops callbacks. The manage ops hashes (what is used to tell
     ftrace what functions to attach to) is built on the sub-ops it
     manages.

     There was an error in the way it built the hash. An empty hash
     means to attach to all functions. When the manager ops had one
     sub-ops it properly copied its hash. But when the manager ops had
     more than one sub-ops, it went into a loop to make a set of all
     functions it needed to add to the hash. If any of the subops hashes
     was empty, that would mean to attach to all functions. The error
     was that the first iteration of the loop passed in an empty hash to
     start with in order to add the other hashes. That starting hash was
     mistaken as to attach to all functions. This made the manage ops
     attach to all functions whenever it had two or more sub-ops, even
     if each sub-op was attached to only a single function.

   - Do not add duplicate entries to the manager ops hash

     If two or more subops hashes trace the same function, an entry for
     that function will be added to the manager ops for each subops.
     This causes waste and extra overhead.

  Fprobe accounting fixes:

   - Remove last function from fprobe hash

     Fprobes has a ftrace hash to manage which functions an fprobe is
     attached to. It also has a counter of how many fprobes are
     attached. When the last fprobe is removed, it unregisters the
     fprobe from ftrace but does not remove the functions the last
     fprobe was attached to from the hash. This leaves the old functions
     attached. When a new fprobe is added, the fprobe infrastructure
     attaches to not only the functions of the new fprobe, but also to
     the functions of the last fprobe.

   - Fix accounting of the fprobe counter

     When a fprobe is added, it updates a counter. If the counter goes
     from zero to one, it attaches its ops to ftrace. When an fprobe is
     removed, the counter is decremented. If the counter goes from 1 to
     zero, it removes the fprobes ops from ftrace.

     There was an issue where if two fprobes trace the same function,
     the addition of each fprobe would increment the counter. But when
     removing the first of the fprobes, it would notice that another
     fprobe is still attached to one of its functions no it does not
     remove the functions from the ftrace ops.

     But it also did not decrement the counter, so when the last fprobe
     is removed, the counter is still one. This leaves the fprobes
     callback still registered with ftrace and it being called by the
     functions defined by the fprobes ops hash. Worse yet, because all
     the functions from the fprobe ops hash have been removed, that
     tells ftrace that it wants to trace all functions.

     Thus, this puts the state of the system where every function is
     calling the fprobe callback handler (which does nothing as there
     are no registered fprobes), but this causes a good 13% slow down of
     the entire system.

  Other updates:

   - Add a selftest to test the above issues to prevent regressions.

   - Fix preempt count accounting in function tracing

     Better recursion protection was added to function tracing which
     added another layer of preempt disable. As the preempt_count gets
     traced in the event, it needs to subtract the amount of preempt
     disabling the tracer does to record what the preempt_count was when
     the trace was triggered.

   - Fix memory leak in output of set_event

     A variable is passed by the seq_file functions in the location that
     is set by the return of the next() function. The start() function
     allocates it and the stop() function frees it. But when the last
     item is found, the next() returns NULL which leaks the data that
     was allocated in start(). The m->private is used for something
     else, so have next() free the data when it returns NULL, as stop()
     will then just receive NULL in that case"

* tag 'ftrace-v6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Fix memory leak when reading set_event file
  ftrace: Correct preemption accounting for function tracing.
  selftests/ftrace: Update fprobe test to check enabled_functions file
  fprobe: Fix accounting of when to unregister from function graph
  fprobe: Always unregister fgraph function from ops
  ftrace: Do not add duplicate entries in subops manager ops
  ftrace: Fix accounting of adding subops to a manager ops

10 months agoselftests/rseq: Add rseq syscall errors test
Michael Jeanson [Tue, 21 Jan 2025 21:33:52 +0000 (16:33 -0500)]
selftests/rseq: Add rseq syscall errors test

This test adds coverage of expected errors during rseq registration and
unregistration, it disables glibc integration and will thus always
exercise the rseq syscall explictly.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250121213402.1754762-1-mjeanson@efficios.com
10 months agoperf/x86/intel/bts: Allocate bts_ctx only if necessary
Li RongQing [Wed, 22 Jan 2025 07:41:03 +0000 (15:41 +0800)]
perf/x86/intel/bts: Allocate bts_ctx only if necessary

Avoid unnecessary per-CPU memory allocation on unsupported CPUs,
this can save 12K memory for each CPU

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250122074103.3091-1-lirongqing@baidu.com
10 months agox86/cpu: Update Intel Family comments
Peter Zijlstra [Mon, 27 Jan 2025 16:22:52 +0000 (17:22 +0100)]
x86/cpu: Update Intel Family comments

Because who can ever remember all these names.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250127162252.GK16742@noisy.programming.kicks-ass.net
10 months agox86/kexec: Export e820_table_kexec[] to sysfs
Dave Young [Tue, 28 Jan 2025 13:32:31 +0000 (21:32 +0800)]
x86/kexec: Export e820_table_kexec[] to sysfs

Previously the e820_table_kexec[] was exported to sysfs since kexec-tools uses
the memmap entries to prepare the e820 table for the new kernel.

The following commit, ~8 years ago, introduced e820_table_firmware[] and changed
the behavior to export the firmware table instead:

  12df216c61c8 ("x86/boot/e820: Introduce the bootloader provided e820_table_firmware[] table")

Originally the kexec_file_load and kexec_load syscalls both used e820_table_kexec[].
Since the sysfs exported entries are from e820_table_firmware[] people
now need to tune both tables for kexec.

Restore the old behavior so the kexec_load and kexec_file_load syscalls work with
only one table update.  The e820_table_firmware[] is used by hibernation kernel
code and it works without the sysfs exporting. Also remove the SEV
e820_table_firmware[] updating code.

Also update the code comments here and drop the comments about setup_data
reservation since it is not needed any more after this change was made
a year ago:

  fc7f27cda843 ("x86/kexec: Do not update E820 kexec table for setup_data")

[ mingo: Tidy up the changelog and comments. ]

Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Ashish Kalra <ashish.kalra@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Link: https://lore.kernel.org/r/Z5jcb1GKhLvH8kDc@darkstar.users.ipa.redhat.com
10 months agox86/boot: Change some static bootflag functions to bool
Uros Bizjak [Wed, 29 Jan 2025 15:47:50 +0000 (16:47 +0100)]
x86/boot: Change some static bootflag functions to bool

The return values of some functions are of boolean type. Change the
type of these function to bool and adjust their return values.

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250129154920.6773-1-ubizjak@gmail.com
10 months agoi2c: core: Allocate temporary client dynamically
Geert Uytterhoeven [Thu, 20 Feb 2025 15:12:12 +0000 (16:12 +0100)]
i2c: core: Allocate temporary client dynamically

drivers/i2c/i2c-core-base.c: In function ‘i2c_detect.isra’:
drivers/i2c/i2c-core-base.c:2544:1: warning: the frame size of 1312 bytes is larger than 1024 bytes [-Wframe-larger-than=]
 2544 | }
      | ^

Fix this by allocating the temporary client structure dynamically, as it
is a rather large structure (1216 bytes, depending on kernel config).
This is basically a revert of the to-be-fixed commit with some
checkpatch improvements.

Fixes: 735668f8e5c9 ("i2c: core: Allocate temp client on the stack in i2c_detect")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Su Hui <suhui@nfschina.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
[wsa: updated commit message, merged tags from similar patch]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
10 months agox86/arch_prctl/64: Clean up ARCH_MAP_VDSO_32
Brian Gerst [Sun, 2 Feb 2025 20:23:23 +0000 (15:23 -0500)]
x86/arch_prctl/64: Clean up ARCH_MAP_VDSO_32

process_64.c is not built on native 32-bit, so CONFIG_X86_32 will never
be set.

No change in functionality intended.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/r/20250202202323.422113-3-brgerst@gmail.com
10 months agox86/arch_prctl: Simplify sys_arch_prctl()
Brian Gerst [Sun, 2 Feb 2025 20:23:22 +0000 (15:23 -0500)]
x86/arch_prctl: Simplify sys_arch_prctl()

Use in_ia32_syscall() instead of a compat syscall entry.

No change in functionality intended.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/r/20250202202323.422113-2-brgerst@gmail.com
10 months agoMerge tag 'soc-fixes-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Fri, 21 Feb 2025 21:16:01 +0000 (13:16 -0800)]
Merge tag 'soc-fixes-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
 "Two people stepped up as platform co-maintainers: Andrew Jeffery for
  ASpeed and Janne Grunau for Apple.

  The rockchip platform gets 9 small fixes for devicetree files,
  addressing both compile-time warnings and board specific bugs.

  One bugfix for the optee firmware driver addresses a reboot-time hang.

  Two drivers need improved Kconfig dependencies to allow wider compile-
  testing while hiding the drivers on platforms that can't use them.

  ARM SCMI and loongson-guts drivers get minor bugfixes"

* tag 'soc-fixes-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  soc: loongson: loongson2_guts: Add check for devm_kstrdup()
  tee: optee: Fix supplicant wait loop
  platform: cznic: CZNIC_PLATFORMS should depend on ARCH_MVEBU
  firmware: imx: IMX_SCMI_MISC_DRV should depend on ARCH_MXC
  MAINTAINERS: arm: apple: Add Janne as maintainer
  MAINTAINERS: Mark Andrew as M: for ASPEED MACHINE SUPPORT
  firmware: arm_scmi: imx: Correct tx size of scmi_imx_misc_ctrl_set
  arm64: dts: rockchip: adjust SMMU interrupt type on rk3588
  arm64: dts: rockchip: disable IOMMU when running rk3588 in PCIe endpoint mode
  dt-bindings: rockchip: pmu: Ensure all properties are defined
  arm64: defconfig: Enable TISCI Interrupt Router and Aggregator
  arm64: dts: rockchip: Fix lcdpwr_en pin for Cool Pi GenBook
  arm64: dts: rockchip: fix fixed-regulator renames on rk3399-gru devices
  arm64: dts: rockchip: Disable DMA for uart5 on px30-ringneck
  arm64: dts: rockchip: Move uart5 pin configuration to px30 ringneck SoM
  arm64: dts: rockchip: change eth phy mode to rgmii-id for orangepi r1 plus lts
  arm64: dts: rockchip: Fix broken tsadc pinctrl names for rk3588

10 months agoMerge tag 'drm-fixes-2025-02-22' of https://gitlab.freedesktop.org/drm/kernel
Linus Torvalds [Fri, 21 Feb 2025 21:10:22 +0000 (13:10 -0800)]
Merge tag 'drm-fixes-2025-02-22' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Weekly drm fixes pull request, lots of small things all over, msm has
  a bunch of things but all very small, xe, i915, a fix for the cgroup
  dmem controller.

  core:
   - remove MAINTAINERS entry

  cgroup/dmem:
   - use correct function for pool descendants

  panel:
   - fix signal polarity issue jd9365da-h3

  nouveau:
   - folio handling fix
   - config fix

  amdxdna:
   - fix missing header

  xe:
   - Fix error handling in xe_irq_install
   - Fix devcoredump format

  i915:
   - Use spin_lock_irqsave() in interruptible context on guc submission
   - Fixes on DDI and TRANS programming
   - Make sure all planes in use by the joiner have their crtc included
   - Fix 128b/132b modeset issues

  msm:
   - More catalog fixes:
      - to skip watchdog programming through top block if its not
        present
      - fix the setting of WB mask to ensure the WB input control is
        programmed correctly through ping-pong
      - drop lm_pair for sm6150 as that chipset does not have any
        3dmerge block
      - Fix the mode validation logic for DP/eDP to account for widebus
        (2ppc) to allow high clock resolutions
      - Fix to disable dither during encoder disable as otherwise this
        was causing kms_writeback failure due to resource sharing
        between WB and DSI paths as DSI uses dither but WB does not
      - Fixes for virtual planes, namely to drop extraneous return and
        fix uninitialized variables
      - Fix to avoid spill-over of DSC encoder block bits when
        programming the bits-per-component
      - Fixes in the DSI PHY to protect against concurrent access of
        PHY_CMN_CLK_CFG regs between clock and display drivers
   - Core/GPU:
      - Fix non-blocking fence wait incorrectly rounding up to 1 jiffy
        timeout
      - Only print GMU fw version once, instead of each time the GPU
        resumes"

* tag 'drm-fixes-2025-02-22' of https://gitlab.freedesktop.org/drm/kernel: (28 commits)
  drm/i915/dp: Fix disabling the transcoder function in 128b/132b mode
  drm/i915/dp: Fix error handling during 128b/132b link training
  accel/amdxdna: Add missing include linux/slab.h
  MAINTAINERS: Remove myself
  drm/nouveau/pmu: Fix gp10b firmware guard
  cgroup/dmem: Don't open-code css_for_each_descendant_pre
  drm/xe/guc: Fix size_t print format
  drm/xe: Make GUC binaries dump consistent with other binaries in devcoredump
  drm/i915: Make sure all planes in use by the joiner have their crtc included
  drm/i915/ddi: Fix HDMI port width programming in DDI_BUF_CTL
  drm/i915/dsi: Use TRANS_DDI_FUNC_CTL's own port width macro
  drm/xe: Fix error handling in xe_irq_install()
  drm/i915/gt: Use spin_lock_irqsave() in interruptible context
  drm/msm/dsi/phy: Do not overwite PHY_CMN_CLK_CFG1 when choosing bitclk source
  drm/msm/dsi/phy: Protect PHY_CMN_CLK_CFG1 against clock driver
  drm/msm/dsi/phy: Protect PHY_CMN_CLK_CFG0 updated from driver side
  drm/msm/dpu: Drop extraneous return in dpu_crtc_reassign_planes()
  drm/msm/dpu: Don't leak bits_per_component into random DSC_ENC fields
  drm/msm/dpu: Disable dither in phys encoder cleanup
  drm/msm/dpu: Fix uninitialized variable
  ...

10 months agosched: Add unlikey branch hints to several system calls
Colin Ian King [Wed, 19 Feb 2025 14:24:23 +0000 (14:24 +0000)]
sched: Add unlikey branch hints to several system calls

Adding an unlikely() hint on early error return paths improves the
run-time performance of several sched related system calls.

Benchmarking on an i9-12900 shows the following per system call
performance improvements:

       before     after     improvement
sched_getattr          182.4ns    170.6ns      ~6.5%
sched_setattr          284.3ns    267.6ns      ~5.9%
sched_getparam         161.6ns    148.1ns      ~8.4%
sched_setparam        1265.4ns   1227.6ns      ~3.0%
sched_getscheduler     129.4ns    118.2ns      ~8.7%
sched_setscheduler    1237.3ns   1216.7ns      ~1.7%

Results are based on running 20 tests with turbo disabled (to reduce
clock freq turbo changes), with 10 second run per test based on the
number of system calls per second. The % standard deviation of the
measurements for the 20 tests was 0.05% to 0.40%, so the results are
reliable.

Tested on kernel build with gcc 14.2.1

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250219142423.45516-1-colin.i.king@gmail.com
10 months agosched/core: Remove duplicate included header file stats.h
Thorsten Blum [Wed, 19 Feb 2025 11:17:57 +0000 (12:17 +0100)]
sched/core: Remove duplicate included header file stats.h

The header file stats.h is included twice. Remove the redundant include
and the following make includecheck warning:

  stats.h is included more than once

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250219111756.3070-2-thorsten.blum@linux.dev
10 months agolocking/mutex: Add MUTEX_WARN_ON() into fast path
Yunhui Cui [Sun, 26 Jan 2025 03:32:43 +0000 (11:32 +0800)]
locking/mutex: Add MUTEX_WARN_ON() into fast path

Scenario: In platform_device_register, the driver misuses struct
device as platform_data, making kmemdup duplicate a device. Accessing
the duplicate may cause list corruption due to its mutex magic or list
holding old content.
It recurs randomly as the first mutex - getting process skips the slow
path and mutex check. Adding MUTEX_WARN_ON(lock->magic!= lock) in
__mutex_trylock_fast() makes it always happen.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250126033243.53069-1-cuiyunhui@bytedance.com
10 months agoMerge tag 'block-6.14-20250221' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 21 Feb 2025 17:36:28 +0000 (09:36 -0800)]
Merge tag 'block-6.14-20250221' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - NVMe pull request via Keith:
      - FC controller state check fixes (Daniel)
      - PCI Endpoint fixes (Damien)
      - TCP connection failure fixe (Caleb)
      - TCP handling C2HTermReq PDU (Maurizio)
      - RDMA queue state check (Ruozhu)
      - Apple controller fixes (Hector)
      - Target crash on disbaled namespace (Hannes)

 - MD pull request via Yu:
      - Fix queue limits error handling for raid0, raid1 and raid10

 - Fix for a NULL pointer deref in request data mapping

 - Code cleanup for request merging

* tag 'block-6.14-20250221' of git://git.kernel.dk/linux:
  nvme: only allow entering LIVE from CONNECTING state
  nvme-fc: rely on state transitions to handle connectivity loss
  apple-nvme: Support coprocessors left idle
  apple-nvme: Release power domains when probe fails
  nvmet: Use enum definitions instead of hardcoded values
  nvme: Cleanup the definition of the controller config register fields
  nvme/ioctl: add missing space in err message
  nvme-tcp: fix connect failure on receiving partial ICResp PDU
  nvme: tcp: Fix compilation warning with W=1
  nvmet: pci-epf: Avoid RCU stalls under heavy workload
  nvmet: pci-epf: Do not uselessly write the CSTS register
  nvmet: pci-epf: Correctly initialize CSTS when enabling the controller
  nvmet-rdma: recheck queue state is LIVE in state lock in recv done
  nvmet: Fix crash when a namespace is disabled
  nvme-tcp: add basic support for the C2HTermReq PDU
  nvme-pci: quirk Acer FA100 for non-uniqueue identifiers
  block: fix NULL pointer dereferenced within __blk_rq_map_sg
  block/merge: remove unnecessary min() with UINT_MAX
  md/raid*: Fix the set_queue_limits implementations

10 months agoMerge tag 'io_uring-6.14-20250221' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 21 Feb 2025 17:17:56 +0000 (09:17 -0800)]
Merge tag 'io_uring-6.14-20250221' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:

 - Series fixing an issue with multishot read on pollable files that may
   return -EIOCBQUEUED from ->read_iter(). Four small patches for that,
   the first one deliberately done in such a way that it'd be easy to
   backport

 - Remove some dead constant definitions

 - Use array_index_nospec() for opcode indexing

 - Work-around for worker creation retries in the presence of signals

* tag 'io_uring-6.14-20250221' of git://git.kernel.dk/linux:
  io_uring/rw: clean up mshot forced sync mode
  io_uring/rw: move ki_complete init into prep
  io_uring/rw: don't directly use ki_complete
  io_uring/rw: forbid multishot async reads
  io_uring/rsrc: remove unused constants
  io_uring: fix spelling error in uapi io_uring.h
  io_uring: prevent opcode speculation
  io-wq: backoff when retrying worker creation

10 months agoMerge tag 'acpi-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 21 Feb 2025 17:11:25 +0000 (09:11 -0800)]
Merge tag 'acpi-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "Fix a memory leak in the ACPI platform_profile driver (Kurt Borja)"

* tag 'acpi-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: platform_profile: Fix memory leak in profile_class_is_visible()

10 months agoMerge tag 'mtd/fixes-for-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 21 Feb 2025 17:07:04 +0000 (09:07 -0800)]
Merge tag 'mtd/fixes-for-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux

Pull mtd fixes from Miquel Raynal:
 "The two most important fixes in this list are probably the SST write
  failure and the Qcom raw NAND controller probe failure which are due
  to some refactoring, otherwise there has been a series of misc fixes
  on the Cadence raw NAND controller driver and especially on the DMA
  side"

* tag 'mtd/fixes-for-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: cadence: fix unchecked dereference
  mtd: spi-nor: sst: Fix SST write failure
  dt-bindings: mtd: cadence: document required clock-names
  mtd: rawnand: qcom: fix broken config in qcom_param_page_type_exec
  mtd: rawnand: cadence: fix incorrect device in dma_unmap_single
  mtd: rawnand: cadence: use dma_map_resource for sdma address
  mtd: rawnand: cadence: fix error code in cadence_nand_init()

10 months agoMerge tag 'gpio-fixes-for-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 21 Feb 2025 16:59:27 +0000 (08:59 -0800)]
Merge tag 'gpio-fixes-for-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:
 "There are two fixes for GPIO core: one adds missing retval checks to
  older code, while the second adds SRCU synchronization to legs in code
  that were missed during the big rework a few cycles back. There's also
  one small driver fix:

   - check the return value of the get_direction() callback in struct
   gpio_chip

   - protect the multi-line get/set legs in GPIO core with SRCU

   - fix a race condition in gpio-vf610"

* tag 'gpio-fixes-for-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: don't bail out if get_direction() fails in gpiochip_add_data()
  gpiolib: protect gpio_chip with SRCU in array_info paths in multi get/set
  gpio: vf610: add locking to gpio direction functions
  gpiolib: check the return value of gpio_chip::get_direction()

10 months agox86/apic: Use str_disabled_enabled() helper in print_ipi_mode()
Thorsten Blum [Sun, 9 Feb 2025 21:03:32 +0000 (22:03 +0100)]
x86/apic: Use str_disabled_enabled() helper in print_ipi_mode()

Remove hard-coded strings by using the str_disabled_enabled() helper.

No change in functionality intended.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250209210333.5666-2-thorsten.blum@linux.dev
10 months agox86/e820: Drop obsolete E820_TYPE_RESERVED_KERN and related code
Mike Rapoport (Microsoft) [Fri, 14 Feb 2025 09:06:51 +0000 (11:06 +0200)]
x86/e820: Drop obsolete E820_TYPE_RESERVED_KERN and related code

E820_TYPE_RESERVED_KERN is a relict from the ancient history that was used
to early reserve setup_data, see:

  28bb22379513 ("x86: move reserve_setup_data to setup.c")

Nowadays setup_data is anyway reserved in memblock and there is no point in
carrying E820_TYPE_RESERVED_KERN that behaves exactly like E820_TYPE_RAM
but only complicates the code.

A bonus for removing E820_TYPE_RESERVED_KERN is a small but measurable
speedup of 20 microseconds in init_mem_mappings() on a VM with 32GB or RAM.

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20250214090651.3331663-5-rppt@kernel.org
10 months agox86/boot: Split parsing of boot_params into the parse_boot_params() helper function
Mike Rapoport (Microsoft) [Fri, 14 Feb 2025 09:06:50 +0000 (11:06 +0200)]
x86/boot: Split parsing of boot_params into the parse_boot_params() helper function

Makes setup_arch() a bit easier to comprehend.

No functional changes.

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250214090651.3331663-4-rppt@kernel.org
10 months agox86/boot: Split kernel resources setup into the setup_kernel_resources() helper function
Mike Rapoport (Microsoft) [Fri, 14 Feb 2025 09:06:49 +0000 (11:06 +0200)]
x86/boot: Split kernel resources setup into the setup_kernel_resources() helper function

Makes setup_arch() a bit easier to comprehend.

No functional changes.

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250214090651.3331663-3-rppt@kernel.org
10 months agox86/boot: Move setting of memblock parameters to e820__memblock_setup()
Mike Rapoport (Microsoft) [Fri, 14 Feb 2025 09:06:48 +0000 (11:06 +0200)]
x86/boot: Move setting of memblock parameters to e820__memblock_setup()

Changing memblock parameters, namely bottom_up and allocation upper
limit does not have any effect before memblock initialization in
e820__memblock_setup().

Move the calls to memblock_set_bottom_up() and memblock_set_current_limit()
to e820__memblock_setup() to group all the memblock initial setup and make
setup_arch() more readable.

No functional changes.

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250214090651.3331663-2-rppt@kernel.org
10 months agox86/locking: Use asm_inline for {,try_}cmpxchg{64,128} emulations
Uros Bizjak [Fri, 14 Feb 2025 15:07:47 +0000 (16:07 +0100)]
x86/locking: Use asm_inline for {,try_}cmpxchg{64,128} emulations

According to:

  https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html

the usage of asm pseudo directives in the asm template can confuse
the compiler to wrongly estimate the size of the generated
code.

The ALTERNATIVE macro expands to several asm pseudo directives,
so its usage in {,try_}cmpxchg{64,128} causes instruction length estimate
to fail by an order of magnitude (the specially instrumented compiler
reports the estimated length of these asm templates to be more than 20
instructions long).

This incorrect estimate further causes unoptimal inlining
decisions, unoptimal instruction scheduling and unoptimal code block
alignments for functions that use these locking primitives.

Use asm_inline instead:

  https://gcc.gnu.org/pipermail/gcc-patches/2018-December/512349.html

which is a feature that makes GCC pretend some inline assembler code
is tiny (while it would think it is huge), instead of just asm.

For code size estimation, the size of the asm is then taken as
the minimum size of one instruction, ignoring how many instructions
compiler thinks it is.

The effect of this patch on x86_64 target is minor, since 128-bit
functions are rarely used on this target. The code size of the resulting
defconfig object file stays the same:

      text       data     bss      dec         hex filename
  27456612    4638523  814148 32909283     1f627e3 vmlinux-old.o
  27456612    4638523  814148 32909283     1f627e3 vmlinux-new.o

but the patch has minor effect on code layout due to the different
scheduling decisions in functions containing changed macros.

There is no effect on the x64_32 target, the code size of the resulting
defconfig object file and the code layout stays the same:

      text       data     bss      dec         hex filename
  18883870    2679275 1707916 23271061     1631695 vmlinux-old.o
  18883870    2679275 1707916 23271061     1631695 vmlinux-new.o

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250214150929.5780-2-ubizjak@gmail.com
10 months agox86/locking: Use ALT_OUTPUT_SP() for percpu_{,try_}cmpxchg{64,128}_op()
Uros Bizjak [Fri, 14 Feb 2025 15:07:46 +0000 (16:07 +0100)]
x86/locking: Use ALT_OUTPUT_SP() for percpu_{,try_}cmpxchg{64,128}_op()

percpu_{,try_}cmpxchg{64,128}() macros use CALL instruction inside
asm statement in one of their alternatives. Use ALT_OUTPUT_SP()
macro to add required dependence on %esp register.

ALT_OUTPUT_SP() implements the above dependence by adding
ASM_CALL_CONSTRAINT to its arguments. This constraint should be used
for any inline asm which has a CALL instruction, otherwise the
compiler may schedule the asm before the frame pointer gets set up
by the containing function, causing objtool to print a "call without
frame pointer save/setup" warning.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250214150929.5780-1-ubizjak@gmail.com
10 months agotracing: Fix memory leak when reading set_event file
Adrian Huang [Thu, 20 Feb 2025 03:15:28 +0000 (11:15 +0800)]
tracing: Fix memory leak when reading set_event file

kmemleak reports the following memory leak after reading set_event file:

  # cat /sys/kernel/tracing/set_event

  # cat /sys/kernel/debug/kmemleak
  unreferenced object 0xff110001234449e0 (size 16):
  comm "cat", pid 13645, jiffies 4294981880
  hex dump (first 16 bytes):
    01 00 00 00 00 00 00 00 a8 71 e7 84 ff ff ff ff  .........q......
  backtrace (crc c43abbc):
    __kmalloc_cache_noprof+0x3ca/0x4b0
    s_start+0x72/0x2d0
    seq_read_iter+0x265/0x1080
    seq_read+0x2c9/0x420
    vfs_read+0x166/0xc30
    ksys_read+0xf4/0x1d0
    do_syscall_64+0x79/0x150
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

The issue can be reproduced regardless of whether set_event is empty or
not. Here is an example about the valid content of set_event.

  # cat /sys/kernel/tracing/set_event
  sched:sched_process_fork
  sched:sched_switch
  sched:sched_wakeup
  *:*:mod:trace_events_sample

The root cause is that s_next() returns NULL when nothing is found.
This results in s_stop() attempting to free a NULL pointer because its
parameter is NULL.

Fix the issue by freeing the memory appropriately when s_next() fails
to find anything.

Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20250220031528.7373-1-ahuang12@lenovo.com
Fixes: b355247df104 ("tracing: Cache ":mod:" events for modules not loaded yet")
Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agoftrace: Correct preemption accounting for function tracing.
Sebastian Andrzej Siewior [Thu, 20 Feb 2025 14:07:49 +0000 (15:07 +0100)]
ftrace: Correct preemption accounting for function tracing.

The function tracer should record the preemption level at the point when
the function is invoked. If the tracing subsystem decrement the
preemption counter it needs to correct this before feeding the data into
the trace buffer. This was broken in the commit cited below while
shifting the preempt-disabled section.

Use tracing_gen_ctx_dec() which properly subtracts one from the
preemption counter on a preemptible kernel.

Cc: stable@vger.kernel.org
Cc: Wander Lairson Costa <wander@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/20250220140749.pfw8qoNZ@linutronix.de
Fixes: ce5e48036c9e7 ("ftrace: disable preemption when recursion locked")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agoselftests/ftrace: Update fprobe test to check enabled_functions file
Steven Rostedt [Thu, 20 Feb 2025 20:20:14 +0000 (15:20 -0500)]
selftests/ftrace: Update fprobe test to check enabled_functions file

A few bugs were found in the fprobe accounting logic along with it using
the function graph infrastructure. Update the fprobe selftest to catch
those bugs in case they or something similar shows up in the future.

The test now checks the enabled_functions file which shows all the
functions attached to ftrace or fgraph. When enabling a fprobe, make sure
that its corresponding function is also added to that file. Also add two
more fprobes to enable to make sure that the fprobe logic works properly
with multiple probes.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Link: https://lore.kernel.org/20250220202055.733001756@goodmis.org
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Tested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agofprobe: Fix accounting of when to unregister from function graph
Steven Rostedt [Thu, 20 Feb 2025 20:20:13 +0000 (15:20 -0500)]
fprobe: Fix accounting of when to unregister from function graph

When adding a new fprobe, it will update the function hash to the
functions the fprobe is attached to and register with function graph to
have it call the registered functions. The fprobe_graph_active variable
keeps track of the number of fprobes that are using function graph.

If two fprobes attach to the same function, it increments the
fprobe_graph_active for each of them. But when they are removed, the first
fprobe to be removed will see that the function it is attached to is also
used by another fprobe and it will not remove that function from
function_graph. The logic will skip decrementing the fprobe_graph_active
variable.

This causes the fprobe_graph_active variable to not go to zero when all
fprobes are removed, and in doing so it does not unregister from
function graph. As the fgraph ops hash will now be empty, and an empty
filter hash means all functions are enabled, this triggers function graph
to add a callback to the fprobe infrastructure for every function!

 # echo "f:myevent1 kernel_clone" >> /sys/kernel/tracing/dynamic_events
 # echo "f:myevent2 kernel_clone%return" >> /sys/kernel/tracing/dynamic_events
 # cat /sys/kernel/tracing/enabled_functions
kernel_clone (1)            tramp: 0xffffffffc0024000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60

 # > /sys/kernel/tracing/dynamic_events
 # cat /sys/kernel/tracing/enabled_functions
trace_initcall_start_cb (1)             tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
run_init_process (1)            tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
try_to_run_init_process (1)             tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
x86_pmu_show_pmu_cap (1)                tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
cleanup_rapl_pmus (1)                   tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
uncore_free_pcibus_map (1)              tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
uncore_types_exit (1)                   tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
uncore_pci_exit.part.0 (1)              tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
kvm_shutdown (1)                tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
vmx_dump_msrs (1)               tramp: 0xffffffffc0026000 (function_trace_call+0x0/0x170) ->function_trace_call+0x0/0x170
[..]

 # cat /sys/kernel/tracing/enabled_functions | wc -l
54702

If a fprobe is being removed and all its functions are also traced by
other fprobes, still decrement the fprobe_graph_active counter.

Cc: stable@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Link: https://lore.kernel.org/20250220202055.565129766@goodmis.org
Fixes: 4346ba1604093 ("fprobe: Rewrite fprobe on function-graph tracer")
Closes: https://lore.kernel.org/all/20250217114918.10397-A-hca@linux.ibm.com/
Reported-by: Heiko Carstens <hca@linux.ibm.com>
Tested-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agofprobe: Always unregister fgraph function from ops
Steven Rostedt [Thu, 20 Feb 2025 20:20:12 +0000 (15:20 -0500)]
fprobe: Always unregister fgraph function from ops

When the last fprobe is removed, it calls unregister_ftrace_graph() to
remove the graph_ops from function graph. The issue is when it does so, it
calls return before removing the function from its graph ops via
ftrace_set_filter_ips(). This leaves the last function lingering in the
fprobe's fgraph ops and if a probe is added it also enables that last
function (even though the callback will just drop it, it does add unneeded
overhead to make that call).

  # echo "f:myevent1 kernel_clone" >> /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions
kernel_clone (1)            tramp: 0xffffffffc02f3000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60

  # echo "f:myevent2 schedule_timeout" >> /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions
kernel_clone (1)            tramp: 0xffffffffc02f3000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
schedule_timeout (1)            tramp: 0xffffffffc02f3000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60

  # > /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions

  # echo "f:myevent3 kmem_cache_free" >> /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions
kmem_cache_free (1)            tramp: 0xffffffffc0219000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
schedule_timeout (1)            tramp: 0xffffffffc0219000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60

The above enabled a fprobe on kernel_clone, and then on schedule_timeout.
The content of the enabled_functions shows the functions that have a
callback attached to them. The fprobe attached to those functions
properly. Then the fprobes were cleared, and enabled_functions was empty
after that. But after adding a fprobe on kmem_cache_free, the
enabled_functions shows that the schedule_timeout was attached again. This
is because it was still left in the fprobe ops that is used to tell
function graph what functions it wants callbacks from.

Cc: stable@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Link: https://lore.kernel.org/20250220202055.393254452@goodmis.org
Fixes: 4346ba1604093 ("fprobe: Rewrite fprobe on function-graph tracer")
Tested-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agoftrace: Do not add duplicate entries in subops manager ops
Steven Rostedt [Thu, 20 Feb 2025 20:20:11 +0000 (15:20 -0500)]
ftrace: Do not add duplicate entries in subops manager ops

Check if a function is already in the manager ops of a subops. A manager
ops contains multiple subops, and if two or more subops are tracing the
same function, the manager ops only needs a single entry in its hash.

Cc: stable@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Link: https://lore.kernel.org/20250220202055.226762894@goodmis.org
Fixes: 4f554e955614f ("ftrace: Add ftrace_set_filter_ips function")
Tested-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agoftrace: Fix accounting of adding subops to a manager ops
Steven Rostedt [Thu, 20 Feb 2025 20:20:10 +0000 (15:20 -0500)]
ftrace: Fix accounting of adding subops to a manager ops

Function graph uses a subops and manager ops mechanism to attach to
ftrace.  The manager ops connects to ftrace and the functions it connects
to is defined by a list of subops that it manages.

The function hash that defines what the above ops attaches to limits the
functions to attach if the hash has any content. If the hash is empty, it
means to trace all functions.

The creation of the manager ops hash is done by iterating over all the
subops hashes. If any of the subops hashes is empty, it means that the
manager ops hash must trace all functions as well.

The issue is in the creation of the manager ops. When a second subops is
attached, a new hash is created by starting it as NULL and adding the
subops one at a time. But the NULL ops is mistaken as an empty hash, and
once an empty hash is found, it stops the loop of subops and just enables
all functions.

  # echo "f:myevent1 kernel_clone" >> /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions
kernel_clone (1)            tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60

  # echo "f:myevent2 schedule_timeout" >> /sys/kernel/tracing/dynamic_events
  # cat /sys/kernel/tracing/enabled_functions
trace_initcall_start_cb (1)             tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
run_init_process (1)            tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
try_to_run_init_process (1)             tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
x86_pmu_show_pmu_cap (1)                tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
cleanup_rapl_pmus (1)                   tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
uncore_free_pcibus_map (1)              tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
uncore_types_exit (1)                   tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
uncore_pci_exit.part.0 (1)              tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
kvm_shutdown (1)                tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
vmx_dump_msrs (1)               tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
vmx_cleanup_l1d_flush (1)               tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60
[..]

Fix this by initializing the new hash to NULL and if the hash is NULL do
not treat it as an empty hash but instead allocate by copying the content
of the first sub ops. Then on subsequent iterations, the new hash will not
be NULL, but the content of the previous subops. If that first subops
attached to all functions, then new hash may assume that the manager ops
also needs to attach to all functions.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Link: https://lore.kernel.org/20250220202055.060300046@goodmis.org
Fixes: 5fccc7552ccbc ("ftrace: Add subops logic to allow one ops to manage many")
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agox86/tsc: Always save/restore TSC sched_clock() on suspend/resume
Guilherme G. Piccoli [Sat, 15 Feb 2025 20:58:16 +0000 (17:58 -0300)]
x86/tsc: Always save/restore TSC sched_clock() on suspend/resume

TSC could be reset in deep ACPI sleep states, even with invariant TSC.

That's the reason we have sched_clock() save/restore functions, to deal
with this situation. But what happens is that such functions are guarded
with a check for the stability of sched_clock - if not considered stable,
the save/restore routines aren't executed.

On top of that, we have a clear comment in native_sched_clock() saying
that *even* with TSC unstable, we continue using TSC for sched_clock due
to its speed.

In other words, if we have a situation of TSC getting detected as unstable,
it marks the sched_clock as unstable as well, so subsequent S3 sleep cycles
could bring bogus sched_clock values due to the lack of the save/restore
mechanism, causing warnings like this:

  [22.954918] ------------[ cut here ]------------
  [22.954923] Delta way too big! 18446743750843854390 ts=18446744072977390405 before=322133536015 after=322133536015 write stamp=18446744072977390405
  [22.954923] If you just came from a suspend/resume,
  [22.954923] please switch to the trace global clock:
  [22.954923]   echo global > /sys/kernel/tracing/trace_clock
  [22.954923] or add trace_clock=global to the kernel command line
  [22.954937] WARNING: CPU: 2 PID: 5728 at kernel/trace/ring_buffer.c:2890 rb_add_timestamp+0x193/0x1c0

Notice that the above was reproduced even with "trace_clock=global".

The fix for that is to _always_ save/restore the sched_clock on suspend
cycle _if TSC is used_ as sched_clock - only if we fallback to jiffies
the sched_clock_stable() check becomes relevant to save/restore the
sched_clock.

Debugged-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250215210314.351480-1-gpiccoli@igalia.com
10 months agoACPI/processor_idle: Export acpi_processor_ffh_play_dead()
Artem Bityutskiy [Sun, 16 Feb 2025 12:26:14 +0000 (14:26 +0200)]
ACPI/processor_idle: Export acpi_processor_ffh_play_dead()

The kernel test robot reported the following build error:

  >> ERROR: modpost: "acpi_processor_ffh_play_dead" [drivers/acpi/processor.ko] undefined!

Caused by this recently merged commit:

  541ddf31e300 ("ACPI/processor_idle: Add FFH state handling")

The build failure is due to an oversight in the 'CONFIG_ACPI_PROCESSOR=m' case,
the function export is missing. Add it.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202502151207.FA9UO1iX-lkp@intel.com/
Fixes: 541ddf31e300 ("ACPI/processor_idle: Add FFH state handling")
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/de5bf4f116779efde315782a15146fdc77a4a044.camel@linux.intel.com
10 months agoperf/core: Move perf_event sysctls into kernel/events
Joel Granados [Tue, 18 Feb 2025 09:56:21 +0000 (10:56 +0100)]
perf/core: Move perf_event sysctls into kernel/events

Move ctl tables to two files:

 - perf_event_{paranoid,mlock_kb,max_sample_rate} and
   perf_cpu_time_max_percent into kernel/events/core.c

 - perf_event_max_{stack,context_per_stack} into
   kernel/events/callchain.c

Make static variables and functions that are fully contained in core.c
and callchain.cand remove them from include/linux/perf_event.h.
Additionally six_hundred_forty_kb is moved to callchain.c.

Two new sysctl tables are added ({callchain,events_core}_sysctl_table)
with their respective sysctl registration functions.

This is part of a greater effort to move ctl tables into their
respective subsystems which will reduce the merge conflicts in
kerenel/sysctl.c.

Signed-off-by: Joel Granados <joel.granados@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250218-jag-mv_ctltables-v1-5-cd3698ab8d29@kernel.org
10 months agoMerge branch 'perf/urgent' into perf/core, to pick up fixes before merging new patches
Ingo Molnar [Fri, 21 Feb 2025 13:52:19 +0000 (14:52 +0100)]
Merge branch 'perf/urgent' into perf/core, to pick up fixes before merging new patches

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 months agox86/module: Remove unnecessary check in module_finalize()
Dan Carpenter [Wed, 19 Feb 2025 13:48:49 +0000 (16:48 +0300)]
x86/module: Remove unnecessary check in module_finalize()

The "calls" pointer can no longer be NULL after the following
commit:

   ab9fea59487d ("x86/alternative: Simplify callthunk patching")

Delete this unnecessary check.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/fcbb2f57-0714-4139-b441-8817365c16a1@stanley.mountain
10 months agodocs: arch/x86/sva: Fix two grammar errors under Background and FAQ
Brian Ochoa [Wed, 19 Feb 2025 15:09:20 +0000 (10:09 -0500)]
docs: arch/x86/sva: Fix two grammar errors under Background and FAQ

- Correct "in order" to "in order to"
- Append missing quantifier

Signed-off-by: Brian Ochoa <brianeochoa@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250219150920.445802-1-brianeochoa@gmail.com
10 months agorseq: Fix rseq registration with CONFIG_DEBUG_RSEQ
Michael Jeanson [Wed, 19 Feb 2025 20:53:26 +0000 (15:53 -0500)]
rseq: Fix rseq registration with CONFIG_DEBUG_RSEQ

With CONFIG_DEBUG_RSEQ=y, at rseq registration the read-only fields are
copied from user-space, if this copy fails the syscall returns -EFAULT
and the registration should not be activated - but it erroneously is.

Move the activation of the registration after the copy of the fields to
fix this bug.

Fixes: 7d5265ffcd8b ("rseq: Validate read-only fields under DEBUG_RSEQ config")
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/r/20250219205330.324770-1-mjeanson@efficios.com
10 months agox86/cpufeatures: Make AVX-VNNI depend on AVX
Eric Biggers [Thu, 20 Feb 2025 06:01:24 +0000 (22:01 -0800)]
x86/cpufeatures: Make AVX-VNNI depend on AVX

The 'noxsave' boot option disables support for AVX, but support for the
AVX-VNNI feature was still declared on CPUs that support it.  Fix this.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250220060124.89622-1-ebiggers@kernel.org
10 months agox86/build: Raise the minimum LLVM version to 15.0.0
Nathan Chancellor [Thu, 20 Feb 2025 20:08:12 +0000 (13:08 -0700)]
x86/build: Raise the minimum LLVM version to 15.0.0

In a similar vein as to this pending commit in the x86/asm tree:

  a3e8fe814ad1 ("x86/build: Raise the minimum GCC version to 8.1")

... bump the minimum supported version of LLVM for building x86 kernels
to 15.0.0, as that is the first version that has support for
'-mstack-protector-guard-symbol', which is used unconditionally after:

  80d47defddc0 ("x86/stackprotector/64: Convert to normal per-CPU variable"):

Older Clang versions will fail the build with:

  clang-14: error: unknown argument: '-mstack-protector-guard-symbol=__ref_stack_chk_guard'

Fixes: 80d47defddc0 ("x86/stackprotector/64: Convert to normal per-CPU variable")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Brian Gerst <brgerst@gmail.com>
Link: https://lore.kernel.org/r/20250220-x86-bump-min-llvm-for-stackp-v1-1-ecb3c906e790@kernel.org
10 months agoarm64: dts: rockchip: rk356x: Move PCIe MSI to use GIC ITS instead of MBI
Dmitry Osipenko [Sun, 16 Feb 2025 22:16:34 +0000 (01:16 +0300)]
arm64: dts: rockchip: rk356x: Move PCIe MSI to use GIC ITS instead of MBI

Rockchip 356x device-tree now supports GIC ITS. Move PCIe controller's
MSI to use ITS instead of MBI. This removes extra CPU overhead of handling
PCIe MBIs by letting GIC's ITS to serve the PCIe MSIs.

Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250216221634.364158-4-dmitry.osipenko@collabora.com
10 months agoarm64: dts: rockchip: rk356x: Add MSI controller node
Dmitry Osipenko [Sun, 16 Feb 2025 22:16:33 +0000 (01:16 +0300)]
arm64: dts: rockchip: rk356x: Add MSI controller node

Rockchip 356x SoC's GIC has two hardware integration issues that
affect MSI functionality of the GIC. Previously, both these GIC
issues were worked around by using MBI for MSI instead of ITS
because kernel GIC driver didn't have necessary quirks.

First issue is about RK356x GIC not supporting programmable
shareability, while reporting it as supported in a GIC's feature
register. Rockchip assigned Erratum ID #3568001 for this issue. This
patch adds dma-noncoherent property to the GIC node, denoting that a SW
workaround is required for mitigating the issue.

Second issue is about GIC AXI master interface addressing limited to
the first 4GB of physical address space. Rockchip assigned Erratum
ID #3568002 for this issue.

Now that kernel supports quirks for both of the erratums, add
MSI controller node to RK356x device-tree.

Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250216221634.364158-3-dmitry.osipenko@collabora.com
10 months agoirqchip/gic-v3: Add Rockchip 3568002 erratum workaround
Dmitry Osipenko [Sun, 16 Feb 2025 22:16:32 +0000 (01:16 +0300)]
irqchip/gic-v3: Add Rockchip 3568002 erratum workaround

Rockchip RK3566/RK3568 GIC600 integration has DDR addressing
limited to the first 32bit of physical address space. Rockchip
assigned Erratum ID #3568002 for this issue. Add driver quirk for
this Rockchip GIC Erratum.

Note, that the 0x0201743b GIC600 ID is not Rockchip-specific and is
common for many ARM GICv3 implementations. Hence, there is an extra
of_machine_is_compatible() check.

Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/all/20250216221634.364158-2-dmitry.osipenko@collabora.com
10 months agovdso: Remove remnants of architecture-specific time storage
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:50 +0000 (13:05 +0100)]
vdso: Remove remnants of architecture-specific time storage

All users of the time releated parts of the vDSO are now using the generic
storage implementation. Remove the therefore unnecessary compatibility
accessor functions and symbols.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-18-13a4669dfc8c@linutronix.de
10 months agovdso: Remove remnants of architecture-specific random state storage
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:49 +0000 (13:05 +0100)]
vdso: Remove remnants of architecture-specific random state storage

All users of the random vDSO are using the generic storage
implementation. Remove the now unnecessary compatibility accessor
functions and symbols.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-17-13a4669dfc8c@linutronix.de
10 months agox86/vdso/vdso2c: Remove page handling
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:48 +0000 (13:05 +0100)]
x86/vdso/vdso2c: Remove page handling

The values are not used anymore.
Also the sanity checks performed by vdso2c can never trigger as they
only validate invariants already enforced by the linker script.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-16-13a4669dfc8c@linutronix.de
10 months agox86/vdso: Switch to generic storage implementation
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:47 +0000 (13:05 +0100)]
x86/vdso: Switch to generic storage implementation

The generic storage implementation provides the same features as the
custom one. However it can be shared between architectures, making
maintenance easier.

This switch also moves the random state data out of the time data page.
The currently used hardcoded __VDSO_RND_DATA_OFFSET does not take into
account changes to the time data page layout.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-15-13a4669dfc8c@linutronix.de
10 months agopowerpc/vdso: Switch to generic storage implementation
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:46 +0000 (13:05 +0100)]
powerpc/vdso: Switch to generic storage implementation

The generic storage implementation provides the same features as the
custom one. However it can be shared between architectures, making
maintenance easier.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-14-13a4669dfc8c@linutronix.de
10 months agoMIPS: vdso: Switch to generic storage implementation
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:45 +0000 (13:05 +0100)]
MIPS: vdso: Switch to generic storage implementation

The generic storage implementation provides the same features as the
custom one. However it can be shared between architectures, making
maintenance easier.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-13-13a4669dfc8c@linutronix.de
10 months agos390/vdso: Switch to generic storage implementation
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:44 +0000 (13:05 +0100)]
s390/vdso: Switch to generic storage implementation

The generic storage implementation provides the same features as the
custom one. However it can be shared between architectures, making
maintenance easier.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-12-13a4669dfc8c@linutronix.de
10 months agoarm: vdso: Switch to generic storage implementation
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:43 +0000 (13:05 +0100)]
arm: vdso: Switch to generic storage implementation

The generic storage implementation provides the same features as the
custom one. However it can be shared between architectures, making
maintenance easier.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-11-13a4669dfc8c@linutronix.de
10 months agoLoongArch: vDSO: Switch to generic storage implementation
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:42 +0000 (13:05 +0100)]
LoongArch: vDSO: Switch to generic storage implementation

The generic storage implementation provides the same features as the
custom one. However it can be shared between architectures, making
maintenance easier.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-10-13a4669dfc8c@linutronix.de
10 months agoriscv: vdso: Switch to generic storage implementation
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:41 +0000 (13:05 +0100)]
riscv: vdso: Switch to generic storage implementation

The generic storage implementation provides the same features as the
custom one. However it can be shared between architectures, making
maintenance easier.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-9-13a4669dfc8c@linutronix.de
10 months agoarm64: vdso: Switch to generic storage implementation
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:40 +0000 (13:05 +0100)]
arm64: vdso: Switch to generic storage implementation

The generic storage implementation provides the same features as the
custom one. However it can be shared between architectures, making
maintenance easier.

This switch also moves the random state data out of the time data page.
The currently used hardcoded __VDSO_RND_DATA_OFFSET does not take into
account changes to the time data page layout.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-8-13a4669dfc8c@linutronix.de
10 months agovdso: Add generic architecture-specific data storage
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:39 +0000 (13:05 +0100)]
vdso: Add generic architecture-specific data storage

Some architectures need to expose architecture-specific data to the vDSO.

Enable the generic vDSO storage mechanism to both store and map this
data. Some architectures require more than a single page, like LoongArch,
so prepare for that usecase, too.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-7-13a4669dfc8c@linutronix.de
10 months agovdso: Add generic random data storage
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:38 +0000 (13:05 +0100)]
vdso: Add generic random data storage

Extend the generic vDSO data storage with a page for the random state data.
The random state data is stored in a dedicated page, as the existing
storage page is only meant for time-related, time-namespace-aware data.
This simplifies to access logic to not need to handle time namespaces
anymore and also frees up more space in the time-related page.

In case further generic vDSO data store is required it can be added to
the random state page.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-6-13a4669dfc8c@linutronix.de
10 months agovdso: Add generic time data storage
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:37 +0000 (13:05 +0100)]
vdso: Add generic time data storage

Historically each architecture defined their own way to store the vDSO
data page. Add a generic mechanism to provide storage for that page.

Furthermore this generic storage will be extended to also provide
uniform storage for *non*-time-related data, like the random state or
architecture-specific data. These will have their own pages and data
structures, so rename 'vdso_data' into 'vdso_time_data' to make that
split clear from the name.

Also introduce a new consistent naming scheme for the symbols related to
the vDSO, which makes it clear if the symbol is accessible from
userspace or kernel space and the type of data behind the symbol.

The generic fault handler contains an optimization to prefault the vvar
page when the timens page is accessed. This was lifted from s390 and x86.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-5-13a4669dfc8c@linutronix.de
10 months agovdso: Rename included Makefile
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:36 +0000 (13:05 +0100)]
vdso: Rename included Makefile

As the Makefile is included into other Makefiles it can not be used to
define objects to be built from the current source directory.
However the generic datastore will introduce such a local source file.
Rename the included Makefile so it is clear how it is to be used and to
make room for a regular Makefile in lib/vdso/.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-4-13a4669dfc8c@linutronix.de
10 months agovdso: Introduce vdso/align.h
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:35 +0000 (13:05 +0100)]
vdso: Introduce vdso/align.h

The vDSO implementation can only include headers from the vdso/
namespace. To enable the usage of the ALIGN() macro from the vDSO, move
linux/align.h to vdso/align.h wholly.
As the only dependency linux/const.h is only a wrapper around
vdso/const.h anyways adapt that dependency.

Also provide a compatibility wrapper linux/align.h.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-3-13a4669dfc8c@linutronix.de
10 months agoparisc: Remove unused symbol vdso_data
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:34 +0000 (13:05 +0100)]
parisc: Remove unused symbol vdso_data

The symbol vdso_data is and has been unused.
Remove it.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-2-13a4669dfc8c@linutronix.de
10 months agox86/vdso: Fix latent bug in vclock_pages calculation
Thomas Weißschuh [Tue, 4 Feb 2025 12:05:33 +0000 (13:05 +0100)]
x86/vdso: Fix latent bug in vclock_pages calculation

The vclock pages are *after* the non-vclock pages. Currently there are both
two vclock and two non-vclock pages so the existing logic works by
accident.  As soon as the number of pages changes it will break however.
This will be the case with the introduction of the generic vDSO data
storage.

Use a macro to keep the calculation understandable and in sync between
the linker script and mapping code.

Fixes: e93d2521b27f ("x86/vdso: Split virtual clock pages into dedicated mapping")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-1-13a4669dfc8c@linutronix.de
10 months agoirqchip/qcom-pdc: Workaround hardware register bug on X1E80100
Stephan Gerhold [Tue, 18 Feb 2025 15:59:18 +0000 (16:59 +0100)]
irqchip/qcom-pdc: Workaround hardware register bug on X1E80100

On X1E80100, there is a hardware bug in the register logic of the
IRQ_ENABLE_BANK register: While read accesses work on the normal address,
all write accesses must be made to a shifted address. Without a workaround
for this, the wrong interrupt gets enabled in the PDC and it is impossible
to wakeup from deep suspend (CX collapse). This has not caused problems so
far, because the deep suspend state was not enabled. A workaround is
required now since work is ongoing to fix this.

The PDC has multiple "DRV" regions, each one has a size of 0x10000 and
provides the same set of registers for a particular client in the system.
Linux is one the clients and uses DRV region 2 on X1E. Each "bank" inside
the DRV region consists of 32 interrupt pins that can be enabled using the
IRQ_ENABLE_BANK register:

  IRQ_ENABLE_BANK[bank] = base + IRQ_ENABLE_BANK + bank * sizeof(u32)

On X1E, this works as intended for read access. However, write access to
most banks is shifted by 2:

  IRQ_ENABLE_BANK_X1E[0] = IRQ_ENABLE_BANK[-2]
  IRQ_ENABLE_BANK_X1E[1] = IRQ_ENABLE_BANK[-1]
  IRQ_ENABLE_BANK_X1E[2] = IRQ_ENABLE_BANK[0] = IRQ_ENABLE_BANK[2 - 2]
  IRQ_ENABLE_BANK_X1E[3] = IRQ_ENABLE_BANK[1] = IRQ_ENABLE_BANK[3 - 2]
  IRQ_ENABLE_BANK_X1E[4] = IRQ_ENABLE_BANK[2] = IRQ_ENABLE_BANK[4 - 2]
  IRQ_ENABLE_BANK_X1E[5] = IRQ_ENABLE_BANK[5] (this one works as intended)

The negative indexes underflow to banks of the previous DRV/client region:

  IRQ_ENABLE_BANK_X1E[drv 2][bank 0] = IRQ_ENABLE_BANK[drv 2][bank -2]
                                     = IRQ_ENABLE_BANK[drv 1][bank 5-2]
                                     = IRQ_ENABLE_BANK[drv 1][bank 3]
                                     = IRQ_ENABLE_BANK[drv 1][bank 0 + 3]
  IRQ_ENABLE_BANK_X1E[drv 2][bank 1] = IRQ_ENABLE_BANK[drv 2][bank -1]
                                     = IRQ_ENABLE_BANK[drv 1][bank 5-1]
                                     = IRQ_ENABLE_BANK[drv 1][bank 4]
                                     = IRQ_ENABLE_BANK[drv 1][bank 1 + 3]

Introduce a workaround for the bug by matching the qcom,x1e80100-pdc
compatible and apply the offsets as shown above:

 - Bank 0...1: previous DRV region, bank += 3
 - Bank 1...4: our DRV region, bank -= 2
 - Bank 5: our DRV region, no fixup required

The PDC node in the device tree only describes the DRV region for the Linux
client, but the workaround also requires to map parts of the previous DRV
region to issue writes there. To maintain compatibility with old device
trees, obtain the base address of the preceeding region by applying the
-0x10000 offset. Note that this is also more correct from a conceptual
point of view:

It does not really make use of the other region; it just issues shifted
writes that end up in the registers of the Linux associated DRV region 2.

Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/all/20250218-x1e80100-pdc-hw-wa-v2-1-29be4c98e355@linaro.org
10 months agoMerge tag 'for-v6.14-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux...
Linus Torvalds [Fri, 21 Feb 2025 02:07:32 +0000 (18:07 -0800)]
Merge tag 'for-v6.14-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply fixes from Sebastian Reichel:

 - core: Fix extension related lockdep warning for LED triggers

 - axp20x-battery: Fix fault handling for AXP717

 - da9150-fg: fix potential overflow

* tag 'for-v6.14-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: supply: axp20x_battery: Fix fault handling for AXP717
  power: supply: core: Fix extension related lockdep warning
  power: supply: da9150-fg: fix potential overflow

10 months agoMerge tag 'ata-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/libata...
Linus Torvalds [Fri, 21 Feb 2025 02:05:24 +0000 (18:05 -0800)]
Merge tag 'ata-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux

Pull ata fix from Niklas Cassel:

 - Fix an unintentional masking of AHCI ports when the device tree does
   not define port child nodes (Damien)

* tag 'ata-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: libahci_platform: Do not set mask_port_map when not needed

10 months agoMerge tag 'drm-msm-fixes-2025-02-20' of https://gitlab.freedesktop.org/drm/msm into...
Dave Airlie [Fri, 21 Feb 2025 00:50:28 +0000 (10:50 +1000)]
Merge tag 'drm-msm-fixes-2025-02-20' of https://gitlab.freedesktop.org/drm/msm into drm-fixes

Fixes for v6.14-rc4

Display:
* More catalog fixes:
 - to skip watchdog programming through top block if its not present
 - fix the setting of WB mask to ensure the WB input control is programmed
   correctly through ping-pong
 - drop lm_pair for sm6150 as that chipset does not have any 3dmerge block
* Fix the mode validation logic for DP/eDP to account for widebus (2ppc)
  to allow high clock resolutions
* Fix to disable dither during encoder disable as otherwise this was
  causing kms_writeback failure due to resource sharing between
* WB and DSI paths as DSI uses dither but WB does not
* Fixes for virtual planes, namely to drop extraneous return and fix
  uninitialized variables
* Fix to avoid spill-over of DSC encoder block bits when programming
  the bits-per-component
* Fixes in the DSI PHY to protect against concurrent access of
  PHY_CMN_CLK_CFG regs between clock and display drivers

Core/GPU:
* Fix non-blocking fence wait incorrectly rounding up to 1 jiffy timeout
* Only print GMU fw version once, instead of each time the GPU resumes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGtt2AODBXdod8ULXcAygf_qYvwRDVeUVtODx=2jErp6cA@mail.gmail.com
10 months agoMerge tag 'drm-intel-fixes-2025-02-20' of https://gitlab.freedesktop.org/drm/i915...
Dave Airlie [Fri, 21 Feb 2025 00:44:53 +0000 (10:44 +1000)]
Merge tag 'drm-intel-fixes-2025-02-20' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Use spin_lock_irqsave() in interruptible context on guc submission (Krzysztof)
- Fixes on DDI and TRANS programming (Imre)
- Make sure all planes in use by the joiner have their crtc included (Ville)
- Fix 128b/132b modeset issues (Imre)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z7dgcUG_hvityvHn@intel.com
10 months agoMerge tag 'nvme-6.14-2025-02-20' of git://git.infradead.org/nvme into block-6.14
Jens Axboe [Fri, 21 Feb 2025 00:43:59 +0000 (17:43 -0700)]
Merge tag 'nvme-6.14-2025-02-20' of git://git.infradead.org/nvme into block-6.14

Pull NVMe fixes from Keith:

"nvme fixes for Linux 6.14

 - FC controller state check fixes (Daniel)
 - PCI Endpoint fixes (Damien)
 - TCP connection failure fixe (Caleb)
 - TCP handling C2HTermReq PDU (Maurizio)
 - RDMA queue state check (Ruozhu)
 - Apple controller fixes (Hector)
 - Target crash on disbaled namespace (Hannes)"

* tag 'nvme-6.14-2025-02-20' of git://git.infradead.org/nvme:
  nvme: only allow entering LIVE from CONNECTING state
  nvme-fc: rely on state transitions to handle connectivity loss
  apple-nvme: Support coprocessors left idle
  apple-nvme: Release power domains when probe fails
  nvmet: Use enum definitions instead of hardcoded values
  nvme: Cleanup the definition of the controller config register fields
  nvme/ioctl: add missing space in err message
  nvme-tcp: fix connect failure on receiving partial ICResp PDU
  nvme: tcp: Fix compilation warning with W=1
  nvmet: pci-epf: Avoid RCU stalls under heavy workload
  nvmet: pci-epf: Do not uselessly write the CSTS register
  nvmet: pci-epf: Correctly initialize CSTS when enabling the controller
  nvmet-rdma: recheck queue state is LIVE in state lock in recv done
  nvmet: Fix crash when a namespace is disabled
  nvme-tcp: add basic support for the C2HTermReq PDU
  nvme-pci: quirk Acer FA100 for non-uniqueue identifiers

10 months agoMerge tag 'drm-xe-fixes-2025-02-20' of https://gitlab.freedesktop.org/drm/xe/kernel...
Dave Airlie [Fri, 21 Feb 2025 00:42:31 +0000 (10:42 +1000)]
Merge tag 'drm-xe-fixes-2025-02-20' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

- Fix error handling in xe_irq_install (Lucas)
- Fix devcoredump format (Jose, Lucas)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z7dePS3a9POnjrVL@intel.com
10 months agoMerge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Linus Torvalds [Thu, 20 Feb 2025 23:37:17 +0000 (15:37 -0800)]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull BPF fixes from Daniel Borkmann:

 - Fix a soft-lockup in BPF arena_map_free on 64k page size kernels
   (Alan Maguire)

 - Fix a missing allocation failure check in BPF verifier's
   acquire_lock_state (Kumar Kartikeya Dwivedi)

 - Fix a NULL-pointer dereference in trace_kfree_skb by adding kfree_skb
   to the raw_tp_null_args set (Kuniyuki Iwashima)

 - Fix a deadlock when freeing BPF cgroup storage (Abel Wu)

 - Fix a syzbot-reported deadlock when holding BPF map's freeze_mutex
   (Andrii Nakryiko)

 - Fix a use-after-free issue in bpf_test_init when eth_skb_pkt_type is
   accessing skb data not containing an Ethernet header (Shigeru
   Yoshida)

 - Fix skipping non-existing keys in generic_map_lookup_batch (Yan Zhai)

 - Several BPF sockmap fixes to address incorrect TCP copied_seq
   calculations, which prevented correct data reads from recv(2) in user
   space (Jiayuan Chen)

 - Two fixes for BPF map lookup nullness elision (Daniel Xu)

 - Fix a NULL-pointer dereference from vmlinux BTF lookup in
   bpf_sk_storage_tracing_allowed (Jared Kangas)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests: bpf: test batch lookup on array of maps with holes
  bpf: skip non exist keys in generic_map_lookup_batch
  bpf: Handle allocation failure in acquire_lock_state
  bpf: verifier: Disambiguate get_constant_map_key() errors
  bpf: selftests: Test constant key extraction on irrelevant maps
  bpf: verifier: Do not extract constant map keys for irrelevant maps
  bpf: Fix softlockup in arena_map_free on 64k page kernel
  net: Add rx_skb of kfree_skb to raw_tp_null_args[].
  bpf: Fix deadlock when freeing cgroup storage
  selftests/bpf: Add strparser test for bpf
  selftests/bpf: Fix invalid flag of recv()
  bpf: Disable non stream socket for strparser
  bpf: Fix wrong copied_seq calculation
  strparser: Add read_sock callback
  bpf: avoid holding freeze_mutex during mmap operation
  bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic
  selftests/bpf: Adjust data size to have ETH_HLEN
  bpf, test_run: Fix use-after-free issue in eth_skb_pkt_type()
  bpf: Remove unnecessary BTF lookups in bpf_sk_storage_tracing_allowed

10 months agoMAINTAINERS: Change maintainer for RDT
Fenghua Yu [Fri, 31 Jan 2025 19:07:31 +0000 (11:07 -0800)]
MAINTAINERS: Change maintainer for RDT

Due to job transition, I am stepping down as RDT maintainer.
Add Tony as a co-maintainer.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20250131190731.3981085-1-fenghua.yu%40intel.com