Re: Linux 5.3.11

From: Greg KH
Date: Tue Nov 12 2019 - 13:35:15 EST


diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 5f7d7b14fa44..0c28d07654bd 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -486,6 +486,8 @@ What: /sys/devices/system/cpu/vulnerabilities
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
/sys/devices/system/cpu/vulnerabilities/l1tf
/sys/devices/system/cpu/vulnerabilities/mds
+ /sys/devices/system/cpu/vulnerabilities/tsx_async_abort
+ /sys/devices/system/cpu/vulnerabilities/itlb_multihit
Date: January 2018
Contact: Linux kernel mailing list <linux-kernel@xxxxxxxxxxxxxxx>
Description: Information about CPU vulnerabilities
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index 49311f3da6f2..0795e3c2643f 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -12,3 +12,5 @@ are configurable at compile, boot or run time.
spectre
l1tf
mds
+ tsx_async_abort
+ multihit.rst
diff --git a/Documentation/admin-guide/hw-vuln/multihit.rst b/Documentation/admin-guide/hw-vuln/multihit.rst
new file mode 100644
index 000000000000..ba9988d8bce5
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/multihit.rst
@@ -0,0 +1,163 @@
+iTLB multihit
+=============
+
+iTLB multihit is an erratum where some processors may incur a machine check
+error, possibly resulting in an unrecoverable CPU lockup, when an
+instruction fetch hits multiple entries in the instruction TLB. This can
+occur when the page size is changed along with either the physical address
+or cache type. A malicious guest running on a virtualized system can
+exploit this erratum to perform a denial of service attack.
+
+
+Affected processors
+-------------------
+
+Variations of this erratum are present on most Intel Core and Xeon processor
+models. The erratum is not present on:
+
+ - non-Intel processors
+
+ - Some Atoms (Airmont, Bonnell, Goldmont, GoldmontPlus, Saltwell, Silvermont)
+
+ - Intel processors that have the PSCHANGE_MC_NO bit set in the
+ IA32_ARCH_CAPABILITIES MSR.
+
+
+Related CVEs
+------------
+
+The following CVE entry is related to this issue:
+
+ ============== =================================================
+ CVE-2018-12207 Machine Check Error Avoidance on Page Size Change
+ ============== =================================================
+
+
+Problem
+-------
+
+Privileged software, including OS and virtual machine managers (VMM), are in
+charge of memory management. A key component in memory management is the control
+of the page tables. Modern processors use virtual memory, a technique that creates
+the illusion of a very large memory for processors. This virtual space is split
+into pages of a given size. Page tables translate virtual addresses to physical
+addresses.
+
+To reduce latency when performing a virtual to physical address translation,
+processors include a structure, called TLB, that caches recent translations.
+There are separate TLBs for instruction (iTLB) and data (dTLB).
+
+Under this errata, instructions are fetched from a linear address translated
+using a 4 KB translation cached in the iTLB. Privileged software modifies the
+paging structure so that the same linear address using large page size (2 MB, 4
+MB, 1 GB) with a different physical address or memory type. After the page
+structure modification but before the software invalidates any iTLB entries for
+the linear address, a code fetch that happens on the same linear address may
+cause a machine-check error which can result in a system hang or shutdown.
+
+
+Attack scenarios
+----------------
+
+Attacks against the iTLB multihit erratum can be mounted from malicious
+guests in a virtualized system.
+
+
+iTLB multihit system information
+--------------------------------
+
+The Linux kernel provides a sysfs interface to enumerate the current iTLB
+multihit status of the system:whether the system is vulnerable and which
+mitigations are active. The relevant sysfs file is:
+
+/sys/devices/system/cpu/vulnerabilities/itlb_multihit
+
+The possible values in this file are:
+
+.. list-table::
+
+ * - Not affected
+ - The processor is not vulnerable.
+ * - KVM: Mitigation: Split huge pages
+ - Software changes mitigate this issue.
+ * - KVM: Vulnerable
+ - The processor is vulnerable, but no mitigation enabled
+
+
+Enumeration of the erratum
+--------------------------------
+
+A new bit has been allocated in the IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) msr
+and will be set on CPU's which are mitigated against this issue.
+
+ ======================================= =========== ===============================
+ IA32_ARCH_CAPABILITIES MSR Not present Possibly vulnerable,check model
+ IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '0' Likely vulnerable,check model
+ IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '1' Not vulnerable
+ ======================================= =========== ===============================
+
+
+Mitigation mechanism
+-------------------------
+
+This erratum can be mitigated by restricting the use of large page sizes to
+non-executable pages. This forces all iTLB entries to be 4K, and removes
+the possibility of multiple hits.
+
+In order to mitigate the vulnerability, KVM initially marks all huge pages
+as non-executable. If the guest attempts to execute in one of those pages,
+the page is broken down into 4K pages, which are then marked executable.
+
+If EPT is disabled or not available on the host, KVM is in control of TLB
+flushes and the problematic situation cannot happen. However, the shadow
+EPT paging mechanism used by nested virtualization is vulnerable, because
+the nested guest can trigger multiple iTLB hits by modifying its own
+(non-nested) page tables. For simplicity, KVM will make large pages
+non-executable in all shadow paging modes.
+
+Mitigation control on the kernel command line and KVM - module parameter
+------------------------------------------------------------------------
+
+The KVM hypervisor mitigation mechanism for marking huge pages as
+non-executable can be controlled with a module parameter "nx_huge_pages=".
+The kernel command line allows to control the iTLB multihit mitigations at
+boot time with the option "kvm.nx_huge_pages=".
+
+The valid arguments for these options are:
+
+ ========== ================================================================
+ force Mitigation is enabled. In this case, the mitigation implements
+ non-executable huge pages in Linux kernel KVM module. All huge
+ pages in the EPT are marked as non-executable.
+ If a guest attempts to execute in one of those pages, the page is
+ broken down into 4K pages, which are then marked executable.
+
+ off Mitigation is disabled.
+
+ auto Enable mitigation only if the platform is affected and the kernel
+ was not booted with the "mitigations=off" command line parameter.
+ This is the default option.
+ ========== ================================================================
+
+
+Mitigation selection guide
+--------------------------
+
+1. No virtualization in use
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The system is protected by the kernel unconditionally and no further
+ action is required.
+
+2. Virtualization with trusted guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ If the guest comes from a trusted source, you may assume that the guest will
+ not attempt to maliciously exploit these errata and no further action is
+ required.
+
+3. Virtualization with untrusted guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ If the guest comes from an untrusted source, the guest host kernel will need
+ to apply iTLB multihit mitigation via the kernel command line or kvm
+ module parameter.
diff --git a/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst b/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
new file mode 100644
index 000000000000..fddbd7579c53
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
@@ -0,0 +1,276 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+TAA - TSX Asynchronous Abort
+======================================
+
+TAA is a hardware vulnerability that allows unprivileged speculative access to
+data which is available in various CPU internal buffers by using asynchronous
+aborts within an Intel TSX transactional region.
+
+Affected processors
+-------------------
+
+This vulnerability only affects Intel processors that support Intel
+Transactional Synchronization Extensions (TSX) when the TAA_NO bit (bit 8)
+is 0 in the IA32_ARCH_CAPABILITIES MSR. On processors where the MDS_NO bit
+(bit 5) is 0 in the IA32_ARCH_CAPABILITIES MSR, the existing MDS mitigations
+also mitigate against TAA.
+
+Whether a processor is affected or not can be read out from the TAA
+vulnerability file in sysfs. See :ref:`tsx_async_abort_sys_info`.
+
+Related CVEs
+------------
+
+The following CVE entry is related to this TAA issue:
+
+ ============== ===== ===================================================
+ CVE-2019-11135 TAA TSX Asynchronous Abort (TAA) condition on some
+ microprocessors utilizing speculative execution may
+ allow an authenticated user to potentially enable
+ information disclosure via a side channel with
+ local access.
+ ============== ===== ===================================================
+
+Problem
+-------
+
+When performing store, load or L1 refill operations, processors write
+data into temporary microarchitectural structures (buffers). The data in
+those buffers can be forwarded to load operations as an optimization.
+
+Intel TSX is an extension to the x86 instruction set architecture that adds
+hardware transactional memory support to improve performance of multi-threaded
+software. TSX lets the processor expose and exploit concurrency hidden in an
+application due to dynamically avoiding unnecessary synchronization.
+
+TSX supports atomic memory transactions that are either committed (success) or
+aborted. During an abort, operations that happened within the transactional region
+are rolled back. An asynchronous abort takes place, among other options, when a
+different thread accesses a cache line that is also used within the transactional
+region when that access might lead to a data race.
+
+Immediately after an uncompleted asynchronous abort, certain speculatively
+executed loads may read data from those internal buffers and pass it to dependent
+operations. This can be then used to infer the value via a cache side channel
+attack.
+
+Because the buffers are potentially shared between Hyper-Threads cross
+Hyper-Thread attacks are possible.
+
+The victim of a malicious actor does not need to make use of TSX. Only the
+attacker needs to begin a TSX transaction and raise an asynchronous abort
+which in turn potenitally leaks data stored in the buffers.
+
+More detailed technical information is available in the TAA specific x86
+architecture section: :ref:`Documentation/x86/tsx_async_abort.rst <tsx_async_abort>`.
+
+
+Attack scenarios
+----------------
+
+Attacks against the TAA vulnerability can be implemented from unprivileged
+applications running on hosts or guests.
+
+As for MDS, the attacker has no control over the memory addresses that can
+be leaked. Only the victim is responsible for bringing data to the CPU. As
+a result, the malicious actor has to sample as much data as possible and
+then postprocess it to try to infer any useful information from it.
+
+A potential attacker only has read access to the data. Also, there is no direct
+privilege escalation by using this technique.
+
+
+.. _tsx_async_abort_sys_info:
+
+TAA system information
+-----------------------
+
+The Linux kernel provides a sysfs interface to enumerate the current TAA status
+of mitigated systems. The relevant sysfs file is:
+
+/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
+
+The possible values in this file are:
+
+.. list-table::
+
+ * - 'Vulnerable'
+ - The CPU is affected by this vulnerability and the microcode and kernel mitigation are not applied.
+ * - 'Vulnerable: Clear CPU buffers attempted, no microcode'
+ - The system tries to clear the buffers but the microcode might not support the operation.
+ * - 'Mitigation: Clear CPU buffers'
+ - The microcode has been updated to clear the buffers. TSX is still enabled.
+ * - 'Mitigation: TSX disabled'
+ - TSX is disabled.
+ * - 'Not affected'
+ - The CPU is not affected by this issue.
+
+.. _ucode_needed:
+
+Best effort mitigation mode
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If the processor is vulnerable, but the availability of the microcode-based
+mitigation mechanism is not advertised via CPUID the kernel selects a best
+effort mitigation mode. This mode invokes the mitigation instructions
+without a guarantee that they clear the CPU buffers.
+
+This is done to address virtualization scenarios where the host has the
+microcode update applied, but the hypervisor is not yet updated to expose the
+CPUID to the guest. If the host has updated microcode the protection takes
+effect; otherwise a few CPU cycles are wasted pointlessly.
+
+The state in the tsx_async_abort sysfs file reflects this situation
+accordingly.
+
+
+Mitigation mechanism
+--------------------
+
+The kernel detects the affected CPUs and the presence of the microcode which is
+required. If a CPU is affected and the microcode is available, then the kernel
+enables the mitigation by default.
+
+
+The mitigation can be controlled at boot time via a kernel command line option.
+See :ref:`taa_mitigation_control_command_line`.
+
+.. _virt_mechanism:
+
+Virtualization mitigation
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Affected systems where the host has TAA microcode and TAA is mitigated by
+having disabled TSX previously, are not vulnerable regardless of the status
+of the VMs.
+
+In all other cases, if the host either does not have the TAA microcode or
+the kernel is not mitigated, the system might be vulnerable.
+
+
+.. _taa_mitigation_control_command_line:
+
+Mitigation control on the kernel command line
+---------------------------------------------
+
+The kernel command line allows to control the TAA mitigations at boot time with
+the option "tsx_async_abort=". The valid arguments for this option are:
+
+ ============ =============================================================
+ off This option disables the TAA mitigation on affected platforms.
+ If the system has TSX enabled (see next parameter) and the CPU
+ is affected, the system is vulnerable.
+
+ full TAA mitigation is enabled. If TSX is enabled, on an affected
+ system it will clear CPU buffers on ring transitions. On
+ systems which are MDS-affected and deploy MDS mitigation,
+ TAA is also mitigated. Specifying this option on those
+ systems will have no effect.
+
+ full,nosmt The same as tsx_async_abort=full, with SMT disabled on
+ vulnerable CPUs that have TSX enabled. This is the complete
+ mitigation. When TSX is disabled, SMT is not disabled because
+ CPU is not vulnerable to cross-thread TAA attacks.
+ ============ =============================================================
+
+Not specifying this option is equivalent to "tsx_async_abort=full".
+
+The kernel command line also allows to control the TSX feature using the
+parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used
+to control the TSX feature and the enumeration of the TSX feature bits (RTM
+and HLE) in CPUID.
+
+The valid options are:
+
+ ============ =============================================================
+ off Disables TSX on the system.
+
+ Note that this option takes effect only on newer CPUs which are
+ not vulnerable to MDS, i.e., have MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1
+ and which get the new IA32_TSX_CTRL MSR through a microcode
+ update. This new MSR allows for the reliable deactivation of
+ the TSX functionality.
+
+ on Enables TSX.
+
+ Although there are mitigations for all known security
+ vulnerabilities, TSX has been known to be an accelerator for
+ several previous speculation-related CVEs, and so there may be
+ unknown security risks associated with leaving it enabled.
+
+ auto Disables TSX if X86_BUG_TAA is present, otherwise enables TSX
+ on the system.
+ ============ =============================================================
+
+Not specifying this option is equivalent to "tsx=off".
+
+The following combinations of the "tsx_async_abort" and "tsx" are possible. For
+affected platforms tsx=auto is equivalent to tsx=off and the result will be:
+
+ ========= ========================== =========================================
+ tsx=on tsx_async_abort=full The system will use VERW to clear CPU
+ buffers. Cross-thread attacks are still
+ possible on SMT machines.
+ tsx=on tsx_async_abort=full,nosmt As above, cross-thread attacks on SMT
+ mitigated.
+ tsx=on tsx_async_abort=off The system is vulnerable.
+ tsx=off tsx_async_abort=full TSX might be disabled if microcode
+ provides a TSX control MSR. If so,
+ system is not vulnerable.
+ tsx=off tsx_async_abort=full,nosmt Ditto
+ tsx=off tsx_async_abort=off ditto
+ ========= ========================== =========================================
+
+
+For unaffected platforms "tsx=on" and "tsx_async_abort=full" does not clear CPU
+buffers. For platforms without TSX control (MSR_IA32_ARCH_CAPABILITIES.MDS_NO=0)
+"tsx" command line argument has no effect.
+
+For the affected platforms below table indicates the mitigation status for the
+combinations of CPUID bit MD_CLEAR and IA32_ARCH_CAPABILITIES MSR bits MDS_NO
+and TSX_CTRL_MSR.
+
+ ======= ========= ============= ========================================
+ MDS_NO MD_CLEAR TSX_CTRL_MSR Status
+ ======= ========= ============= ========================================
+ 0 0 0 Vulnerable (needs microcode)
+ 0 1 0 MDS and TAA mitigated via VERW
+ 1 1 0 MDS fixed, TAA vulnerable if TSX enabled
+ because MD_CLEAR has no meaning and
+ VERW is not guaranteed to clear buffers
+ 1 X 1 MDS fixed, TAA can be mitigated by
+ VERW or TSX_CTRL_MSR
+ ======= ========= ============= ========================================
+
+Mitigation selection guide
+--------------------------
+
+1. Trusted userspace and guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If all user space applications are from a trusted source and do not execute
+untrusted code which is supplied externally, then the mitigation can be
+disabled. The same applies to virtualized environments with trusted guests.
+
+
+2. Untrusted userspace and guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If there are untrusted applications or guests on the system, enabling TSX
+might allow a malicious actor to leak data from the host or from other
+processes running on the same physical core.
+
+If the microcode is available and the TSX is disabled on the host, attacks
+are prevented in a virtualized environment as well, even if the VMs do not
+explicitly enable the mitigation.
+
+
+.. _taa_default_mitigations:
+
+Default mitigations
+-------------------
+
+The kernel's default action for vulnerable processors is:
+
+ - Deploy TSX disable mitigation (tsx_async_abort=full tsx=off).
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 5ea005c9e2d6..49d1719177ea 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2040,6 +2040,25 @@
KVM MMU at runtime.
Default is 0 (off)

+ kvm.nx_huge_pages=
+ [KVM] Controls the software workaround for the
+ X86_BUG_ITLB_MULTIHIT bug.
+ force : Always deploy workaround.
+ off : Never deploy workaround.
+ auto : Deploy workaround based on the presence of
+ X86_BUG_ITLB_MULTIHIT.
+
+ Default is 'auto'.
+
+ If the software workaround is enabled for the host,
+ guests do need not to enable it for nested guests.
+
+ kvm.nx_huge_pages_recovery_ratio=
+ [KVM] Controls how many 4KiB pages are periodically zapped
+ back to huge pages. 0 disables the recovery, otherwise if
+ the value is N KVM will zap 1/Nth of the 4KiB pages every
+ minute. The default is 60.
+
kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
Default is 1 (enabled)

@@ -2612,6 +2631,13 @@
ssbd=force-off [ARM64]
l1tf=off [X86]
mds=off [X86]
+ tsx_async_abort=off [X86]
+ kvm.nx_huge_pages=off [X86]
+
+ Exceptions:
+ This does not have any effect on
+ kvm.nx_huge_pages when
+ kvm.nx_huge_pages=force.

auto (default)
Mitigate all CPU vulnerabilities, but leave SMT
@@ -2627,6 +2653,7 @@
be fully mitigated, even if it means losing SMT.
Equivalent to: l1tf=flush,nosmt [X86]
mds=full,nosmt [X86]
+ tsx_async_abort=full,nosmt [X86]

mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
@@ -4813,6 +4840,71 @@
interruptions from clocksource watchdog are not
acceptable).

+ tsx= [X86] Control Transactional Synchronization
+ Extensions (TSX) feature in Intel processors that
+ support TSX control.
+
+ This parameter controls the TSX feature. The options are:
+
+ on - Enable TSX on the system. Although there are
+ mitigations for all known security vulnerabilities,
+ TSX has been known to be an accelerator for
+ several previous speculation-related CVEs, and
+ so there may be unknown security risks associated
+ with leaving it enabled.
+
+ off - Disable TSX on the system. (Note that this
+ option takes effect only on newer CPUs which are
+ not vulnerable to MDS, i.e., have
+ MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1 and which get
+ the new IA32_TSX_CTRL MSR through a microcode
+ update. This new MSR allows for the reliable
+ deactivation of the TSX functionality.)
+
+ auto - Disable TSX if X86_BUG_TAA is present,
+ otherwise enable TSX on the system.
+
+ Not specifying this option is equivalent to tsx=off.
+
+ See Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
+ for more details.
+
+ tsx_async_abort= [X86,INTEL] Control mitigation for the TSX Async
+ Abort (TAA) vulnerability.
+
+ Similar to Micro-architectural Data Sampling (MDS)
+ certain CPUs that support Transactional
+ Synchronization Extensions (TSX) are vulnerable to an
+ exploit against CPU internal buffers which can forward
+ information to a disclosure gadget under certain
+ conditions.
+
+ In vulnerable processors, the speculatively forwarded
+ data can be used in a cache side channel attack, to
+ access data to which the attacker does not have direct
+ access.
+
+ This parameter controls the TAA mitigation. The
+ options are:
+
+ full - Enable TAA mitigation on vulnerable CPUs
+ if TSX is enabled.
+
+ full,nosmt - Enable TAA mitigation and disable SMT on
+ vulnerable CPUs. If TSX is disabled, SMT
+ is not disabled because CPU is not
+ vulnerable to cross-thread TAA attacks.
+ off - Unconditionally disable TAA mitigation
+
+ Not specifying this option is equivalent to
+ tsx_async_abort=full. On CPUs which are MDS affected
+ and deploy MDS mitigation, TAA mitigation is not
+ required and doesn't provide any additional
+ mitigation.
+
+ For details see:
+ Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
+
turbografx.map[2|3]= [HW,JOY]
TurboGraFX parallel port interface
Format:
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index 6e52d334bc55..d5f72a5b214f 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -91,6 +91,11 @@ stable kernels.
| ARM | MMU-500 | #841119,826419 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
+| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 |
++----------------+-----------------+-----------------+-----------------------------+
+| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_843419 |
++----------------+-----------------+-----------------+-----------------------------+
++----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX ITS | #22375,24313 | CAVIUM_ERRATUM_22375 |
+----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX ITS | #23144 | CAVIUM_ERRATUM_23144 |
@@ -124,7 +129,7 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 |
+----------------+-----------------+-----------------+-----------------------------+
-| Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009 |
+| Qualcomm Tech. | Kryo/Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009 |
+----------------+-----------------+-----------------+-----------------------------+
| Qualcomm Tech. | QDF2400 ITS | E0065 | QCOM_QDF2400_ERRATUM_0065 |
+----------------+-----------------+-----------------+-----------------------------+
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index af64c4bb4447..a8de2fbc1caa 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -27,6 +27,7 @@ x86-specific Documentation
mds
microcode
resctrl_ui
+ tsx_async_abort
usb-legacy-support
i386/index
x86_64/index
diff --git a/Documentation/x86/tsx_async_abort.rst b/Documentation/x86/tsx_async_abort.rst
new file mode 100644
index 000000000000..583ddc185ba2
--- /dev/null
+++ b/Documentation/x86/tsx_async_abort.rst
@@ -0,0 +1,117 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+TSX Async Abort (TAA) mitigation
+================================
+
+.. _tsx_async_abort:
+
+Overview
+--------
+
+TSX Async Abort (TAA) is a side channel attack on internal buffers in some
+Intel processors similar to Microachitectural Data Sampling (MDS). In this
+case certain loads may speculatively pass invalid data to dependent operations
+when an asynchronous abort condition is pending in a Transactional
+Synchronization Extensions (TSX) transaction. This includes loads with no
+fault or assist condition. Such loads may speculatively expose stale data from
+the same uarch data structures as in MDS, with same scope of exposure i.e.
+same-thread and cross-thread. This issue affects all current processors that
+support TSX.
+
+Mitigation strategy
+-------------------
+
+a) TSX disable - one of the mitigations is to disable TSX. A new MSR
+IA32_TSX_CTRL will be available in future and current processors after
+microcode update which can be used to disable TSX. In addition, it
+controls the enumeration of the TSX feature bits (RTM and HLE) in CPUID.
+
+b) Clear CPU buffers - similar to MDS, clearing the CPU buffers mitigates this
+vulnerability. More details on this approach can be found in
+:ref:`Documentation/admin-guide/hw-vuln/mds.rst <mds>`.
+
+Kernel internal mitigation modes
+--------------------------------
+
+ ============= ============================================================
+ off Mitigation is disabled. Either the CPU is not affected or
+ tsx_async_abort=off is supplied on the kernel command line.
+
+ tsx disabled Mitigation is enabled. TSX feature is disabled by default at
+ bootup on processors that support TSX control.
+
+ verw Mitigation is enabled. CPU is affected and MD_CLEAR is
+ advertised in CPUID.
+
+ ucode needed Mitigation is enabled. CPU is affected and MD_CLEAR is not
+ advertised in CPUID. That is mainly for virtualization
+ scenarios where the host has the updated microcode but the
+ hypervisor does not expose MD_CLEAR in CPUID. It's a best
+ effort approach without guarantee.
+ ============= ============================================================
+
+If the CPU is affected and the "tsx_async_abort" kernel command line parameter is
+not provided then the kernel selects an appropriate mitigation depending on the
+status of RTM and MD_CLEAR CPUID bits.
+
+Below tables indicate the impact of tsx=on|off|auto cmdline options on state of
+TAA mitigation, VERW behavior and TSX feature for various combinations of
+MSR_IA32_ARCH_CAPABILITIES bits.
+
+1. "tsx=off"
+
+========= ========= ============ ============ ============== =================== ======================
+MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=off
+---------------------------------- -------------------------------------------------------------------------
+TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation
+ after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full
+========= ========= ============ ============ ============== =================== ======================
+ 0 0 0 HW default Yes Same as MDS Same as MDS
+ 0 0 1 Invalid case Invalid case Invalid case Invalid case
+ 0 1 0 HW default No Need ucode update Need ucode update
+ 0 1 1 Disabled Yes TSX disabled TSX disabled
+ 1 X 1 Disabled X None needed None needed
+========= ========= ============ ============ ============== =================== ======================
+
+2. "tsx=on"
+
+========= ========= ============ ============ ============== =================== ======================
+MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=on
+---------------------------------- -------------------------------------------------------------------------
+TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation
+ after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full
+========= ========= ============ ============ ============== =================== ======================
+ 0 0 0 HW default Yes Same as MDS Same as MDS
+ 0 0 1 Invalid case Invalid case Invalid case Invalid case
+ 0 1 0 HW default No Need ucode update Need ucode update
+ 0 1 1 Enabled Yes None Same as MDS
+ 1 X 1 Enabled X None needed None needed
+========= ========= ============ ============ ============== =================== ======================
+
+3. "tsx=auto"
+
+========= ========= ============ ============ ============== =================== ======================
+MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=auto
+---------------------------------- -------------------------------------------------------------------------
+TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation
+ after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full
+========= ========= ============ ============ ============== =================== ======================
+ 0 0 0 HW default Yes Same as MDS Same as MDS
+ 0 0 1 Invalid case Invalid case Invalid case Invalid case
+ 0 1 0 HW default No Need ucode update Need ucode update
+ 0 1 1 Disabled Yes TSX disabled TSX disabled
+ 1 X 1 Enabled X None needed None needed
+========= ========= ============ ============ ============== =================== ======================
+
+In the tables, TSX_CTRL_MSR is a new bit in MSR_IA32_ARCH_CAPABILITIES that
+indicates whether MSR_IA32_TSX_CTRL is supported.
+
+There are two control bits in IA32_TSX_CTRL MSR:
+
+ Bit 0: When set it disables the Restricted Transactional Memory (RTM)
+ sub-feature of TSX (will force all transactions to abort on the
+ XBEGIN instruction).
+
+ Bit 1: When set it disables the enumeration of the RTM and HLE feature
+ (i.e. it will make CPUID(EAX=7).EBX{bit4} and
+ CPUID(EAX=7).EBX{bit11} read as 0).
diff --git a/Makefile b/Makefile
index e2a8b4534da5..40148c01ffe2 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
VERSION = 5
PATCHLEVEL = 3
-SUBLEVEL = 10
+SUBLEVEL = 11
EXTRAVERSION =
NAME = Bobtail Squid

diff --git a/arch/arc/boot/dts/hsdk.dts b/arch/arc/boot/dts/hsdk.dts
index bfc7f5f5d6f2..9bea5daadd23 100644
--- a/arch/arc/boot/dts/hsdk.dts
+++ b/arch/arc/boot/dts/hsdk.dts
@@ -264,6 +264,14 @@
clocks = <&input_clk>;
cs-gpios = <&creg_gpio 0 GPIO_ACTIVE_LOW>,
<&creg_gpio 1 GPIO_ACTIVE_LOW>;
+
+ spi-flash@0 {
+ compatible = "sst26wf016b", "jedec,spi-nor";
+ reg = <0>;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ spi-max-frequency = <4000000>;
+ };
};

creg_gpio: gpio@14b0 {
diff --git a/arch/arc/configs/hsdk_defconfig b/arch/arc/configs/hsdk_defconfig
index 403125d9c9a3..fe9de80e41ee 100644
--- a/arch/arc/configs/hsdk_defconfig
+++ b/arch/arc/configs/hsdk_defconfig
@@ -31,6 +31,8 @@ CONFIG_INET=y
CONFIG_DEVTMPFS=y
# CONFIG_STANDALONE is not set
# CONFIG_PREVENT_FIRMWARE_BUILD is not set
+CONFIG_MTD=y
+CONFIG_MTD_SPI_NOR=y
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_NETDEVICES=y
diff --git a/arch/arm/boot/dts/imx6-logicpd-baseboard.dtsi b/arch/arm/boot/dts/imx6-logicpd-baseboard.dtsi
index 2a6ce87071f9..9e027b9a5f91 100644
--- a/arch/arm/boot/dts/imx6-logicpd-baseboard.dtsi
+++ b/arch/arm/boot/dts/imx6-logicpd-baseboard.dtsi
@@ -328,6 +328,10 @@
pinctrl-0 = <&pinctrl_pwm3>;
};

+&snvs_pwrkey {
+ status = "okay";
+};
+
&ssi2 {
status = "okay";
};
diff --git a/arch/arm/boot/dts/stm32mp157c-ev1.dts b/arch/arm/boot/dts/stm32mp157c-ev1.dts
index feb8f7727270..541bad97248a 100644
--- a/arch/arm/boot/dts/stm32mp157c-ev1.dts
+++ b/arch/arm/boot/dts/stm32mp157c-ev1.dts
@@ -206,7 +206,6 @@

joystick_pins: joystick {
pins = "gpio0", "gpio1", "gpio2", "gpio3", "gpio4";
- drive-push-pull;
bias-pull-down;
};

diff --git a/arch/arm/mach-sunxi/mc_smp.c b/arch/arm/mach-sunxi/mc_smp.c
index 239084cf8192..26cbce135338 100644
--- a/arch/arm/mach-sunxi/mc_smp.c
+++ b/arch/arm/mach-sunxi/mc_smp.c
@@ -481,14 +481,18 @@ static void sunxi_mc_smp_cpu_die(unsigned int l_cpu)
static int sunxi_cpu_powerdown(unsigned int cpu, unsigned int cluster)
{
u32 reg;
+ int gating_bit = cpu;

pr_debug("%s: cluster %u cpu %u\n", __func__, cluster, cpu);
if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
return -EINVAL;

+ if (is_a83t && cpu == 0)
+ gating_bit = 4;
+
/* gate processor power */
reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
- reg |= PRCM_PWROFF_GATING_REG_CORE(cpu);
+ reg |= PRCM_PWROFF_GATING_REG_CORE(gating_bit);
writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
udelay(20);

diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index b1454d117cd2..aca07c2f6e6e 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -79,6 +79,7 @@
#define CAVIUM_CPU_PART_THUNDERX_83XX 0x0A3
#define CAVIUM_CPU_PART_THUNDERX2 0x0AF

+#define BRCM_CPU_PART_BRAHMA_B53 0x100
#define BRCM_CPU_PART_VULCAN 0x516

#define QCOM_CPU_PART_FALKOR_V1 0x800
@@ -105,6 +106,7 @@
#define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
#define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX)
#define MIDR_CAVIUM_THUNDERX2 MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX2)
+#define MIDR_BRAHMA_B53 MIDR_CPU_MODEL(ARM_CPU_IMP_BRCM, BRCM_CPU_PART_BRAHMA_B53)
#define MIDR_BRCM_VULCAN MIDR_CPU_MODEL(ARM_CPU_IMP_BRCM, BRCM_CPU_PART_VULCAN)
#define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1)
#define MIDR_QCOM_FALKOR MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 8eb5c0fbdee6..b15f90511d4f 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -283,23 +283,6 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
set_pte(ptep, pte);
}

-#define __HAVE_ARCH_PTE_SAME
-static inline int pte_same(pte_t pte_a, pte_t pte_b)
-{
- pteval_t lhs, rhs;
-
- lhs = pte_val(pte_a);
- rhs = pte_val(pte_b);
-
- if (pte_present(pte_a))
- lhs &= ~PTE_RDONLY;
-
- if (pte_present(pte_b))
- rhs &= ~PTE_RDONLY;
-
- return (lhs == rhs);
-}
-
/*
* Huge pte definitions.
*/
diff --git a/arch/arm64/include/asm/vdso/vsyscall.h b/arch/arm64/include/asm/vdso/vsyscall.h
index 0c731bfc7c8c..0c20a7c1bee5 100644
--- a/arch/arm64/include/asm/vdso/vsyscall.h
+++ b/arch/arm64/include/asm/vdso/vsyscall.h
@@ -30,13 +30,6 @@ int __arm64_get_clock_mode(struct timekeeper *tk)
}
#define __arch_get_clock_mode __arm64_get_clock_mode

-static __always_inline
-int __arm64_use_vsyscall(struct vdso_data *vdata)
-{
- return !vdata[CS_HRES_COARSE].clock_mode;
-}
-#define __arch_use_vsyscall __arm64_use_vsyscall
-
static __always_inline
void __arm64_update_vsyscall(struct vdso_data *vdata, struct timekeeper *tk)
{
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 1e0b9ae9bf7e..169549f939e2 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -129,8 +129,8 @@ static void install_bp_hardening_cb(bp_hardening_cb_t fn,
int cpu, slot = -1;

/*
- * enable_smccc_arch_workaround_1() passes NULL for the hyp_vecs
- * start/end if we're a guest. Skip the hyp-vectors work.
+ * detect_harden_bp_fw() passes NULL for the hyp_vecs start/end if
+ * we're a guest. Skip the hyp-vectors work.
*/
if (!hyp_vecs_start) {
__this_cpu_write(bp_hardening_data.fn, fn);
@@ -489,6 +489,7 @@ static const struct midr_range arm64_ssb_cpus[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A35),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A53),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A55),
+ MIDR_ALL_VERSIONS(MIDR_BRAHMA_B53),
{},
};

@@ -573,6 +574,7 @@ static const struct midr_range spectre_v2_safe_list[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A35),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A53),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A55),
+ MIDR_ALL_VERSIONS(MIDR_BRAHMA_B53),
{ /* sentinel */ }
};

@@ -659,17 +661,23 @@ static const struct midr_range arm64_harden_el2_vectors[] = {
#endif

#ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI
-
-static const struct midr_range arm64_repeat_tlbi_cpus[] = {
+static const struct arm64_cpu_capabilities arm64_repeat_tlbi_list[] = {
#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1009
- MIDR_RANGE(MIDR_QCOM_FALKOR_V1, 0, 0, 0, 0),
+ {
+ ERRATA_MIDR_REV(MIDR_QCOM_FALKOR_V1, 0, 0)
+ },
+ {
+ .midr_range.model = MIDR_QCOM_KRYO,
+ .matches = is_kryo_midr,
+ },
#endif
#ifdef CONFIG_ARM64_ERRATUM_1286807
- MIDR_RANGE(MIDR_CORTEX_A76, 0, 0, 3, 0),
+ {
+ ERRATA_MIDR_RANGE(MIDR_CORTEX_A76, 0, 0, 3, 0),
+ },
#endif
{},
};
-
#endif

#ifdef CONFIG_CAVIUM_ERRATUM_27456
@@ -737,6 +745,33 @@ static const struct midr_range erratum_1418040_list[] = {
};
#endif

+#ifdef CONFIG_ARM64_ERRATUM_845719
+static const struct midr_range erratum_845719_list[] = {
+ /* Cortex-A53 r0p[01234] */
+ MIDR_REV_RANGE(MIDR_CORTEX_A53, 0, 0, 4),
+ /* Brahma-B53 r0p[0] */
+ MIDR_REV(MIDR_BRAHMA_B53, 0, 0),
+ {},
+};
+#endif
+
+#ifdef CONFIG_ARM64_ERRATUM_843419
+static const struct arm64_cpu_capabilities erratum_843419_list[] = {
+ {
+ /* Cortex-A53 r0p[01234] */
+ .matches = is_affected_midr_range,
+ ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A53, 0, 0, 4),
+ MIDR_FIXED(0x4, BIT(8)),
+ },
+ {
+ /* Brahma-B53 r0p[0] */
+ .matches = is_affected_midr_range,
+ ERRATA_MIDR_REV(MIDR_BRAHMA_B53, 0, 0),
+ },
+ {},
+};
+#endif
+
const struct arm64_cpu_capabilities arm64_errata[] = {
#ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
{
@@ -768,19 +803,18 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
#endif
#ifdef CONFIG_ARM64_ERRATUM_843419
{
- /* Cortex-A53 r0p[01234] */
.desc = "ARM erratum 843419",
.capability = ARM64_WORKAROUND_843419,
- ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A53, 0, 0, 4),
- MIDR_FIXED(0x4, BIT(8)),
+ .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,
+ .matches = cpucap_multi_entry_cap_matches,
+ .match_list = erratum_843419_list,
},
#endif
#ifdef CONFIG_ARM64_ERRATUM_845719
{
- /* Cortex-A53 r0p[01234] */
.desc = "ARM erratum 845719",
.capability = ARM64_WORKAROUND_845719,
- ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A53, 0, 0, 4),
+ ERRATA_MIDR_RANGE_LIST(erratum_845719_list),
},
#endif
#ifdef CONFIG_CAVIUM_ERRATUM_23154
@@ -825,7 +859,9 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
{
.desc = "Qualcomm erratum 1009, ARM erratum 1286807",
.capability = ARM64_WORKAROUND_REPEAT_TLBI,
- ERRATA_MIDR_RANGE_LIST(arm64_repeat_tlbi_cpus),
+ .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,
+ .matches = cpucap_multi_entry_cap_matches,
+ .match_list = arm64_repeat_tlbi_list,
},
#endif
#ifdef CONFIG_ARM64_ERRATUM_858921
diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index 677e9babef80..f9dc597b0b86 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -91,6 +91,7 @@

static inline void kuap_update_sr(u32 sr, u32 addr, u32 end)
{
+ addr &= 0xf0000000; /* align addr to start of segment */
barrier(); /* make sure thread.kuap is updated before playing with SRs */
while (addr < end) {
mtsrin(sr, addr);
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index d7fcdfa7fee4..ec2547cc5ecb 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -36,8 +36,8 @@
#include "book3s.h"
#include "trace.h"

-#define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM
-#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
+#define VM_STAT(x, ...) offsetof(struct kvm, stat.x), KVM_STAT_VM, ## __VA_ARGS__
+#define VCPU_STAT(x, ...) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU, ## __VA_ARGS__

/* #define EXIT_DEBUG */

@@ -69,8 +69,8 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "pthru_all", VCPU_STAT(pthru_all) },
{ "pthru_host", VCPU_STAT(pthru_host) },
{ "pthru_bad_aff", VCPU_STAT(pthru_bad_aff) },
- { "largepages_2M", VM_STAT(num_2M_pages) },
- { "largepages_1G", VM_STAT(num_1G_pages) },
+ { "largepages_2M", VM_STAT(num_2M_pages, .mode = 0444) },
+ { "largepages_1G", VM_STAT(num_1G_pages, .mode = 0444) },
{ NULL }
};

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 222855cc0158..8a717c681d3c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1934,6 +1934,51 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS

If unsure, say y.

+choice
+ prompt "TSX enable mode"
+ depends on CPU_SUP_INTEL
+ default X86_INTEL_TSX_MODE_OFF
+ help
+ Intel's TSX (Transactional Synchronization Extensions) feature
+ allows to optimize locking protocols through lock elision which
+ can lead to a noticeable performance boost.
+
+ On the other hand it has been shown that TSX can be exploited
+ to form side channel attacks (e.g. TAA) and chances are there
+ will be more of those attacks discovered in the future.
+
+ Therefore TSX is not enabled by default (aka tsx=off). An admin
+ might override this decision by tsx=on the command line parameter.
+ Even with TSX enabled, the kernel will attempt to enable the best
+ possible TAA mitigation setting depending on the microcode available
+ for the particular machine.
+
+ This option allows to set the default tsx mode between tsx=on, =off
+ and =auto. See Documentation/admin-guide/kernel-parameters.txt for more
+ details.
+
+ Say off if not sure, auto if TSX is in use but it should be used on safe
+ platforms or on if TSX is in use and the security aspect of tsx is not
+ relevant.
+
+config X86_INTEL_TSX_MODE_OFF
+ bool "off"
+ help
+ TSX is disabled if possible - equals to tsx=off command line parameter.
+
+config X86_INTEL_TSX_MODE_ON
+ bool "on"
+ help
+ TSX is always enabled on TSX capable HW - equals the tsx=on command
+ line parameter.
+
+config X86_INTEL_TSX_MODE_AUTO
+ bool "auto"
+ help
+ TSX is enabled on TSX capable HW that is believed to be safe against
+ side channel attacks- equals the tsx=auto command line parameter.
+endchoice
+
config EFI
bool "EFI runtime service support"
depends on ACPI
diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index d6662fdef300..82bc60c8acb2 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -13,6 +13,7 @@
#include <asm/e820/types.h>
#include <asm/setup.h>
#include <asm/desc.h>
+#include <asm/boot.h>

#include "../string.h"
#include "eboot.h"
@@ -813,7 +814,8 @@ efi_main(struct efi_config *c, struct boot_params *boot_params)
status = efi_relocate_kernel(sys_table, &bzimage_addr,
hdr->init_size, hdr->init_size,
hdr->pref_address,
- hdr->kernel_alignment);
+ hdr->kernel_alignment,
+ LOAD_PHYSICAL_ADDR);
if (status != EFI_SUCCESS) {
efi_printk(sys_table, "efi_relocate_kernel() failed!\n");
goto fail;
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 5b35b7ea5d72..26c36357c4c9 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -377,7 +377,8 @@ static inline void perf_ibs_disable_event(struct perf_ibs *perf_ibs,
struct hw_perf_event *hwc, u64 config)
{
config &= ~perf_ibs->cnt_mask;
- wrmsrl(hwc->config_base, config);
+ if (boot_cpu_data.x86 == 0x10)
+ wrmsrl(hwc->config_base, config);
config &= ~perf_ibs->enable_mask;
wrmsrl(hwc->config_base, config);
}
@@ -553,7 +554,8 @@ static struct perf_ibs perf_ibs_op = {
},
.msr = MSR_AMD64_IBSOPCTL,
.config_mask = IBS_OP_CONFIG_MASK,
- .cnt_mask = IBS_OP_MAX_CNT,
+ .cnt_mask = IBS_OP_MAX_CNT | IBS_OP_CUR_CNT |
+ IBS_OP_CUR_CNT_RAND,
.enable_mask = IBS_OP_ENABLE,
.valid_mask = IBS_OP_VAL,
.max_period = IBS_OP_MAX_CNT << 4,
@@ -614,7 +616,7 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
if (event->attr.sample_type & PERF_SAMPLE_RAW)
offset_max = perf_ibs->offset_max;
else if (check_rip)
- offset_max = 2;
+ offset_max = 3;
else
offset_max = 1;
do {
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 3694a5d0703d..f7b191d3c9b0 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -502,10 +502,8 @@ void uncore_pmu_event_start(struct perf_event *event, int flags)
local64_set(&event->hw.prev_count, uncore_read_counter(box, event));
uncore_enable_event(box, event);

- if (box->n_active == 1) {
- uncore_enable_box(box);
+ if (box->n_active == 1)
uncore_pmu_start_hrtimer(box);
- }
}

void uncore_pmu_event_stop(struct perf_event *event, int flags)
@@ -529,10 +527,8 @@ void uncore_pmu_event_stop(struct perf_event *event, int flags)
WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
hwc->state |= PERF_HES_STOPPED;

- if (box->n_active == 0) {
- uncore_disable_box(box);
+ if (box->n_active == 0)
uncore_pmu_cancel_hrtimer(box);
- }
}

if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
@@ -778,6 +774,40 @@ static int uncore_pmu_event_init(struct perf_event *event)
return ret;
}

+static void uncore_pmu_enable(struct pmu *pmu)
+{
+ struct intel_uncore_pmu *uncore_pmu;
+ struct intel_uncore_box *box;
+
+ uncore_pmu = container_of(pmu, struct intel_uncore_pmu, pmu);
+ if (!uncore_pmu)
+ return;
+
+ box = uncore_pmu_to_box(uncore_pmu, smp_processor_id());
+ if (!box)
+ return;
+
+ if (uncore_pmu->type->ops->enable_box)
+ uncore_pmu->type->ops->enable_box(box);
+}
+
+static void uncore_pmu_disable(struct pmu *pmu)
+{
+ struct intel_uncore_pmu *uncore_pmu;
+ struct intel_uncore_box *box;
+
+ uncore_pmu = container_of(pmu, struct intel_uncore_pmu, pmu);
+ if (!uncore_pmu)
+ return;
+
+ box = uncore_pmu_to_box(uncore_pmu, smp_processor_id());
+ if (!box)
+ return;
+
+ if (uncore_pmu->type->ops->disable_box)
+ uncore_pmu->type->ops->disable_box(box);
+}
+
static ssize_t uncore_get_attr_cpumask(struct device *dev,
struct device_attribute *attr, char *buf)
{
@@ -803,6 +833,8 @@ static int uncore_pmu_register(struct intel_uncore_pmu *pmu)
pmu->pmu = (struct pmu) {
.attr_groups = pmu->type->attr_groups,
.task_ctx_nr = perf_invalid_context,
+ .pmu_enable = uncore_pmu_enable,
+ .pmu_disable = uncore_pmu_disable,
.event_init = uncore_pmu_event_init,
.add = uncore_pmu_event_add,
.del = uncore_pmu_event_del,
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index f36f7bebbc1b..bbfdaa720b45 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -441,18 +441,6 @@ static inline int uncore_freerunning_hw_config(struct intel_uncore_box *box,
return -EINVAL;
}

-static inline void uncore_disable_box(struct intel_uncore_box *box)
-{
- if (box->pmu->type->ops->disable_box)
- box->pmu->type->ops->disable_box(box);
-}
-
-static inline void uncore_enable_box(struct intel_uncore_box *box)
-{
- if (box->pmu->type->ops->enable_box)
- box->pmu->type->ops->enable_box(box);
-}
-
static inline void uncore_disable_event(struct intel_uncore_box *box,
struct perf_event *event)
{
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index e880f2408e29..5f9ae6b5be35 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -397,5 +397,7 @@
#define X86_BUG_MDS X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */
#define X86_BUG_MSBDS_ONLY X86_BUG(20) /* CPU is only affected by the MSDBS variant of BUG_MDS */
#define X86_BUG_SWAPGS X86_BUG(21) /* CPU is affected by speculation through SWAPGS */
+#define X86_BUG_TAA X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */
+#define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */

#endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index dd0ca154a958..f68e174f452f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -319,8 +319,11 @@ struct kvm_rmap_head {
struct kvm_mmu_page {
struct list_head link;
struct hlist_node hash_link;
+ struct list_head lpage_disallowed_link;
+
bool unsync;
bool mmio_cached;
+ bool lpage_disallowed; /* Can't be replaced by an equiv large page */

/*
* The following two entries are used to key the shadow page in the
@@ -863,6 +866,7 @@ struct kvm_arch {
* Hash table of struct kvm_mmu_page.
*/
struct list_head active_mmu_pages;
+ struct list_head lpage_disallowed_mmu_pages;
struct kvm_page_track_notifier_node mmu_sp_tracker;
struct kvm_page_track_notifier_head track_notifier_head;

@@ -937,6 +941,7 @@ struct kvm_arch {
bool exception_payload_enabled;

struct kvm_pmu_event_filter *pmu_event_filter;
+ struct task_struct *nx_lpage_recovery_thread;
};

struct kvm_vm_stat {
@@ -950,6 +955,7 @@ struct kvm_vm_stat {
ulong mmu_unsync;
ulong remote_tlb_flush;
ulong lpages;
+ ulong nx_lpage_splits;
ulong max_mmu_page_hash_collisions;
};

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 271d837d69a8..f1fbb29539c4 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -93,6 +93,18 @@
* Microarchitectural Data
* Sampling (MDS) vulnerabilities.
*/
+#define ARCH_CAP_PSCHANGE_MC_NO BIT(6) /*
+ * The processor is not susceptible to a
+ * machine check error due to modifying the
+ * code page size along with either the
+ * physical address or cache type
+ * without TLB invalidation.
+ */
+#define ARCH_CAP_TSX_CTRL_MSR BIT(7) /* MSR for TSX control is available. */
+#define ARCH_CAP_TAA_NO BIT(8) /*
+ * Not susceptible to
+ * TSX Async Abort (TAA) vulnerabilities.
+ */

#define MSR_IA32_FLUSH_CMD 0x0000010b
#define L1D_FLUSH BIT(0) /*
@@ -103,6 +115,10 @@
#define MSR_IA32_BBL_CR_CTL 0x00000119
#define MSR_IA32_BBL_CR_CTL3 0x0000011e

+#define MSR_IA32_TSX_CTRL 0x00000122
+#define TSX_CTRL_RTM_DISABLE BIT(0) /* Disable RTM feature */
+#define TSX_CTRL_CPUID_CLEAR BIT(1) /* Disable TSX enumeration */
+
#define MSR_IA32_SYSENTER_CS 0x00000174
#define MSR_IA32_SYSENTER_ESP 0x00000175
#define MSR_IA32_SYSENTER_EIP 0x00000176
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 80bc209c0708..5c24a7b35166 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -314,7 +314,7 @@ DECLARE_STATIC_KEY_FALSE(mds_idle_clear);
#include <asm/segment.h>

/**
- * mds_clear_cpu_buffers - Mitigation for MDS vulnerability
+ * mds_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability
*
* This uses the otherwise unused and obsolete VERW instruction in
* combination with microcode which triggers a CPU buffer flush when the
@@ -337,7 +337,7 @@ static inline void mds_clear_cpu_buffers(void)
}

/**
- * mds_user_clear_cpu_buffers - Mitigation for MDS vulnerability
+ * mds_user_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability
*
* Clear CPU buffers if the corresponding static key is enabled
*/
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 6e0a3b43d027..54f5d54280f6 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -988,4 +988,11 @@ enum mds_mitigations {
MDS_MITIGATION_VMWERV,
};

+enum taa_mitigations {
+ TAA_MITIGATION_OFF,
+ TAA_MITIGATION_UCODE_NEEDED,
+ TAA_MITIGATION_VERW,
+ TAA_MITIGATION_TSX_DISABLED,
+};
+
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index ad0d5ced82b3..2c6bab985a6a 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1573,9 +1573,6 @@ static void setup_local_APIC(void)
{
int cpu = smp_processor_id();
unsigned int value;
-#ifdef CONFIG_X86_32
- int logical_apicid, ldr_apicid;
-#endif


if (disable_apic) {
@@ -1616,16 +1613,21 @@ static void setup_local_APIC(void)
apic->init_apic_ldr();

#ifdef CONFIG_X86_32
- /*
- * APIC LDR is initialized. If logical_apicid mapping was
- * initialized during get_smp_config(), make sure it matches the
- * actual value.
- */
- logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
- ldr_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR));
- WARN_ON(logical_apicid != BAD_APICID && logical_apicid != ldr_apicid);
- /* always use the value from LDR */
- early_per_cpu(x86_cpu_to_logical_apicid, cpu) = ldr_apicid;
+ if (apic->dest_logical) {
+ int logical_apicid, ldr_apicid;
+
+ /*
+ * APIC LDR is initialized. If logical_apicid mapping was
+ * initialized during get_smp_config(), make sure it matches
+ * the actual value.
+ */
+ logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+ ldr_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR));
+ if (logical_apicid != BAD_APICID)
+ WARN_ON(logical_apicid != ldr_apicid);
+ /* Always use the value from LDR. */
+ early_per_cpu(x86_cpu_to_logical_apicid, cpu) = ldr_apicid;
+ }
#endif

/*
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index d7a1e5a9331c..890f60083eca 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -30,7 +30,7 @@ obj-$(CONFIG_PROC_FS) += proc.o
obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o

ifdef CONFIG_CPU_SUP_INTEL
-obj-y += intel.o intel_pconfig.o
+obj-y += intel.o intel_pconfig.o tsx.o
obj-$(CONFIG_PM) += intel_epb.o
endif
obj-$(CONFIG_CPU_SUP_AMD) += amd.o
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index c6fa3ef10b4e..9b7586204cd2 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -39,6 +39,7 @@ static void __init spectre_v2_select_mitigation(void);
static void __init ssb_select_mitigation(void);
static void __init l1tf_select_mitigation(void);
static void __init mds_select_mitigation(void);
+static void __init taa_select_mitigation(void);

/* The base value of the SPEC_CTRL MSR that always has to be preserved. */
u64 x86_spec_ctrl_base;
@@ -105,6 +106,7 @@ void __init check_bugs(void)
ssb_select_mitigation();
l1tf_select_mitigation();
mds_select_mitigation();
+ taa_select_mitigation();

arch_smt_update();

@@ -268,6 +270,100 @@ static int __init mds_cmdline(char *str)
}
early_param("mds", mds_cmdline);

+#undef pr_fmt
+#define pr_fmt(fmt) "TAA: " fmt
+
+/* Default mitigation for TAA-affected CPUs */
+static enum taa_mitigations taa_mitigation __ro_after_init = TAA_MITIGATION_VERW;
+static bool taa_nosmt __ro_after_init;
+
+static const char * const taa_strings[] = {
+ [TAA_MITIGATION_OFF] = "Vulnerable",
+ [TAA_MITIGATION_UCODE_NEEDED] = "Vulnerable: Clear CPU buffers attempted, no microcode",
+ [TAA_MITIGATION_VERW] = "Mitigation: Clear CPU buffers",
+ [TAA_MITIGATION_TSX_DISABLED] = "Mitigation: TSX disabled",
+};
+
+static void __init taa_select_mitigation(void)
+{
+ u64 ia32_cap;
+
+ if (!boot_cpu_has_bug(X86_BUG_TAA)) {
+ taa_mitigation = TAA_MITIGATION_OFF;
+ return;
+ }
+
+ /* TSX previously disabled by tsx=off */
+ if (!boot_cpu_has(X86_FEATURE_RTM)) {
+ taa_mitigation = TAA_MITIGATION_TSX_DISABLED;
+ goto out;
+ }
+
+ if (cpu_mitigations_off()) {
+ taa_mitigation = TAA_MITIGATION_OFF;
+ return;
+ }
+
+ /* TAA mitigation is turned off on the cmdline (tsx_async_abort=off) */
+ if (taa_mitigation == TAA_MITIGATION_OFF)
+ goto out;
+
+ if (boot_cpu_has(X86_FEATURE_MD_CLEAR))
+ taa_mitigation = TAA_MITIGATION_VERW;
+ else
+ taa_mitigation = TAA_MITIGATION_UCODE_NEEDED;
+
+ /*
+ * VERW doesn't clear the CPU buffers when MD_CLEAR=1 and MDS_NO=1.
+ * A microcode update fixes this behavior to clear CPU buffers. It also
+ * adds support for MSR_IA32_TSX_CTRL which is enumerated by the
+ * ARCH_CAP_TSX_CTRL_MSR bit.
+ *
+ * On MDS_NO=1 CPUs if ARCH_CAP_TSX_CTRL_MSR is not set, microcode
+ * update is required.
+ */
+ ia32_cap = x86_read_arch_cap_msr();
+ if ( (ia32_cap & ARCH_CAP_MDS_NO) &&
+ !(ia32_cap & ARCH_CAP_TSX_CTRL_MSR))
+ taa_mitigation = TAA_MITIGATION_UCODE_NEEDED;
+
+ /*
+ * TSX is enabled, select alternate mitigation for TAA which is
+ * the same as MDS. Enable MDS static branch to clear CPU buffers.
+ *
+ * For guests that can't determine whether the correct microcode is
+ * present on host, enable the mitigation for UCODE_NEEDED as well.
+ */
+ static_branch_enable(&mds_user_clear);
+
+ if (taa_nosmt || cpu_mitigations_auto_nosmt())
+ cpu_smt_disable(false);
+
+out:
+ pr_info("%s\n", taa_strings[taa_mitigation]);
+}
+
+static int __init tsx_async_abort_parse_cmdline(char *str)
+{
+ if (!boot_cpu_has_bug(X86_BUG_TAA))
+ return 0;
+
+ if (!str)
+ return -EINVAL;
+
+ if (!strcmp(str, "off")) {
+ taa_mitigation = TAA_MITIGATION_OFF;
+ } else if (!strcmp(str, "full")) {
+ taa_mitigation = TAA_MITIGATION_VERW;
+ } else if (!strcmp(str, "full,nosmt")) {
+ taa_mitigation = TAA_MITIGATION_VERW;
+ taa_nosmt = true;
+ }
+
+ return 0;
+}
+early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
+
#undef pr_fmt
#define pr_fmt(fmt) "Spectre V1 : " fmt

@@ -786,13 +882,10 @@ static void update_mds_branch_idle(void)
}

#define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n"
+#define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n"

void arch_smt_update(void)
{
- /* Enhanced IBRS implies STIBP. No update required. */
- if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
- return;
-
mutex_lock(&spec_ctrl_mutex);

switch (spectre_v2_user) {
@@ -819,6 +912,17 @@ void arch_smt_update(void)
break;
}

+ switch (taa_mitigation) {
+ case TAA_MITIGATION_VERW:
+ case TAA_MITIGATION_UCODE_NEEDED:
+ if (sched_smt_active())
+ pr_warn_once(TAA_MSG_SMT);
+ break;
+ case TAA_MITIGATION_TSX_DISABLED:
+ case TAA_MITIGATION_OFF:
+ break;
+ }
+
mutex_unlock(&spec_ctrl_mutex);
}

@@ -1149,6 +1253,9 @@ void x86_spec_ctrl_setup_ap(void)
x86_amd_ssb_disable();
}

+bool itlb_multihit_kvm_mitigation;
+EXPORT_SYMBOL_GPL(itlb_multihit_kvm_mitigation);
+
#undef pr_fmt
#define pr_fmt(fmt) "L1TF: " fmt

@@ -1304,11 +1411,24 @@ static ssize_t l1tf_show_state(char *buf)
l1tf_vmx_states[l1tf_vmx_mitigation],
sched_smt_active() ? "vulnerable" : "disabled");
}
+
+static ssize_t itlb_multihit_show_state(char *buf)
+{
+ if (itlb_multihit_kvm_mitigation)
+ return sprintf(buf, "KVM: Mitigation: Split huge pages\n");
+ else
+ return sprintf(buf, "KVM: Vulnerable\n");
+}
#else
static ssize_t l1tf_show_state(char *buf)
{
return sprintf(buf, "%s\n", L1TF_DEFAULT_MSG);
}
+
+static ssize_t itlb_multihit_show_state(char *buf)
+{
+ return sprintf(buf, "Processor vulnerable\n");
+}
#endif

static ssize_t mds_show_state(char *buf)
@@ -1328,6 +1448,21 @@ static ssize_t mds_show_state(char *buf)
sched_smt_active() ? "vulnerable" : "disabled");
}

+static ssize_t tsx_async_abort_show_state(char *buf)
+{
+ if ((taa_mitigation == TAA_MITIGATION_TSX_DISABLED) ||
+ (taa_mitigation == TAA_MITIGATION_OFF))
+ return sprintf(buf, "%s\n", taa_strings[taa_mitigation]);
+
+ if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
+ return sprintf(buf, "%s; SMT Host state unknown\n",
+ taa_strings[taa_mitigation]);
+ }
+
+ return sprintf(buf, "%s; SMT %s\n", taa_strings[taa_mitigation],
+ sched_smt_active() ? "vulnerable" : "disabled");
+}
+
static char *stibp_state(void)
{
if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
@@ -1398,6 +1533,12 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
case X86_BUG_MDS:
return mds_show_state(buf);

+ case X86_BUG_TAA:
+ return tsx_async_abort_show_state(buf);
+
+ case X86_BUG_ITLB_MULTIHIT:
+ return itlb_multihit_show_state(buf);
+
default:
break;
}
@@ -1434,4 +1575,14 @@ ssize_t cpu_show_mds(struct device *dev, struct device_attribute *attr, char *bu
{
return cpu_show_common(dev, attr, buf, X86_BUG_MDS);
}
+
+ssize_t cpu_show_tsx_async_abort(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ return cpu_show_common(dev, attr, buf, X86_BUG_TAA);
+}
+
+ssize_t cpu_show_itlb_multihit(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ return cpu_show_common(dev, attr, buf, X86_BUG_ITLB_MULTIHIT);
+}
#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f125bf7ecb6f..663b27bdea88 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1016,13 +1016,14 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
#endif
}

-#define NO_SPECULATION BIT(0)
-#define NO_MELTDOWN BIT(1)
-#define NO_SSB BIT(2)
-#define NO_L1TF BIT(3)
-#define NO_MDS BIT(4)
-#define MSBDS_ONLY BIT(5)
-#define NO_SWAPGS BIT(6)
+#define NO_SPECULATION BIT(0)
+#define NO_MELTDOWN BIT(1)
+#define NO_SSB BIT(2)
+#define NO_L1TF BIT(3)
+#define NO_MDS BIT(4)
+#define MSBDS_ONLY BIT(5)
+#define NO_SWAPGS BIT(6)
+#define NO_ITLB_MULTIHIT BIT(7)

#define VULNWL(_vendor, _family, _model, _whitelist) \
{ X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist }
@@ -1043,26 +1044,26 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
VULNWL(NSC, 5, X86_MODEL_ANY, NO_SPECULATION),

/* Intel Family 6 */
- VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION),
- VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION),
- VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION),
- VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION),
- VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION),
-
- VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(ATOM_SILVERMONT_X, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
+ VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION | NO_ITLB_MULTIHIT),
+
+ VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_SILVERMONT_X, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),

VULNWL_INTEL(CORE_YONAH, NO_SSB),

- VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
+ VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),

- VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF | NO_SWAPGS),
- VULNWL_INTEL(ATOM_GOLDMONT_X, NO_MDS | NO_L1TF | NO_SWAPGS),
- VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF | NO_SWAPGS),
+ VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_GOLDMONT_X, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),

/*
* Technically, swapgs isn't serializing on AMD (despite it previously
@@ -1072,15 +1073,17 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
* good enough for our purposes.
*/

+ VULNWL_INTEL(ATOM_TREMONT_X, NO_ITLB_MULTIHIT),
+
/* AMD Family 0xf - 0x12 */
- VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
- VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
- VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
- VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
+ VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),

/* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */
- VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS),
- VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS),
+ VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
{}
};

@@ -1091,19 +1094,30 @@ static bool __init cpu_matches(unsigned long which)
return m && !!(m->driver_data & which);
}

-static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+u64 x86_read_arch_cap_msr(void)
{
u64 ia32_cap = 0;

+ if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
+ rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
+
+ return ia32_cap;
+}
+
+static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+{
+ u64 ia32_cap = x86_read_arch_cap_msr();
+
+ /* Set ITLB_MULTIHIT bug if cpu is not in the whitelist and not mitigated */
+ if (!cpu_matches(NO_ITLB_MULTIHIT) && !(ia32_cap & ARCH_CAP_PSCHANGE_MC_NO))
+ setup_force_cpu_bug(X86_BUG_ITLB_MULTIHIT);
+
if (cpu_matches(NO_SPECULATION))
return;

setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
setup_force_cpu_bug(X86_BUG_SPECTRE_V2);

- if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
- rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
-
if (!cpu_matches(NO_SSB) && !(ia32_cap & ARCH_CAP_SSB_NO) &&
!cpu_has(c, X86_FEATURE_AMD_SSB_NO))
setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
@@ -1120,6 +1134,21 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
if (!cpu_matches(NO_SWAPGS))
setup_force_cpu_bug(X86_BUG_SWAPGS);

+ /*
+ * When the CPU is not mitigated for TAA (TAA_NO=0) set TAA bug when:
+ * - TSX is supported or
+ * - TSX_CTRL is present
+ *
+ * TSX_CTRL check is needed for cases when TSX could be disabled before
+ * the kernel boot e.g. kexec.
+ * TSX_CTRL check alone is not sufficient for cases when the microcode
+ * update is not present or running as guest that don't get TSX_CTRL.
+ */
+ if (!(ia32_cap & ARCH_CAP_TAA_NO) &&
+ (cpu_has(c, X86_FEATURE_RTM) ||
+ (ia32_cap & ARCH_CAP_TSX_CTRL_MSR)))
+ setup_force_cpu_bug(X86_BUG_TAA);
+
if (cpu_matches(NO_MELTDOWN))
return;

@@ -1553,6 +1582,8 @@ void __init identify_boot_cpu(void)
#endif
cpu_detect_tlb(&boot_cpu_data);
setup_cr_pinning();
+
+ tsx_init();
}

void identify_secondary_cpu(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
index c0e2407abdd6..38ab6e115eac 100644
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -44,6 +44,22 @@ struct _tlb_table {
extern const struct cpu_dev *const __x86_cpu_dev_start[],
*const __x86_cpu_dev_end[];

+#ifdef CONFIG_CPU_SUP_INTEL
+enum tsx_ctrl_states {
+ TSX_CTRL_ENABLE,
+ TSX_CTRL_DISABLE,
+ TSX_CTRL_NOT_SUPPORTED,
+};
+
+extern __ro_after_init enum tsx_ctrl_states tsx_ctrl_state;
+
+extern void __init tsx_init(void);
+extern void tsx_enable(void);
+extern void tsx_disable(void);
+#else
+static inline void tsx_init(void) { }
+#endif /* CONFIG_CPU_SUP_INTEL */
+
extern void get_cpu_cap(struct cpuinfo_x86 *c);
extern void get_cpu_address_sizes(struct cpuinfo_x86 *c);
extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c);
@@ -62,4 +78,6 @@ unsigned int aperfmperf_get_khz(int cpu);

extern void x86_spec_ctrl_setup_ap(void);

+extern u64 x86_read_arch_cap_msr(void);
+
#endif /* ARCH_X86_CPU_H */
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 8d6d92ebeb54..cc9f24818e49 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -761,6 +761,11 @@ static void init_intel(struct cpuinfo_x86 *c)
detect_tme(c);

init_intel_misc_features(c);
+
+ if (tsx_ctrl_state == TSX_CTRL_ENABLE)
+ tsx_enable();
+ if (tsx_ctrl_state == TSX_CTRL_DISABLE)
+ tsx_disable();
}

#ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
new file mode 100644
index 000000000000..3e20d322bc98
--- /dev/null
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -0,0 +1,140 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Transactional Synchronization Extensions (TSX) control.
+ *
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Author:
+ * Pawan Gupta <pawan.kumar.gupta@xxxxxxxxxxxxxxx>
+ */
+
+#include <linux/cpufeature.h>
+
+#include <asm/cmdline.h>
+
+#include "cpu.h"
+
+enum tsx_ctrl_states tsx_ctrl_state __ro_after_init = TSX_CTRL_NOT_SUPPORTED;
+
+void tsx_disable(void)
+{
+ u64 tsx;
+
+ rdmsrl(MSR_IA32_TSX_CTRL, tsx);
+
+ /* Force all transactions to immediately abort */
+ tsx |= TSX_CTRL_RTM_DISABLE;
+
+ /*
+ * Ensure TSX support is not enumerated in CPUID.
+ * This is visible to userspace and will ensure they
+ * do not waste resources trying TSX transactions that
+ * will always abort.
+ */
+ tsx |= TSX_CTRL_CPUID_CLEAR;
+
+ wrmsrl(MSR_IA32_TSX_CTRL, tsx);
+}
+
+void tsx_enable(void)
+{
+ u64 tsx;
+
+ rdmsrl(MSR_IA32_TSX_CTRL, tsx);
+
+ /* Enable the RTM feature in the cpu */
+ tsx &= ~TSX_CTRL_RTM_DISABLE;
+
+ /*
+ * Ensure TSX support is enumerated in CPUID.
+ * This is visible to userspace and will ensure they
+ * can enumerate and use the TSX feature.
+ */
+ tsx &= ~TSX_CTRL_CPUID_CLEAR;
+
+ wrmsrl(MSR_IA32_TSX_CTRL, tsx);
+}
+
+static bool __init tsx_ctrl_is_supported(void)
+{
+ u64 ia32_cap = x86_read_arch_cap_msr();
+
+ /*
+ * TSX is controlled via MSR_IA32_TSX_CTRL. However, support for this
+ * MSR is enumerated by ARCH_CAP_TSX_MSR bit in MSR_IA32_ARCH_CAPABILITIES.
+ *
+ * TSX control (aka MSR_IA32_TSX_CTRL) is only available after a
+ * microcode update on CPUs that have their MSR_IA32_ARCH_CAPABILITIES
+ * bit MDS_NO=1. CPUs with MDS_NO=0 are not planned to get
+ * MSR_IA32_TSX_CTRL support even after a microcode update. Thus,
+ * tsx= cmdline requests will do nothing on CPUs without
+ * MSR_IA32_TSX_CTRL support.
+ */
+ return !!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR);
+}
+
+static enum tsx_ctrl_states x86_get_tsx_auto_mode(void)
+{
+ if (boot_cpu_has_bug(X86_BUG_TAA))
+ return TSX_CTRL_DISABLE;
+
+ return TSX_CTRL_ENABLE;
+}
+
+void __init tsx_init(void)
+{
+ char arg[5] = {};
+ int ret;
+
+ if (!tsx_ctrl_is_supported())
+ return;
+
+ ret = cmdline_find_option(boot_command_line, "tsx", arg, sizeof(arg));
+ if (ret >= 0) {
+ if (!strcmp(arg, "on")) {
+ tsx_ctrl_state = TSX_CTRL_ENABLE;
+ } else if (!strcmp(arg, "off")) {
+ tsx_ctrl_state = TSX_CTRL_DISABLE;
+ } else if (!strcmp(arg, "auto")) {
+ tsx_ctrl_state = x86_get_tsx_auto_mode();
+ } else {
+ tsx_ctrl_state = TSX_CTRL_DISABLE;
+ pr_err("tsx: invalid option, defaulting to off\n");
+ }
+ } else {
+ /* tsx= not provided */
+ if (IS_ENABLED(CONFIG_X86_INTEL_TSX_MODE_AUTO))
+ tsx_ctrl_state = x86_get_tsx_auto_mode();
+ else if (IS_ENABLED(CONFIG_X86_INTEL_TSX_MODE_OFF))
+ tsx_ctrl_state = TSX_CTRL_DISABLE;
+ else
+ tsx_ctrl_state = TSX_CTRL_ENABLE;
+ }
+
+ if (tsx_ctrl_state == TSX_CTRL_DISABLE) {
+ tsx_disable();
+
+ /*
+ * tsx_disable() will change the state of the
+ * RTM CPUID bit. Clear it here since it is now
+ * expected to be not set.
+ */
+ setup_clear_cpu_cap(X86_FEATURE_RTM);
+ } else if (tsx_ctrl_state == TSX_CTRL_ENABLE) {
+
+ /*
+ * HW defaults TSX to be enabled at bootup.
+ * We may still need the TSX enable support
+ * during init for special cases like
+ * kexec after TSX is disabled.
+ */
+ tsx_enable();
+
+ /*
+ * tsx_enable() will change the state of the
+ * RTM CPUID bit. Force it here since it is now
+ * expected to be set.
+ */
+ setup_force_cpu_cap(X86_FEATURE_RTM);
+ }
+}
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index 753b8cfe8b8a..87b97897a881 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -94,6 +94,13 @@ static bool in_exception_stack(unsigned long *stack, struct stack_info *info)
BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);

begin = (unsigned long)__this_cpu_read(cea_exception_stacks);
+ /*
+ * Handle the case where stack trace is collected _before_
+ * cea_exception_stacks had been initialized.
+ */
+ if (!begin)
+ return false;
+
end = begin + sizeof(struct cea_exception_stacks);
/* Bail if @stack is outside the exception stack area. */
if (stk < begin || stk >= end)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 94aa6102010d..32b1c6136c6a 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -37,6 +37,7 @@
#include <linux/uaccess.h>
#include <linux/hash.h>
#include <linux/kern_levels.h>
+#include <linux/kthread.h>

#include <asm/page.h>
#include <asm/pat.h>
@@ -47,6 +48,30 @@
#include <asm/kvm_page_track.h>
#include "trace.h"

+extern bool itlb_multihit_kvm_mitigation;
+
+static int __read_mostly nx_huge_pages = -1;
+static uint __read_mostly nx_huge_pages_recovery_ratio = 60;
+
+static int set_nx_huge_pages(const char *val, const struct kernel_param *kp);
+static int set_nx_huge_pages_recovery_ratio(const char *val, const struct kernel_param *kp);
+
+static struct kernel_param_ops nx_huge_pages_ops = {
+ .set = set_nx_huge_pages,
+ .get = param_get_bool,
+};
+
+static struct kernel_param_ops nx_huge_pages_recovery_ratio_ops = {
+ .set = set_nx_huge_pages_recovery_ratio,
+ .get = param_get_uint,
+};
+
+module_param_cb(nx_huge_pages, &nx_huge_pages_ops, &nx_huge_pages, 0644);
+__MODULE_PARM_TYPE(nx_huge_pages, "bool");
+module_param_cb(nx_huge_pages_recovery_ratio, &nx_huge_pages_recovery_ratio_ops,
+ &nx_huge_pages_recovery_ratio, 0644);
+__MODULE_PARM_TYPE(nx_huge_pages_recovery_ratio, "uint");
+
/*
* When setting this variable to true it enables Two-Dimensional-Paging
* where the hardware walks 2 page tables:
@@ -318,6 +343,11 @@ static inline bool spte_ad_enabled(u64 spte)
return !(spte & shadow_acc_track_value);
}

+static bool is_nx_huge_page_enabled(void)
+{
+ return READ_ONCE(nx_huge_pages);
+}
+
static inline u64 spte_shadow_accessed_mask(u64 spte)
{
MMU_WARN_ON((spte & shadow_mmio_mask) == shadow_mmio_value);
@@ -1162,6 +1192,17 @@ static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp)
kvm_mmu_gfn_disallow_lpage(slot, gfn);
}

+static void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp)
+{
+ if (sp->lpage_disallowed)
+ return;
+
+ ++kvm->stat.nx_lpage_splits;
+ list_add_tail(&sp->lpage_disallowed_link,
+ &kvm->arch.lpage_disallowed_mmu_pages);
+ sp->lpage_disallowed = true;
+}
+
static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp)
{
struct kvm_memslots *slots;
@@ -1179,6 +1220,13 @@ static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp)
kvm_mmu_gfn_allow_lpage(slot, gfn);
}

+static void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp)
+{
+ --kvm->stat.nx_lpage_splits;
+ sp->lpage_disallowed = false;
+ list_del(&sp->lpage_disallowed_link);
+}
+
static bool __mmu_gfn_lpage_is_disallowed(gfn_t gfn, int level,
struct kvm_memory_slot *slot)
{
@@ -2753,6 +2801,9 @@ static bool __kvm_mmu_prepare_zap_page(struct kvm *kvm,
kvm_reload_remote_mmus(kvm);
}

+ if (sp->lpage_disallowed)
+ unaccount_huge_nx_page(kvm, sp);
+
sp->role.invalid = 1;
return list_unstable;
}
@@ -2972,6 +3023,11 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
if (!speculative)
spte |= spte_shadow_accessed_mask(spte);

+ if (level > PT_PAGE_TABLE_LEVEL && (pte_access & ACC_EXEC_MASK) &&
+ is_nx_huge_page_enabled()) {
+ pte_access &= ~ACC_EXEC_MASK;
+ }
+
if (pte_access & ACC_EXEC_MASK)
spte |= shadow_x_mask;
else
@@ -3192,9 +3248,32 @@ static void direct_pte_prefetch(struct kvm_vcpu *vcpu, u64 *sptep)
__direct_pte_prefetch(vcpu, sp, sptep);
}

+static void disallowed_hugepage_adjust(struct kvm_shadow_walk_iterator it,
+ gfn_t gfn, kvm_pfn_t *pfnp, int *levelp)
+{
+ int level = *levelp;
+ u64 spte = *it.sptep;
+
+ if (it.level == level && level > PT_PAGE_TABLE_LEVEL &&
+ is_nx_huge_page_enabled() &&
+ is_shadow_present_pte(spte) &&
+ !is_large_pte(spte)) {
+ /*
+ * A small SPTE exists for this pfn, but FNAME(fetch)
+ * and __direct_map would like to create a large PTE
+ * instead: just force them to go down another level,
+ * patching back for them into pfn the next 9 bits of
+ * the address.
+ */
+ u64 page_mask = KVM_PAGES_PER_HPAGE(level) - KVM_PAGES_PER_HPAGE(level - 1);
+ *pfnp |= gfn & page_mask;
+ (*levelp)--;
+ }
+}
+
static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, int write,
int map_writable, int level, kvm_pfn_t pfn,
- bool prefault)
+ bool prefault, bool lpage_disallowed)
{
struct kvm_shadow_walk_iterator it;
struct kvm_mmu_page *sp;
@@ -3207,6 +3286,12 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, int write,

trace_kvm_mmu_spte_requested(gpa, level, pfn);
for_each_shadow_entry(vcpu, gpa, it) {
+ /*
+ * We cannot overwrite existing page tables with an NX
+ * large page, as the leaf could be executable.
+ */
+ disallowed_hugepage_adjust(it, gfn, &pfn, &level);
+
base_gfn = gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1);
if (it.level == level)
break;
@@ -3217,6 +3302,8 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, int write,
it.level - 1, true, ACC_ALL);

link_shadow_page(vcpu, it.sptep, sp);
+ if (lpage_disallowed)
+ account_huge_nx_page(vcpu->kvm, sp);
}
}

@@ -3508,11 +3595,14 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, u32 error_code,
{
int r;
int level;
- bool force_pt_level = false;
+ bool force_pt_level;
kvm_pfn_t pfn;
unsigned long mmu_seq;
bool map_writable, write = error_code & PFERR_WRITE_MASK;
+ bool lpage_disallowed = (error_code & PFERR_FETCH_MASK) &&
+ is_nx_huge_page_enabled();

+ force_pt_level = lpage_disallowed;
level = mapping_level(vcpu, gfn, &force_pt_level);
if (likely(!force_pt_level)) {
/*
@@ -3546,7 +3636,8 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, u32 error_code,
goto out_unlock;
if (likely(!force_pt_level))
transparent_hugepage_adjust(vcpu, gfn, &pfn, &level);
- r = __direct_map(vcpu, v, write, map_writable, level, pfn, prefault);
+ r = __direct_map(vcpu, v, write, map_writable, level, pfn,
+ prefault, false);
out_unlock:
spin_unlock(&vcpu->kvm->mmu_lock);
kvm_release_pfn_clean(pfn);
@@ -4132,6 +4223,8 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
unsigned long mmu_seq;
int write = error_code & PFERR_WRITE_MASK;
bool map_writable;
+ bool lpage_disallowed = (error_code & PFERR_FETCH_MASK) &&
+ is_nx_huge_page_enabled();

MMU_WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa));

@@ -4142,8 +4235,9 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
if (r)
return r;

- force_pt_level = !check_hugepage_cache_consistency(vcpu, gfn,
- PT_DIRECTORY_LEVEL);
+ force_pt_level =
+ lpage_disallowed ||
+ !check_hugepage_cache_consistency(vcpu, gfn, PT_DIRECTORY_LEVEL);
level = mapping_level(vcpu, gfn, &force_pt_level);
if (likely(!force_pt_level)) {
if (level > PT_DIRECTORY_LEVEL &&
@@ -4172,7 +4266,8 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
goto out_unlock;
if (likely(!force_pt_level))
transparent_hugepage_adjust(vcpu, gfn, &pfn, &level);
- r = __direct_map(vcpu, gpa, write, map_writable, level, pfn, prefault);
+ r = __direct_map(vcpu, gpa, write, map_writable, level, pfn,
+ prefault, lpage_disallowed);
out_unlock:
spin_unlock(&vcpu->kvm->mmu_lock);
kvm_release_pfn_clean(pfn);
@@ -6099,10 +6194,60 @@ static void kvm_set_mmio_spte_mask(void)
kvm_mmu_set_mmio_spte_mask(mask, mask);
}

+static bool get_nx_auto_mode(void)
+{
+ /* Return true when CPU has the bug, and mitigations are ON */
+ return boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT) && !cpu_mitigations_off();
+}
+
+static void __set_nx_huge_pages(bool val)
+{
+ nx_huge_pages = itlb_multihit_kvm_mitigation = val;
+}
+
+static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
+{
+ bool old_val = nx_huge_pages;
+ bool new_val;
+
+ /* In "auto" mode deploy workaround only if CPU has the bug. */
+ if (sysfs_streq(val, "off"))
+ new_val = 0;
+ else if (sysfs_streq(val, "force"))
+ new_val = 1;
+ else if (sysfs_streq(val, "auto"))
+ new_val = get_nx_auto_mode();
+ else if (strtobool(val, &new_val) < 0)
+ return -EINVAL;
+
+ __set_nx_huge_pages(new_val);
+
+ if (new_val != old_val) {
+ struct kvm *kvm;
+ int idx;
+
+ mutex_lock(&kvm_lock);
+
+ list_for_each_entry(kvm, &vm_list, vm_list) {
+ idx = srcu_read_lock(&kvm->srcu);
+ kvm_mmu_zap_all_fast(kvm);
+ srcu_read_unlock(&kvm->srcu, idx);
+
+ wake_up_process(kvm->arch.nx_lpage_recovery_thread);
+ }
+ mutex_unlock(&kvm_lock);
+ }
+
+ return 0;
+}
+
int kvm_mmu_module_init(void)
{
int ret = -ENOMEM;

+ if (nx_huge_pages == -1)
+ __set_nx_huge_pages(get_nx_auto_mode());
+
/*
* MMU roles use union aliasing which is, generally speaking, an
* undefined behavior. However, we supposedly know how compilers behave
@@ -6182,3 +6327,116 @@ void kvm_mmu_module_exit(void)
unregister_shrinker(&mmu_shrinker);
mmu_audit_disable();
}
+
+static int set_nx_huge_pages_recovery_ratio(const char *val, const struct kernel_param *kp)
+{
+ unsigned int old_val;
+ int err;
+
+ old_val = nx_huge_pages_recovery_ratio;
+ err = param_set_uint(val, kp);
+ if (err)
+ return err;
+
+ if (READ_ONCE(nx_huge_pages) &&
+ !old_val && nx_huge_pages_recovery_ratio) {
+ struct kvm *kvm;
+
+ mutex_lock(&kvm_lock);
+
+ list_for_each_entry(kvm, &vm_list, vm_list)
+ wake_up_process(kvm->arch.nx_lpage_recovery_thread);
+
+ mutex_unlock(&kvm_lock);
+ }
+
+ return err;
+}
+
+static void kvm_recover_nx_lpages(struct kvm *kvm)
+{
+ int rcu_idx;
+ struct kvm_mmu_page *sp;
+ unsigned int ratio;
+ LIST_HEAD(invalid_list);
+ ulong to_zap;
+
+ rcu_idx = srcu_read_lock(&kvm->srcu);
+ spin_lock(&kvm->mmu_lock);
+
+ ratio = READ_ONCE(nx_huge_pages_recovery_ratio);
+ to_zap = ratio ? DIV_ROUND_UP(kvm->stat.nx_lpage_splits, ratio) : 0;
+ while (to_zap && !list_empty(&kvm->arch.lpage_disallowed_mmu_pages)) {
+ /*
+ * We use a separate list instead of just using active_mmu_pages
+ * because the number of lpage_disallowed pages is expected to
+ * be relatively small compared to the total.
+ */
+ sp = list_first_entry(&kvm->arch.lpage_disallowed_mmu_pages,
+ struct kvm_mmu_page,
+ lpage_disallowed_link);
+ WARN_ON_ONCE(!sp->lpage_disallowed);
+ kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
+ WARN_ON_ONCE(sp->lpage_disallowed);
+
+ if (!--to_zap || need_resched() || spin_needbreak(&kvm->mmu_lock)) {
+ kvm_mmu_commit_zap_page(kvm, &invalid_list);
+ if (to_zap)
+ cond_resched_lock(&kvm->mmu_lock);
+ }
+ }
+
+ spin_unlock(&kvm->mmu_lock);
+ srcu_read_unlock(&kvm->srcu, rcu_idx);
+}
+
+static long get_nx_lpage_recovery_timeout(u64 start_time)
+{
+ return READ_ONCE(nx_huge_pages) && READ_ONCE(nx_huge_pages_recovery_ratio)
+ ? start_time + 60 * HZ - get_jiffies_64()
+ : MAX_SCHEDULE_TIMEOUT;
+}
+
+static int kvm_nx_lpage_recovery_worker(struct kvm *kvm, uintptr_t data)
+{
+ u64 start_time;
+ long remaining_time;
+
+ while (true) {
+ start_time = get_jiffies_64();
+ remaining_time = get_nx_lpage_recovery_timeout(start_time);
+
+ set_current_state(TASK_INTERRUPTIBLE);
+ while (!kthread_should_stop() && remaining_time > 0) {
+ schedule_timeout(remaining_time);
+ remaining_time = get_nx_lpage_recovery_timeout(start_time);
+ set_current_state(TASK_INTERRUPTIBLE);
+ }
+
+ set_current_state(TASK_RUNNING);
+
+ if (kthread_should_stop())
+ return 0;
+
+ kvm_recover_nx_lpages(kvm);
+ }
+}
+
+int kvm_mmu_post_init_vm(struct kvm *kvm)
+{
+ int err;
+
+ err = kvm_vm_create_worker_thread(kvm, kvm_nx_lpage_recovery_worker, 0,
+ "kvm-nx-lpage-recovery",
+ &kvm->arch.nx_lpage_recovery_thread);
+ if (!err)
+ kthread_unpark(kvm->arch.nx_lpage_recovery_thread);
+
+ return err;
+}
+
+void kvm_mmu_pre_destroy_vm(struct kvm *kvm)
+{
+ if (kvm->arch.nx_lpage_recovery_thread)
+ kthread_stop(kvm->arch.nx_lpage_recovery_thread);
+}
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 54c2a377795b..4610230ddaea 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -210,4 +210,8 @@ void kvm_mmu_gfn_allow_lpage(struct kvm_memory_slot *slot, gfn_t gfn);
bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
struct kvm_memory_slot *slot, u64 gfn);
int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu);
+
+int kvm_mmu_post_init_vm(struct kvm *kvm);
+void kvm_mmu_pre_destroy_vm(struct kvm *kvm);
+
#endif
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 7d5cdb3af594..97b21e7fd013 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -614,13 +614,14 @@ static void FNAME(pte_prefetch)(struct kvm_vcpu *vcpu, struct guest_walker *gw,
static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
struct guest_walker *gw,
int write_fault, int hlevel,
- kvm_pfn_t pfn, bool map_writable, bool prefault)
+ kvm_pfn_t pfn, bool map_writable, bool prefault,
+ bool lpage_disallowed)
{
struct kvm_mmu_page *sp = NULL;
struct kvm_shadow_walk_iterator it;
unsigned direct_access, access = gw->pt_access;
int top_level, ret;
- gfn_t base_gfn;
+ gfn_t gfn, base_gfn;

direct_access = gw->pte_access;

@@ -665,13 +666,25 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
link_shadow_page(vcpu, it.sptep, sp);
}

- base_gfn = gw->gfn;
+ /*
+ * FNAME(page_fault) might have clobbered the bottom bits of
+ * gw->gfn, restore them from the virtual address.
+ */
+ gfn = gw->gfn | ((addr & PT_LVL_OFFSET_MASK(gw->level)) >> PAGE_SHIFT);
+ base_gfn = gfn;

trace_kvm_mmu_spte_requested(addr, gw->level, pfn);

for (; shadow_walk_okay(&it); shadow_walk_next(&it)) {
clear_sp_write_flooding_count(it.sptep);
- base_gfn = gw->gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1);
+
+ /*
+ * We cannot overwrite existing page tables with an NX
+ * large page, as the leaf could be executable.
+ */
+ disallowed_hugepage_adjust(it, gfn, &pfn, &hlevel);
+
+ base_gfn = gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1);
if (it.level == hlevel)
break;

@@ -683,6 +696,8 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
sp = kvm_mmu_get_page(vcpu, base_gfn, addr,
it.level - 1, true, direct_access);
link_shadow_page(vcpu, it.sptep, sp);
+ if (lpage_disallowed)
+ account_huge_nx_page(vcpu->kvm, sp);
}
}

@@ -759,9 +774,11 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr, u32 error_code,
int r;
kvm_pfn_t pfn;
int level = PT_PAGE_TABLE_LEVEL;
- bool force_pt_level = false;
unsigned long mmu_seq;
bool map_writable, is_self_change_mapping;
+ bool lpage_disallowed = (error_code & PFERR_FETCH_MASK) &&
+ is_nx_huge_page_enabled();
+ bool force_pt_level = lpage_disallowed;

pgprintk("%s: addr %lx err %x\n", __func__, addr, error_code);

@@ -851,7 +868,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr, u32 error_code,
if (!force_pt_level)
transparent_hugepage_adjust(vcpu, walker.gfn, &pfn, &level);
r = FNAME(fetch)(vcpu, addr, &walker, write_fault,
- level, pfn, map_writable, prefault);
+ level, pfn, map_writable, prefault, lpage_disallowed);
kvm_mmu_audit(vcpu, AUDIT_POST_PAGE_FAULT);

out_unlock:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e5ccfb33dbea..f82f766c81c8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -92,8 +92,8 @@ u64 __read_mostly efer_reserved_bits = ~((u64)(EFER_SCE | EFER_LME | EFER_LMA));
static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE);
#endif

-#define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM
-#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
+#define VM_STAT(x, ...) offsetof(struct kvm, stat.x), KVM_STAT_VM, ## __VA_ARGS__
+#define VCPU_STAT(x, ...) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU, ## __VA_ARGS__

#define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \
KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK)
@@ -212,7 +212,8 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "mmu_cache_miss", VM_STAT(mmu_cache_miss) },
{ "mmu_unsync", VM_STAT(mmu_unsync) },
{ "remote_tlb_flush", VM_STAT(remote_tlb_flush) },
- { "largepages", VM_STAT(lpages) },
+ { "largepages", VM_STAT(lpages, .mode = 0444) },
+ { "nx_largepages_splitted", VM_STAT(nx_lpage_splits, .mode = 0444) },
{ "max_mmu_page_hash_collisions",
VM_STAT(max_mmu_page_hash_collisions) },
{ NULL }
@@ -1255,6 +1256,14 @@ static u64 kvm_get_arch_capabilities(void)
if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
rdmsrl(MSR_IA32_ARCH_CAPABILITIES, data);

+ /*
+ * If nx_huge_pages is enabled, KVM's shadow paging will ensure that
+ * the nested hypervisor runs with NX huge pages. If it is not,
+ * L1 is anyway vulnerable to ITLB_MULTIHIT explots from other
+ * L1 guests, so it need not worry about its own (L2) guests.
+ */
+ data |= ARCH_CAP_PSCHANGE_MC_NO;
+
/*
* If we're doing cache flushes (either "always" or "cond")
* we will do one whenever the guest does a vmlaunch/vmresume.
@@ -1267,6 +1276,25 @@ static u64 kvm_get_arch_capabilities(void)
if (l1tf_vmx_mitigation != VMENTER_L1D_FLUSH_NEVER)
data |= ARCH_CAP_SKIP_VMENTRY_L1DFLUSH;

+ /*
+ * On TAA affected systems, export MDS_NO=0 when:
+ * - TSX is enabled on the host, i.e. X86_FEATURE_RTM=1.
+ * - Updated microcode is present. This is detected by
+ * the presence of ARCH_CAP_TSX_CTRL_MSR and ensures
+ * that VERW clears CPU buffers.
+ *
+ * When MDS_NO=0 is exported, guests deploy clear CPU buffer
+ * mitigation and don't complain:
+ *
+ * "Vulnerable: Clear CPU buffers attempted, no microcode"
+ *
+ * If TSX is disabled on the system, guests are also mitigated against
+ * TAA and clear CPU buffer mitigation is not required for guests.
+ */
+ if (boot_cpu_has_bug(X86_BUG_TAA) && boot_cpu_has(X86_FEATURE_RTM) &&
+ (data & ARCH_CAP_TSX_CTRL_MSR))
+ data &= ~ARCH_CAP_MDS_NO;
+
return data;
}

@@ -9314,6 +9342,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)

INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list);
INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
+ INIT_LIST_HEAD(&kvm->arch.lpage_disallowed_mmu_pages);
INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);
atomic_set(&kvm->arch.noncoherent_dma_count, 0);

@@ -9345,6 +9374,11 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
return 0;
}

+int kvm_arch_post_init_vm(struct kvm *kvm)
+{
+ return kvm_mmu_post_init_vm(kvm);
+}
+
static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)
{
vcpu_load(vcpu);
@@ -9446,6 +9480,11 @@ int x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size)
}
EXPORT_SYMBOL_GPL(x86_set_memory_region);

+void kvm_arch_pre_destroy_vm(struct kvm *kvm)
+{
+ kvm_mmu_pre_destroy_vm(kvm);
+}
+
void kvm_arch_destroy_vm(struct kvm *kvm)
{
if (current->mm == kvm->mm) {
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 55a7dc227dfb..4c909ae07093 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -908,9 +908,14 @@ static int blkcg_print_stat(struct seq_file *sf, void *v)
int i;
bool has_stats = false;

+ spin_lock_irq(&blkg->q->queue_lock);
+
+ if (!blkg->online)
+ goto skip;
+
dname = blkg_dev_name(blkg);
if (!dname)
- continue;
+ goto skip;

/*
* Hooray string manipulation, count is the size written NOT
@@ -920,8 +925,6 @@ static int blkcg_print_stat(struct seq_file *sf, void *v)
*/
off += scnprintf(buf+off, size-off, "%s ", dname);

- spin_lock_irq(&blkg->q->queue_lock);
-
blkg_rwstat_recursive_sum(blkg, NULL,
offsetof(struct blkcg_gq, stat_bytes), &rwstat);
rbytes = rwstat.cnt[BLKG_RWSTAT_READ];
@@ -934,8 +937,6 @@ static int blkcg_print_stat(struct seq_file *sf, void *v)
wios = rwstat.cnt[BLKG_RWSTAT_WRITE];
dios = rwstat.cnt[BLKG_RWSTAT_DISCARD];

- spin_unlock_irq(&blkg->q->queue_lock);
-
if (rbytes || wbytes || rios || wios) {
has_stats = true;
off += scnprintf(buf+off, size-off,
@@ -973,6 +974,8 @@ static int blkcg_print_stat(struct seq_file *sf, void *v)
seq_commit(sf, -1);
}
}
+ skip:
+ spin_unlock_irq(&blkg->q->queue_lock);
}

rcu_read_unlock();
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index cc37511de866..6265871a4af2 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -554,12 +554,27 @@ ssize_t __weak cpu_show_mds(struct device *dev,
return sprintf(buf, "Not affected\n");
}

+ssize_t __weak cpu_show_tsx_async_abort(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ return sprintf(buf, "Not affected\n");
+}
+
+ssize_t __weak cpu_show_itlb_multihit(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ return sprintf(buf, "Not affected\n");
+}
+
static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL);
static DEVICE_ATTR(l1tf, 0444, cpu_show_l1tf, NULL);
static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL);
+static DEVICE_ATTR(tsx_async_abort, 0444, cpu_show_tsx_async_abort, NULL);
+static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL);

static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_meltdown.attr,
@@ -568,6 +583,8 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_spec_store_bypass.attr,
&dev_attr_l1tf.attr,
&dev_attr_mds.attr,
+ &dev_attr_tsx_async_abort.attr,
+ &dev_attr_itlb_multihit.attr,
NULL
};

diff --git a/drivers/clk/imx/clk-imx8mm.c b/drivers/clk/imx/clk-imx8mm.c
index 6f46bcb1d643..59ce93691e97 100644
--- a/drivers/clk/imx/clk-imx8mm.c
+++ b/drivers/clk/imx/clk-imx8mm.c
@@ -666,7 +666,7 @@ static int __init imx8mm_clocks_init(struct device_node *ccm_node)
clks[IMX8MM_CLK_A53_DIV],
clks[IMX8MM_CLK_A53_SRC],
clks[IMX8MM_ARM_PLL_OUT],
- clks[IMX8MM_CLK_24M]);
+ clks[IMX8MM_SYS_PLL1_800M]);

imx_check_clocks(clks, ARRAY_SIZE(clks));

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index cc27d4c59dca..6a89d3383a21 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -846,11 +846,9 @@ static void intel_pstate_hwp_force_min_perf(int cpu)
value |= HWP_MAX_PERF(min_perf);
value |= HWP_MIN_PERF(min_perf);

- /* Set EPP/EPB to min */
+ /* Set EPP to min */
if (boot_cpu_has(X86_FEATURE_HWP_EPP))
value |= HWP_ENERGY_PERF_PREFERENCE(HWP_EPP_POWERSAVE);
- else
- intel_pstate_set_epb(cpu, HWP_EPP_BALANCE_POWERSAVE);

wrmsrl_on_cpu(cpu, MSR_HWP_REQUEST, value);
}
diff --git a/drivers/dma/sprd-dma.c b/drivers/dma/sprd-dma.c
index 525dc7338fe3..8546ad034720 100644
--- a/drivers/dma/sprd-dma.c
+++ b/drivers/dma/sprd-dma.c
@@ -134,6 +134,10 @@
#define SPRD_DMA_SRC_TRSF_STEP_OFFSET 0
#define SPRD_DMA_TRSF_STEP_MASK GENMASK(15, 0)

+/* SPRD DMA_SRC_BLK_STEP register definition */
+#define SPRD_DMA_LLIST_HIGH_MASK GENMASK(31, 28)
+#define SPRD_DMA_LLIST_HIGH_SHIFT 28
+
/* define DMA channel mode & trigger mode mask */
#define SPRD_DMA_CHN_MODE_MASK GENMASK(7, 0)
#define SPRD_DMA_TRG_MODE_MASK GENMASK(7, 0)
@@ -208,6 +212,7 @@ struct sprd_dma_dev {
struct sprd_dma_chn channels[0];
};

+static void sprd_dma_free_desc(struct virt_dma_desc *vd);
static bool sprd_dma_filter_fn(struct dma_chan *chan, void *param);
static struct of_dma_filter_info sprd_dma_info = {
.filter_fn = sprd_dma_filter_fn,
@@ -609,12 +614,19 @@ static int sprd_dma_alloc_chan_resources(struct dma_chan *chan)
static void sprd_dma_free_chan_resources(struct dma_chan *chan)
{
struct sprd_dma_chn *schan = to_sprd_dma_chan(chan);
+ struct virt_dma_desc *cur_vd = NULL;
unsigned long flags;

spin_lock_irqsave(&schan->vc.lock, flags);
+ if (schan->cur_desc)
+ cur_vd = &schan->cur_desc->vd;
+
sprd_dma_stop(schan);
spin_unlock_irqrestore(&schan->vc.lock, flags);

+ if (cur_vd)
+ sprd_dma_free_desc(cur_vd);
+
vchan_free_chan_resources(&schan->vc);
pm_runtime_put(chan->device->dev);
}
@@ -717,6 +729,7 @@ static int sprd_dma_fill_desc(struct dma_chan *chan,
u32 int_mode = flags & SPRD_DMA_INT_MASK;
int src_datawidth, dst_datawidth, src_step, dst_step;
u32 temp, fix_mode = 0, fix_en = 0;
+ phys_addr_t llist_ptr;

if (dir == DMA_MEM_TO_DEV) {
src_step = sprd_dma_get_step(slave_cfg->src_addr_width);
@@ -814,13 +827,16 @@ static int sprd_dma_fill_desc(struct dma_chan *chan,
* Set the link-list pointer point to next link-list
* configuration's physical address.
*/
- hw->llist_ptr = schan->linklist.phy_addr + temp;
+ llist_ptr = schan->linklist.phy_addr + temp;
+ hw->llist_ptr = lower_32_bits(llist_ptr);
+ hw->src_blk_step = (upper_32_bits(llist_ptr) << SPRD_DMA_LLIST_HIGH_SHIFT) &
+ SPRD_DMA_LLIST_HIGH_MASK;
} else {
hw->llist_ptr = 0;
+ hw->src_blk_step = 0;
}

hw->frg_step = 0;
- hw->src_blk_step = 0;
hw->des_blk_step = 0;
return 0;
}
@@ -1023,15 +1039,22 @@ static int sprd_dma_resume(struct dma_chan *chan)
static int sprd_dma_terminate_all(struct dma_chan *chan)
{
struct sprd_dma_chn *schan = to_sprd_dma_chan(chan);
+ struct virt_dma_desc *cur_vd = NULL;
unsigned long flags;
LIST_HEAD(head);

spin_lock_irqsave(&schan->vc.lock, flags);
+ if (schan->cur_desc)
+ cur_vd = &schan->cur_desc->vd;
+
sprd_dma_stop(schan);

vchan_get_all_descriptors(&schan->vc, &head);
spin_unlock_irqrestore(&schan->vc.lock, flags);

+ if (cur_vd)
+ sprd_dma_free_desc(cur_vd);
+
vchan_dma_desc_free_list(&schan->vc, &head);
return 0;
}
diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index e7dc3c4dc8e0..5d56f1e4d332 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -68,6 +68,9 @@
#define XILINX_DMA_DMACR_CIRC_EN BIT(1)
#define XILINX_DMA_DMACR_RUNSTOP BIT(0)
#define XILINX_DMA_DMACR_FSYNCSRC_MASK GENMASK(6, 5)
+#define XILINX_DMA_DMACR_DELAY_MASK GENMASK(31, 24)
+#define XILINX_DMA_DMACR_FRAME_COUNT_MASK GENMASK(23, 16)
+#define XILINX_DMA_DMACR_MASTER_MASK GENMASK(11, 8)

#define XILINX_DMA_REG_DMASR 0x0004
#define XILINX_DMA_DMASR_EOL_LATE_ERR BIT(15)
@@ -1354,7 +1357,8 @@ static void xilinx_dma_start_transfer(struct xilinx_dma_chan *chan)
node);
hw = &segment->hw;

- xilinx_write(chan, XILINX_DMA_REG_SRCDSTADDR, hw->buf_addr);
+ xilinx_write(chan, XILINX_DMA_REG_SRCDSTADDR,
+ xilinx_prep_dma_addr_t(hw->buf_addr));

/* Start the transfer */
dma_ctrl_write(chan, XILINX_DMA_REG_BTT,
@@ -2117,8 +2121,10 @@ int xilinx_vdma_channel_set_config(struct dma_chan *dchan,
chan->config.gen_lock = cfg->gen_lock;
chan->config.master = cfg->master;

+ dmacr &= ~XILINX_DMA_DMACR_GENLOCK_EN;
if (cfg->gen_lock && chan->genlock) {
dmacr |= XILINX_DMA_DMACR_GENLOCK_EN;
+ dmacr &= ~XILINX_DMA_DMACR_MASTER_MASK;
dmacr |= cfg->master << XILINX_DMA_DMACR_MASTER_SHIFT;
}

@@ -2134,11 +2140,13 @@ int xilinx_vdma_channel_set_config(struct dma_chan *dchan,
chan->config.delay = cfg->delay;

if (cfg->coalesc <= XILINX_DMA_DMACR_FRAME_COUNT_MAX) {
+ dmacr &= ~XILINX_DMA_DMACR_FRAME_COUNT_MASK;
dmacr |= cfg->coalesc << XILINX_DMA_DMACR_FRAME_COUNT_SHIFT;
chan->config.coalesc = cfg->coalesc;
}

if (cfg->delay <= XILINX_DMA_DMACR_DELAY_MAX) {
+ dmacr &= ~XILINX_DMA_DMACR_DELAY_MASK;
dmacr |= cfg->delay << XILINX_DMA_DMACR_DELAY_SHIFT;
chan->config.delay = cfg->delay;
}
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 0460c7581220..ee0661ddb25b 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o string.o random.o \

lib-$(CONFIG_ARM) += arm32-stub.o
lib-$(CONFIG_ARM64) += arm64-stub.o
+CFLAGS_arm32-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
CFLAGS_arm64-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET)

#
diff --git a/drivers/firmware/efi/libstub/arm32-stub.c b/drivers/firmware/efi/libstub/arm32-stub.c
index e8f7aefb6813..41213bf5fcf5 100644
--- a/drivers/firmware/efi/libstub/arm32-stub.c
+++ b/drivers/firmware/efi/libstub/arm32-stub.c
@@ -195,6 +195,7 @@ efi_status_t handle_kernel_image(efi_system_table_t *sys_table,
unsigned long dram_base,
efi_loaded_image_t *image)
{
+ unsigned long kernel_base;
efi_status_t status;

/*
@@ -204,9 +205,18 @@ efi_status_t handle_kernel_image(efi_system_table_t *sys_table,
* loaded. These assumptions are made by the decompressor,
* before any memory map is available.
*/
- dram_base = round_up(dram_base, SZ_128M);
+ kernel_base = round_up(dram_base, SZ_128M);

- status = reserve_kernel_base(sys_table, dram_base, reserve_addr,
+ /*
+ * Note that some platforms (notably, the Raspberry Pi 2) put
+ * spin-tables and other pieces of firmware at the base of RAM,
+ * abusing the fact that the window of TEXT_OFFSET bytes at the
+ * base of the kernel image is only partially used at the moment.
+ * (Up to 5 pages are used for the swapper page tables)
+ */
+ kernel_base += TEXT_OFFSET - 5 * PAGE_SIZE;
+
+ status = reserve_kernel_base(sys_table, kernel_base, reserve_addr,
reserve_size);
if (status != EFI_SUCCESS) {
pr_efi_err(sys_table, "Unable to allocate memory for uncompressed kernel.\n");
@@ -220,7 +230,7 @@ efi_status_t handle_kernel_image(efi_system_table_t *sys_table,
*image_size = image->image_size;
status = efi_relocate_kernel(sys_table, image_addr, *image_size,
*image_size,
- dram_base + MAX_UNCOMP_KERNEL_SIZE, 0);
+ kernel_base + MAX_UNCOMP_KERNEL_SIZE, 0, 0);
if (status != EFI_SUCCESS) {
pr_efi_err(sys_table, "Failed to relocate kernel.\n");
efi_free(sys_table, *reserve_size, *reserve_addr);
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index 3caae7f2cf56..35dbc2791c97 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -260,11 +260,11 @@ efi_status_t efi_high_alloc(efi_system_table_t *sys_table_arg,
}

/*
- * Allocate at the lowest possible address.
+ * Allocate at the lowest possible address that is not below 'min'.
*/
-efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
- unsigned long size, unsigned long align,
- unsigned long *addr)
+efi_status_t efi_low_alloc_above(efi_system_table_t *sys_table_arg,
+ unsigned long size, unsigned long align,
+ unsigned long *addr, unsigned long min)
{
unsigned long map_size, desc_size, buff_size;
efi_memory_desc_t *map;
@@ -311,13 +311,8 @@ efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
start = desc->phys_addr;
end = start + desc->num_pages * EFI_PAGE_SIZE;

- /*
- * Don't allocate at 0x0. It will confuse code that
- * checks pointers against NULL. Skip the first 8
- * bytes so we start at a nice even number.
- */
- if (start == 0x0)
- start += 8;
+ if (start < min)
+ start = min;

start = round_up(start, align);
if ((start + size) > end)
@@ -698,7 +693,8 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
unsigned long image_size,
unsigned long alloc_size,
unsigned long preferred_addr,
- unsigned long alignment)
+ unsigned long alignment,
+ unsigned long min_addr)
{
unsigned long cur_image_addr;
unsigned long new_addr = 0;
@@ -731,8 +727,8 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
* possible.
*/
if (status != EFI_SUCCESS) {
- status = efi_low_alloc(sys_table_arg, alloc_size, alignment,
- &new_addr);
+ status = efi_low_alloc_above(sys_table_arg, alloc_size,
+ alignment, &new_addr, min_addr);
}
if (status != EFI_SUCCESS) {
pr_efi_err(sys_table_arg, "Failed to allocate usable memory for kernel.\n");
diff --git a/drivers/firmware/efi/tpm.c b/drivers/firmware/efi/tpm.c
index ebd7977653a8..31f9f0e369b9 100644
--- a/drivers/firmware/efi/tpm.c
+++ b/drivers/firmware/efi/tpm.c
@@ -88,6 +88,7 @@ int __init efi_tpm_eventlog_init(void)

if (tbl_size < 0) {
pr_err(FW_BUG "Failed to parse event in TPM Final Events Log\n");
+ ret = -EINVAL;
goto out_calc;
}

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 9d76e0923a5a..96b2a31ccfed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -218,7 +218,7 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job)
struct amdgpu_ring *ring = to_amdgpu_ring(sched_job->sched);
struct dma_fence *fence = NULL, *finished;
struct amdgpu_job *job;
- int r;
+ int r = 0;

job = to_amdgpu_job(sched_job);
finished = &job->base.s_fence->finished;
@@ -243,6 +243,8 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job)
job->fence = dma_fence_get(fence);

amdgpu_job_free_resources(job);
+
+ fence = r ? ERR_PTR(r) : fence;
return fence;
}

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 5eeb72fcc123..6a51e6a4a035 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -264,6 +264,7 @@ static void gmc_v10_0_flush_gpu_tlb(struct amdgpu_device *adev,

job->vm_pd_addr = amdgpu_gmc_pd_addr(adev->gart.bo);
job->vm_needs_flush = true;
+ job->ibs->ptr[job->ibs->length_dw++] = ring->funcs->nop;
amdgpu_ring_pad_ib(ring, &job->ibs[0]);
r = amdgpu_job_submit(job, &adev->mman.entity,
AMDGPU_FENCE_OWNER_UNDEFINED, &fence);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 730f97ba8dbb..dd4731ab935c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -566,6 +566,10 @@ static bool construct(struct dc *dc,
#ifdef CONFIG_DRM_AMD_DC_DCN2_0
// Allocate memory for the vm_helper
dc->vm_helper = kzalloc(sizeof(struct vm_helper), GFP_KERNEL);
+ if (!dc->vm_helper) {
+ dm_error("%s: failed to create dc->vm_helper\n", __func__);
+ goto fail;
+ }

#endif
memcpy(&dc->bb_overrides, &init_params->bb_overrides, sizeof(dc->bb_overrides));
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
index e6da8506128b..623c1ab4d3db 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
@@ -374,6 +374,7 @@ void dal_ddc_service_i2c_query_dp_dual_mode_adaptor(
enum display_dongle_type *dongle = &sink_cap->dongle_type;
uint8_t type2_dongle_buf[DP_ADAPTOR_TYPE2_SIZE];
bool is_type2_dongle = false;
+ int retry_count = 2;
struct dp_hdmi_dongle_signature_data *dongle_signature;

/* Assume we have no valid DP passive dongle connected */
@@ -386,13 +387,24 @@ void dal_ddc_service_i2c_query_dp_dual_mode_adaptor(
DP_HDMI_DONGLE_ADDRESS,
type2_dongle_buf,
sizeof(type2_dongle_buf))) {
- *dongle = DISPLAY_DONGLE_DP_DVI_DONGLE;
- sink_cap->max_hdmi_pixel_clock = DP_ADAPTOR_DVI_MAX_TMDS_CLK;
+ /* Passive HDMI dongles can sometimes fail here without retrying*/
+ while (retry_count > 0) {
+ if (i2c_read(ddc,
+ DP_HDMI_DONGLE_ADDRESS,
+ type2_dongle_buf,
+ sizeof(type2_dongle_buf)))
+ break;
+ retry_count--;
+ }
+ if (retry_count == 0) {
+ *dongle = DISPLAY_DONGLE_DP_DVI_DONGLE;
+ sink_cap->max_hdmi_pixel_clock = DP_ADAPTOR_DVI_MAX_TMDS_CLK;

- CONN_DATA_DETECT(ddc->link, type2_dongle_buf, sizeof(type2_dongle_buf),
- "DP-DVI passive dongle %dMhz: ",
- DP_ADAPTOR_DVI_MAX_TMDS_CLK / 1000);
- return;
+ CONN_DATA_DETECT(ddc->link, type2_dongle_buf, sizeof(type2_dongle_buf),
+ "DP-DVI passive dongle %dMhz: ",
+ DP_ADAPTOR_DVI_MAX_TMDS_CLK / 1000);
+ return;
+ }
}

/* Check if Type 2 dongle.*/
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index 68db60e4caf3..d1a33e04570f 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -394,6 +394,9 @@ bool resource_are_streams_timing_synchronizable(
if (stream1->view_format != stream2->view_format)
return false;

+ if (stream1->ignore_msa_timing_param || stream2->ignore_msa_timing_param)
+ return false;
+
return true;
}
static bool is_dp_and_hdmi_sharable(
@@ -1566,6 +1569,9 @@ bool dc_is_stream_unchanged(
if (!are_stream_backends_same(old_stream, stream))
return false;

+ if (old_stream->ignore_msa_timing_param != stream->ignore_msa_timing_param)
+ return false;
+
return true;
}

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
index 649883777f62..6c6c486b774a 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
@@ -2577,7 +2577,8 @@ static void dml20_DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPer
mode_lib->vba.MinActiveDRAMClockChangeMargin
+ mode_lib->vba.DRAMClockChangeLatency;

- if (mode_lib->vba.MinActiveDRAMClockChangeMargin > 0) {
+ if (mode_lib->vba.MinActiveDRAMClockChangeMargin > 50) {
+ mode_lib->vba.DRAMClockChangeWatermark += 25;
mode_lib->vba.DRAMClockChangeSupport[0][0] = dm_dram_clock_change_vactive;
} else {
if (mode_lib->vba.SynchronizedVBlank || mode_lib->vba.NumberOfActivePlanes == 1) {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 0f2c22a3bcb6..f822b7730308 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -315,6 +315,8 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
free_engines(rcu_access_pointer(ctx->engines));
mutex_destroy(&ctx->engines_mutex);

+ kfree(ctx->jump_whitelist);
+
if (ctx->timeline)
i915_timeline_put(ctx->timeline);

@@ -465,6 +467,9 @@ __create_context(struct drm_i915_private *i915)
for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;

+ ctx->jump_whitelist = NULL;
+ ctx->jump_whitelist_cmds = 0;
+
return ctx;

err_free:
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index cc513410eeef..d284b30d591a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -197,6 +197,11 @@ struct i915_gem_context {
* per vm, which may be one per context or shared with the global GTT)
*/
struct radix_tree_root handles_vma;
+
+ /** jump_whitelist: Bit array for tracking cmds during cmdparsing */
+ unsigned long *jump_whitelist;
+ /** jump_whitelist_cmds: No of cmd slots available */
+ u32 jump_whitelist_cmds;
};

#endif /* __I915_GEM_CONTEXT_TYPES_H__ */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 41dab9ea33cd..4a1debd021ed 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -295,7 +295,9 @@ static inline u64 gen8_noncanonical_addr(u64 address)

static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb)
{
- return intel_engine_needs_cmd_parser(eb->engine) && eb->batch_len;
+ return intel_engine_requires_cmd_parser(eb->engine) ||
+ (intel_engine_using_cmd_parser(eb->engine) &&
+ eb->args->batch_len);
}

static int eb_create(struct i915_execbuffer *eb)
@@ -2009,10 +2011,39 @@ static int i915_reset_gen7_sol_offsets(struct i915_request *rq)
return 0;
}

-static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master)
+static struct i915_vma *
+shadow_batch_pin(struct i915_execbuffer *eb, struct drm_i915_gem_object *obj)
+{
+ struct drm_i915_private *dev_priv = eb->i915;
+ struct i915_vma * const vma = *eb->vma;
+ struct i915_address_space *vm;
+ u64 flags;
+
+ /*
+ * PPGTT backed shadow buffers must be mapped RO, to prevent
+ * post-scan tampering
+ */
+ if (CMDPARSER_USES_GGTT(dev_priv)) {
+ flags = PIN_GLOBAL;
+ vm = &dev_priv->ggtt.vm;
+ } else if (vma->vm->has_read_only) {
+ flags = PIN_USER;
+ vm = vma->vm;
+ i915_gem_object_set_readonly(obj);
+ } else {
+ DRM_DEBUG("Cannot prevent post-scan tampering without RO capable vm\n");
+ return ERR_PTR(-EINVAL);
+ }
+
+ return i915_gem_object_pin(obj, vm, NULL, 0, 0, flags);
+}
+
+static struct i915_vma *eb_parse(struct i915_execbuffer *eb)
{
struct drm_i915_gem_object *shadow_batch_obj;
struct i915_vma *vma;
+ u64 batch_start;
+ u64 shadow_batch_start;
int err;

shadow_batch_obj = i915_gem_batch_pool_get(&eb->engine->batch_pool,
@@ -2020,30 +2051,53 @@ static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master)
if (IS_ERR(shadow_batch_obj))
return ERR_CAST(shadow_batch_obj);

- err = intel_engine_cmd_parser(eb->engine,
+ vma = shadow_batch_pin(eb, shadow_batch_obj);
+ if (IS_ERR(vma))
+ goto out;
+
+ batch_start = gen8_canonical_addr(eb->batch->node.start) +
+ eb->batch_start_offset;
+
+ shadow_batch_start = gen8_canonical_addr(vma->node.start);
+
+ err = intel_engine_cmd_parser(eb->gem_context,
+ eb->engine,
eb->batch->obj,
- shadow_batch_obj,
+ batch_start,
eb->batch_start_offset,
eb->batch_len,
- is_master);
+ shadow_batch_obj,
+ shadow_batch_start);
+
if (err) {
- if (err == -EACCES) /* unhandled chained batch */
+ i915_vma_unpin(vma);
+
+ /*
+ * Unsafe GGTT-backed buffers can still be submitted safely
+ * as non-secure.
+ * For PPGTT backing however, we have no choice but to forcibly
+ * reject unsafe buffers
+ */
+ if (CMDPARSER_USES_GGTT(eb->i915) && (err == -EACCES))
+ /* Execute original buffer non-secure */
vma = NULL;
else
vma = ERR_PTR(err);
goto out;
}

- vma = i915_gem_object_ggtt_pin(shadow_batch_obj, NULL, 0, 0, 0);
- if (IS_ERR(vma))
- goto out;
-
eb->vma[eb->buffer_count] = i915_vma_get(vma);
eb->flags[eb->buffer_count] =
__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_REF;
vma->exec_flags = &eb->flags[eb->buffer_count];
eb->buffer_count++;
+ eb->batch_start_offset = 0;
+ eb->batch = vma;
+
+ if (CMDPARSER_USES_GGTT(eb->i915))
+ eb->batch_flags |= I915_DISPATCH_SECURE;

+ /* eb->batch_len unchanged */
out:
i915_gem_object_unpin_pages(shadow_batch_obj);
return vma;
@@ -2351,6 +2405,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
struct drm_i915_gem_exec_object2 *exec,
struct drm_syncobj **fences)
{
+ struct drm_i915_private *i915 = to_i915(dev);
struct i915_execbuffer eb;
struct dma_fence *in_fence = NULL;
struct dma_fence *exec_fence = NULL;
@@ -2362,7 +2417,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
BUILD_BUG_ON(__EXEC_OBJECT_INTERNAL_FLAGS &
~__EXEC_OBJECT_UNKNOWN_FLAGS);

- eb.i915 = to_i915(dev);
+ eb.i915 = i915;
eb.file = file;
eb.args = args;
if (DBG_FORCE_RELOC || !(args->flags & I915_EXEC_NO_RELOC))
@@ -2382,8 +2437,15 @@ i915_gem_do_execbuffer(struct drm_device *dev,

eb.batch_flags = 0;
if (args->flags & I915_EXEC_SECURE) {
+ if (INTEL_GEN(i915) >= 11)
+ return -ENODEV;
+
+ /* Return -EPERM to trigger fallback code on old binaries. */
+ if (!HAS_SECURE_BATCHES(i915))
+ return -EPERM;
+
if (!drm_is_current_master(file) || !capable(CAP_SYS_ADMIN))
- return -EPERM;
+ return -EPERM;

eb.batch_flags |= I915_DISPATCH_SECURE;
}
@@ -2473,34 +2535,19 @@ i915_gem_do_execbuffer(struct drm_device *dev,
goto err_vma;
}

+ if (eb.batch_len == 0)
+ eb.batch_len = eb.batch->size - eb.batch_start_offset;
+
if (eb_use_cmdparser(&eb)) {
struct i915_vma *vma;

- vma = eb_parse(&eb, drm_is_current_master(file));
+ vma = eb_parse(&eb);
if (IS_ERR(vma)) {
err = PTR_ERR(vma);
goto err_vma;
}
-
- if (vma) {
- /*
- * Batch parsed and accepted:
- *
- * Set the DISPATCH_SECURE bit to remove the NON_SECURE
- * bit from MI_BATCH_BUFFER_START commands issued in
- * the dispatch_execbuffer implementations. We
- * specifically don't want that set on batches the
- * command parser has accepted.
- */
- eb.batch_flags |= I915_DISPATCH_SECURE;
- eb.batch_start_offset = 0;
- eb.batch = vma;
- }
}

- if (eb.batch_len == 0)
- eb.batch_len = eb.batch->size - eb.batch_start_offset;
-
/*
* snb/ivb/vlv conflate the "batch in ppgtt" bit with the "non-secure
* batch" bit. Hence we need to pin secure batches into the global gtt.
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 43e975a26016..3f6c58b68bb1 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -460,12 +460,13 @@ struct intel_engine_cs {

struct intel_engine_hangcheck hangcheck;

-#define I915_ENGINE_NEEDS_CMD_PARSER BIT(0)
+#define I915_ENGINE_USING_CMD_PARSER BIT(0)
#define I915_ENGINE_SUPPORTS_STATS BIT(1)
#define I915_ENGINE_HAS_PREEMPTION BIT(2)
#define I915_ENGINE_HAS_SEMAPHORES BIT(3)
#define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4)
#define I915_ENGINE_IS_VIRTUAL BIT(5)
+#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(7)
unsigned int flags;

/*
@@ -526,9 +527,15 @@ struct intel_engine_cs {
};

static inline bool
-intel_engine_needs_cmd_parser(const struct intel_engine_cs *engine)
+intel_engine_using_cmd_parser(const struct intel_engine_cs *engine)
{
- return engine->flags & I915_ENGINE_NEEDS_CMD_PARSER;
+ return engine->flags & I915_ENGINE_USING_CMD_PARSER;
+}
+
+static inline bool
+intel_engine_requires_cmd_parser(const struct intel_engine_cs *engine)
+{
+ return engine->flags & I915_ENGINE_REQUIRES_CMD_PARSER;
}

static inline bool
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index 9f8f7f54191f..3f6f42592708 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -36,6 +36,9 @@ static int intel_gt_unpark(struct intel_wakeref *wf)
i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
GEM_BUG_ON(!i915->gt.awake);

+ if (NEEDS_RC6_CTX_CORRUPTION_WA(i915))
+ intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
+
intel_enable_gt_powersave(i915);

i915_update_gfx_val(i915);
@@ -70,6 +73,11 @@ static int intel_gt_park(struct intel_wakeref *wf)
if (INTEL_GEN(i915) >= 6)
gen6_rps_idle(i915);

+ if (NEEDS_RC6_CTX_CORRUPTION_WA(i915)) {
+ intel_rc6_ctx_wa_check(i915);
+ intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
+ }
+
GEM_BUG_ON(!wakeref);
intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index a28bcd2d7c09..a412e346b29c 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -52,13 +52,11 @@
* granting userspace undue privileges. There are three categories of privilege.
*
* First, commands which are explicitly defined as privileged or which should
- * only be used by the kernel driver. The parser generally rejects such
- * commands, though it may allow some from the drm master process.
+ * only be used by the kernel driver. The parser rejects such commands
*
* Second, commands which access registers. To support correct/enhanced
* userspace functionality, particularly certain OpenGL extensions, the parser
- * provides a whitelist of registers which userspace may safely access (for both
- * normal and drm master processes).
+ * provides a whitelist of registers which userspace may safely access
*
* Third, commands which access privileged memory (i.e. GGTT, HWS page, etc).
* The parser always rejects such commands.
@@ -83,9 +81,9 @@
* in the per-engine command tables.
*
* Other command table entries map fairly directly to high level categories
- * mentioned above: rejected, master-only, register whitelist. The parser
- * implements a number of checks, including the privileged memory checks, via a
- * general bitmasking mechanism.
+ * mentioned above: rejected, register whitelist. The parser implements a number
+ * of checks, including the privileged memory checks, via a general bitmasking
+ * mechanism.
*/

/*
@@ -103,8 +101,6 @@ struct drm_i915_cmd_descriptor {
* CMD_DESC_REJECT: The command is never allowed
* CMD_DESC_REGISTER: The command should be checked against the
* register whitelist for the appropriate ring
- * CMD_DESC_MASTER: The command is allowed if the submitting process
- * is the DRM master
*/
u32 flags;
#define CMD_DESC_FIXED (1<<0)
@@ -112,7 +108,6 @@ struct drm_i915_cmd_descriptor {
#define CMD_DESC_REJECT (1<<2)
#define CMD_DESC_REGISTER (1<<3)
#define CMD_DESC_BITMASK (1<<4)
-#define CMD_DESC_MASTER (1<<5)

/*
* The command's unique identification bits and the bitmask to get them.
@@ -193,7 +188,7 @@ struct drm_i915_cmd_table {
#define CMD(op, opm, f, lm, fl, ...) \
{ \
.flags = (fl) | ((f) ? CMD_DESC_FIXED : 0), \
- .cmd = { (op), ~0u << (opm) }, \
+ .cmd = { (op & ~0u << (opm)), ~0u << (opm) }, \
.length = { (lm) }, \
__VA_ARGS__ \
}
@@ -208,14 +203,13 @@ struct drm_i915_cmd_table {
#define R CMD_DESC_REJECT
#define W CMD_DESC_REGISTER
#define B CMD_DESC_BITMASK
-#define M CMD_DESC_MASTER

/* Command Mask Fixed Len Action
---------------------------------------------------------- */
-static const struct drm_i915_cmd_descriptor common_cmds[] = {
+static const struct drm_i915_cmd_descriptor gen7_common_cmds[] = {
CMD( MI_NOOP, SMI, F, 1, S ),
CMD( MI_USER_INTERRUPT, SMI, F, 1, R ),
- CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, M ),
+ CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, R ),
CMD( MI_ARB_CHECK, SMI, F, 1, S ),
CMD( MI_REPORT_HEAD, SMI, F, 1, S ),
CMD( MI_SUSPEND_FLUSH, SMI, F, 1, S ),
@@ -245,7 +239,7 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = {
CMD( MI_BATCH_BUFFER_START, SMI, !F, 0xFF, S ),
};

-static const struct drm_i915_cmd_descriptor render_cmds[] = {
+static const struct drm_i915_cmd_descriptor gen7_render_cmds[] = {
CMD( MI_FLUSH, SMI, F, 1, S ),
CMD( MI_ARB_ON_OFF, SMI, F, 1, R ),
CMD( MI_PREDICATE, SMI, F, 1, S ),
@@ -312,7 +306,7 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = {
CMD( MI_URB_ATOMIC_ALLOC, SMI, F, 1, S ),
CMD( MI_SET_APPID, SMI, F, 1, S ),
CMD( MI_RS_CONTEXT, SMI, F, 1, S ),
- CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, M ),
+ CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, R ),
CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ),
CMD( MI_LOAD_REGISTER_REG, SMI, !F, 0xFF, W,
.reg = { .offset = 1, .mask = 0x007FFFFC, .step = 1 } ),
@@ -329,7 +323,7 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = {
CMD( GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS, S3D, !F, 0x1FF, S ),
};

-static const struct drm_i915_cmd_descriptor video_cmds[] = {
+static const struct drm_i915_cmd_descriptor gen7_video_cmds[] = {
CMD( MI_ARB_ON_OFF, SMI, F, 1, R ),
CMD( MI_SET_APPID, SMI, F, 1, S ),
CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B,
@@ -373,7 +367,7 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = {
CMD( MFX_WAIT, SMFX, F, 1, S ),
};

-static const struct drm_i915_cmd_descriptor vecs_cmds[] = {
+static const struct drm_i915_cmd_descriptor gen7_vecs_cmds[] = {
CMD( MI_ARB_ON_OFF, SMI, F, 1, R ),
CMD( MI_SET_APPID, SMI, F, 1, S ),
CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B,
@@ -411,7 +405,7 @@ static const struct drm_i915_cmd_descriptor vecs_cmds[] = {
}}, ),
};

-static const struct drm_i915_cmd_descriptor blt_cmds[] = {
+static const struct drm_i915_cmd_descriptor gen7_blt_cmds[] = {
CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ),
CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, B,
.bits = {{
@@ -445,10 +439,64 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = {
};

static const struct drm_i915_cmd_descriptor hsw_blt_cmds[] = {
- CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, M ),
+ CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, R ),
CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ),
};

+/*
+ * For Gen9 we can still rely on the h/w to enforce cmd security, and only
+ * need to re-enforce the register access checks. We therefore only need to
+ * teach the cmdparser how to find the end of each command, and identify
+ * register accesses. The table doesn't need to reject any commands, and so
+ * the only commands listed here are:
+ * 1) Those that touch registers
+ * 2) Those that do not have the default 8-bit length
+ *
+ * Note that the default MI length mask chosen for this table is 0xFF, not
+ * the 0x3F used on older devices. This is because the vast majority of MI
+ * cmds on Gen9 use a standard 8-bit Length field.
+ * All the Gen9 blitter instructions are standard 0xFF length mask, and
+ * none allow access to non-general registers, so in fact no BLT cmds are
+ * included in the table at all.
+ *
+ */
+static const struct drm_i915_cmd_descriptor gen9_blt_cmds[] = {
+ CMD( MI_NOOP, SMI, F, 1, S ),
+ CMD( MI_USER_INTERRUPT, SMI, F, 1, S ),
+ CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, S ),
+ CMD( MI_FLUSH, SMI, F, 1, S ),
+ CMD( MI_ARB_CHECK, SMI, F, 1, S ),
+ CMD( MI_REPORT_HEAD, SMI, F, 1, S ),
+ CMD( MI_ARB_ON_OFF, SMI, F, 1, S ),
+ CMD( MI_SUSPEND_FLUSH, SMI, F, 1, S ),
+ CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, S ),
+ CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, S ),
+ CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, S ),
+ CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, W,
+ .reg = { .offset = 1, .mask = 0x007FFFFC, .step = 2 } ),
+ CMD( MI_UPDATE_GTT, SMI, !F, 0x3FF, S ),
+ CMD( MI_STORE_REGISTER_MEM_GEN8, SMI, F, 4, W,
+ .reg = { .offset = 1, .mask = 0x007FFFFC } ),
+ CMD( MI_FLUSH_DW, SMI, !F, 0x3F, S ),
+ CMD( MI_LOAD_REGISTER_MEM_GEN8, SMI, F, 4, W,
+ .reg = { .offset = 1, .mask = 0x007FFFFC } ),
+ CMD( MI_LOAD_REGISTER_REG, SMI, !F, 0xFF, W,
+ .reg = { .offset = 1, .mask = 0x007FFFFC, .step = 1 } ),
+
+ /*
+ * We allow BB_START but apply further checks. We just sanitize the
+ * basic fields here.
+ */
+#define MI_BB_START_OPERAND_MASK GENMASK(SMI-1, 0)
+#define MI_BB_START_OPERAND_EXPECT (MI_BATCH_PPGTT_HSW | 1)
+ CMD( MI_BATCH_BUFFER_START_GEN8, SMI, !F, 0xFF, B,
+ .bits = {{
+ .offset = 0,
+ .mask = MI_BB_START_OPERAND_MASK,
+ .expected = MI_BB_START_OPERAND_EXPECT,
+ }}, ),
+};
+
static const struct drm_i915_cmd_descriptor noop_desc =
CMD(MI_NOOP, SMI, F, 1, S);

@@ -462,40 +510,44 @@ static const struct drm_i915_cmd_descriptor noop_desc =
#undef R
#undef W
#undef B
-#undef M

-static const struct drm_i915_cmd_table gen7_render_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { render_cmds, ARRAY_SIZE(render_cmds) },
+static const struct drm_i915_cmd_table gen7_render_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_render_cmds, ARRAY_SIZE(gen7_render_cmds) },
};

-static const struct drm_i915_cmd_table hsw_render_ring_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { render_cmds, ARRAY_SIZE(render_cmds) },
+static const struct drm_i915_cmd_table hsw_render_ring_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_render_cmds, ARRAY_SIZE(gen7_render_cmds) },
{ hsw_render_cmds, ARRAY_SIZE(hsw_render_cmds) },
};

-static const struct drm_i915_cmd_table gen7_video_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { video_cmds, ARRAY_SIZE(video_cmds) },
+static const struct drm_i915_cmd_table gen7_video_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_video_cmds, ARRAY_SIZE(gen7_video_cmds) },
};

-static const struct drm_i915_cmd_table hsw_vebox_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { vecs_cmds, ARRAY_SIZE(vecs_cmds) },
+static const struct drm_i915_cmd_table hsw_vebox_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_vecs_cmds, ARRAY_SIZE(gen7_vecs_cmds) },
};

-static const struct drm_i915_cmd_table gen7_blt_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { blt_cmds, ARRAY_SIZE(blt_cmds) },
+static const struct drm_i915_cmd_table gen7_blt_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_blt_cmds, ARRAY_SIZE(gen7_blt_cmds) },
};

-static const struct drm_i915_cmd_table hsw_blt_ring_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { blt_cmds, ARRAY_SIZE(blt_cmds) },
+static const struct drm_i915_cmd_table hsw_blt_ring_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_blt_cmds, ARRAY_SIZE(gen7_blt_cmds) },
{ hsw_blt_cmds, ARRAY_SIZE(hsw_blt_cmds) },
};

+static const struct drm_i915_cmd_table gen9_blt_cmd_table[] = {
+ { gen9_blt_cmds, ARRAY_SIZE(gen9_blt_cmds) },
+};
+
+
/*
* Register whitelists, sorted by increasing register offset.
*/
@@ -611,17 +663,27 @@ static const struct drm_i915_reg_descriptor gen7_blt_regs[] = {
REG64_IDX(RING_TIMESTAMP, BLT_RING_BASE),
};

-static const struct drm_i915_reg_descriptor ivb_master_regs[] = {
- REG32(FORCEWAKE_MT),
- REG32(DERRMR),
- REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_A)),
- REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_B)),
- REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_C)),
-};
-
-static const struct drm_i915_reg_descriptor hsw_master_regs[] = {
- REG32(FORCEWAKE_MT),
- REG32(DERRMR),
+static const struct drm_i915_reg_descriptor gen9_blt_regs[] = {
+ REG64_IDX(RING_TIMESTAMP, RENDER_RING_BASE),
+ REG64_IDX(RING_TIMESTAMP, BSD_RING_BASE),
+ REG32(BCS_SWCTRL),
+ REG64_IDX(RING_TIMESTAMP, BLT_RING_BASE),
+ REG64_IDX(BCS_GPR, 0),
+ REG64_IDX(BCS_GPR, 1),
+ REG64_IDX(BCS_GPR, 2),
+ REG64_IDX(BCS_GPR, 3),
+ REG64_IDX(BCS_GPR, 4),
+ REG64_IDX(BCS_GPR, 5),
+ REG64_IDX(BCS_GPR, 6),
+ REG64_IDX(BCS_GPR, 7),
+ REG64_IDX(BCS_GPR, 8),
+ REG64_IDX(BCS_GPR, 9),
+ REG64_IDX(BCS_GPR, 10),
+ REG64_IDX(BCS_GPR, 11),
+ REG64_IDX(BCS_GPR, 12),
+ REG64_IDX(BCS_GPR, 13),
+ REG64_IDX(BCS_GPR, 14),
+ REG64_IDX(BCS_GPR, 15),
};

#undef REG64
@@ -630,28 +692,27 @@ static const struct drm_i915_reg_descriptor hsw_master_regs[] = {
struct drm_i915_reg_table {
const struct drm_i915_reg_descriptor *regs;
int num_regs;
- bool master;
};

static const struct drm_i915_reg_table ivb_render_reg_tables[] = {
- { gen7_render_regs, ARRAY_SIZE(gen7_render_regs), false },
- { ivb_master_regs, ARRAY_SIZE(ivb_master_regs), true },
+ { gen7_render_regs, ARRAY_SIZE(gen7_render_regs) },
};

static const struct drm_i915_reg_table ivb_blt_reg_tables[] = {
- { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs), false },
- { ivb_master_regs, ARRAY_SIZE(ivb_master_regs), true },
+ { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs) },
};

static const struct drm_i915_reg_table hsw_render_reg_tables[] = {
- { gen7_render_regs, ARRAY_SIZE(gen7_render_regs), false },
- { hsw_render_regs, ARRAY_SIZE(hsw_render_regs), false },
- { hsw_master_regs, ARRAY_SIZE(hsw_master_regs), true },
+ { gen7_render_regs, ARRAY_SIZE(gen7_render_regs) },
+ { hsw_render_regs, ARRAY_SIZE(hsw_render_regs) },
};

static const struct drm_i915_reg_table hsw_blt_reg_tables[] = {
- { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs), false },
- { hsw_master_regs, ARRAY_SIZE(hsw_master_regs), true },
+ { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs) },
+};
+
+static const struct drm_i915_reg_table gen9_blt_reg_tables[] = {
+ { gen9_blt_regs, ARRAY_SIZE(gen9_blt_regs) },
};

static u32 gen7_render_get_cmd_length_mask(u32 cmd_header)
@@ -709,6 +770,17 @@ static u32 gen7_blt_get_cmd_length_mask(u32 cmd_header)
return 0;
}

+static u32 gen9_blt_get_cmd_length_mask(u32 cmd_header)
+{
+ u32 client = cmd_header >> INSTR_CLIENT_SHIFT;
+
+ if (client == INSTR_MI_CLIENT || client == INSTR_BC_CLIENT)
+ return 0xFF;
+
+ DRM_DEBUG_DRIVER("CMD: Abnormal blt cmd length! 0x%08X\n", cmd_header);
+ return 0;
+}
+
static bool validate_cmds_sorted(const struct intel_engine_cs *engine,
const struct drm_i915_cmd_table *cmd_tables,
int cmd_table_count)
@@ -866,18 +938,19 @@ void intel_engine_init_cmd_parser(struct intel_engine_cs *engine)
int cmd_table_count;
int ret;

- if (!IS_GEN(engine->i915, 7))
+ if (!IS_GEN(engine->i915, 7) && !(IS_GEN(engine->i915, 9) &&
+ engine->class == COPY_ENGINE_CLASS))
return;

switch (engine->class) {
case RENDER_CLASS:
if (IS_HASWELL(engine->i915)) {
- cmd_tables = hsw_render_ring_cmds;
+ cmd_tables = hsw_render_ring_cmd_table;
cmd_table_count =
- ARRAY_SIZE(hsw_render_ring_cmds);
+ ARRAY_SIZE(hsw_render_ring_cmd_table);
} else {
- cmd_tables = gen7_render_cmds;
- cmd_table_count = ARRAY_SIZE(gen7_render_cmds);
+ cmd_tables = gen7_render_cmd_table;
+ cmd_table_count = ARRAY_SIZE(gen7_render_cmd_table);
}

if (IS_HASWELL(engine->i915)) {
@@ -887,36 +960,46 @@ void intel_engine_init_cmd_parser(struct intel_engine_cs *engine)
engine->reg_tables = ivb_render_reg_tables;
engine->reg_table_count = ARRAY_SIZE(ivb_render_reg_tables);
}
-
engine->get_cmd_length_mask = gen7_render_get_cmd_length_mask;
break;
case VIDEO_DECODE_CLASS:
- cmd_tables = gen7_video_cmds;
- cmd_table_count = ARRAY_SIZE(gen7_video_cmds);
+ cmd_tables = gen7_video_cmd_table;
+ cmd_table_count = ARRAY_SIZE(gen7_video_cmd_table);
engine->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask;
break;
case COPY_ENGINE_CLASS:
- if (IS_HASWELL(engine->i915)) {
- cmd_tables = hsw_blt_ring_cmds;
- cmd_table_count = ARRAY_SIZE(hsw_blt_ring_cmds);
+ engine->get_cmd_length_mask = gen7_blt_get_cmd_length_mask;
+ if (IS_GEN(engine->i915, 9)) {
+ cmd_tables = gen9_blt_cmd_table;
+ cmd_table_count = ARRAY_SIZE(gen9_blt_cmd_table);
+ engine->get_cmd_length_mask =
+ gen9_blt_get_cmd_length_mask;
+
+ /* BCS Engine unsafe without parser */
+ engine->flags |= I915_ENGINE_REQUIRES_CMD_PARSER;
+ } else if (IS_HASWELL(engine->i915)) {
+ cmd_tables = hsw_blt_ring_cmd_table;
+ cmd_table_count = ARRAY_SIZE(hsw_blt_ring_cmd_table);
} else {
- cmd_tables = gen7_blt_cmds;
- cmd_table_count = ARRAY_SIZE(gen7_blt_cmds);
+ cmd_tables = gen7_blt_cmd_table;
+ cmd_table_count = ARRAY_SIZE(gen7_blt_cmd_table);
}

- if (IS_HASWELL(engine->i915)) {
+ if (IS_GEN(engine->i915, 9)) {
+ engine->reg_tables = gen9_blt_reg_tables;
+ engine->reg_table_count =
+ ARRAY_SIZE(gen9_blt_reg_tables);
+ } else if (IS_HASWELL(engine->i915)) {
engine->reg_tables = hsw_blt_reg_tables;
engine->reg_table_count = ARRAY_SIZE(hsw_blt_reg_tables);
} else {
engine->reg_tables = ivb_blt_reg_tables;
engine->reg_table_count = ARRAY_SIZE(ivb_blt_reg_tables);
}
-
- engine->get_cmd_length_mask = gen7_blt_get_cmd_length_mask;
break;
case VIDEO_ENHANCEMENT_CLASS:
- cmd_tables = hsw_vebox_cmds;
- cmd_table_count = ARRAY_SIZE(hsw_vebox_cmds);
+ cmd_tables = hsw_vebox_cmd_table;
+ cmd_table_count = ARRAY_SIZE(hsw_vebox_cmd_table);
/* VECS can use the same length_mask function as VCS */
engine->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask;
break;
@@ -942,7 +1025,7 @@ void intel_engine_init_cmd_parser(struct intel_engine_cs *engine)
return;
}

- engine->flags |= I915_ENGINE_NEEDS_CMD_PARSER;
+ engine->flags |= I915_ENGINE_USING_CMD_PARSER;
}

/**
@@ -954,7 +1037,7 @@ void intel_engine_init_cmd_parser(struct intel_engine_cs *engine)
*/
void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine)
{
- if (!intel_engine_needs_cmd_parser(engine))
+ if (!intel_engine_using_cmd_parser(engine))
return;

fini_hash_table(engine);
@@ -1028,22 +1111,16 @@ __find_reg(const struct drm_i915_reg_descriptor *table, int count, u32 addr)
}

static const struct drm_i915_reg_descriptor *
-find_reg(const struct intel_engine_cs *engine, bool is_master, u32 addr)
+find_reg(const struct intel_engine_cs *engine, u32 addr)
{
const struct drm_i915_reg_table *table = engine->reg_tables;
+ const struct drm_i915_reg_descriptor *reg = NULL;
int count = engine->reg_table_count;

- for (; count > 0; ++table, --count) {
- if (!table->master || is_master) {
- const struct drm_i915_reg_descriptor *reg;
+ for (; !reg && (count > 0); ++table, --count)
+ reg = __find_reg(table->regs, table->num_regs, addr);

- reg = __find_reg(table->regs, table->num_regs, addr);
- if (reg != NULL)
- return reg;
- }
- }
-
- return NULL;
+ return reg;
}

/* Returns a vmap'd pointer to dst_obj, which the caller must unmap */
@@ -1127,8 +1204,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,

static bool check_cmd(const struct intel_engine_cs *engine,
const struct drm_i915_cmd_descriptor *desc,
- const u32 *cmd, u32 length,
- const bool is_master)
+ const u32 *cmd, u32 length)
{
if (desc->flags & CMD_DESC_SKIP)
return true;
@@ -1138,12 +1214,6 @@ static bool check_cmd(const struct intel_engine_cs *engine,
return false;
}

- if ((desc->flags & CMD_DESC_MASTER) && !is_master) {
- DRM_DEBUG_DRIVER("CMD: Rejected master-only command: 0x%08X\n",
- *cmd);
- return false;
- }
-
if (desc->flags & CMD_DESC_REGISTER) {
/*
* Get the distance between individual register offset
@@ -1157,7 +1227,7 @@ static bool check_cmd(const struct intel_engine_cs *engine,
offset += step) {
const u32 reg_addr = cmd[offset] & desc->reg.mask;
const struct drm_i915_reg_descriptor *reg =
- find_reg(engine, is_master, reg_addr);
+ find_reg(engine, reg_addr);

if (!reg) {
DRM_DEBUG_DRIVER("CMD: Rejected register 0x%08X in command: 0x%08X (%s)\n",
@@ -1235,16 +1305,112 @@ static bool check_cmd(const struct intel_engine_cs *engine,
return true;
}

+static int check_bbstart(const struct i915_gem_context *ctx,
+ u32 *cmd, u32 offset, u32 length,
+ u32 batch_len,
+ u64 batch_start,
+ u64 shadow_batch_start)
+{
+ u64 jump_offset, jump_target;
+ u32 target_cmd_offset, target_cmd_index;
+
+ /* For igt compatibility on older platforms */
+ if (CMDPARSER_USES_GGTT(ctx->i915)) {
+ DRM_DEBUG("CMD: Rejecting BB_START for ggtt based submission\n");
+ return -EACCES;
+ }
+
+ if (length != 3) {
+ DRM_DEBUG("CMD: Recursive BB_START with bad length(%u)\n",
+ length);
+ return -EINVAL;
+ }
+
+ jump_target = *(u64*)(cmd+1);
+ jump_offset = jump_target - batch_start;
+
+ /*
+ * Any underflow of jump_target is guaranteed to be outside the range
+ * of a u32, so >= test catches both too large and too small
+ */
+ if (jump_offset >= batch_len) {
+ DRM_DEBUG("CMD: BB_START to 0x%llx jumps out of BB\n",
+ jump_target);
+ return -EINVAL;
+ }
+
+ /*
+ * This cannot overflow a u32 because we already checked jump_offset
+ * is within the BB, and the batch_len is a u32
+ */
+ target_cmd_offset = lower_32_bits(jump_offset);
+ target_cmd_index = target_cmd_offset / sizeof(u32);
+
+ *(u64*)(cmd + 1) = shadow_batch_start + target_cmd_offset;
+
+ if (target_cmd_index == offset)
+ return 0;
+
+ if (ctx->jump_whitelist_cmds <= target_cmd_index) {
+ DRM_DEBUG("CMD: Rejecting BB_START - truncated whitelist array\n");
+ return -EINVAL;
+ } else if (!test_bit(target_cmd_index, ctx->jump_whitelist)) {
+ DRM_DEBUG("CMD: BB_START to 0x%llx not a previously executed cmd\n",
+ jump_target);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static void init_whitelist(struct i915_gem_context *ctx, u32 batch_len)
+{
+ const u32 batch_cmds = DIV_ROUND_UP(batch_len, sizeof(u32));
+ const u32 exact_size = BITS_TO_LONGS(batch_cmds);
+ u32 next_size = BITS_TO_LONGS(roundup_pow_of_two(batch_cmds));
+ unsigned long *next_whitelist;
+
+ if (CMDPARSER_USES_GGTT(ctx->i915))
+ return;
+
+ if (batch_cmds <= ctx->jump_whitelist_cmds) {
+ bitmap_zero(ctx->jump_whitelist, batch_cmds);
+ return;
+ }
+
+again:
+ next_whitelist = kcalloc(next_size, sizeof(long), GFP_KERNEL);
+ if (next_whitelist) {
+ kfree(ctx->jump_whitelist);
+ ctx->jump_whitelist = next_whitelist;
+ ctx->jump_whitelist_cmds =
+ next_size * BITS_PER_BYTE * sizeof(long);
+ return;
+ }
+
+ if (next_size > exact_size) {
+ next_size = exact_size;
+ goto again;
+ }
+
+ DRM_DEBUG("CMD: Failed to extend whitelist. BB_START may be disallowed\n");
+ bitmap_zero(ctx->jump_whitelist, ctx->jump_whitelist_cmds);
+
+ return;
+}
+
#define LENGTH_BIAS 2

/**
* i915_parse_cmds() - parse a submitted batch buffer for privilege violations
+ * @ctx: the context in which the batch is to execute
* @engine: the engine on which the batch is to execute
* @batch_obj: the batch buffer in question
- * @shadow_batch_obj: copy of the batch buffer in question
+ * @batch_start: Canonical base address of batch
* @batch_start_offset: byte offset in the batch at which execution starts
* @batch_len: length of the commands in batch_obj
- * @is_master: is the submitting process the drm master?
+ * @shadow_batch_obj: copy of the batch buffer in question
+ * @shadow_batch_start: Canonical base address of shadow_batch_obj
*
* Parses the specified batch buffer looking for privilege violations as
* described in the overview.
@@ -1252,14 +1418,17 @@ static bool check_cmd(const struct intel_engine_cs *engine,
* Return: non-zero if the parser finds violations or otherwise fails; -EACCES
* if the batch appears legal but should use hardware parsing
*/
-int intel_engine_cmd_parser(struct intel_engine_cs *engine,
+
+int intel_engine_cmd_parser(struct i915_gem_context *ctx,
+ struct intel_engine_cs *engine,
struct drm_i915_gem_object *batch_obj,
- struct drm_i915_gem_object *shadow_batch_obj,
+ u64 batch_start,
u32 batch_start_offset,
u32 batch_len,
- bool is_master)
+ struct drm_i915_gem_object *shadow_batch_obj,
+ u64 shadow_batch_start)
{
- u32 *cmd, *batch_end;
+ u32 *cmd, *batch_end, offset = 0;
struct drm_i915_cmd_descriptor default_desc = noop_desc;
const struct drm_i915_cmd_descriptor *desc = &default_desc;
bool needs_clflush_after = false;
@@ -1273,6 +1442,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
return PTR_ERR(cmd);
}

+ init_whitelist(ctx, batch_len);
+
/*
* We use the batch length as size because the shadow object is as
* large or larger and copy_batch() will write MI_NOPs to the extra
@@ -1282,31 +1453,15 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
do {
u32 length;

- if (*cmd == MI_BATCH_BUFFER_END) {
- if (needs_clflush_after) {
- void *ptr = page_mask_bits(shadow_batch_obj->mm.mapping);
- drm_clflush_virt_range(ptr,
- (void *)(cmd + 1) - ptr);
- }
+ if (*cmd == MI_BATCH_BUFFER_END)
break;
- }

desc = find_cmd(engine, *cmd, desc, &default_desc);
if (!desc) {
DRM_DEBUG_DRIVER("CMD: Unrecognized command: 0x%08X\n",
*cmd);
ret = -EINVAL;
- break;
- }
-
- /*
- * If the batch buffer contains a chained batch, return an
- * error that tells the caller to abort and dispatch the
- * workload as a non-secure batch.
- */
- if (desc->cmd.value == MI_BATCH_BUFFER_START) {
- ret = -EACCES;
- break;
+ goto err;
}

if (desc->flags & CMD_DESC_FIXED)
@@ -1320,22 +1475,43 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
length,
batch_end - cmd);
ret = -EINVAL;
- break;
+ goto err;
}

- if (!check_cmd(engine, desc, cmd, length, is_master)) {
+ if (!check_cmd(engine, desc, cmd, length)) {
ret = -EACCES;
+ goto err;
+ }
+
+ if (desc->cmd.value == MI_BATCH_BUFFER_START) {
+ ret = check_bbstart(ctx, cmd, offset, length,
+ batch_len, batch_start,
+ shadow_batch_start);
+
+ if (ret)
+ goto err;
break;
}

+ if (ctx->jump_whitelist_cmds > offset)
+ set_bit(offset, ctx->jump_whitelist);
+
cmd += length;
+ offset += length;
if (cmd >= batch_end) {
DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n");
ret = -EINVAL;
- break;
+ goto err;
}
} while (1);

+ if (needs_clflush_after) {
+ void *ptr = page_mask_bits(shadow_batch_obj->mm.mapping);
+
+ drm_clflush_virt_range(ptr, (void *)(cmd + 1) - ptr);
+ }
+
+err:
i915_gem_object_unpin_map(shadow_batch_obj);
return ret;
}
@@ -1357,7 +1533,7 @@ int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv)

/* If the command parser is not enabled, report 0 - unsupported */
for_each_engine(engine, dev_priv, id) {
- if (intel_engine_needs_cmd_parser(engine)) {
+ if (intel_engine_using_cmd_parser(engine)) {
active = true;
break;
}
@@ -1382,6 +1558,7 @@ int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv)
* the parser enabled.
* 9. Don't whitelist or handle oacontrol specially, as ownership
* for oacontrol state is moving to i915-perf.
+ * 10. Support for Gen9 BCS Parsing
*/
- return 9;
+ return 10;
}
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 5b895df09ebf..b6d51514cf9c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -387,7 +387,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
value = !!(dev_priv->caps.scheduler & I915_SCHEDULER_CAP_SEMAPHORES);
break;
case I915_PARAM_HAS_SECURE_BATCHES:
- value = capable(CAP_SYS_ADMIN);
+ value = HAS_SECURE_BATCHES(dev_priv) && capable(CAP_SYS_ADMIN);
break;
case I915_PARAM_CMD_PARSER_VERSION:
value = i915_cmd_parser_get_version(dev_priv);
@@ -2156,6 +2156,8 @@ static int i915_drm_suspend_late(struct drm_device *dev, bool hibernation)

i915_gem_suspend_late(dev_priv);

+ intel_rc6_ctx_wa_suspend(dev_priv);
+
intel_uncore_suspend(&dev_priv->uncore);

intel_power_domains_suspend(dev_priv,
@@ -2372,6 +2374,8 @@ static int i915_drm_resume_early(struct drm_device *dev)

intel_power_domains_resume(dev_priv);

+ intel_rc6_ctx_wa_resume(dev_priv);
+
intel_gt_sanitize(dev_priv, true);

enable_rpm_wakeref_asserts(&dev_priv->runtime_pm);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index edb88406cb75..a992f0749859 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -696,6 +696,8 @@ struct intel_rps {

struct intel_rc6 {
bool enabled;
+ bool ctx_corrupted;
+ intel_wakeref_t ctx_corrupted_wakeref;
u64 prev_hw_residency[4];
u64 cur_residency[4];
};
@@ -2246,9 +2248,16 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
#define VEBOX_MASK(dev_priv) \
ENGINE_INSTANCES_MASK(dev_priv, VECS0, I915_MAX_VECS)

+/*
+ * The Gen7 cmdparser copies the scanned buffer to the ggtt for execution
+ * All later gens can run the final buffer from the ppgtt
+ */
+#define CMDPARSER_USES_GGTT(dev_priv) IS_GEN(dev_priv, 7)
+
#define HAS_LLC(dev_priv) (INTEL_INFO(dev_priv)->has_llc)
#define HAS_SNOOP(dev_priv) (INTEL_INFO(dev_priv)->has_snoop)
#define HAS_EDRAM(dev_priv) ((dev_priv)->edram_size_mb)
+#define HAS_SECURE_BATCHES(dev_priv) (INTEL_GEN(dev_priv) < 6)
#define HAS_WT(dev_priv) ((IS_HASWELL(dev_priv) || \
IS_BROADWELL(dev_priv)) && HAS_EDRAM(dev_priv))

@@ -2281,10 +2290,12 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
/* Early gen2 have a totally busted CS tlb and require pinned batches. */
#define HAS_BROKEN_CS_TLB(dev_priv) (IS_I830(dev_priv) || IS_I845G(dev_priv))

+#define NEEDS_RC6_CTX_CORRUPTION_WA(dev_priv) \
+ (IS_BROADWELL(dev_priv) || IS_GEN(dev_priv, 9))
+
/* WaRsDisableCoarsePowerGating:skl,cnl */
#define NEEDS_WaRsDisableCoarsePowerGating(dev_priv) \
- (IS_CANNONLAKE(dev_priv) || \
- IS_SKL_GT3(dev_priv) || IS_SKL_GT4(dev_priv))
+ (IS_CANNONLAKE(dev_priv) || IS_GEN(dev_priv, 9))

#define HAS_GMBUS_IRQ(dev_priv) (INTEL_GEN(dev_priv) >= 4)
#define HAS_GMBUS_BURST_READ(dev_priv) (INTEL_GEN(dev_priv) >= 10 || \
@@ -2528,6 +2539,14 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,

int i915_gem_object_unbind(struct drm_i915_gem_object *obj);

+struct i915_vma * __must_check
+i915_gem_object_pin(struct drm_i915_gem_object *obj,
+ struct i915_address_space *vm,
+ const struct i915_ggtt_view *view,
+ u64 size,
+ u64 alignment,
+ u64 flags);
+
void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv);

static inline int __must_check
@@ -2712,12 +2731,14 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type);
int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv);
void intel_engine_init_cmd_parser(struct intel_engine_cs *engine);
void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine);
-int intel_engine_cmd_parser(struct intel_engine_cs *engine,
+int intel_engine_cmd_parser(struct i915_gem_context *cxt,
+ struct intel_engine_cs *engine,
struct drm_i915_gem_object *batch_obj,
- struct drm_i915_gem_object *shadow_batch_obj,
+ u64 user_batch_start,
u32 batch_start_offset,
u32 batch_len,
- bool is_master);
+ struct drm_i915_gem_object *shadow_batch_obj,
+ u64 shadow_batch_start);

/* i915_perf.c */
extern void i915_perf_init(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7f6af4ca0968..a88e34b9350f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1025,6 +1025,20 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
{
struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
struct i915_address_space *vm = &dev_priv->ggtt.vm;
+
+ return i915_gem_object_pin(obj, vm, view, size, alignment,
+ flags | PIN_GLOBAL);
+}
+
+struct i915_vma *
+i915_gem_object_pin(struct drm_i915_gem_object *obj,
+ struct i915_address_space *vm,
+ const struct i915_ggtt_view *view,
+ u64 size,
+ u64 alignment,
+ u64 flags)
+{
+ struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
struct i915_vma *vma;
int ret;

@@ -1091,7 +1105,7 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
return ERR_PTR(ret);
}

- ret = i915_vma_pin(vma, size, alignment, flags | PIN_GLOBAL);
+ ret = i915_vma_pin(vma, size, alignment, flags);
if (ret)
return ERR_PTR(ret);

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d6483b5dc8e5..177a26275811 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -493,6 +493,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
#define ECOCHK_PPGTT_WT_HSW (0x2 << 3)
#define ECOCHK_PPGTT_WB_HSW (0x3 << 3)

+#define GEN8_RC6_CTX_INFO _MMIO(0x8504)
+
#define GAC_ECO_BITS _MMIO(0x14090)
#define ECOBITS_SNB_BIT (1 << 13)
#define ECOBITS_PPGTT_CACHE64B (3 << 8)
@@ -577,6 +579,10 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
*/
#define BCS_SWCTRL _MMIO(0x22200)

+/* There are 16 GPR registers */
+#define BCS_GPR(n) _MMIO(0x22600 + (n) * 8)
+#define BCS_GPR_UDW(n) _MMIO(0x22600 + (n) * 8 + 4)
+
#define GPGPU_THREADS_DISPATCHED _MMIO(0x2290)
#define GPGPU_THREADS_DISPATCHED_UDW _MMIO(0x2290 + 4)
#define HS_INVOCATION_COUNT _MMIO(0x2300)
@@ -7229,6 +7235,10 @@ enum {
#define SKL_CSR_DC5_DC6_COUNT _MMIO(0x8002C)
#define BXT_CSR_DC3_DC5_COUNT _MMIO(0x80038)

+/* Display Internal Timeout Register */
+#define RM_TIMEOUT _MMIO(0x42060)
+#define MMIO_TIMEOUT_US(us) ((us) << 0)
+
/* interrupts */
#define DE_MASTER_IRQ_CONTROL (1 << 31)
#define DE_SPRITEB_FLIP_DONE (1 << 29)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d9a7a13ce32a..9546900fd72b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -125,6 +125,14 @@ static void bxt_init_clock_gating(struct drm_i915_private *dev_priv)
*/
I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
PWM1_GATING_DIS | PWM2_GATING_DIS);
+
+ /*
+ * Lower the display internal timeout.
+ * This is needed to avoid any hard hangs when DSI port PLL
+ * is off and a MMIO access is attempted by any privilege
+ * application, using batch buffers or any other means.
+ */
+ I915_WRITE(RM_TIMEOUT, MMIO_TIMEOUT_US(950));
}

static void glk_init_clock_gating(struct drm_i915_private *dev_priv)
@@ -8556,6 +8564,95 @@ static void intel_init_emon(struct drm_i915_private *dev_priv)
dev_priv->ips.corr = (lcfuse & LCFUSE_HIV_MASK);
}

+static bool intel_rc6_ctx_corrupted(struct drm_i915_private *dev_priv)
+{
+ return !I915_READ(GEN8_RC6_CTX_INFO);
+}
+
+static void intel_rc6_ctx_wa_init(struct drm_i915_private *i915)
+{
+ if (!NEEDS_RC6_CTX_CORRUPTION_WA(i915))
+ return;
+
+ if (intel_rc6_ctx_corrupted(i915)) {
+ DRM_INFO("RC6 context corrupted, disabling runtime power management\n");
+ i915->gt_pm.rc6.ctx_corrupted = true;
+ i915->gt_pm.rc6.ctx_corrupted_wakeref =
+ intel_runtime_pm_get(&i915->runtime_pm);
+ }
+}
+
+static void intel_rc6_ctx_wa_cleanup(struct drm_i915_private *i915)
+{
+ if (i915->gt_pm.rc6.ctx_corrupted) {
+ intel_runtime_pm_put(&i915->runtime_pm,
+ i915->gt_pm.rc6.ctx_corrupted_wakeref);
+ i915->gt_pm.rc6.ctx_corrupted = false;
+ }
+}
+
+/**
+ * intel_rc6_ctx_wa_suspend - system suspend sequence for the RC6 CTX WA
+ * @i915: i915 device
+ *
+ * Perform any steps needed to clean up the RC6 CTX WA before system suspend.
+ */
+void intel_rc6_ctx_wa_suspend(struct drm_i915_private *i915)
+{
+ if (i915->gt_pm.rc6.ctx_corrupted)
+ intel_runtime_pm_put(&i915->runtime_pm,
+ i915->gt_pm.rc6.ctx_corrupted_wakeref);
+}
+
+/**
+ * intel_rc6_ctx_wa_resume - system resume sequence for the RC6 CTX WA
+ * @i915: i915 device
+ *
+ * Perform any steps needed to re-init the RC6 CTX WA after system resume.
+ */
+void intel_rc6_ctx_wa_resume(struct drm_i915_private *i915)
+{
+ if (!i915->gt_pm.rc6.ctx_corrupted)
+ return;
+
+ if (intel_rc6_ctx_corrupted(i915)) {
+ i915->gt_pm.rc6.ctx_corrupted_wakeref =
+ intel_runtime_pm_get(&i915->runtime_pm);
+ return;
+ }
+
+ DRM_INFO("RC6 context restored, re-enabling runtime power management\n");
+ i915->gt_pm.rc6.ctx_corrupted = false;
+}
+
+static void intel_disable_rc6(struct drm_i915_private *dev_priv);
+
+/**
+ * intel_rc6_ctx_wa_check - check for a new RC6 CTX corruption
+ * @i915: i915 device
+ *
+ * Check if an RC6 CTX corruption has happened since the last check and if so
+ * disable RC6 and runtime power management.
+*/
+void intel_rc6_ctx_wa_check(struct drm_i915_private *i915)
+{
+ if (!NEEDS_RC6_CTX_CORRUPTION_WA(i915))
+ return;
+
+ if (i915->gt_pm.rc6.ctx_corrupted)
+ return;
+
+ if (!intel_rc6_ctx_corrupted(i915))
+ return;
+
+ DRM_NOTE("RC6 context corruption, disabling runtime power management\n");
+
+ intel_disable_rc6(i915);
+ i915->gt_pm.rc6.ctx_corrupted = true;
+ i915->gt_pm.rc6.ctx_corrupted_wakeref =
+ intel_runtime_pm_get_noresume(&i915->runtime_pm);
+}
+
void intel_init_gt_powersave(struct drm_i915_private *dev_priv)
{
struct intel_rps *rps = &dev_priv->gt_pm.rps;
@@ -8569,6 +8666,8 @@ void intel_init_gt_powersave(struct drm_i915_private *dev_priv)
pm_runtime_get(&dev_priv->drm.pdev->dev);
}

+ intel_rc6_ctx_wa_init(dev_priv);
+
/* Initialize RPS limits (for userspace) */
if (IS_CHERRYVIEW(dev_priv))
cherryview_init_gt_powersave(dev_priv);
@@ -8607,6 +8706,8 @@ void intel_cleanup_gt_powersave(struct drm_i915_private *dev_priv)
if (IS_VALLEYVIEW(dev_priv))
valleyview_cleanup_gt_powersave(dev_priv);

+ intel_rc6_ctx_wa_cleanup(dev_priv);
+
if (!HAS_RC6(dev_priv))
pm_runtime_put(&dev_priv->drm.pdev->dev);
}
@@ -8635,7 +8736,7 @@ static inline void intel_disable_llc_pstate(struct drm_i915_private *i915)
i915->gt_pm.llc_pstate.enabled = false;
}

-static void intel_disable_rc6(struct drm_i915_private *dev_priv)
+static void __intel_disable_rc6(struct drm_i915_private *dev_priv)
{
lockdep_assert_held(&dev_priv->gt_pm.rps.lock);

@@ -8654,6 +8755,13 @@ static void intel_disable_rc6(struct drm_i915_private *dev_priv)
dev_priv->gt_pm.rc6.enabled = false;
}

+static void intel_disable_rc6(struct drm_i915_private *dev_priv)
+{
+ mutex_lock(&dev_priv->gt_pm.rps.lock);
+ __intel_disable_rc6(dev_priv);
+ mutex_unlock(&dev_priv->gt_pm.rps.lock);
+}
+
static void intel_disable_rps(struct drm_i915_private *dev_priv)
{
lockdep_assert_held(&dev_priv->gt_pm.rps.lock);
@@ -8679,7 +8787,7 @@ void intel_disable_gt_powersave(struct drm_i915_private *dev_priv)
{
mutex_lock(&dev_priv->gt_pm.rps.lock);

- intel_disable_rc6(dev_priv);
+ __intel_disable_rc6(dev_priv);
intel_disable_rps(dev_priv);
if (HAS_LLC(dev_priv))
intel_disable_llc_pstate(dev_priv);
@@ -8706,6 +8814,9 @@ static void intel_enable_rc6(struct drm_i915_private *dev_priv)
if (dev_priv->gt_pm.rc6.enabled)
return;

+ if (dev_priv->gt_pm.rc6.ctx_corrupted)
+ return;
+
if (IS_CHERRYVIEW(dev_priv))
cherryview_enable_rc6(dev_priv);
else if (IS_VALLEYVIEW(dev_priv))
diff --git a/drivers/gpu/drm/i915/intel_pm.h b/drivers/gpu/drm/i915/intel_pm.h
index 1b489fa399e1..4ccb5a53b61c 100644
--- a/drivers/gpu/drm/i915/intel_pm.h
+++ b/drivers/gpu/drm/i915/intel_pm.h
@@ -36,6 +36,9 @@ void intel_cleanup_gt_powersave(struct drm_i915_private *dev_priv);
void intel_sanitize_gt_powersave(struct drm_i915_private *dev_priv);
void intel_enable_gt_powersave(struct drm_i915_private *dev_priv);
void intel_disable_gt_powersave(struct drm_i915_private *dev_priv);
+void intel_rc6_ctx_wa_check(struct drm_i915_private *i915);
+void intel_rc6_ctx_wa_suspend(struct drm_i915_private *i915);
+void intel_rc6_ctx_wa_resume(struct drm_i915_private *i915);
void gen6_rps_busy(struct drm_i915_private *dev_priv);
void gen6_rps_idle(struct drm_i915_private *dev_priv);
void gen6_rps_boost(struct i915_request *rq);
diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
index 460fd98e40a7..a0b382a637a6 100644
--- a/drivers/gpu/drm/radeon/si_dpm.c
+++ b/drivers/gpu/drm/radeon/si_dpm.c
@@ -1958,6 +1958,7 @@ static void si_initialize_powertune_defaults(struct radeon_device *rdev)
case 0x682C:
si_pi->cac_weights = cac_weights_cape_verde_pro;
si_pi->dte_data = dte_data_sun_xt;
+ update_dte_from_pl2 = true;
break;
case 0x6825:
case 0x6827:
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index c1058eece16b..27e6449da24a 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -478,6 +478,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
struct drm_sched_job *s_job, *tmp;
uint64_t guilty_context;
bool found_guilty = false;
+ struct dma_fence *fence;

list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
struct drm_sched_fence *s_fence = s_job->s_fence;
@@ -491,7 +492,16 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
dma_fence_set_error(&s_fence->finished, -ECANCELED);

dma_fence_put(s_job->s_fence->parent);
- s_job->s_fence->parent = sched->ops->run_job(s_job);
+ fence = sched->ops->run_job(s_job);
+
+ if (IS_ERR_OR_NULL(fence)) {
+ s_job->s_fence->parent = NULL;
+ dma_fence_set_error(&s_fence->finished, PTR_ERR(fence));
+ } else {
+ s_job->s_fence->parent = fence;
+ }
+
+
}
}
EXPORT_SYMBOL(drm_sched_resubmit_jobs);
@@ -719,7 +729,7 @@ static int drm_sched_main(void *param)
fence = sched->ops->run_job(sched_job);
drm_sched_fence_scheduled(s_fence);

- if (fence) {
+ if (!IS_ERR_OR_NULL(fence)) {
s_fence->parent = dma_fence_get(fence);
r = dma_fence_add_callback(fence, &sched_job->cb,
drm_sched_process_job);
@@ -729,8 +739,11 @@ static int drm_sched_main(void *param)
DRM_ERROR("fence add callback failed (%d)\n",
r);
dma_fence_put(fence);
- } else
+ } else {
+
+ dma_fence_set_error(&s_fence->finished, PTR_ERR(fence));
drm_sched_process_job(NULL, &sched_job->cb);
+ }

wake_up(&sched->job_scheduled);
}
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 27e0f87075d9..4dc7e38c99c7 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -555,13 +555,16 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,

if (args->bcl_start != args->bcl_end) {
bin = kcalloc(1, sizeof(*bin), GFP_KERNEL);
- if (!bin)
+ if (!bin) {
+ v3d_job_put(&render->base);
return -ENOMEM;
+ }

ret = v3d_job_init(v3d, file_priv, &bin->base,
v3d_job_free, args->in_sync_bcl);
if (ret) {
v3d_job_put(&render->base);
+ kfree(bin);
return ret;
}

diff --git a/drivers/hid/hid-google-hammer.c b/drivers/hid/hid-google-hammer.c
index ee5e0bdcf078..154f1ce771d5 100644
--- a/drivers/hid/hid-google-hammer.c
+++ b/drivers/hid/hid-google-hammer.c
@@ -469,6 +469,10 @@ static int hammer_probe(struct hid_device *hdev,
static const struct hid_device_id hammer_devices[] = {
{ HID_DEVICE(BUS_USB, HID_GROUP_GENERIC,
USB_VENDOR_ID_GOOGLE, USB_DEVICE_ID_GOOGLE_HAMMER) },
+ { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC,
+ USB_VENDOR_ID_GOOGLE, USB_DEVICE_ID_GOOGLE_MAGNEMITE) },
+ { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC,
+ USB_VENDOR_ID_GOOGLE, USB_DEVICE_ID_GOOGLE_MASTERBALL) },
{ HID_DEVICE(BUS_USB, HID_GROUP_GENERIC,
USB_VENDOR_ID_GOOGLE, USB_DEVICE_ID_GOOGLE_STAFF) },
{ HID_DEVICE(BUS_USB, HID_GROUP_GENERIC,
diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
index e4d51ce20a6a..9cf5a95c1bd3 100644
--- a/drivers/hid/hid-ids.h
+++ b/drivers/hid/hid-ids.h
@@ -475,6 +475,8 @@
#define USB_DEVICE_ID_GOOGLE_STAFF 0x502b
#define USB_DEVICE_ID_GOOGLE_WAND 0x502d
#define USB_DEVICE_ID_GOOGLE_WHISKERS 0x5030
+#define USB_DEVICE_ID_GOOGLE_MASTERBALL 0x503c
+#define USB_DEVICE_ID_GOOGLE_MAGNEMITE 0x503d

#define USB_VENDOR_ID_GOTOP 0x08f2
#define USB_DEVICE_ID_SUPER_Q2 0x007f
diff --git a/drivers/hid/intel-ish-hid/ishtp/client-buffers.c b/drivers/hid/intel-ish-hid/ishtp/client-buffers.c
index 1b0a0cc605e7..513d7a4a1b8a 100644
--- a/drivers/hid/intel-ish-hid/ishtp/client-buffers.c
+++ b/drivers/hid/intel-ish-hid/ishtp/client-buffers.c
@@ -84,7 +84,7 @@ int ishtp_cl_alloc_tx_ring(struct ishtp_cl *cl)
return 0;
out:
dev_err(&cl->device->dev, "error in allocating Tx pool\n");
- ishtp_cl_free_rx_ring(cl);
+ ishtp_cl_free_tx_ring(cl);
return -ENOMEM;
}

diff --git a/drivers/hid/wacom.h b/drivers/hid/wacom.h
index 4a7f8d363220..203d27d198b8 100644
--- a/drivers/hid/wacom.h
+++ b/drivers/hid/wacom.h
@@ -202,6 +202,21 @@ static inline void wacom_schedule_work(struct wacom_wac *wacom_wac,
}
}

+/*
+ * Convert a signed 32-bit integer to an unsigned n-bit integer. Undoes
+ * the normally-helpful work of 'hid_snto32' for fields that use signed
+ * ranges for questionable reasons.
+ */
+static inline __u32 wacom_s32tou(s32 value, __u8 n)
+{
+ switch (n) {
+ case 8: return ((__u8)value);
+ case 16: return ((__u16)value);
+ case 32: return ((__u32)value);
+ }
+ return value & (1 << (n - 1)) ? value & (~(~0U << n)) : value;
+}
+
extern const struct hid_device_id wacom_ids[];

void wacom_wac_irq(struct wacom_wac *wacom_wac, size_t len);
diff --git a/drivers/hid/wacom_wac.c b/drivers/hid/wacom_wac.c
index 2b4640397375..4f2b08aa7508 100644
--- a/drivers/hid/wacom_wac.c
+++ b/drivers/hid/wacom_wac.c
@@ -2258,7 +2258,7 @@ static void wacom_wac_pen_event(struct hid_device *hdev, struct hid_field *field
case HID_DG_TOOLSERIALNUMBER:
if (value) {
wacom_wac->serial[0] = (wacom_wac->serial[0] & ~0xFFFFFFFFULL);
- wacom_wac->serial[0] |= (__u32)value;
+ wacom_wac->serial[0] |= wacom_s32tou(value, field->report_size);
}
return;
case HID_DG_TWIST:
@@ -2274,15 +2274,17 @@ static void wacom_wac_pen_event(struct hid_device *hdev, struct hid_field *field
return;
case WACOM_HID_WD_SERIALHI:
if (value) {
+ __u32 raw_value = wacom_s32tou(value, field->report_size);
+
wacom_wac->serial[0] = (wacom_wac->serial[0] & 0xFFFFFFFF);
- wacom_wac->serial[0] |= ((__u64)value) << 32;
+ wacom_wac->serial[0] |= ((__u64)raw_value) << 32;
/*
* Non-USI EMR devices may contain additional tool type
* information here. See WACOM_HID_WD_TOOLTYPE case for
* more details.
*/
if (value >> 20 == 1) {
- wacom_wac->id[0] |= value & 0xFFFFF;
+ wacom_wac->id[0] |= raw_value & 0xFFFFF;
}
}
return;
@@ -2294,7 +2296,7 @@ static void wacom_wac_pen_event(struct hid_device *hdev, struct hid_field *field
* bitwise OR so the complete value can be built
* up over time :(
*/
- wacom_wac->id[0] |= value;
+ wacom_wac->id[0] |= wacom_s32tou(value, field->report_size);
return;
case WACOM_HID_WD_OFFSETLEFT:
if (features->offset_left && value != features->offset_left)
diff --git a/drivers/hwmon/ina3221.c b/drivers/hwmon/ina3221.c
index 0037e2bdacd6..8a51dcf055ea 100644
--- a/drivers/hwmon/ina3221.c
+++ b/drivers/hwmon/ina3221.c
@@ -170,7 +170,7 @@ static inline int ina3221_wait_for_data(struct ina3221_data *ina)

/* Polling the CVRF bit to make sure read data is ready */
return regmap_field_read_poll_timeout(ina->fields[F_CVRF],
- cvrf, cvrf, wait, 100000);
+ cvrf, cvrf, wait, wait * 2);
}

static int ina3221_read_value(struct ina3221_data *ina, unsigned int reg,
diff --git a/drivers/hwtracing/intel_th/gth.c b/drivers/hwtracing/intel_th/gth.c
index fa9d34af87ac..f72803a02391 100644
--- a/drivers/hwtracing/intel_th/gth.c
+++ b/drivers/hwtracing/intel_th/gth.c
@@ -626,6 +626,9 @@ static void intel_th_gth_switch(struct intel_th_device *thdev,
if (!count)
dev_dbg(&thdev->dev, "timeout waiting for CTS Trigger\n");

+ /* De-assert the trigger */
+ iowrite32(0, gth->base + REG_CTS_CTL);
+
intel_th_gth_stop(gth, output, false);
intel_th_gth_start(gth, output);
}
diff --git a/drivers/hwtracing/intel_th/pci.c b/drivers/hwtracing/intel_th/pci.c
index 91dfeba62485..03ca5b1bef9f 100644
--- a/drivers/hwtracing/intel_th/pci.c
+++ b/drivers/hwtracing/intel_th/pci.c
@@ -199,6 +199,11 @@ static const struct pci_device_id intel_th_pci_id_table[] = {
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x02a6),
.driver_data = (kernel_ulong_t)&intel_th_2x,
},
+ {
+ /* Comet Lake PCH */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x06a6),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
{
/* Ice Lake NNPI */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x45c5),
@@ -209,6 +214,11 @@ static const struct pci_device_id intel_th_pci_id_table[] = {
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xa0a6),
.driver_data = (kernel_ulong_t)&intel_th_2x,
},
+ {
+ /* Jasper Lake PCH */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x4da6),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
{ 0 },
};

diff --git a/drivers/iio/adc/stm32-adc.c b/drivers/iio/adc/stm32-adc.c
index b22be473cb03..755627e4ab9e 100644
--- a/drivers/iio/adc/stm32-adc.c
+++ b/drivers/iio/adc/stm32-adc.c
@@ -1399,7 +1399,7 @@ static int stm32_adc_dma_start(struct iio_dev *indio_dev)
cookie = dmaengine_submit(desc);
ret = dma_submit_error(cookie);
if (ret) {
- dmaengine_terminate_all(adc->dma_chan);
+ dmaengine_terminate_sync(adc->dma_chan);
return ret;
}

@@ -1477,7 +1477,7 @@ static void __stm32_adc_buffer_predisable(struct iio_dev *indio_dev)
stm32_adc_conv_irq_disable(adc);

if (adc->dma_chan)
- dmaengine_terminate_all(adc->dma_chan);
+ dmaengine_terminate_sync(adc->dma_chan);

if (stm32_adc_set_trig(indio_dev, NULL))
dev_err(&indio_dev->dev, "Can't clear trigger\n");
diff --git a/drivers/iio/imu/adis16480.c b/drivers/iio/imu/adis16480.c
index b99d73887c9f..8743b2f376e2 100644
--- a/drivers/iio/imu/adis16480.c
+++ b/drivers/iio/imu/adis16480.c
@@ -317,8 +317,11 @@ static int adis16480_set_freq(struct iio_dev *indio_dev, int val, int val2)
struct adis16480 *st = iio_priv(indio_dev);
unsigned int t, reg;

+ if (val < 0 || val2 < 0)
+ return -EINVAL;
+
t = val * 1000 + val2 / 1000;
- if (t <= 0)
+ if (t == 0)
return -EINVAL;

/*
diff --git a/drivers/iio/imu/inv_mpu6050/inv_mpu_core.c b/drivers/iio/imu/inv_mpu6050/inv_mpu_core.c
index 8a704cd5bddb..3cb41ac357fa 100644
--- a/drivers/iio/imu/inv_mpu6050/inv_mpu_core.c
+++ b/drivers/iio/imu/inv_mpu6050/inv_mpu_core.c
@@ -114,54 +114,63 @@ static const struct inv_mpu6050_hw hw_info[] = {
.name = "MPU6050",
.reg = &reg_set_6050,
.config = &chip_config_6050,
+ .fifo_size = 1024,
},
{
.whoami = INV_MPU6500_WHOAMI_VALUE,
.name = "MPU6500",
.reg = &reg_set_6500,
.config = &chip_config_6050,
+ .fifo_size = 512,
},
{
.whoami = INV_MPU6515_WHOAMI_VALUE,
.name = "MPU6515",
.reg = &reg_set_6500,
.config = &chip_config_6050,
+ .fifo_size = 512,
},
{
.whoami = INV_MPU6000_WHOAMI_VALUE,
.name = "MPU6000",
.reg = &reg_set_6050,
.config = &chip_config_6050,
+ .fifo_size = 1024,
},
{
.whoami = INV_MPU9150_WHOAMI_VALUE,
.name = "MPU9150",
.reg = &reg_set_6050,
.config = &chip_config_6050,
+ .fifo_size = 1024,
},
{
.whoami = INV_MPU9250_WHOAMI_VALUE,
.name = "MPU9250",
.reg = &reg_set_6500,
.config = &chip_config_6050,
+ .fifo_size = 512,
},
{
.whoami = INV_MPU9255_WHOAMI_VALUE,
.name = "MPU9255",
.reg = &reg_set_6500,
.config = &chip_config_6050,
+ .fifo_size = 512,
},
{
.whoami = INV_ICM20608_WHOAMI_VALUE,
.name = "ICM20608",
.reg = &reg_set_6500,
.config = &chip_config_6050,
+ .fifo_size = 512,
},
{
.whoami = INV_ICM20602_WHOAMI_VALUE,
.name = "ICM20602",
.reg = &reg_set_icm20602,
.config = &chip_config_6050,
+ .fifo_size = 1008,
},
};

diff --git a/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h b/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h
index db1c6904388b..51235677c534 100644
--- a/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h
+++ b/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h
@@ -100,12 +100,14 @@ struct inv_mpu6050_chip_config {
* @name: name of the chip.
* @reg: register map of the chip.
* @config: configuration of the chip.
+ * @fifo_size: size of the FIFO in bytes.
*/
struct inv_mpu6050_hw {
u8 whoami;
u8 *name;
const struct inv_mpu6050_reg_map *reg;
const struct inv_mpu6050_chip_config *config;
+ size_t fifo_size;
};

/*
diff --git a/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c b/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c
index 5f9a5de0bab4..72d8c5790076 100644
--- a/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c
+++ b/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c
@@ -180,9 +180,6 @@ irqreturn_t inv_mpu6050_read_fifo(int irq, void *p)
"failed to ack interrupt\n");
goto flush_fifo;
}
- /* handle fifo overflow by reseting fifo */
- if (int_status & INV_MPU6050_BIT_FIFO_OVERFLOW_INT)
- goto flush_fifo;
if (!(int_status & INV_MPU6050_BIT_RAW_DATA_RDY_INT)) {
dev_warn(regmap_get_device(st->map),
"spurious interrupt with status 0x%x\n", int_status);
@@ -211,6 +208,18 @@ irqreturn_t inv_mpu6050_read_fifo(int irq, void *p)
if (result)
goto end_session;
fifo_count = get_unaligned_be16(&data[0]);
+
+ /*
+ * Handle fifo overflow by resetting fifo.
+ * Reset if there is only 3 data set free remaining to mitigate
+ * possible delay between reading fifo count and fifo data.
+ */
+ nb = 3 * bytes_per_datum;
+ if (fifo_count >= st->hw->fifo_size - nb) {
+ dev_warn(regmap_get_device(st->map), "fifo overflow reset\n");
+ goto flush_fifo;
+ }
+
/* compute and process all complete datum */
nb = fifo_count / bytes_per_datum;
inv_mpu6050_update_period(st, pf->timestamp, nb);
diff --git a/drivers/iio/proximity/srf04.c b/drivers/iio/proximity/srf04.c
index 8b50d56b0a03..01eb8cc63076 100644
--- a/drivers/iio/proximity/srf04.c
+++ b/drivers/iio/proximity/srf04.c
@@ -110,7 +110,7 @@ static int srf04_read(struct srf04_data *data)
udelay(data->cfg->trigger_pulse_us);
gpiod_set_value(data->gpiod_trig, 0);

- /* it cannot take more than 20 ms */
+ /* it should not take more than 20 ms until echo is rising */
ret = wait_for_completion_killable_timeout(&data->rising, HZ/50);
if (ret < 0) {
mutex_unlock(&data->lock);
@@ -120,7 +120,8 @@ static int srf04_read(struct srf04_data *data)
return -ETIMEDOUT;
}

- ret = wait_for_completion_killable_timeout(&data->falling, HZ/50);
+ /* it cannot take more than 50 ms until echo is falling */
+ ret = wait_for_completion_killable_timeout(&data->falling, HZ/20);
if (ret < 0) {
mutex_unlock(&data->lock);
return ret;
@@ -135,19 +136,19 @@ static int srf04_read(struct srf04_data *data)

dt_ns = ktime_to_ns(ktime_dt);
/*
- * measuring more than 3 meters is beyond the capabilities of
- * the sensor
+ * measuring more than 6,45 meters is beyond the capabilities of
+ * the supported sensors
* ==> filter out invalid results for not measuring echos of
* another us sensor
*
* formula:
- * distance 3 m
- * time = ---------- = --------- = 9404389 ns
- * speed 319 m/s
+ * distance 6,45 * 2 m
+ * time = ---------- = ------------ = 40438871 ns
+ * speed 319 m/s
*
* using a minimum speed at -20 °C of 319 m/s
*/
- if (dt_ns > 9404389)
+ if (dt_ns > 40438871)
return -EIO;

time_ns = dt_ns;
@@ -159,20 +160,20 @@ static int srf04_read(struct srf04_data *data)
* with Temp in °C
* and speed in m/s
*
- * use 343 m/s as ultrasonic speed at 20 °C here in absence of the
+ * use 343,5 m/s as ultrasonic speed at 20 °C here in absence of the
* temperature
*
* therefore:
- * time 343
- * distance = ------ * -----
- * 10^6 2
+ * time 343,5 time * 106
+ * distance = ------ * ------- = ------------
+ * 10^6 2 617176
* with time in ns
* and distance in mm (one way)
*
- * because we limit to 3 meters the multiplication with 343 just
+ * because we limit to 6,45 meters the multiplication with 106 just
* fits into 32 bit
*/
- distance_mm = time_ns * 343 / 2000000;
+ distance_mm = time_ns * 106 / 617176;

return distance_mm;
}
diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index f42e856f3072..4300e2186584 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -778,7 +778,7 @@ static int fill_res_counter_entry(struct sk_buff *msg, bool has_cap_net_admin,
container_of(res, struct rdma_counter, res);

if (port && port != counter->port)
- return 0;
+ return -EAGAIN;

/* Dump it even query failed */
rdma_counter_query_stats(counter);
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 1e5aeb39f774..63f7f7db5902 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -98,7 +98,7 @@ ib_uverbs_init_udata_buf_or_null(struct ib_udata *udata,

struct ib_uverbs_device {
atomic_t refcount;
- int num_comp_vectors;
+ u32 num_comp_vectors;
struct completion comp;
struct device dev;
/* First group for device attributes, NULL terminated array */
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 92349bf37589..5b1dc11a7283 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -662,16 +662,17 @@ static bool find_gid_index(const union ib_gid *gid,
void *context)
{
struct find_gid_index_context *ctx = context;
+ u16 vlan_id = 0xffff;
+ int ret;

if (ctx->gid_type != gid_attr->gid_type)
return false;

- if ((!!(ctx->vlan_id != 0xffff) == !is_vlan_dev(gid_attr->ndev)) ||
- (is_vlan_dev(gid_attr->ndev) &&
- vlan_dev_vlan_id(gid_attr->ndev) != ctx->vlan_id))
+ ret = rdma_read_gid_l2_fields(gid_attr, &vlan_id, NULL);
+ if (ret)
return false;

- return true;
+ return ctx->vlan_id == vlan_id;
}

static const struct ib_gid_attr *
diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c
index e87fc0408470..347dc242fb88 100644
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -495,7 +495,6 @@ static int _put_ep_safe(struct c4iw_dev *dev, struct sk_buff *skb)

ep = *((struct c4iw_ep **)(skb->cb + 2 * sizeof(void *)));
release_ep_resources(ep);
- kfree_skb(skb);
return 0;
}

@@ -506,7 +505,6 @@ static int _put_pass_ep_safe(struct c4iw_dev *dev, struct sk_buff *skb)
ep = *((struct c4iw_ep **)(skb->cb + 2 * sizeof(void *)));
c4iw_put_ep(&ep->parent_ep->com);
release_ep_resources(ep);
- kfree_skb(skb);
return 0;
}

@@ -2424,20 +2422,6 @@ static int accept_cr(struct c4iw_ep *ep, struct sk_buff *skb,
enum chip_type adapter_type = ep->com.dev->rdev.lldi.adapter_type;

pr_debug("ep %p tid %u\n", ep, ep->hwtid);
-
- skb_get(skb);
- rpl = cplhdr(skb);
- if (!is_t4(adapter_type)) {
- skb_trim(skb, roundup(sizeof(*rpl5), 16));
- rpl5 = (void *)rpl;
- INIT_TP_WR(rpl5, ep->hwtid);
- } else {
- skb_trim(skb, sizeof(*rpl));
- INIT_TP_WR(rpl, ep->hwtid);
- }
- OPCODE_TID(rpl) = cpu_to_be32(MK_OPCODE_TID(CPL_PASS_ACCEPT_RPL,
- ep->hwtid));
-
cxgb_best_mtu(ep->com.dev->rdev.lldi.mtus, ep->mtu, &mtu_idx,
enable_tcp_timestamps && req->tcpopt.tstamp,
(ep->com.remote_addr.ss_family == AF_INET) ? 0 : 1);
@@ -2483,6 +2467,20 @@ static int accept_cr(struct c4iw_ep *ep, struct sk_buff *skb,
if (tcph->ece && tcph->cwr)
opt2 |= CCTRL_ECN_V(1);
}
+
+ skb_get(skb);
+ rpl = cplhdr(skb);
+ if (!is_t4(adapter_type)) {
+ skb_trim(skb, roundup(sizeof(*rpl5), 16));
+ rpl5 = (void *)rpl;
+ INIT_TP_WR(rpl5, ep->hwtid);
+ } else {
+ skb_trim(skb, sizeof(*rpl));
+ INIT_TP_WR(rpl, ep->hwtid);
+ }
+ OPCODE_TID(rpl) = cpu_to_be32(MK_OPCODE_TID(CPL_PASS_ACCEPT_RPL,
+ ep->hwtid));
+
if (CHELSIO_CHIP_VERSION(adapter_type) > CHELSIO_T4) {
u32 isn = (prandom_u32() & ~7UL) - 1;
opt2 |= T5_OPT_2_VALID_F;
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index b76e3beeafb8..854898433916 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -5268,9 +5268,9 @@ static void hns_roce_v2_free_eq(struct hns_roce_dev *hr_dev,
return;
}

- if (eq->buf_list)
- dma_free_coherent(hr_dev->dev, buf_chk_sz,
- eq->buf_list->buf, eq->buf_list->map);
+ dma_free_coherent(hr_dev->dev, buf_chk_sz, eq->buf_list->buf,
+ eq->buf_list->map);
+ kfree(eq->buf_list);
}

static void hns_roce_config_eqc(struct hns_roce_dev *hr_dev,
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 72869ff4a334..3903141a387e 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -3249,10 +3249,12 @@ static int modify_raw_packet_qp_sq(
}

/* Only remove the old rate after new rate was set */
- if ((old_rl.rate &&
- !mlx5_rl_are_equal(&old_rl, &new_rl)) ||
- (new_state != MLX5_SQC_STATE_RDY))
+ if ((old_rl.rate && !mlx5_rl_are_equal(&old_rl, &new_rl)) ||
+ (new_state != MLX5_SQC_STATE_RDY)) {
mlx5_rl_remove_rate(dev, &old_rl);
+ if (new_state != MLX5_SQC_STATE_RDY)
+ memset(&new_rl, 0, sizeof(new_rl));
+ }

ibqp->rl = new_rl;
sq->state = new_state;
diff --git a/drivers/infiniband/hw/qedr/main.c b/drivers/infiniband/hw/qedr/main.c
index f97b3d65b30c..2fef7a48f77b 100644
--- a/drivers/infiniband/hw/qedr/main.c
+++ b/drivers/infiniband/hw/qedr/main.c
@@ -76,7 +76,7 @@ static void qedr_get_dev_fw_str(struct ib_device *ibdev, char *str)
struct qedr_dev *qedr = get_qedr_dev(ibdev);
u32 fw_ver = (u32)qedr->attr.fw_ver;

- snprintf(str, IB_FW_VERSION_NAME_MAX, "%d. %d. %d. %d",
+ snprintf(str, IB_FW_VERSION_NAME_MAX, "%d.%d.%d.%d",
(fw_ver >> 24) & 0xFF, (fw_ver >> 16) & 0xFF,
(fw_ver >> 8) & 0xFF, fw_ver & 0xFF);
}
diff --git a/drivers/infiniband/sw/siw/siw_qp.c b/drivers/infiniband/sw/siw/siw_qp.c
index 52d402f39df9..b4317480cee7 100644
--- a/drivers/infiniband/sw/siw/siw_qp.c
+++ b/drivers/infiniband/sw/siw/siw_qp.c
@@ -1312,6 +1312,7 @@ int siw_qp_add(struct siw_device *sdev, struct siw_qp *qp)
void siw_free_qp(struct kref *ref)
{
struct siw_qp *found, *qp = container_of(ref, struct siw_qp, ref);
+ struct siw_base_qp *siw_base_qp = to_siw_base_qp(qp->ib_qp);
struct siw_device *sdev = qp->sdev;
unsigned long flags;

@@ -1334,4 +1335,5 @@ void siw_free_qp(struct kref *ref)
atomic_dec(&sdev->num_qp);
siw_dbg_qp(qp, "free QP\n");
kfree_rcu(qp, rcu);
+ kfree(siw_base_qp);
}
diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
index da52c90e06d4..ac08d84d84cb 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.c
+++ b/drivers/infiniband/sw/siw/siw_verbs.c
@@ -603,7 +603,6 @@ int siw_verbs_modify_qp(struct ib_qp *base_qp, struct ib_qp_attr *attr,
int siw_destroy_qp(struct ib_qp *base_qp, struct ib_udata *udata)
{
struct siw_qp *qp = to_siw_qp(base_qp);
- struct siw_base_qp *siw_base_qp = to_siw_base_qp(base_qp);
struct siw_ucontext *uctx =
rdma_udata_to_drv_context(udata, struct siw_ucontext,
base_ucontext);
@@ -640,7 +639,6 @@ int siw_destroy_qp(struct ib_qp *base_qp, struct ib_udata *udata)
qp->scq = qp->rcq = NULL;

siw_qp_put(qp);
- kfree(siw_base_qp);

return 0;
}
diff --git a/drivers/iommu/amd_iommu_quirks.c b/drivers/iommu/amd_iommu_quirks.c
index c235f79b7a20..5120ce4fdce3 100644
--- a/drivers/iommu/amd_iommu_quirks.c
+++ b/drivers/iommu/amd_iommu_quirks.c
@@ -73,6 +73,19 @@ static const struct dmi_system_id ivrs_quirks[] __initconst = {
},
.driver_data = (void *)&ivrs_ioapic_quirks[DELL_LATITUDE_5495],
},
+ {
+ /*
+ * Acer Aspire A315-41 requires the very same workaround as
+ * Dell Latitude 5495
+ */
+ .callback = ivrs_ioapic_quirk_cb,
+ .ident = "Acer Aspire A315-41",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Acer"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Aspire A315-41"),
+ },
+ .driver_data = (void *)&ivrs_ioapic_quirks[DELL_LATITUDE_5495],
+ },
{
.callback = ivrs_ioapic_quirk_cb,
.ident = "Lenovo ideapad 330S-15ARR",
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 21d8fcc83c9c..8550822095be 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1816,7 +1816,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
slave_disable_netpoll(new_slave);

err_close:
- slave_dev->priv_flags &= ~IFF_BONDING;
+ if (!netif_is_bond_master(slave_dev))
+ slave_dev->priv_flags &= ~IFF_BONDING;
dev_close(slave_dev);

err_restore_mac:
@@ -2017,7 +2018,8 @@ static int __bond_release_one(struct net_device *bond_dev,
else
dev_set_mtu(slave_dev, slave->original_mtu);

- slave_dev->priv_flags &= ~IFF_BONDING;
+ if (!netif_is_bond_master(slave_dev))
+ slave_dev->priv_flags &= ~IFF_BONDING;

bond_free_slave(slave);

@@ -2086,8 +2088,7 @@ static int bond_miimon_inspect(struct bonding *bond)
ignore_updelay = !rcu_dereference(bond->curr_active_slave);

bond_for_each_slave_rcu(bond, slave, iter) {
- slave->new_link = BOND_LINK_NOCHANGE;
- slave->link_new_state = slave->link;
+ bond_propose_link_state(slave, BOND_LINK_NOCHANGE);

link_state = bond_check_dev_link(bond, slave->dev, 0);

@@ -2121,7 +2122,7 @@ static int bond_miimon_inspect(struct bonding *bond)
}

if (slave->delay <= 0) {
- slave->new_link = BOND_LINK_DOWN;
+ bond_propose_link_state(slave, BOND_LINK_DOWN);
commit++;
continue;
}
@@ -2158,7 +2159,7 @@ static int bond_miimon_inspect(struct bonding *bond)
slave->delay = 0;

if (slave->delay <= 0) {
- slave->new_link = BOND_LINK_UP;
+ bond_propose_link_state(slave, BOND_LINK_UP);
commit++;
ignore_updelay = false;
continue;
@@ -2196,7 +2197,7 @@ static void bond_miimon_commit(struct bonding *bond)
struct slave *slave, *primary;

bond_for_each_slave(bond, slave, iter) {
- switch (slave->new_link) {
+ switch (slave->link_new_state) {
case BOND_LINK_NOCHANGE:
/* For 802.3ad mode, check current slave speed and
* duplex again in case its port was disabled after
@@ -2268,8 +2269,8 @@ static void bond_miimon_commit(struct bonding *bond)

default:
slave_err(bond->dev, slave->dev, "invalid new link %d on slave\n",
- slave->new_link);
- slave->new_link = BOND_LINK_NOCHANGE;
+ slave->link_new_state);
+ bond_propose_link_state(slave, BOND_LINK_NOCHANGE);

continue;
}
@@ -2677,13 +2678,13 @@ static void bond_loadbalance_arp_mon(struct bonding *bond)
bond_for_each_slave_rcu(bond, slave, iter) {
unsigned long trans_start = dev_trans_start(slave->dev);

- slave->new_link = BOND_LINK_NOCHANGE;
+ bond_propose_link_state(slave, BOND_LINK_NOCHANGE);

if (slave->link != BOND_LINK_UP) {
if (bond_time_in_interval(bond, trans_start, 1) &&
bond_time_in_interval(bond, slave->last_rx, 1)) {

- slave->new_link = BOND_LINK_UP;
+ bond_propose_link_state(slave, BOND_LINK_UP);
slave_state_changed = 1;

/* primary_slave has no meaning in round-robin
@@ -2708,7 +2709,7 @@ static void bond_loadbalance_arp_mon(struct bonding *bond)
if (!bond_time_in_interval(bond, trans_start, 2) ||
!bond_time_in_interval(bond, slave->last_rx, 2)) {

- slave->new_link = BOND_LINK_DOWN;
+ bond_propose_link_state(slave, BOND_LINK_DOWN);
slave_state_changed = 1;

if (slave->link_failure_count < UINT_MAX)
@@ -2739,8 +2740,8 @@ static void bond_loadbalance_arp_mon(struct bonding *bond)
goto re_arm;

bond_for_each_slave(bond, slave, iter) {
- if (slave->new_link != BOND_LINK_NOCHANGE)
- slave->link = slave->new_link;
+ if (slave->link_new_state != BOND_LINK_NOCHANGE)
+ slave->link = slave->link_new_state;
}

if (slave_state_changed) {
@@ -2763,9 +2764,9 @@ static void bond_loadbalance_arp_mon(struct bonding *bond)
}

/* Called to inspect slaves for active-backup mode ARP monitor link state
- * changes. Sets new_link in slaves to specify what action should take
- * place for the slave. Returns 0 if no changes are found, >0 if changes
- * to link states must be committed.
+ * changes. Sets proposed link state in slaves to specify what action
+ * should take place for the slave. Returns 0 if no changes are found, >0
+ * if changes to link states must be committed.
*
* Called with rcu_read_lock held.
*/
@@ -2777,12 +2778,12 @@ static int bond_ab_arp_inspect(struct bonding *bond)
int commit = 0;

bond_for_each_slave_rcu(bond, slave, iter) {
- slave->new_link = BOND_LINK_NOCHANGE;
+ bond_propose_link_state(slave, BOND_LINK_NOCHANGE);
last_rx = slave_last_rx(bond, slave);

if (slave->link != BOND_LINK_UP) {
if (bond_time_in_interval(bond, last_rx, 1)) {
- slave->new_link = BOND_LINK_UP;
+ bond_propose_link_state(slave, BOND_LINK_UP);
commit++;
}
continue;
@@ -2810,7 +2811,7 @@ static int bond_ab_arp_inspect(struct bonding *bond)
if (!bond_is_active_slave(slave) &&
!rcu_access_pointer(bond->current_arp_slave) &&
!bond_time_in_interval(bond, last_rx, 3)) {
- slave->new_link = BOND_LINK_DOWN;
+ bond_propose_link_state(slave, BOND_LINK_DOWN);
commit++;
}

@@ -2823,7 +2824,7 @@ static int bond_ab_arp_inspect(struct bonding *bond)
if (bond_is_active_slave(slave) &&
(!bond_time_in_interval(bond, trans_start, 2) ||
!bond_time_in_interval(bond, last_rx, 2))) {
- slave->new_link = BOND_LINK_DOWN;
+ bond_propose_link_state(slave, BOND_LINK_DOWN);
commit++;
}
}
@@ -2843,7 +2844,7 @@ static void bond_ab_arp_commit(struct bonding *bond)
struct slave *slave;

bond_for_each_slave(bond, slave, iter) {
- switch (slave->new_link) {
+ switch (slave->link_new_state) {
case BOND_LINK_NOCHANGE:
continue;

@@ -2893,8 +2894,9 @@ static void bond_ab_arp_commit(struct bonding *bond)
continue;

default:
- slave_err(bond->dev, slave->dev, "impossible: new_link %d on slave\n",
- slave->new_link);
+ slave_err(bond->dev, slave->dev,
+ "impossible: link_new_state %d on slave\n",
+ slave->link_new_state);
continue;
}

@@ -3457,7 +3459,7 @@ static void bond_get_stats(struct net_device *bond_dev,
struct list_head *iter;
struct slave *slave;

- spin_lock_nested(&bond->stats_lock, bond_get_nest_level(bond_dev));
+ spin_lock(&bond->stats_lock);
memcpy(stats, &bond->bond_stats, sizeof(*stats));

rcu_read_lock();
@@ -4296,7 +4298,6 @@ void bond_setup(struct net_device *bond_dev)
struct bonding *bond = netdev_priv(bond_dev);

spin_lock_init(&bond->mode_lock);
- spin_lock_init(&bond->stats_lock);
bond->params = bonding_defaults;

/* Initialize pointers */
@@ -4365,6 +4366,7 @@ static void bond_uninit(struct net_device *bond_dev)

list_del(&bond->bond_list);

+ lockdep_unregister_key(&bond->stats_lock_key);
bond_debug_unregister(bond);
}

@@ -4771,6 +4773,10 @@ static int bond_init(struct net_device *bond_dev)
bond->nest_level = SINGLE_DEPTH_NESTING;
netdev_lockdep_set_classes(bond_dev);

+ spin_lock_init(&bond->stats_lock);
+ lockdep_register_key(&bond->stats_lock_key);
+ lockdep_set_class(&bond->stats_lock, &bond->stats_lock_key);
+
list_add_tail(&bond->bond_list, &bn->dev_list);

bond_prepare_sysfs_group(bond);
diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c
index 606b7d8ffe13..9b61bfbea6cd 100644
--- a/drivers/net/can/c_can/c_can.c
+++ b/drivers/net/can/c_can/c_can.c
@@ -97,6 +97,9 @@
#define BTR_TSEG2_SHIFT 12
#define BTR_TSEG2_MASK (0x7 << BTR_TSEG2_SHIFT)

+/* interrupt register */
+#define INT_STS_PENDING 0x8000
+
/* brp extension register */
#define BRP_EXT_BRPE_MASK 0x0f
#define BRP_EXT_BRPE_SHIFT 0
@@ -1029,10 +1032,16 @@ static int c_can_poll(struct napi_struct *napi, int quota)
u16 curr, last = priv->last_status;
int work_done = 0;

- priv->last_status = curr = priv->read_reg(priv, C_CAN_STS_REG);
- /* Ack status on C_CAN. D_CAN is self clearing */
- if (priv->type != BOSCH_D_CAN)
- priv->write_reg(priv, C_CAN_STS_REG, LEC_UNUSED);
+ /* Only read the status register if a status interrupt was pending */
+ if (atomic_xchg(&priv->sie_pending, 0)) {
+ priv->last_status = curr = priv->read_reg(priv, C_CAN_STS_REG);
+ /* Ack status on C_CAN. D_CAN is self clearing */
+ if (priv->type != BOSCH_D_CAN)
+ priv->write_reg(priv, C_CAN_STS_REG, LEC_UNUSED);
+ } else {
+ /* no change detected ... */
+ curr = last;
+ }

/* handle state changes */
if ((curr & STATUS_EWARN) && (!(last & STATUS_EWARN))) {
@@ -1083,10 +1092,16 @@ static irqreturn_t c_can_isr(int irq, void *dev_id)
{
struct net_device *dev = (struct net_device *)dev_id;
struct c_can_priv *priv = netdev_priv(dev);
+ int reg_int;

- if (!priv->read_reg(priv, C_CAN_INT_REG))
+ reg_int = priv->read_reg(priv, C_CAN_INT_REG);
+ if (!reg_int)
return IRQ_NONE;

+ /* save for later use */
+ if (reg_int & INT_STS_PENDING)
+ atomic_set(&priv->sie_pending, 1);
+
/* disable all interrupts and schedule the NAPI */
c_can_irq_control(priv, false);
napi_schedule(&priv->napi);
diff --git a/drivers/net/can/c_can/c_can.h b/drivers/net/can/c_can/c_can.h
index 8acdc7fa4792..d5567a7c1c6d 100644
--- a/drivers/net/can/c_can/c_can.h
+++ b/drivers/net/can/c_can/c_can.h
@@ -198,6 +198,7 @@ struct c_can_priv {
struct net_device *dev;
struct device *device;
atomic_t tx_active;
+ atomic_t sie_pending;
unsigned long tx_dir;
int last_status;
u16 (*read_reg) (const struct c_can_priv *priv, enum reg index);
diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index 483d270664cc..99fa712b48b3 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -842,6 +842,7 @@ void of_can_transceiver(struct net_device *dev)
return;

ret = of_property_read_u32(dn, "max-bitrate", &priv->bitrate_max);
+ of_node_put(dn);
if ((ret && ret != -EINVAL) || (!ret && !priv->bitrate_max))
netdev_warn(dev, "Invalid value for transceiver max bitrate. Ignoring bitrate limit.\n");
}
diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index fcec8bcb53d6..56fa98d7aa90 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -1169,6 +1169,7 @@ static int flexcan_chip_start(struct net_device *dev)
reg_mecr = priv->read(&regs->mecr);
reg_mecr &= ~FLEXCAN_MECR_ECRWRDIS;
priv->write(reg_mecr, &regs->mecr);
+ reg_mecr |= FLEXCAN_MECR_ECCDIS;
reg_mecr &= ~(FLEXCAN_MECR_NCEFAFRZ | FLEXCAN_MECR_HANCEI_MSK |
FLEXCAN_MECR_FANCEI_MSK);
priv->write(reg_mecr, &regs->mecr);
diff --git a/drivers/net/can/rx-offload.c b/drivers/net/can/rx-offload.c
index e6a668ee7730..663697439d1c 100644
--- a/drivers/net/can/rx-offload.c
+++ b/drivers/net/can/rx-offload.c
@@ -207,8 +207,10 @@ int can_rx_offload_queue_sorted(struct can_rx_offload *offload,
unsigned long flags;

if (skb_queue_len(&offload->skb_queue) >
- offload->skb_queue_len_max)
- return -ENOMEM;
+ offload->skb_queue_len_max) {
+ kfree_skb(skb);
+ return -ENOBUFS;
+ }

cb = can_rx_offload_get_cb(skb);
cb->timestamp = timestamp;
diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index bd6eb9967630..2f74f6704c12 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -623,6 +623,7 @@ static int gs_can_open(struct net_device *netdev)
rc);

usb_unanchor_urb(urb);
+ usb_free_urb(urb);
break;
}

diff --git a/drivers/net/can/usb/mcba_usb.c b/drivers/net/can/usb/mcba_usb.c
index 19a702ac49e4..21faa2ec4632 100644
--- a/drivers/net/can/usb/mcba_usb.c
+++ b/drivers/net/can/usb/mcba_usb.c
@@ -876,9 +876,8 @@ static void mcba_usb_disconnect(struct usb_interface *intf)
netdev_info(priv->netdev, "device disconnected\n");

unregister_candev(priv->netdev);
- free_candev(priv->netdev);
-
mcba_urb_unlink(priv);
+ free_candev(priv->netdev);
}

static struct usb_driver mcba_usb_driver = {
diff --git a/drivers/net/can/usb/peak_usb/pcan_usb.c b/drivers/net/can/usb/peak_usb/pcan_usb.c
index 617da295b6c1..5a66c9f53aae 100644
--- a/drivers/net/can/usb/peak_usb/pcan_usb.c
+++ b/drivers/net/can/usb/peak_usb/pcan_usb.c
@@ -100,7 +100,7 @@ struct pcan_usb_msg_context {
u8 *end;
u8 rec_cnt;
u8 rec_idx;
- u8 rec_data_idx;
+ u8 rec_ts_idx;
struct net_device *netdev;
struct pcan_usb *pdev;
};
@@ -547,10 +547,15 @@ static int pcan_usb_decode_status(struct pcan_usb_msg_context *mc,
mc->ptr += PCAN_USB_CMD_ARGS;

if (status_len & PCAN_USB_STATUSLEN_TIMESTAMP) {
- int err = pcan_usb_decode_ts(mc, !mc->rec_idx);
+ int err = pcan_usb_decode_ts(mc, !mc->rec_ts_idx);

if (err)
return err;
+
+ /* Next packet in the buffer will have a timestamp on a single
+ * byte
+ */
+ mc->rec_ts_idx++;
}

switch (f) {
@@ -632,10 +637,13 @@ static int pcan_usb_decode_data(struct pcan_usb_msg_context *mc, u8 status_len)

cf->can_dlc = get_can_dlc(rec_len);

- /* first data packet timestamp is a word */
- if (pcan_usb_decode_ts(mc, !mc->rec_data_idx))
+ /* Only first packet timestamp is a word */
+ if (pcan_usb_decode_ts(mc, !mc->rec_ts_idx))
goto decode_failed;

+ /* Next packet in the buffer will have a timestamp on a single byte */
+ mc->rec_ts_idx++;
+
/* read data */
memset(cf->data, 0x0, sizeof(cf->data));
if (status_len & PCAN_USB_STATUSLEN_RTR) {
@@ -688,7 +696,6 @@ static int pcan_usb_decode_msg(struct peak_usb_device *dev, u8 *ibuf, u32 lbuf)
/* handle normal can frames here */
} else {
err = pcan_usb_decode_data(&mc, sl);
- mc.rec_data_idx++;
}
}

diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_core.c b/drivers/net/can/usb/peak_usb/pcan_usb_core.c
index 65dce642b86b..0b7766b715fd 100644
--- a/drivers/net/can/usb/peak_usb/pcan_usb_core.c
+++ b/drivers/net/can/usb/peak_usb/pcan_usb_core.c
@@ -750,7 +750,7 @@ static int peak_usb_create_dev(const struct peak_usb_adapter *peak_usb_adapter,
dev = netdev_priv(netdev);

/* allocate a buffer large enough to send commands */
- dev->cmd_buf = kmalloc(PCAN_USB_MAX_CMD_LEN, GFP_KERNEL);
+ dev->cmd_buf = kzalloc(PCAN_USB_MAX_CMD_LEN, GFP_KERNEL);
if (!dev->cmd_buf) {
err = -ENOMEM;
goto lbl_free_candev;
diff --git a/drivers/net/can/usb/usb_8dev.c b/drivers/net/can/usb/usb_8dev.c
index d596a2ad7f78..8fa224b28218 100644
--- a/drivers/net/can/usb/usb_8dev.c
+++ b/drivers/net/can/usb/usb_8dev.c
@@ -996,9 +996,8 @@ static void usb_8dev_disconnect(struct usb_interface *intf)
netdev_info(priv->netdev, "device disconnected\n");

unregister_netdev(priv->netdev);
- free_candev(priv->netdev);
-
unlink_all_urbs(priv);
+ free_candev(priv->netdev);
}

}
diff --git a/drivers/net/ethernet/arc/emac_rockchip.c b/drivers/net/ethernet/arc/emac_rockchip.c
index 42d2e1b02c44..664d664e0925 100644
--- a/drivers/net/ethernet/arc/emac_rockchip.c
+++ b/drivers/net/ethernet/arc/emac_rockchip.c
@@ -256,6 +256,9 @@ static int emac_rockchip_remove(struct platform_device *pdev)
if (priv->regulator)
regulator_disable(priv->regulator);

+ if (priv->soc_data->need_div_macclk)
+ clk_disable_unprepare(priv->macclk);
+
free_netdev(ndev);
return err;
}
diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
index 0e5de88fd6e8..cdd7e5da4a74 100644
--- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
+++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
@@ -1499,7 +1499,7 @@ static int octeon_mgmt_probe(struct platform_device *pdev)
netdev->ethtool_ops = &octeon_mgmt_ethtool_ops;

netdev->min_mtu = 64 - OCTEON_MGMT_RX_HEADROOM;
- netdev->max_mtu = 16383 - OCTEON_MGMT_RX_HEADROOM;
+ netdev->max_mtu = 16383 - OCTEON_MGMT_RX_HEADROOM - VLAN_HLEN;

mac = of_get_mac_address(pdev->dev.of_node);

diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c
index 59564ac99d2a..edec61dfc868 100644
--- a/drivers/net/ethernet/google/gve/gve_rx.c
+++ b/drivers/net/ethernet/google/gve/gve_rx.c
@@ -289,6 +289,8 @@ static bool gve_rx(struct gve_rx_ring *rx, struct gve_rx_desc *rx_desc,

len = be16_to_cpu(rx_desc->len) - GVE_RX_PAD;
page_info = &rx->data.page_info[idx];
+ dma_sync_single_for_cpu(&priv->pdev->dev, rx->data.qpl->page_buses[idx],
+ PAGE_SIZE, DMA_FROM_DEVICE);

/* gvnic can only receive into registered segments. If the buffer
* can't be recycled, our only choice is to copy the data out of
diff --git a/drivers/net/ethernet/google/gve/gve_tx.c b/drivers/net/ethernet/google/gve/gve_tx.c
index 778b87b5a06c..0a9a7ee2a866 100644
--- a/drivers/net/ethernet/google/gve/gve_tx.c
+++ b/drivers/net/ethernet/google/gve/gve_tx.c
@@ -390,7 +390,21 @@ static void gve_tx_fill_seg_desc(union gve_tx_desc *seg_desc,
seg_desc->seg.seg_addr = cpu_to_be64(addr);
}

-static int gve_tx_add_skb(struct gve_tx_ring *tx, struct sk_buff *skb)
+static void gve_dma_sync_for_device(struct device *dev, dma_addr_t *page_buses,
+ u64 iov_offset, u64 iov_len)
+{
+ dma_addr_t dma;
+ u64 addr;
+
+ for (addr = iov_offset; addr < iov_offset + iov_len;
+ addr += PAGE_SIZE) {
+ dma = page_buses[addr / PAGE_SIZE];
+ dma_sync_single_for_device(dev, dma, PAGE_SIZE, DMA_TO_DEVICE);
+ }
+}
+
+static int gve_tx_add_skb(struct gve_tx_ring *tx, struct sk_buff *skb,
+ struct device *dev)
{
int pad_bytes, hlen, hdr_nfrags, payload_nfrags, l4_hdr_offset;
union gve_tx_desc *pkt_desc, *seg_desc;
@@ -432,6 +446,9 @@ static int gve_tx_add_skb(struct gve_tx_ring *tx, struct sk_buff *skb)
skb_copy_bits(skb, 0,
tx->tx_fifo.base + info->iov[hdr_nfrags - 1].iov_offset,
hlen);
+ gve_dma_sync_for_device(dev, tx->tx_fifo.qpl->page_buses,
+ info->iov[hdr_nfrags - 1].iov_offset,
+ info->iov[hdr_nfrags - 1].iov_len);
copy_offset = hlen;

for (i = payload_iov; i < payload_nfrags + payload_iov; i++) {
@@ -445,6 +462,9 @@ static int gve_tx_add_skb(struct gve_tx_ring *tx, struct sk_buff *skb)
skb_copy_bits(skb, copy_offset,
tx->tx_fifo.base + info->iov[i].iov_offset,
info->iov[i].iov_len);
+ gve_dma_sync_for_device(dev, tx->tx_fifo.qpl->page_buses,
+ info->iov[i].iov_offset,
+ info->iov[i].iov_len);
copy_offset += info->iov[i].iov_len;
}

@@ -473,7 +493,7 @@ netdev_tx_t gve_tx(struct sk_buff *skb, struct net_device *dev)
gve_tx_put_doorbell(priv, tx->q_resources, tx->req);
return NETDEV_TX_BUSY;
}
- nsegs = gve_tx_add_skb(tx, skb);
+ nsegs = gve_tx_add_skb(tx, skb, &priv->pdev->dev);

netdev_tx_sent_queue(tx->netdev_txq, skb->len);
skb_tx_timestamp(skb);
diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c
index f51bc0255556..4606a7e4a6d1 100644
--- a/drivers/net/ethernet/hisilicon/hip04_eth.c
+++ b/drivers/net/ethernet/hisilicon/hip04_eth.c
@@ -1041,7 +1041,6 @@ static int hip04_remove(struct platform_device *pdev)

hip04_free_ring(ndev, d);
unregister_netdev(ndev);
- free_irq(ndev->irq, ndev);
of_node_put(priv->phy_node);
cancel_work_sync(&priv->tx_timeout_task);
free_netdev(ndev);
diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.c b/drivers/net/ethernet/hisilicon/hns/hnae.c
index 6d0457eb4faa..08339278c722 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.c
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.c
@@ -199,7 +199,6 @@ hnae_init_ring(struct hnae_queue *q, struct hnae_ring *ring, int flags)

ring->q = q;
ring->flags = flags;
- spin_lock_init(&ring->lock);
ring->coal_param = q->handle->coal_param;
assert(!ring->desc && !ring->desc_cb && !ring->desc_dma_addr);

diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.h b/drivers/net/ethernet/hisilicon/hns/hnae.h
index e9c67c06bfd2..6ab9458302e1 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.h
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.h
@@ -274,9 +274,6 @@ struct hnae_ring {
/* statistic */
struct ring_stats stats;

- /* ring lock for poll one */
- spinlock_t lock;
-
dma_addr_t desc_dma_addr;
u32 buf_size; /* size for hnae_desc->addr, preset by AE */
u16 desc_num; /* total number of desc */
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index 2235dd55fab2..56e8d4dee0e0 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -943,15 +943,6 @@ static int is_valid_clean_head(struct hnae_ring *ring, int h)
return u > c ? (h > c && h <= u) : (h > c || h <= u);
}

-/* netif_tx_lock will turn down the performance, set only when necessary */
-#ifdef CONFIG_NET_POLL_CONTROLLER
-#define NETIF_TX_LOCK(ring) spin_lock(&(ring)->lock)
-#define NETIF_TX_UNLOCK(ring) spin_unlock(&(ring)->lock)
-#else
-#define NETIF_TX_LOCK(ring)
-#define NETIF_TX_UNLOCK(ring)
-#endif
-
/* reclaim all desc in one budget
* return error or number of desc left
*/
@@ -965,21 +956,16 @@ static int hns_nic_tx_poll_one(struct hns_nic_ring_data *ring_data,
int head;
int bytes, pkts;

- NETIF_TX_LOCK(ring);
-
head = readl_relaxed(ring->io_base + RCB_REG_HEAD);
rmb(); /* make sure head is ready before touch any data */

- if (is_ring_empty(ring) || head == ring->next_to_clean) {
- NETIF_TX_UNLOCK(ring);
+ if (is_ring_empty(ring) || head == ring->next_to_clean)
return 0; /* no data to poll */
- }

if (!is_valid_clean_head(ring, head)) {
netdev_err(ndev, "wrong head (%d, %d-%d)\n", head,
ring->next_to_use, ring->next_to_clean);
ring->stats.io_err_cnt++;
- NETIF_TX_UNLOCK(ring);
return -EIO;
}

@@ -994,8 +980,6 @@ static int hns_nic_tx_poll_one(struct hns_nic_ring_data *ring_data,
ring->stats.tx_pkts += pkts;
ring->stats.tx_bytes += bytes;

- NETIF_TX_UNLOCK(ring);
-
dev_queue = netdev_get_tx_queue(ndev, ring_data->queue_index);
netdev_tx_completed_queue(dev_queue, pkts, bytes);

@@ -1055,16 +1039,12 @@ static void hns_nic_tx_clr_all_bufs(struct hns_nic_ring_data *ring_data)
int head;
int bytes, pkts;

- NETIF_TX_LOCK(ring);
-
head = ring->next_to_use; /* ntu :soft setted ring position*/
bytes = 0;
pkts = 0;
while (head != ring->next_to_clean)
hns_nic_reclaim_one_desc(ring, &bytes, &pkts);

- NETIF_TX_UNLOCK(ring);
-
dev_queue = netdev_get_tx_queue(ndev, ring_data->queue_index);
netdev_tx_reset_queue(dev_queue);
}
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 964e7d62f4b1..5ef7704cd98d 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1723,6 +1723,86 @@ static int ibmvnic_set_mac(struct net_device *netdev, void *p)
return rc;
}

+/**
+ * do_change_param_reset returns zero if we are able to keep processing reset
+ * events, or non-zero if we hit a fatal error and must halt.
+ */
+static int do_change_param_reset(struct ibmvnic_adapter *adapter,
+ struct ibmvnic_rwi *rwi,
+ u32 reset_state)
+{
+ struct net_device *netdev = adapter->netdev;
+ int i, rc;
+
+ netdev_dbg(adapter->netdev, "Change param resetting driver (%d)\n",
+ rwi->reset_reason);
+
+ netif_carrier_off(netdev);
+ adapter->reset_reason = rwi->reset_reason;
+
+ ibmvnic_cleanup(netdev);
+
+ if (reset_state == VNIC_OPEN) {
+ rc = __ibmvnic_close(netdev);
+ if (rc)
+ return rc;
+ }
+
+ release_resources(adapter);
+ release_sub_crqs(adapter, 1);
+ release_crq_queue(adapter);
+
+ adapter->state = VNIC_PROBED;
+
+ rc = init_crq_queue(adapter);
+
+ if (rc) {
+ netdev_err(adapter->netdev,
+ "Couldn't initialize crq. rc=%d\n", rc);
+ return rc;
+ }
+
+ rc = ibmvnic_reset_init(adapter);
+ if (rc)
+ return IBMVNIC_INIT_FAILED;
+
+ /* If the adapter was in PROBE state prior to the reset,
+ * exit here.
+ */
+ if (reset_state == VNIC_PROBED)
+ return 0;
+
+ rc = ibmvnic_login(netdev);
+ if (rc) {
+ adapter->state = reset_state;
+ return rc;
+ }
+
+ rc = init_resources(adapter);
+ if (rc)
+ return rc;
+
+ ibmvnic_disable_irqs(adapter);
+
+ adapter->state = VNIC_CLOSED;
+
+ if (reset_state == VNIC_CLOSED)
+ return 0;
+
+ rc = __ibmvnic_open(netdev);
+ if (rc)
+ return IBMVNIC_OPEN_FAILED;
+
+ /* refresh device's multicast list */
+ ibmvnic_set_multi(netdev);
+
+ /* kick napi */
+ for (i = 0; i < adapter->req_rx_queues; i++)
+ napi_schedule(&adapter->napi[i]);
+
+ return 0;
+}
+
/**
* do_reset returns zero if we are able to keep processing reset events, or
* non-zero if we hit a fatal error and must halt.
@@ -1738,6 +1818,8 @@ static int do_reset(struct ibmvnic_adapter *adapter,
netdev_dbg(adapter->netdev, "Re-setting driver (%d)\n",
rwi->reset_reason);

+ rtnl_lock();
+
netif_carrier_off(netdev);
adapter->reset_reason = rwi->reset_reason;

@@ -1751,16 +1833,25 @@ static int do_reset(struct ibmvnic_adapter *adapter,
if (reset_state == VNIC_OPEN &&
adapter->reset_reason != VNIC_RESET_MOBILITY &&
adapter->reset_reason != VNIC_RESET_FAILOVER) {
- rc = __ibmvnic_close(netdev);
+ adapter->state = VNIC_CLOSING;
+
+ /* Release the RTNL lock before link state change and
+ * re-acquire after the link state change to allow
+ * linkwatch_event to grab the RTNL lock and run during
+ * a reset.
+ */
+ rtnl_unlock();
+ rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_DN);
+ rtnl_lock();
if (rc)
- return rc;
- }
+ goto out;

- if (adapter->reset_reason == VNIC_RESET_CHANGE_PARAM ||
- adapter->wait_for_reset) {
- release_resources(adapter);
- release_sub_crqs(adapter, 1);
- release_crq_queue(adapter);
+ if (adapter->state != VNIC_CLOSING) {
+ rc = -1;
+ goto out;
+ }
+
+ adapter->state = VNIC_CLOSED;
}

if (adapter->reset_reason != VNIC_RESET_NON_FATAL) {
@@ -1769,9 +1860,7 @@ static int do_reset(struct ibmvnic_adapter *adapter,
*/
adapter->state = VNIC_PROBED;

- if (adapter->wait_for_reset) {
- rc = init_crq_queue(adapter);
- } else if (adapter->reset_reason == VNIC_RESET_MOBILITY) {
+ if (adapter->reset_reason == VNIC_RESET_MOBILITY) {
rc = ibmvnic_reenable_crq_queue(adapter);
release_sub_crqs(adapter, 1);
} else {
@@ -1783,36 +1872,35 @@ static int do_reset(struct ibmvnic_adapter *adapter,
if (rc) {
netdev_err(adapter->netdev,
"Couldn't initialize crq. rc=%d\n", rc);
- return rc;
+ goto out;
}

rc = ibmvnic_reset_init(adapter);
- if (rc)
- return IBMVNIC_INIT_FAILED;
+ if (rc) {
+ rc = IBMVNIC_INIT_FAILED;
+ goto out;
+ }

/* If the adapter was in PROBE state prior to the reset,
* exit here.
*/
- if (reset_state == VNIC_PROBED)
- return 0;
+ if (reset_state == VNIC_PROBED) {
+ rc = 0;
+ goto out;
+ }

rc = ibmvnic_login(netdev);
if (rc) {
adapter->state = reset_state;
- return rc;
+ goto out;
}

- if (adapter->reset_reason == VNIC_RESET_CHANGE_PARAM ||
- adapter->wait_for_reset) {
- rc = init_resources(adapter);
- if (rc)
- return rc;
- } else if (adapter->req_rx_queues != old_num_rx_queues ||
- adapter->req_tx_queues != old_num_tx_queues ||
- adapter->req_rx_add_entries_per_subcrq !=
- old_num_rx_slots ||
- adapter->req_tx_entries_per_subcrq !=
- old_num_tx_slots) {
+ if (adapter->req_rx_queues != old_num_rx_queues ||
+ adapter->req_tx_queues != old_num_tx_queues ||
+ adapter->req_rx_add_entries_per_subcrq !=
+ old_num_rx_slots ||
+ adapter->req_tx_entries_per_subcrq !=
+ old_num_tx_slots) {
release_rx_pools(adapter);
release_tx_pools(adapter);
release_napi(adapter);
@@ -1820,32 +1908,30 @@ static int do_reset(struct ibmvnic_adapter *adapter,

rc = init_resources(adapter);
if (rc)
- return rc;
+ goto out;

} else {
rc = reset_tx_pools(adapter);
if (rc)
- return rc;
+ goto out;

rc = reset_rx_pools(adapter);
if (rc)
- return rc;
+ goto out;
}
ibmvnic_disable_irqs(adapter);
}
adapter->state = VNIC_CLOSED;

- if (reset_state == VNIC_CLOSED)
- return 0;
+ if (reset_state == VNIC_CLOSED) {
+ rc = 0;
+ goto out;
+ }

rc = __ibmvnic_open(netdev);
if (rc) {
- if (list_empty(&adapter->rwi_list))
- adapter->state = VNIC_CLOSED;
- else
- adapter->state = reset_state;
-
- return 0;
+ rc = IBMVNIC_OPEN_FAILED;
+ goto out;
}

/* refresh device's multicast list */
@@ -1855,11 +1941,15 @@ static int do_reset(struct ibmvnic_adapter *adapter,
for (i = 0; i < adapter->req_rx_queues; i++)
napi_schedule(&adapter->napi[i]);

- if (adapter->reset_reason != VNIC_RESET_FAILOVER &&
- adapter->reset_reason != VNIC_RESET_CHANGE_PARAM)
+ if (adapter->reset_reason != VNIC_RESET_FAILOVER)
call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, netdev);

- return 0;
+ rc = 0;
+
+out:
+ rtnl_unlock();
+
+ return rc;
}

static int do_hard_reset(struct ibmvnic_adapter *adapter,
@@ -1919,14 +2009,8 @@ static int do_hard_reset(struct ibmvnic_adapter *adapter,
return 0;

rc = __ibmvnic_open(netdev);
- if (rc) {
- if (list_empty(&adapter->rwi_list))
- adapter->state = VNIC_CLOSED;
- else
- adapter->state = reset_state;
-
- return 0;
- }
+ if (rc)
+ return IBMVNIC_OPEN_FAILED;

return 0;
}
@@ -1965,20 +2049,11 @@ static void __ibmvnic_reset(struct work_struct *work)
{
struct ibmvnic_rwi *rwi;
struct ibmvnic_adapter *adapter;
- bool we_lock_rtnl = false;
u32 reset_state;
int rc = 0;

adapter = container_of(work, struct ibmvnic_adapter, ibmvnic_reset);

- /* netif_set_real_num_xx_queues needs to take rtnl lock here
- * unless wait_for_reset is set, in which case the rtnl lock
- * has already been taken before initializing the reset
- */
- if (!adapter->wait_for_reset) {
- rtnl_lock();
- we_lock_rtnl = true;
- }
reset_state = adapter->state;

rwi = get_next_rwi(adapter);
@@ -1990,14 +2065,32 @@ static void __ibmvnic_reset(struct work_struct *work)
break;
}

- if (adapter->force_reset_recovery) {
- adapter->force_reset_recovery = false;
- rc = do_hard_reset(adapter, rwi, reset_state);
+ if (rwi->reset_reason == VNIC_RESET_CHANGE_PARAM) {
+ /* CHANGE_PARAM requestor holds rtnl_lock */
+ rc = do_change_param_reset(adapter, rwi, reset_state);
+ } else if (adapter->force_reset_recovery) {
+ /* Transport event occurred during previous reset */
+ if (adapter->wait_for_reset) {
+ /* Previous was CHANGE_PARAM; caller locked */
+ adapter->force_reset_recovery = false;
+ rc = do_hard_reset(adapter, rwi, reset_state);
+ } else {
+ rtnl_lock();
+ adapter->force_reset_recovery = false;
+ rc = do_hard_reset(adapter, rwi, reset_state);
+ rtnl_unlock();
+ }
} else {
rc = do_reset(adapter, rwi, reset_state);
}
kfree(rwi);
- if (rc && rc != IBMVNIC_INIT_FAILED &&
+ if (rc == IBMVNIC_OPEN_FAILED) {
+ if (list_empty(&adapter->rwi_list))
+ adapter->state = VNIC_CLOSED;
+ else
+ adapter->state = reset_state;
+ rc = 0;
+ } else if (rc && rc != IBMVNIC_INIT_FAILED &&
!adapter->force_reset_recovery)
break;

@@ -2005,7 +2098,6 @@ static void __ibmvnic_reset(struct work_struct *work)
}

if (adapter->wait_for_reset) {
- adapter->wait_for_reset = false;
adapter->reset_done_rc = rc;
complete(&adapter->reset_done);
}
@@ -2016,8 +2108,6 @@ static void __ibmvnic_reset(struct work_struct *work)
}

adapter->resetting = false;
- if (we_lock_rtnl)
- rtnl_unlock();
}

static int ibmvnic_reset(struct ibmvnic_adapter *adapter,
@@ -2078,8 +2168,6 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter,

return 0;
err:
- if (adapter->wait_for_reset)
- adapter->wait_for_reset = false;
return -ret;
}

diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index 70bd286f8932..9d3d35cc91d6 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -20,6 +20,7 @@
#define IBMVNIC_INVALID_MAP -1
#define IBMVNIC_STATS_TIMEOUT 1
#define IBMVNIC_INIT_FAILED 2
+#define IBMVNIC_OPEN_FAILED 3

/* basic structures plus 100 2k buffers */
#define IBMVNIC_IO_ENTITLEMENT_DEFAULT 610305
diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
index a41008523c98..2e07ffa87e34 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
@@ -607,6 +607,7 @@ static int e1000_set_ringparam(struct net_device *netdev,
for (i = 0; i < adapter->num_rx_queues; i++)
rxdr[i].count = rxdr->count;

+ err = 0;
if (netif_running(adapter->netdev)) {
/* Try to get new resources before deleting old */
err = e1000_setup_all_rx_resources(adapter);
@@ -627,14 +628,13 @@ static int e1000_set_ringparam(struct net_device *netdev,
adapter->rx_ring = rxdr;
adapter->tx_ring = txdr;
err = e1000_up(adapter);
- if (err)
- goto err_setup;
}
kfree(tx_old);
kfree(rx_old);

clear_bit(__E1000_RESETTING, &adapter->flags);
- return 0;
+ return err;
+
err_setup_tx:
e1000_free_all_rx_resources(adapter);
err_setup_rx:
@@ -646,7 +646,6 @@ static int e1000_set_ringparam(struct net_device *netdev,
err_alloc_tx:
if (netif_running(adapter->netdev))
e1000_up(adapter);
-err_setup:
clear_bit(__E1000_RESETTING, &adapter->flags);
return err;
}
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index b4df3e319467..93a1352f5be9 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2064,7 +2064,8 @@ static void igb_check_swap_media(struct igb_adapter *adapter)
if ((hw->phy.media_type == e1000_media_type_copper) &&
(!(connsw & E1000_CONNSW_AUTOSENSE_EN))) {
swap_now = true;
- } else if (!(connsw & E1000_CONNSW_SERDESD)) {
+ } else if ((hw->phy.media_type != e1000_media_type_copper) &&
+ !(connsw & E1000_CONNSW_SERDESD)) {
/* copper signal takes time to appear */
if (adapter->copper_tries < 4) {
adapter->copper_tries++;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h
index b7298f9ee3d3..c4c128908b6e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h
@@ -86,7 +86,7 @@ struct sk_buff *mlx5e_ktls_handle_tx_skb(struct net_device *netdev,
struct mlx5e_tx_wqe **wqe, u16 *pi);
void mlx5e_ktls_tx_handle_resync_dump_comp(struct mlx5e_txqsq *sq,
struct mlx5e_tx_wqe_info *wi,
- struct mlx5e_sq_dma *dma);
+ u32 *dma_fifo_cc);

#else

@@ -94,6 +94,11 @@ static inline void mlx5e_ktls_build_netdev(struct mlx5e_priv *priv)
{
}

+static inline void
+mlx5e_ktls_tx_handle_resync_dump_comp(struct mlx5e_txqsq *sq,
+ struct mlx5e_tx_wqe_info *wi,
+ u32 *dma_fifo_cc) {}
+
#endif

#endif /* __MLX5E_TLS_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
index 7833ddef0427..002245bb6b28 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
@@ -304,9 +304,16 @@ tx_post_resync_dump(struct mlx5e_txqsq *sq, struct sk_buff *skb,

void mlx5e_ktls_tx_handle_resync_dump_comp(struct mlx5e_txqsq *sq,
struct mlx5e_tx_wqe_info *wi,
- struct mlx5e_sq_dma *dma)
+ u32 *dma_fifo_cc)
{
- struct mlx5e_sq_stats *stats = sq->stats;
+ struct mlx5e_sq_stats *stats;
+ struct mlx5e_sq_dma *dma;
+
+ if (!wi->resync_dump_frag)
+ return;
+
+ dma = mlx5e_dma_get(sq, (*dma_fifo_cc)++);
+ stats = sq->stats;

mlx5e_tx_dma_unmap(sq->pdev, dma);
__skb_frag_unref(wi->resync_dump_frag);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 9d5f6e56188f..f3a2970c3fcf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1347,9 +1347,13 @@ static void mlx5e_deactivate_txqsq(struct mlx5e_txqsq *sq)
/* last doorbell out, godspeed .. */
if (mlx5e_wqc_has_room_for(wq, sq->cc, sq->pc, 1)) {
u16 pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
+ struct mlx5e_tx_wqe_info *wi;
struct mlx5e_tx_wqe *nop;

- sq->db.wqe_info[pi].skb = NULL;
+ wi = &sq->db.wqe_info[pi];
+
+ memset(wi, 0, sizeof(*wi));
+ wi->num_wqebbs = 1;
nop = mlx5e_post_nop(wq, sq->sqn, &sq->pc);
mlx5e_notify_hw(wq, sq->pc, sq->uar_map, &nop->ctrl);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index 600e92cb629a..d5d2b1af3dbc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -404,7 +404,10 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
static void mlx5e_dump_error_cqe(struct mlx5e_txqsq *sq,
struct mlx5_err_cqe *err_cqe)
{
- u32 ci = mlx5_cqwq_get_ci(&sq->cq.wq);
+ struct mlx5_cqwq *wq = &sq->cq.wq;
+ u32 ci;
+
+ ci = mlx5_cqwq_ctr2ix(wq, wq->cc - 1);

netdev_err(sq->channel->netdev,
"Error cqe on cqn 0x%x, ci 0x%x, sqn 0x%x, opcode 0x%x, syndrome 0x%x, vendor syndrome 0x%x\n",
@@ -480,14 +483,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
skb = wi->skb;

if (unlikely(!skb)) {
-#ifdef CONFIG_MLX5_EN_TLS
- if (wi->resync_dump_frag) {
- struct mlx5e_sq_dma *dma =
- mlx5e_dma_get(sq, dma_fifo_cc++);
-
- mlx5e_ktls_tx_handle_resync_dump_comp(sq, wi, dma);
- }
-#endif
+ mlx5e_ktls_tx_handle_resync_dump_comp(sq, wi, &dma_fifo_cc);
sqcc += wi->num_wqebbs;
continue;
}
@@ -543,29 +539,38 @@ void mlx5e_free_txqsq_descs(struct mlx5e_txqsq *sq)
{
struct mlx5e_tx_wqe_info *wi;
struct sk_buff *skb;
+ u32 dma_fifo_cc;
+ u16 sqcc;
u16 ci;
int i;

- while (sq->cc != sq->pc) {
- ci = mlx5_wq_cyc_ctr2ix(&sq->wq, sq->cc);
+ sqcc = sq->cc;
+ dma_fifo_cc = sq->dma_fifo_cc;
+
+ while (sqcc != sq->pc) {
+ ci = mlx5_wq_cyc_ctr2ix(&sq->wq, sqcc);
wi = &sq->db.wqe_info[ci];
skb = wi->skb;

- if (!skb) { /* nop */
- sq->cc++;
+ if (!skb) {
+ mlx5e_ktls_tx_handle_resync_dump_comp(sq, wi, &dma_fifo_cc);
+ sqcc += wi->num_wqebbs;
continue;
}

for (i = 0; i < wi->num_dma; i++) {
struct mlx5e_sq_dma *dma =
- mlx5e_dma_get(sq, sq->dma_fifo_cc++);
+ mlx5e_dma_get(sq, dma_fifo_cc++);

mlx5e_tx_dma_unmap(sq->pdev, dma);
}

dev_kfree_skb_any(skb);
- sq->cc += wi->num_wqebbs;
+ sqcc += wi->num_wqebbs;
}
+
+ sq->dma_fifo_cc = dma_fifo_cc;
+ sq->cc = sqcc;
}

#ifdef CONFIG_MLX5_CORE_IPOIB
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c b/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c
index 4c50efe4e7f1..61021133029e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c
@@ -464,8 +464,10 @@ static int mlx5_fpga_conn_create_cq(struct mlx5_fpga_conn *conn, int cq_size)
}

err = mlx5_vector2eqn(mdev, smp_processor_id(), &eqn, &irqn);
- if (err)
+ if (err) {
+ kvfree(in);
goto err_cqwq;
+ }

cqc = MLX5_ADDR_OF(create_cq_in, in, cq_context);
MLX5_SET(cqc, cqc, log_cq_size, ilog2(cq_size));
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index d685122d9ff7..c07f3154437c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -572,7 +572,7 @@ mlx5_fw_fatal_reporter_dump(struct devlink_health_reporter *reporter,
return -ENOMEM;
err = mlx5_crdump_collect(dev, cr_data);
if (err)
- return err;
+ goto free_data;

if (priv_ctx) {
struct mlx5_fw_reporter_ctx *fw_reporter_ctx = priv_ctx;
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index 6932e615d4b0..7ffe5959a7e7 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -260,8 +260,15 @@ static int ocelot_vlan_vid_add(struct net_device *dev, u16 vid, bool pvid,
port->pvid = vid;

/* Untagged egress vlan clasification */
- if (untagged)
+ if (untagged && port->vid != vid) {
+ if (port->vid) {
+ dev_err(ocelot->dev,
+ "Port already has a native VLAN: %d\n",
+ port->vid);
+ return -EBUSY;
+ }
port->vid = vid;
+ }

ocelot_vlan_port_apply(ocelot, port);

@@ -877,7 +884,7 @@ static int ocelot_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb,
static int ocelot_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
u16 vid)
{
- return ocelot_vlan_vid_add(dev, vid, false, true);
+ return ocelot_vlan_vid_add(dev, vid, false, false);
}

static int ocelot_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
@@ -1499,9 +1506,6 @@ static int ocelot_netdevice_port_event(struct net_device *dev,
struct ocelot_port *ocelot_port = netdev_priv(dev);
int err = 0;

- if (!ocelot_netdevice_dev_check(dev))
- return 0;
-
switch (event) {
case NETDEV_CHANGEUPPER:
if (netif_is_bridge_master(info->upper_dev)) {
@@ -1538,12 +1542,16 @@ static int ocelot_netdevice_event(struct notifier_block *unused,
struct net_device *dev = netdev_notifier_info_to_dev(ptr);
int ret = 0;

+ if (!ocelot_netdevice_dev_check(dev))
+ return 0;
+
if (event == NETDEV_PRECHANGEUPPER &&
netif_is_lag_master(info->upper_dev)) {
struct netdev_lag_upper_info *lag_upper_info = info->upper_info;
struct netlink_ext_ack *extack;

- if (lag_upper_info->tx_type != NETDEV_LAG_TX_TYPE_HASH) {
+ if (lag_upper_info &&
+ lag_upper_info->tx_type != NETDEV_LAG_TX_TYPE_HASH) {
extack = netdev_notifier_info_to_extack(&info->info);
NL_SET_ERR_MSG_MOD(extack, "LAG device using unsupported Tx type");

diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 8d1c208f778f..a220cc7c947a 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -1208,8 +1208,16 @@ enum qede_remove_mode {
static void __qede_remove(struct pci_dev *pdev, enum qede_remove_mode mode)
{
struct net_device *ndev = pci_get_drvdata(pdev);
- struct qede_dev *edev = netdev_priv(ndev);
- struct qed_dev *cdev = edev->cdev;
+ struct qede_dev *edev;
+ struct qed_dev *cdev;
+
+ if (!ndev) {
+ dev_info(&pdev->dev, "Device has already been removed\n");
+ return;
+ }
+
+ edev = netdev_priv(ndev);
+ cdev = edev->cdev;

DP_INFO(edev, "Starting qede_remove\n");

diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
index 9c54b715228e..06de59521fc4 100644
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
@@ -57,10 +57,10 @@ static int rmnet_unregister_real_device(struct net_device *real_dev,
if (port->nr_rmnet_devs)
return -EINVAL;

- kfree(port);
-
netdev_rx_handler_unregister(real_dev);

+ kfree(port);
+
/* release reference on real_dev */
dev_put(real_dev);

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 00c86c7dd42d..efb5b000489f 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -863,6 +863,9 @@ static void r8168g_mdio_write(struct rtl8169_private *tp, int reg, int value)

static int r8168g_mdio_read(struct rtl8169_private *tp, int reg)
{
+ if (reg == 0x1f)
+ return tp->ocp_base == OCP_STD_PHY_BASE ? 0 : tp->ocp_base >> 4;
+
if (tp->ocp_base != OCP_STD_PHY_BASE)
reg -= 0x10;

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index fe2d3029de5e..ed0e694a0855 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2906,6 +2906,7 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
} else {
stmmac_set_desc_addr(priv, first, des);
tmp_pay_len = pay_len;
+ des += proto_hdr_len;
}

stmmac_tso_allocator(priv, des, tmp_pay_len, (nfrags == 0), queue);
diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index bbbc1dcb6ab5..b517c1af9de0 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -1237,8 +1237,17 @@ static int fjes_probe(struct platform_device *plat_dev)
adapter->open_guard = false;

adapter->txrx_wq = alloc_workqueue(DRV_NAME "/txrx", WQ_MEM_RECLAIM, 0);
+ if (unlikely(!adapter->txrx_wq)) {
+ err = -ENOMEM;
+ goto err_free_netdev;
+ }
+
adapter->control_wq = alloc_workqueue(DRV_NAME "/control",
WQ_MEM_RECLAIM, 0);
+ if (unlikely(!adapter->control_wq)) {
+ err = -ENOMEM;
+ goto err_free_txrx_wq;
+ }

INIT_WORK(&adapter->tx_stall_task, fjes_tx_stall_task);
INIT_WORK(&adapter->raise_intr_rxdata_task,
@@ -1255,7 +1264,7 @@ static int fjes_probe(struct platform_device *plat_dev)
hw->hw_res.irq = platform_get_irq(plat_dev, 0);
err = fjes_hw_init(&adapter->hw);
if (err)
- goto err_free_netdev;
+ goto err_free_control_wq;

/* setup MAC address (02:00:00:00:00:[epid])*/
netdev->dev_addr[0] = 2;
@@ -1277,6 +1286,10 @@ static int fjes_probe(struct platform_device *plat_dev)

err_hw_exit:
fjes_hw_exit(&adapter->hw);
+err_free_control_wq:
+ destroy_workqueue(adapter->control_wq);
+err_free_txrx_wq:
+ destroy_workqueue(adapter->txrx_wq);
err_free_netdev:
free_netdev(netdev);
err_out:
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index e8fce6d715ef..8ed79b418d88 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -982,7 +982,7 @@ static int netvsc_attach(struct net_device *ndev,
if (netif_running(ndev)) {
ret = rndis_filter_open(nvdev);
if (ret)
- return ret;
+ goto err;

rdev = nvdev->extension;
if (!rdev->link_state)
@@ -990,6 +990,13 @@ static int netvsc_attach(struct net_device *ndev,
}

return 0;
+
+err:
+ netif_device_detach(ndev);
+
+ rndis_filter_device_remove(hdev, nvdev);
+
+ return ret;
}

static int netvsc_set_channels(struct net_device *net,
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index cb7637364b40..1bd113b142ea 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -3001,12 +3001,10 @@ static const struct nla_policy macsec_rtnl_policy[IFLA_MACSEC_MAX + 1] = {
static void macsec_free_netdev(struct net_device *dev)
{
struct macsec_dev *macsec = macsec_priv(dev);
- struct net_device *real_dev = macsec->real_dev;

free_percpu(macsec->stats);
free_percpu(macsec->secy.tx_sc.stats);

- dev_put(real_dev);
}

static void macsec_setup(struct net_device *dev)
@@ -3261,8 +3259,6 @@ static int macsec_newlink(struct net *net, struct net_device *dev,
if (err < 0)
return err;

- dev_hold(real_dev);
-
macsec->nest_level = dev_get_nest_level(real_dev) + 1;
netdev_lockdep_set_classes(dev);
lockdep_set_class_and_subclass(&dev->addr_list_lock,
diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c
index dc3d92d340c4..b73298250793 100644
--- a/drivers/net/phy/smsc.c
+++ b/drivers/net/phy/smsc.c
@@ -327,6 +327,7 @@ static struct phy_driver smsc_phy_driver[] = {
.name = "SMSC LAN8740",

/* PHY_BASIC_FEATURES */
+ .flags = PHY_RST_AFTER_CLK_EN,

.probe = smsc_phy_probe,

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index 00cab3f43a4c..a245597a3902 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -578,8 +578,8 @@ static void cdc_ncm_set_dgram_size(struct usbnet *dev, int new_size)
/* read current mtu value from device */
err = usbnet_read_cmd(dev, USB_CDC_GET_MAX_DATAGRAM_SIZE,
USB_TYPE_CLASS | USB_DIR_IN | USB_RECIP_INTERFACE,
- 0, iface_no, &max_datagram_size, 2);
- if (err < 0) {
+ 0, iface_no, &max_datagram_size, sizeof(max_datagram_size));
+ if (err < sizeof(max_datagram_size)) {
dev_dbg(&dev->intf->dev, "GET_MAX_DATAGRAM_SIZE failed\n");
goto out;
}
@@ -590,7 +590,7 @@ static void cdc_ncm_set_dgram_size(struct usbnet *dev, int new_size)
max_datagram_size = cpu_to_le16(ctx->max_datagram_size);
err = usbnet_write_cmd(dev, USB_CDC_SET_MAX_DATAGRAM_SIZE,
USB_TYPE_CLASS | USB_DIR_OUT | USB_RECIP_INTERFACE,
- 0, iface_no, &max_datagram_size, 2);
+ 0, iface_no, &max_datagram_size, sizeof(max_datagram_size));
if (err < 0)
dev_dbg(&dev->intf->dev, "SET_MAX_DATAGRAM_SIZE failed\n");

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 3d77cd402ba9..ba682bba7851 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1361,6 +1361,7 @@ static const struct usb_device_id products[] = {
{QMI_FIXED_INTF(0x413c, 0x81b6, 8)}, /* Dell Wireless 5811e */
{QMI_FIXED_INTF(0x413c, 0x81b6, 10)}, /* Dell Wireless 5811e */
{QMI_FIXED_INTF(0x413c, 0x81d7, 0)}, /* Dell Wireless 5821e */
+ {QMI_FIXED_INTF(0x413c, 0x81e0, 0)}, /* Dell Wireless 5821e with eSIM support*/
{QMI_FIXED_INTF(0x03f0, 0x4e1d, 8)}, /* HP lt4111 LTE/EV-DO/HSPA+ Gobi 4G Module */
{QMI_FIXED_INTF(0x03f0, 0x9d1d, 1)}, /* HP lt4120 Snapdragon X5 LTE */
{QMI_FIXED_INTF(0x22de, 0x9061, 3)}, /* WeTelecom WPD-600N */
diff --git a/drivers/net/wimax/i2400m/op-rfkill.c b/drivers/net/wimax/i2400m/op-rfkill.c
index 8efb493ceec2..5c79f052cad2 100644
--- a/drivers/net/wimax/i2400m/op-rfkill.c
+++ b/drivers/net/wimax/i2400m/op-rfkill.c
@@ -127,12 +127,12 @@ int i2400m_op_rfkill_sw_toggle(struct wimax_dev *wimax_dev,
"%d\n", result);
result = 0;
error_cmd:
- kfree(cmd);
kfree_skb(ack_skb);
error_msg_to_dev:
error_alloc:
d_fnend(4, dev, "(wimax_dev %p state %d) = %d\n",
wimax_dev, state, result);
+ kfree(cmd);
return result;
}

diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c
index acbadfdbdd3f..2ee5c5dc78cb 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c
@@ -573,20 +573,20 @@ static const struct pci_device_id iwl_hw_card_ids[] = {
{IWL_PCI_DEVICE(0x2526, 0x0034, iwl9560_2ac_cfg)},
{IWL_PCI_DEVICE(0x2526, 0x0038, iwl9560_2ac_160_cfg)},
{IWL_PCI_DEVICE(0x2526, 0x003C, iwl9560_2ac_160_cfg)},
- {IWL_PCI_DEVICE(0x2526, 0x0060, iwl9460_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2526, 0x0064, iwl9460_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2526, 0x00A0, iwl9460_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2526, 0x00A4, iwl9460_2ac_cfg)},
+ {IWL_PCI_DEVICE(0x2526, 0x0060, iwl9461_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x2526, 0x0064, iwl9461_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x2526, 0x00A0, iwl9462_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x2526, 0x00A4, iwl9462_2ac_cfg_soc)},
{IWL_PCI_DEVICE(0x2526, 0x0210, iwl9260_2ac_cfg)},
{IWL_PCI_DEVICE(0x2526, 0x0214, iwl9260_2ac_cfg)},
{IWL_PCI_DEVICE(0x2526, 0x0230, iwl9560_2ac_cfg)},
{IWL_PCI_DEVICE(0x2526, 0x0234, iwl9560_2ac_cfg)},
{IWL_PCI_DEVICE(0x2526, 0x0238, iwl9560_2ac_cfg)},
{IWL_PCI_DEVICE(0x2526, 0x023C, iwl9560_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2526, 0x0260, iwl9460_2ac_cfg)},
+ {IWL_PCI_DEVICE(0x2526, 0x0260, iwl9461_2ac_cfg_soc)},
{IWL_PCI_DEVICE(0x2526, 0x0264, iwl9461_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2526, 0x02A0, iwl9460_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2526, 0x02A4, iwl9460_2ac_cfg)},
+ {IWL_PCI_DEVICE(0x2526, 0x02A0, iwl9462_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x2526, 0x02A4, iwl9462_2ac_cfg_soc)},
{IWL_PCI_DEVICE(0x2526, 0x1010, iwl9260_2ac_cfg)},
{IWL_PCI_DEVICE(0x2526, 0x1030, iwl9560_2ac_cfg)},
{IWL_PCI_DEVICE(0x2526, 0x1210, iwl9260_2ac_cfg)},
@@ -603,7 +603,7 @@ static const struct pci_device_id iwl_hw_card_ids[] = {
{IWL_PCI_DEVICE(0x2526, 0x401C, iwl9260_2ac_160_cfg)},
{IWL_PCI_DEVICE(0x2526, 0x4030, iwl9560_2ac_160_cfg)},
{IWL_PCI_DEVICE(0x2526, 0x4034, iwl9560_2ac_160_cfg_soc)},
- {IWL_PCI_DEVICE(0x2526, 0x40A4, iwl9460_2ac_cfg)},
+ {IWL_PCI_DEVICE(0x2526, 0x40A4, iwl9462_2ac_cfg_soc)},
{IWL_PCI_DEVICE(0x2526, 0x4234, iwl9560_2ac_cfg_soc)},
{IWL_PCI_DEVICE(0x2526, 0x42A4, iwl9462_2ac_cfg_soc)},
{IWL_PCI_DEVICE(0x2526, 0x6010, iwl9260_2ac_160_cfg)},
@@ -618,60 +618,61 @@ static const struct pci_device_id iwl_hw_card_ids[] = {
{IWL_PCI_DEVICE(0x271B, 0x0210, iwl9160_2ac_cfg)},
{IWL_PCI_DEVICE(0x271B, 0x0214, iwl9260_2ac_cfg)},
{IWL_PCI_DEVICE(0x271C, 0x0214, iwl9260_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2720, 0x0034, iwl9560_2ac_160_cfg)},
- {IWL_PCI_DEVICE(0x2720, 0x0038, iwl9560_2ac_160_cfg)},
- {IWL_PCI_DEVICE(0x2720, 0x003C, iwl9560_2ac_160_cfg)},
- {IWL_PCI_DEVICE(0x2720, 0x0060, iwl9461_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x0064, iwl9461_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x00A0, iwl9462_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x00A4, iwl9462_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x0230, iwl9560_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2720, 0x0234, iwl9560_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2720, 0x0238, iwl9560_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2720, 0x023C, iwl9560_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2720, 0x0260, iwl9461_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x0264, iwl9461_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x02A0, iwl9462_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x02A4, iwl9462_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x1010, iwl9260_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2720, 0x1030, iwl9560_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x1210, iwl9260_2ac_cfg)},
- {IWL_PCI_DEVICE(0x2720, 0x1551, iwl9560_killer_s_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x1552, iwl9560_killer_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x2030, iwl9560_2ac_160_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x2034, iwl9560_2ac_160_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x4030, iwl9560_2ac_160_cfg)},
- {IWL_PCI_DEVICE(0x2720, 0x4034, iwl9560_2ac_160_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x40A4, iwl9462_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x4234, iwl9560_2ac_cfg_soc)},
- {IWL_PCI_DEVICE(0x2720, 0x42A4, iwl9462_2ac_cfg_soc)},
-
- {IWL_PCI_DEVICE(0x30DC, 0x0030, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x0034, iwl9560_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x0038, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x003C, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x0060, iwl9461_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x0064, iwl9461_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x00A0, iwl9462_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x00A4, iwl9462_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x0230, iwl9560_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x0234, iwl9560_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x0238, iwl9560_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x023C, iwl9560_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x0260, iwl9461_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x0264, iwl9461_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x02A0, iwl9462_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x02A4, iwl9462_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x1030, iwl9560_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x1551, killer1550s_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x1552, killer1550i_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x2030, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x2034, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x4030, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x4034, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x40A4, iwl9462_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x4234, iwl9560_2ac_cfg_qu_b0_jf_b0)},
- {IWL_PCI_DEVICE(0x30DC, 0x42A4, iwl9462_2ac_cfg_qu_b0_jf_b0)},
+
+ {IWL_PCI_DEVICE(0x2720, 0x0034, iwl9560_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x0038, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x003C, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x0060, iwl9461_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x0064, iwl9461_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x00A0, iwl9462_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x00A4, iwl9462_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x0230, iwl9560_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x0234, iwl9560_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x0238, iwl9560_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x023C, iwl9560_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x0260, iwl9461_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x0264, iwl9461_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x02A0, iwl9462_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x02A4, iwl9462_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x1030, iwl9560_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x1551, killer1550s_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x1552, killer1550i_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x2030, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x2034, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x4030, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x4034, iwl9560_2ac_160_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x40A4, iwl9462_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x4234, iwl9560_2ac_cfg_qu_b0_jf_b0)},
+ {IWL_PCI_DEVICE(0x2720, 0x42A4, iwl9462_2ac_cfg_qu_b0_jf_b0)},
+
+ {IWL_PCI_DEVICE(0x30DC, 0x0030, iwl9560_2ac_160_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x0034, iwl9560_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x0038, iwl9560_2ac_160_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x003C, iwl9560_2ac_160_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x0060, iwl9460_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x0064, iwl9461_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x00A0, iwl9462_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x00A4, iwl9462_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x0230, iwl9560_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x0234, iwl9560_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x0238, iwl9560_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x023C, iwl9560_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x0260, iwl9461_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x0264, iwl9461_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x02A0, iwl9462_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x02A4, iwl9462_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x1010, iwl9260_2ac_cfg)},
+ {IWL_PCI_DEVICE(0x30DC, 0x1030, iwl9560_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x1210, iwl9260_2ac_cfg)},
+ {IWL_PCI_DEVICE(0x30DC, 0x1551, iwl9560_killer_s_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x1552, iwl9560_killer_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x2030, iwl9560_2ac_160_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x2034, iwl9560_2ac_160_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x4030, iwl9560_2ac_160_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x4034, iwl9560_2ac_160_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x40A4, iwl9462_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x4234, iwl9560_2ac_cfg_soc)},
+ {IWL_PCI_DEVICE(0x30DC, 0x42A4, iwl9462_2ac_cfg_soc)},

{IWL_PCI_DEVICE(0x31DC, 0x0030, iwl9560_2ac_160_cfg_shared_clk)},
{IWL_PCI_DEVICE(0x31DC, 0x0034, iwl9560_2ac_cfg_shared_clk)},
diff --git a/drivers/net/wireless/mediatek/mt76/dma.c b/drivers/net/wireless/mediatek/mt76/dma.c
index d8f61e540bfd..ed744cd19819 100644
--- a/drivers/net/wireless/mediatek/mt76/dma.c
+++ b/drivers/net/wireless/mediatek/mt76/dma.c
@@ -64,8 +64,10 @@ mt76_dma_add_buf(struct mt76_dev *dev, struct mt76_queue *q,
u32 ctrl;
int i, idx = -1;

- if (txwi)
+ if (txwi) {
q->entry[q->head].txwi = DMA_DUMMY_DATA;
+ q->entry[q->head].skip_buf0 = true;
+ }

for (i = 0; i < nbufs; i += 2, buf += 2) {
u32 buf0 = buf[0].addr, buf1 = 0;
@@ -108,7 +110,7 @@ mt76_dma_tx_cleanup_idx(struct mt76_dev *dev, struct mt76_queue *q, int idx,
__le32 __ctrl = READ_ONCE(q->desc[idx].ctrl);
u32 ctrl = le32_to_cpu(__ctrl);

- if (!e->txwi || !e->skb) {
+ if (!e->skip_buf0) {
__le32 addr = READ_ONCE(q->desc[idx].buf0);
u32 len = FIELD_GET(MT_DMA_CTL_SD_LEN0, ctrl);

diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h
index 989386ecb5e4..e98859ab480b 100644
--- a/drivers/net/wireless/mediatek/mt76/mt76.h
+++ b/drivers/net/wireless/mediatek/mt76/mt76.h
@@ -102,8 +102,9 @@ struct mt76_queue_entry {
struct urb *urb;
};
enum mt76_txq_id qid;
- bool schedule;
- bool done;
+ bool skip_buf0:1;
+ bool schedule:1;
+ bool done:1;
};

struct mt76_queue_regs {
diff --git a/drivers/net/wireless/virt_wifi.c b/drivers/net/wireless/virt_wifi.c
index be92e1220284..7997cc6de334 100644
--- a/drivers/net/wireless/virt_wifi.c
+++ b/drivers/net/wireless/virt_wifi.c
@@ -548,6 +548,7 @@ static int virt_wifi_newlink(struct net *src_net, struct net_device *dev,
priv->is_connected = false;
priv->is_up = false;
INIT_DELAYED_WORK(&priv->connect, virt_wifi_connect_complete);
+ __module_get(THIS_MODULE);

return 0;
unregister_netdev:
@@ -578,6 +579,7 @@ static void virt_wifi_dellink(struct net_device *dev,
netdev_upper_dev_unlink(priv->lowerdev, dev);

unregister_netdevice_queue(dev, head);
+ module_put(THIS_MODULE);

/* Deleting the wiphy is handled in the module destructor. */
}
@@ -590,6 +592,42 @@ static struct rtnl_link_ops virt_wifi_link_ops = {
.priv_size = sizeof(struct virt_wifi_netdev_priv),
};

+static bool netif_is_virt_wifi_dev(const struct net_device *dev)
+{
+ return rcu_access_pointer(dev->rx_handler) == virt_wifi_rx_handler;
+}
+
+static int virt_wifi_event(struct notifier_block *this, unsigned long event,
+ void *ptr)
+{
+ struct net_device *lower_dev = netdev_notifier_info_to_dev(ptr);
+ struct virt_wifi_netdev_priv *priv;
+ struct net_device *upper_dev;
+ LIST_HEAD(list_kill);
+
+ if (!netif_is_virt_wifi_dev(lower_dev))
+ return NOTIFY_DONE;
+
+ switch (event) {
+ case NETDEV_UNREGISTER:
+ priv = rtnl_dereference(lower_dev->rx_handler_data);
+ if (!priv)
+ return NOTIFY_DONE;
+
+ upper_dev = priv->upperdev;
+
+ upper_dev->rtnl_link_ops->dellink(upper_dev, &list_kill);
+ unregister_netdevice_many(&list_kill);
+ break;
+ }
+
+ return NOTIFY_DONE;
+}
+
+static struct notifier_block virt_wifi_notifier = {
+ .notifier_call = virt_wifi_event,
+};
+
/* Acquires and releases the rtnl lock. */
static int __init virt_wifi_init_module(void)
{
@@ -598,14 +636,25 @@ static int __init virt_wifi_init_module(void)
/* Guaranteed to be locallly-administered and not multicast. */
eth_random_addr(fake_router_bssid);

+ err = register_netdevice_notifier(&virt_wifi_notifier);
+ if (err)
+ return err;
+
+ err = -ENOMEM;
common_wiphy = virt_wifi_make_wiphy();
if (!common_wiphy)
- return -ENOMEM;
+ goto notifier;

err = rtnl_link_register(&virt_wifi_link_ops);
if (err)
- virt_wifi_destroy_wiphy(common_wiphy);
+ goto destroy_wiphy;

+ return 0;
+
+destroy_wiphy:
+ virt_wifi_destroy_wiphy(common_wiphy);
+notifier:
+ unregister_netdevice_notifier(&virt_wifi_notifier);
return err;
}

@@ -615,6 +664,7 @@ static void __exit virt_wifi_cleanup_module(void)
/* Will delete any devices that depend on the wiphy. */
rtnl_link_unregister(&virt_wifi_link_ops);
virt_wifi_destroy_wiphy(common_wiphy);
+ unregister_netdevice_notifier(&virt_wifi_notifier);
}

module_init(virt_wifi_init_module);
diff --git a/drivers/nfc/fdp/i2c.c b/drivers/nfc/fdp/i2c.c
index 1cd113c8d7cb..ad0abb1f0bae 100644
--- a/drivers/nfc/fdp/i2c.c
+++ b/drivers/nfc/fdp/i2c.c
@@ -259,7 +259,7 @@ static void fdp_nci_i2c_read_device_properties(struct device *dev,
*fw_vsc_cfg, len);

if (r) {
- devm_kfree(dev, fw_vsc_cfg);
+ devm_kfree(dev, *fw_vsc_cfg);
goto vsc_read_err;
}
} else {
diff --git a/drivers/nfc/st21nfca/core.c b/drivers/nfc/st21nfca/core.c
index f9ac176cf257..2ce17932a073 100644
--- a/drivers/nfc/st21nfca/core.c
+++ b/drivers/nfc/st21nfca/core.c
@@ -708,6 +708,7 @@ static int st21nfca_hci_complete_target_discovered(struct nfc_hci_dev *hdev,
NFC_PROTO_FELICA_MASK;
} else {
kfree_skb(nfcid_skb);
+ nfcid_skb = NULL;
/* P2P in type A */
r = nfc_hci_get_param(hdev, ST21NFCA_RF_READER_F_GATE,
ST21NFCA_RF_READER_F_NFCID1,
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 30de7efef003..d320684d25b2 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -715,7 +715,7 @@ int nvme_mpath_init(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
goto out;
}

- error = nvme_read_ana_log(ctrl, true);
+ error = nvme_read_ana_log(ctrl, false);
if (error)
goto out_free_ana_log_buf;
return 0;
diff --git a/drivers/pinctrl/intel/pinctrl-cherryview.c b/drivers/pinctrl/intel/pinctrl-cherryview.c
index bf049d1bbb87..17a248b723b9 100644
--- a/drivers/pinctrl/intel/pinctrl-cherryview.c
+++ b/drivers/pinctrl/intel/pinctrl-cherryview.c
@@ -1584,7 +1584,7 @@ static int chv_gpio_probe(struct chv_pinctrl *pctrl, int irq)
intsel >>= CHV_PADCTRL0_INTSEL_SHIFT;

if (need_valid_mask && intsel >= community->nirqs)
- clear_bit(i, chip->irq.valid_mask);
+ clear_bit(desc->number, chip->irq.valid_mask);
}

/*
diff --git a/drivers/pinctrl/intel/pinctrl-intel.c b/drivers/pinctrl/intel/pinctrl-intel.c
index 4323796cbe11..8fb6c9668c37 100644
--- a/drivers/pinctrl/intel/pinctrl-intel.c
+++ b/drivers/pinctrl/intel/pinctrl-intel.c
@@ -52,6 +52,7 @@
#define PADCFG0_GPIROUTNMI BIT(17)
#define PADCFG0_PMODE_SHIFT 10
#define PADCFG0_PMODE_MASK GENMASK(13, 10)
+#define PADCFG0_PMODE_GPIO 0
#define PADCFG0_GPIORXDIS BIT(9)
#define PADCFG0_GPIOTXDIS BIT(8)
#define PADCFG0_GPIORXSTATE BIT(1)
@@ -307,7 +308,7 @@ static void intel_pin_dbg_show(struct pinctrl_dev *pctldev, struct seq_file *s,
cfg1 = readl(intel_get_padcfg(pctrl, pin, PADCFG1));

mode = (cfg0 & PADCFG0_PMODE_MASK) >> PADCFG0_PMODE_SHIFT;
- if (!mode)
+ if (mode == PADCFG0_PMODE_GPIO)
seq_puts(s, "GPIO ");
else
seq_printf(s, "mode %d ", mode);
@@ -428,6 +429,11 @@ static void __intel_gpio_set_direction(void __iomem *padcfg0, bool input)
writel(value, padcfg0);
}

+static int intel_gpio_get_gpio_mode(void __iomem *padcfg0)
+{
+ return (readl(padcfg0) & PADCFG0_PMODE_MASK) >> PADCFG0_PMODE_SHIFT;
+}
+
static void intel_gpio_set_gpio_mode(void __iomem *padcfg0)
{
u32 value;
@@ -456,7 +462,20 @@ static int intel_gpio_request_enable(struct pinctrl_dev *pctldev,
}

padcfg0 = intel_get_padcfg(pctrl, pin, PADCFG0);
+
+ /*
+ * If pin is already configured in GPIO mode, we assume that
+ * firmware provides correct settings. In such case we avoid
+ * potential glitches on the pin. Otherwise, for the pin in
+ * alternative mode, consumer has to supply respective flags.
+ */
+ if (intel_gpio_get_gpio_mode(padcfg0) == PADCFG0_PMODE_GPIO) {
+ raw_spin_unlock_irqrestore(&pctrl->lock, flags);
+ return 0;
+ }
+
intel_gpio_set_gpio_mode(padcfg0);
+
/* Disable TX buffer and enable RX (this will be input) */
__intel_gpio_set_direction(padcfg0, true);

diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c b/drivers/scsi/lpfc/lpfc_nportdisc.c
index 59252bfca14e..41309ac65693 100644
--- a/drivers/scsi/lpfc/lpfc_nportdisc.c
+++ b/drivers/scsi/lpfc/lpfc_nportdisc.c
@@ -845,9 +845,9 @@ lpfc_disc_set_adisc(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)

if (!(vport->fc_flag & FC_PT2PT)) {
/* Check config parameter use-adisc or FCP-2 */
- if ((vport->cfg_use_adisc && (vport->fc_flag & FC_RSCN_MODE)) ||
+ if (vport->cfg_use_adisc && ((vport->fc_flag & FC_RSCN_MODE) ||
((ndlp->nlp_fcp_info & NLP_FCP_2_DEVICE) &&
- (ndlp->nlp_type & NLP_FCP_TARGET))) {
+ (ndlp->nlp_type & NLP_FCP_TARGET)))) {
spin_lock_irq(shost->host_lock);
ndlp->nlp_flag |= NLP_NPR_ADISC;
spin_unlock_irq(shost->host_lock);
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index f9e6a135d656..c7027ecd4d19 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -7898,7 +7898,7 @@ lpfc_sli4_process_missed_mbox_completions(struct lpfc_hba *phba)
if (sli4_hba->hdwq) {
for (eqidx = 0; eqidx < phba->cfg_irq_chann; eqidx++) {
eq = phba->sli4_hba.hba_eq_hdl[eqidx].eq;
- if (eq->queue_id == sli4_hba->mbx_cq->assoc_qid) {
+ if (eq && eq->queue_id == sli4_hba->mbx_cq->assoc_qid) {
fpeq = eq;
break;
}
diff --git a/drivers/scsi/qla2xxx/qla_bsg.c b/drivers/scsi/qla2xxx/qla_bsg.c
index 5441557b424b..3084c2cff7bd 100644
--- a/drivers/scsi/qla2xxx/qla_bsg.c
+++ b/drivers/scsi/qla2xxx/qla_bsg.c
@@ -257,7 +257,7 @@ qla2x00_process_els(struct bsg_job *bsg_job)
srb_t *sp;
const char *type;
int req_sg_cnt, rsp_sg_cnt;
- int rval = (DRIVER_ERROR << 16);
+ int rval = (DID_ERROR << 16);
uint16_t nextlid = 0;

if (bsg_request->msgcode == FC_BSG_RPT_ELS) {
@@ -432,7 +432,7 @@ qla2x00_process_ct(struct bsg_job *bsg_job)
struct Scsi_Host *host = fc_bsg_to_shost(bsg_job);
scsi_qla_host_t *vha = shost_priv(host);
struct qla_hw_data *ha = vha->hw;
- int rval = (DRIVER_ERROR << 16);
+ int rval = (DID_ERROR << 16);
int req_sg_cnt, rsp_sg_cnt;
uint16_t loop_id;
struct fc_port *fcport;
@@ -1951,7 +1951,7 @@ qlafx00_mgmt_cmd(struct bsg_job *bsg_job)
struct Scsi_Host *host = fc_bsg_to_shost(bsg_job);
scsi_qla_host_t *vha = shost_priv(host);
struct qla_hw_data *ha = vha->hw;
- int rval = (DRIVER_ERROR << 16);
+ int rval = (DID_ERROR << 16);
struct qla_mt_iocb_rqst_fx00 *piocb_rqst;
srb_t *sp;
int req_sg_cnt = 0, rsp_sg_cnt = 0;
diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c
index abfb9c800ce2..ac4640f45678 100644
--- a/drivers/scsi/qla2xxx/qla_mbx.c
+++ b/drivers/scsi/qla2xxx/qla_mbx.c
@@ -710,6 +710,7 @@ qla2x00_execute_fw(scsi_qla_host_t *vha, uint32_t risc_addr)
mcp->mb[2] = LSW(risc_addr);
mcp->mb[3] = 0;
mcp->mb[4] = 0;
+ mcp->mb[11] = 0;
ha->flags.using_lr_setting = 0;
if (IS_QLA25XX(ha) || IS_QLA81XX(ha) || IS_QLA83XX(ha) ||
IS_QLA27XX(ha) || IS_QLA28XX(ha)) {
@@ -754,7 +755,7 @@ qla2x00_execute_fw(scsi_qla_host_t *vha, uint32_t risc_addr)
if (ha->flags.exchoffld_enabled)
mcp->mb[4] |= ENABLE_EXCHANGE_OFFLD;

- mcp->out_mb |= MBX_4|MBX_3|MBX_2|MBX_1;
+ mcp->out_mb |= MBX_4 | MBX_3 | MBX_2 | MBX_1 | MBX_11;
mcp->in_mb |= MBX_3 | MBX_2 | MBX_1;
} else {
mcp->mb[1] = LSW(risc_addr);
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 04cf6986eb8e..ac96771bb06d 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3543,6 +3543,10 @@ qla2x00_shutdown(struct pci_dev *pdev)
qla2x00_try_to_stop_firmware(vha);
}

+ /* Disable timer */
+ if (vha->timer_active)
+ qla2x00_stop_timer(vha);
+
/* Turn adapter off line */
vha->flags.online = 0;

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 2d77f32e13d5..9dc367e2e742 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1166,11 +1166,12 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
sector_t lba = sectors_to_logical(sdp, blk_rq_pos(rq));
sector_t threshold;
unsigned int nr_blocks = sectors_to_logical(sdp, blk_rq_sectors(rq));
- bool dif, dix;
unsigned int mask = logical_to_sectors(sdp, 1) - 1;
bool write = rq_data_dir(rq) == WRITE;
unsigned char protect, fua;
blk_status_t ret;
+ unsigned int dif;
+ bool dix;

ret = scsi_init_io(cmd);
if (ret != BLK_STS_OK)
diff --git a/drivers/scsi/ufs/ufs_bsg.c b/drivers/scsi/ufs/ufs_bsg.c
index a9344eb4e047..dc2f6d2b46ed 100644
--- a/drivers/scsi/ufs/ufs_bsg.c
+++ b/drivers/scsi/ufs/ufs_bsg.c
@@ -98,6 +98,8 @@ static int ufs_bsg_request(struct bsg_job *job)

bsg_reply->reply_payload_rcv_len = 0;

+ pm_runtime_get_sync(hba->dev);
+
msgcode = bsg_request->msgcode;
switch (msgcode) {
case UPIU_TRANSACTION_QUERY_REQ:
@@ -135,6 +137,8 @@ static int ufs_bsg_request(struct bsg_job *job)
break;
}

+ pm_runtime_put_sync(hba->dev);
+
if (!desc_buff)
goto out;

diff --git a/drivers/soundwire/Kconfig b/drivers/soundwire/Kconfig
index f518273cfbe3..c8c80df090d1 100644
--- a/drivers/soundwire/Kconfig
+++ b/drivers/soundwire/Kconfig
@@ -5,6 +5,7 @@

menuconfig SOUNDWIRE
tristate "SoundWire support"
+ depends on ACPI || OF
help
SoundWire is a 2-Pin interface with data and clock line ratified
by the MIPI Alliance. SoundWire is used for transporting data
diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
index fe745830a261..90b2127cc203 100644
--- a/drivers/soundwire/bus.c
+++ b/drivers/soundwire/bus.c
@@ -803,7 +803,7 @@ static int sdw_handle_port_interrupt(struct sdw_slave *slave,
static int sdw_handle_slave_alerts(struct sdw_slave *slave)
{
struct sdw_slave_intr_status slave_intr;
- u8 clear = 0, bit, port_status[15];
+ u8 clear = 0, bit, port_status[15] = {0};
int port_num, stat, ret, count = 0;
unsigned long port;
bool slave_notify = false;
diff --git a/drivers/usb/core/config.c b/drivers/usb/core/config.c
index 151a74a54386..1ac1095bfeac 100644
--- a/drivers/usb/core/config.c
+++ b/drivers/usb/core/config.c
@@ -348,6 +348,11 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno, int inum,

/* Validate the wMaxPacketSize field */
maxp = usb_endpoint_maxp(&endpoint->desc);
+ if (maxp == 0) {
+ dev_warn(ddev, "config %d interface %d altsetting %d endpoint 0x%X has wMaxPacketSize 0, skipping\n",
+ cfgno, inum, asnum, d->bEndpointAddress);
+ goto skip_to_next_endpoint_or_interface_descriptor;
+ }

/* Find the highest legal maxpacket size for this endpoint */
i = 0; /* additional transactions per microframe */
diff --git a/drivers/usb/dwc3/Kconfig b/drivers/usb/dwc3/Kconfig
index 89abc6078703..556a876c7896 100644
--- a/drivers/usb/dwc3/Kconfig
+++ b/drivers/usb/dwc3/Kconfig
@@ -102,6 +102,7 @@ config USB_DWC3_MESON_G12A
depends on ARCH_MESON || COMPILE_TEST
default USB_DWC3
select USB_ROLE_SWITCH
+ select REGMAP_MMIO
help
Support USB2/3 functionality in Amlogic G12A platforms.
Say 'Y' or 'M' if you have one such device.
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index c9bb93a2c81e..06d7e8612dfe 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -300,8 +300,7 @@ static void dwc3_frame_length_adjustment(struct dwc3 *dwc)

reg = dwc3_readl(dwc->regs, DWC3_GFLADJ);
dft = reg & DWC3_GFLADJ_30MHZ_MASK;
- if (!dev_WARN_ONCE(dwc->dev, dft == dwc->fladj,
- "request value same as default, ignoring\n")) {
+ if (dft != dwc->fladj) {
reg &= ~DWC3_GFLADJ_30MHZ_MASK;
reg |= DWC3_GFLADJ_30MHZ_SDBND_SEL | dwc->fladj;
dwc3_writel(dwc->regs, DWC3_GFLADJ, reg);
diff --git a/drivers/usb/dwc3/dwc3-pci.c b/drivers/usb/dwc3/dwc3-pci.c
index 5e8e18222f92..023f0357efd7 100644
--- a/drivers/usb/dwc3/dwc3-pci.c
+++ b/drivers/usb/dwc3/dwc3-pci.c
@@ -258,7 +258,7 @@ static int dwc3_pci_probe(struct pci_dev *pci, const struct pci_device_id *id)

ret = platform_device_add_properties(dwc->dwc3, p);
if (ret < 0)
- return ret;
+ goto err;

ret = dwc3_pci_quirks(dwc);
if (ret)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 173f5329d3d9..56bd6ae0c18f 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -707,6 +707,12 @@ static void dwc3_remove_requests(struct dwc3 *dwc, struct dwc3_ep *dep)

dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
}
+
+ while (!list_empty(&dep->cancelled_list)) {
+ req = next_request(&dep->cancelled_list);
+
+ dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
+ }
}

/**
diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c
index 76883ff4f5bb..c8ae07cd6fbf 100644
--- a/drivers/usb/gadget/composite.c
+++ b/drivers/usb/gadget/composite.c
@@ -2156,14 +2156,18 @@ void composite_dev_cleanup(struct usb_composite_dev *cdev)
usb_ep_dequeue(cdev->gadget->ep0, cdev->os_desc_req);

kfree(cdev->os_desc_req->buf);
+ cdev->os_desc_req->buf = NULL;
usb_ep_free_request(cdev->gadget->ep0, cdev->os_desc_req);
+ cdev->os_desc_req = NULL;
}
if (cdev->req) {
if (cdev->setup_pending)
usb_ep_dequeue(cdev->gadget->ep0, cdev->req);

kfree(cdev->req->buf);
+ cdev->req->buf = NULL;
usb_ep_free_request(cdev->gadget->ep0, cdev->req);
+ cdev->req = NULL;
}
cdev->next_string_id = 0;
device_remove_file(&cdev->gadget->dev, &dev_attr_suspended);
diff --git a/drivers/usb/gadget/configfs.c b/drivers/usb/gadget/configfs.c
index 025129942894..33852c2b29d1 100644
--- a/drivers/usb/gadget/configfs.c
+++ b/drivers/usb/gadget/configfs.c
@@ -61,6 +61,8 @@ struct gadget_info {
bool use_os_desc;
char b_vendor_code;
char qw_sign[OS_STRING_QW_SIGN_LEN];
+ spinlock_t spinlock;
+ bool unbind;
};

static inline struct gadget_info *to_gadget_info(struct config_item *item)
@@ -1244,6 +1246,7 @@ static int configfs_composite_bind(struct usb_gadget *gadget,
int ret;

/* the gi->lock is hold by the caller */
+ gi->unbind = 0;
cdev->gadget = gadget;
set_gadget_data(gadget, cdev);
ret = composite_dev_prepare(composite, cdev);
@@ -1376,31 +1379,128 @@ static void configfs_composite_unbind(struct usb_gadget *gadget)
{
struct usb_composite_dev *cdev;
struct gadget_info *gi;
+ unsigned long flags;

/* the gi->lock is hold by the caller */

cdev = get_gadget_data(gadget);
gi = container_of(cdev, struct gadget_info, cdev);
+ spin_lock_irqsave(&gi->spinlock, flags);
+ gi->unbind = 1;
+ spin_unlock_irqrestore(&gi->spinlock, flags);

kfree(otg_desc[0]);
otg_desc[0] = NULL;
purge_configs_funcs(gi);
composite_dev_cleanup(cdev);
usb_ep_autoconfig_reset(cdev->gadget);
+ spin_lock_irqsave(&gi->spinlock, flags);
cdev->gadget = NULL;
set_gadget_data(gadget, NULL);
+ spin_unlock_irqrestore(&gi->spinlock, flags);
+}
+
+static int configfs_composite_setup(struct usb_gadget *gadget,
+ const struct usb_ctrlrequest *ctrl)
+{
+ struct usb_composite_dev *cdev;
+ struct gadget_info *gi;
+ unsigned long flags;
+ int ret;
+
+ cdev = get_gadget_data(gadget);
+ if (!cdev)
+ return 0;
+
+ gi = container_of(cdev, struct gadget_info, cdev);
+ spin_lock_irqsave(&gi->spinlock, flags);
+ cdev = get_gadget_data(gadget);
+ if (!cdev || gi->unbind) {
+ spin_unlock_irqrestore(&gi->spinlock, flags);
+ return 0;
+ }
+
+ ret = composite_setup(gadget, ctrl);
+ spin_unlock_irqrestore(&gi->spinlock, flags);
+ return ret;
+}
+
+static void configfs_composite_disconnect(struct usb_gadget *gadget)
+{
+ struct usb_composite_dev *cdev;
+ struct gadget_info *gi;
+ unsigned long flags;
+
+ cdev = get_gadget_data(gadget);
+ if (!cdev)
+ return;
+
+ gi = container_of(cdev, struct gadget_info, cdev);
+ spin_lock_irqsave(&gi->spinlock, flags);
+ cdev = get_gadget_data(gadget);
+ if (!cdev || gi->unbind) {
+ spin_unlock_irqrestore(&gi->spinlock, flags);
+ return;
+ }
+
+ composite_disconnect(gadget);
+ spin_unlock_irqrestore(&gi->spinlock, flags);
+}
+
+static void configfs_composite_suspend(struct usb_gadget *gadget)
+{
+ struct usb_composite_dev *cdev;
+ struct gadget_info *gi;
+ unsigned long flags;
+
+ cdev = get_gadget_data(gadget);
+ if (!cdev)
+ return;
+
+ gi = container_of(cdev, struct gadget_info, cdev);
+ spin_lock_irqsave(&gi->spinlock, flags);
+ cdev = get_gadget_data(gadget);
+ if (!cdev || gi->unbind) {
+ spin_unlock_irqrestore(&gi->spinlock, flags);
+ return;
+ }
+
+ composite_suspend(gadget);
+ spin_unlock_irqrestore(&gi->spinlock, flags);
+}
+
+static void configfs_composite_resume(struct usb_gadget *gadget)
+{
+ struct usb_composite_dev *cdev;
+ struct gadget_info *gi;
+ unsigned long flags;
+
+ cdev = get_gadget_data(gadget);
+ if (!cdev)
+ return;
+
+ gi = container_of(cdev, struct gadget_info, cdev);
+ spin_lock_irqsave(&gi->spinlock, flags);
+ cdev = get_gadget_data(gadget);
+ if (!cdev || gi->unbind) {
+ spin_unlock_irqrestore(&gi->spinlock, flags);
+ return;
+ }
+
+ composite_resume(gadget);
+ spin_unlock_irqrestore(&gi->spinlock, flags);
}

static const struct usb_gadget_driver configfs_driver_template = {
.bind = configfs_composite_bind,
.unbind = configfs_composite_unbind,

- .setup = composite_setup,
- .reset = composite_disconnect,
- .disconnect = composite_disconnect,
+ .setup = configfs_composite_setup,
+ .reset = configfs_composite_disconnect,
+ .disconnect = configfs_composite_disconnect,

- .suspend = composite_suspend,
- .resume = composite_resume,
+ .suspend = configfs_composite_suspend,
+ .resume = configfs_composite_resume,

.max_speed = USB_SPEED_SUPER,
.driver = {
diff --git a/drivers/usb/gadget/udc/atmel_usba_udc.c b/drivers/usb/gadget/udc/atmel_usba_udc.c
index 503d275bc4c4..761e8a808857 100644
--- a/drivers/usb/gadget/udc/atmel_usba_udc.c
+++ b/drivers/usb/gadget/udc/atmel_usba_udc.c
@@ -448,9 +448,11 @@ static void submit_request(struct usba_ep *ep, struct usba_request *req)
next_fifo_transaction(ep, req);
if (req->last_transaction) {
usba_ep_writel(ep, CTL_DIS, USBA_TX_PK_RDY);
- usba_ep_writel(ep, CTL_ENB, USBA_TX_COMPLETE);
+ if (ep_is_control(ep))
+ usba_ep_writel(ep, CTL_ENB, USBA_TX_COMPLETE);
} else {
- usba_ep_writel(ep, CTL_DIS, USBA_TX_COMPLETE);
+ if (ep_is_control(ep))
+ usba_ep_writel(ep, CTL_DIS, USBA_TX_COMPLETE);
usba_ep_writel(ep, CTL_ENB, USBA_TX_PK_RDY);
}
}
diff --git a/drivers/usb/gadget/udc/fsl_udc_core.c b/drivers/usb/gadget/udc/fsl_udc_core.c
index 20141c3096f6..9a05863b2876 100644
--- a/drivers/usb/gadget/udc/fsl_udc_core.c
+++ b/drivers/usb/gadget/udc/fsl_udc_core.c
@@ -2576,7 +2576,7 @@ static int fsl_udc_remove(struct platform_device *pdev)
dma_pool_destroy(udc_controller->td_pool);
free_irq(udc_controller->irq, udc_controller);
iounmap(dr_regs);
- if (pdata->operating_mode == FSL_USB2_DR_DEVICE)
+ if (res && (pdata->operating_mode == FSL_USB2_DR_DEVICE))
release_mem_region(res->start, resource_size(res));

/* free udc --wait for the release() finished */
diff --git a/drivers/usb/misc/ldusb.c b/drivers/usb/misc/ldusb.c
index f5e34c503454..8f86b4ebca89 100644
--- a/drivers/usb/misc/ldusb.c
+++ b/drivers/usb/misc/ldusb.c
@@ -487,7 +487,7 @@ static ssize_t ld_usb_read(struct file *file, char __user *buffer, size_t count,
}
bytes_to_read = min(count, *actual_buffer);
if (bytes_to_read < *actual_buffer)
- dev_warn(&dev->intf->dev, "Read buffer overflow, %zd bytes dropped\n",
+ dev_warn(&dev->intf->dev, "Read buffer overflow, %zu bytes dropped\n",
*actual_buffer-bytes_to_read);

/* copy one interrupt_in_buffer from ring_buffer into userspace */
@@ -562,8 +562,9 @@ static ssize_t ld_usb_write(struct file *file, const char __user *buffer,
/* write the data into interrupt_out_buffer from userspace */
bytes_to_write = min(count, write_buffer_size*dev->interrupt_out_endpoint_size);
if (bytes_to_write < count)
- dev_warn(&dev->intf->dev, "Write buffer overflow, %zd bytes dropped\n", count-bytes_to_write);
- dev_dbg(&dev->intf->dev, "%s: count = %zd, bytes_to_write = %zd\n",
+ dev_warn(&dev->intf->dev, "Write buffer overflow, %zu bytes dropped\n",
+ count - bytes_to_write);
+ dev_dbg(&dev->intf->dev, "%s: count = %zu, bytes_to_write = %zu\n",
__func__, count, bytes_to_write);

if (copy_from_user(dev->interrupt_out_buffer, buffer, bytes_to_write)) {
diff --git a/drivers/usb/usbip/stub.h b/drivers/usb/usbip/stub.h
index 35618ceb2791..d11270560c24 100644
--- a/drivers/usb/usbip/stub.h
+++ b/drivers/usb/usbip/stub.h
@@ -52,7 +52,11 @@ struct stub_priv {
unsigned long seqnum;
struct list_head list;
struct stub_device *sdev;
- struct urb *urb;
+ struct urb **urbs;
+ struct scatterlist *sgl;
+ int num_urbs;
+ int completed_urbs;
+ int urb_status;

int unlinking;
};
@@ -86,6 +90,7 @@ extern struct usb_device_driver stub_driver;
struct bus_id_priv *get_busid_priv(const char *busid);
void put_busid_priv(struct bus_id_priv *bid);
int del_match_busid(char *busid);
+void stub_free_priv_and_urb(struct stub_priv *priv);
void stub_device_cleanup_urbs(struct stub_device *sdev);

/* stub_rx.c */
diff --git a/drivers/usb/usbip/stub_main.c b/drivers/usb/usbip/stub_main.c
index 2e4bfccd4bfc..c1c0bbc9f8b1 100644
--- a/drivers/usb/usbip/stub_main.c
+++ b/drivers/usb/usbip/stub_main.c
@@ -6,6 +6,7 @@
#include <linux/string.h>
#include <linux/module.h>
#include <linux/device.h>
+#include <linux/scatterlist.h>

#include "usbip_common.h"
#include "stub.h"
@@ -281,13 +282,49 @@ static struct stub_priv *stub_priv_pop_from_listhead(struct list_head *listhead)
struct stub_priv *priv, *tmp;

list_for_each_entry_safe(priv, tmp, listhead, list) {
- list_del(&priv->list);
+ list_del_init(&priv->list);
return priv;
}

return NULL;
}

+void stub_free_priv_and_urb(struct stub_priv *priv)
+{
+ struct urb *urb;
+ int i;
+
+ for (i = 0; i < priv->num_urbs; i++) {
+ urb = priv->urbs[i];
+
+ if (!urb)
+ return;
+
+ kfree(urb->setup_packet);
+ urb->setup_packet = NULL;
+
+
+ if (urb->transfer_buffer && !priv->sgl) {
+ kfree(urb->transfer_buffer);
+ urb->transfer_buffer = NULL;
+ }
+
+ if (urb->num_sgs) {
+ sgl_free(urb->sg);
+ urb->sg = NULL;
+ urb->num_sgs = 0;
+ }
+
+ usb_free_urb(urb);
+ }
+ if (!list_empty(&priv->list))
+ list_del(&priv->list);
+ if (priv->sgl)
+ sgl_free(priv->sgl);
+ kfree(priv->urbs);
+ kmem_cache_free(stub_priv_cache, priv);
+}
+
static struct stub_priv *stub_priv_pop(struct stub_device *sdev)
{
unsigned long flags;
@@ -314,25 +351,15 @@ static struct stub_priv *stub_priv_pop(struct stub_device *sdev)
void stub_device_cleanup_urbs(struct stub_device *sdev)
{
struct stub_priv *priv;
- struct urb *urb;
+ int i;

dev_dbg(&sdev->udev->dev, "Stub device cleaning up urbs\n");

while ((priv = stub_priv_pop(sdev))) {
- urb = priv->urb;
- dev_dbg(&sdev->udev->dev, "free urb seqnum %lu\n",
- priv->seqnum);
- usb_kill_urb(urb);
-
- kmem_cache_free(stub_priv_cache, priv);
+ for (i = 0; i < priv->num_urbs; i++)
+ usb_kill_urb(priv->urbs[i]);

- kfree(urb->transfer_buffer);
- urb->transfer_buffer = NULL;
-
- kfree(urb->setup_packet);
- urb->setup_packet = NULL;
-
- usb_free_urb(urb);
+ stub_free_priv_and_urb(priv);
}
}

diff --git a/drivers/usb/usbip/stub_rx.c b/drivers/usb/usbip/stub_rx.c
index b0a855acafa3..66edfeea68fe 100644
--- a/drivers/usb/usbip/stub_rx.c
+++ b/drivers/usb/usbip/stub_rx.c
@@ -7,6 +7,7 @@
#include <linux/kthread.h>
#include <linux/usb.h>
#include <linux/usb/hcd.h>
+#include <linux/scatterlist.h>

#include "usbip_common.h"
#include "stub.h"
@@ -201,7 +202,7 @@ static void tweak_special_requests(struct urb *urb)
static int stub_recv_cmd_unlink(struct stub_device *sdev,
struct usbip_header *pdu)
{
- int ret;
+ int ret, i;
unsigned long flags;
struct stub_priv *priv;

@@ -246,12 +247,14 @@ static int stub_recv_cmd_unlink(struct stub_device *sdev,
* so a driver in a client host will know the failure
* of the unlink request ?
*/
- ret = usb_unlink_urb(priv->urb);
- if (ret != -EINPROGRESS)
- dev_err(&priv->urb->dev->dev,
- "failed to unlink a urb # %lu, ret %d\n",
- priv->seqnum, ret);
-
+ for (i = priv->completed_urbs; i < priv->num_urbs; i++) {
+ ret = usb_unlink_urb(priv->urbs[i]);
+ if (ret != -EINPROGRESS)
+ dev_err(&priv->urbs[i]->dev->dev,
+ "failed to unlink %d/%d urb of seqnum %lu, ret %d\n",
+ i + 1, priv->num_urbs,
+ priv->seqnum, ret);
+ }
return 0;
}

@@ -433,14 +436,36 @@ static void masking_bogus_flags(struct urb *urb)
urb->transfer_flags &= allowed;
}

+static int stub_recv_xbuff(struct usbip_device *ud, struct stub_priv *priv)
+{
+ int ret;
+ int i;
+
+ for (i = 0; i < priv->num_urbs; i++) {
+ ret = usbip_recv_xbuff(ud, priv->urbs[i]);
+ if (ret < 0)
+ break;
+ }
+
+ return ret;
+}
+
static void stub_recv_cmd_submit(struct stub_device *sdev,
struct usbip_header *pdu)
{
- int ret;
struct stub_priv *priv;
struct usbip_device *ud = &sdev->ud;
struct usb_device *udev = sdev->udev;
+ struct scatterlist *sgl = NULL, *sg;
+ void *buffer = NULL;
+ unsigned long long buf_len;
+ int nents;
+ int num_urbs = 1;
int pipe = get_pipe(sdev, pdu);
+ int use_sg = pdu->u.cmd_submit.transfer_flags & URB_DMA_MAP_SG;
+ int support_sg = 1;
+ int np = 0;
+ int ret, i;

if (pipe == -1)
return;
@@ -449,76 +474,139 @@ static void stub_recv_cmd_submit(struct stub_device *sdev,
if (!priv)
return;

- /* setup a urb */
- if (usb_pipeisoc(pipe))
- priv->urb = usb_alloc_urb(pdu->u.cmd_submit.number_of_packets,
- GFP_KERNEL);
- else
- priv->urb = usb_alloc_urb(0, GFP_KERNEL);
+ buf_len = (unsigned long long)pdu->u.cmd_submit.transfer_buffer_length;

- if (!priv->urb) {
- usbip_event_add(ud, SDEV_EVENT_ERROR_MALLOC);
- return;
+ /* allocate urb transfer buffer, if needed */
+ if (buf_len) {
+ if (use_sg) {
+ sgl = sgl_alloc(buf_len, GFP_KERNEL, &nents);
+ if (!sgl)
+ goto err_malloc;
+ } else {
+ buffer = kzalloc(buf_len, GFP_KERNEL);
+ if (!buffer)
+ goto err_malloc;
+ }
}

- /* allocate urb transfer buffer, if needed */
- if (pdu->u.cmd_submit.transfer_buffer_length > 0) {
- priv->urb->transfer_buffer =
- kzalloc(pdu->u.cmd_submit.transfer_buffer_length,
- GFP_KERNEL);
- if (!priv->urb->transfer_buffer) {
+ /* Check if the server's HCD supports SG */
+ if (use_sg && !udev->bus->sg_tablesize) {
+ /*
+ * If the server's HCD doesn't support SG, break a single SG
+ * request into several URBs and map each SG list entry to
+ * corresponding URB buffer. The previously allocated SG
+ * list is stored in priv->sgl (If the server's HCD support SG,
+ * SG list is stored only in urb->sg) and it is used as an
+ * indicator that the server split single SG request into
+ * several URBs. Later, priv->sgl is used by stub_complete() and
+ * stub_send_ret_submit() to reassemble the divied URBs.
+ */
+ support_sg = 0;
+ num_urbs = nents;
+ priv->completed_urbs = 0;
+ pdu->u.cmd_submit.transfer_flags &= ~URB_DMA_MAP_SG;
+ }
+
+ /* allocate urb array */
+ priv->num_urbs = num_urbs;
+ priv->urbs = kmalloc_array(num_urbs, sizeof(*priv->urbs), GFP_KERNEL);
+ if (!priv->urbs)
+ goto err_urbs;
+
+ /* setup a urb */
+ if (support_sg) {
+ if (usb_pipeisoc(pipe))
+ np = pdu->u.cmd_submit.number_of_packets;
+
+ priv->urbs[0] = usb_alloc_urb(np, GFP_KERNEL);
+ if (!priv->urbs[0])
+ goto err_urb;
+
+ if (buf_len) {
+ if (use_sg) {
+ priv->urbs[0]->sg = sgl;
+ priv->urbs[0]->num_sgs = nents;
+ priv->urbs[0]->transfer_buffer = NULL;
+ } else {
+ priv->urbs[0]->transfer_buffer = buffer;
+ }
+ }
+
+ /* copy urb setup packet */
+ priv->urbs[0]->setup_packet = kmemdup(&pdu->u.cmd_submit.setup,
+ 8, GFP_KERNEL);
+ if (!priv->urbs[0]->setup_packet) {
usbip_event_add(ud, SDEV_EVENT_ERROR_MALLOC);
return;
}
- }

- /* copy urb setup packet */
- priv->urb->setup_packet = kmemdup(&pdu->u.cmd_submit.setup, 8,
- GFP_KERNEL);
- if (!priv->urb->setup_packet) {
- dev_err(&udev->dev, "allocate setup_packet\n");
- usbip_event_add(ud, SDEV_EVENT_ERROR_MALLOC);
- return;
+ usbip_pack_pdu(pdu, priv->urbs[0], USBIP_CMD_SUBMIT, 0);
+ } else {
+ for_each_sg(sgl, sg, nents, i) {
+ priv->urbs[i] = usb_alloc_urb(0, GFP_KERNEL);
+ /* The URBs which is previously allocated will be freed
+ * in stub_device_cleanup_urbs() if error occurs.
+ */
+ if (!priv->urbs[i])
+ goto err_urb;
+
+ usbip_pack_pdu(pdu, priv->urbs[i], USBIP_CMD_SUBMIT, 0);
+ priv->urbs[i]->transfer_buffer = sg_virt(sg);
+ priv->urbs[i]->transfer_buffer_length = sg->length;
+ }
+ priv->sgl = sgl;
}

- /* set other members from the base header of pdu */
- priv->urb->context = (void *) priv;
- priv->urb->dev = udev;
- priv->urb->pipe = pipe;
- priv->urb->complete = stub_complete;
+ for (i = 0; i < num_urbs; i++) {
+ /* set other members from the base header of pdu */
+ priv->urbs[i]->context = (void *) priv;
+ priv->urbs[i]->dev = udev;
+ priv->urbs[i]->pipe = pipe;
+ priv->urbs[i]->complete = stub_complete;

- usbip_pack_pdu(pdu, priv->urb, USBIP_CMD_SUBMIT, 0);
+ /* no need to submit an intercepted request, but harmless? */
+ tweak_special_requests(priv->urbs[i]);

+ masking_bogus_flags(priv->urbs[i]);
+ }

- if (usbip_recv_xbuff(ud, priv->urb) < 0)
+ if (stub_recv_xbuff(ud, priv) < 0)
return;

- if (usbip_recv_iso(ud, priv->urb) < 0)
+ if (usbip_recv_iso(ud, priv->urbs[0]) < 0)
return;

- /* no need to submit an intercepted request, but harmless? */
- tweak_special_requests(priv->urb);
-
- masking_bogus_flags(priv->urb);
/* urb is now ready to submit */
- ret = usb_submit_urb(priv->urb, GFP_KERNEL);
-
- if (ret == 0)
- usbip_dbg_stub_rx("submit urb ok, seqnum %u\n",
- pdu->base.seqnum);
- else {
- dev_err(&udev->dev, "submit_urb error, %d\n", ret);
- usbip_dump_header(pdu);
- usbip_dump_urb(priv->urb);
-
- /*
- * Pessimistic.
- * This connection will be discarded.
- */
- usbip_event_add(ud, SDEV_EVENT_ERROR_SUBMIT);
+ for (i = 0; i < priv->num_urbs; i++) {
+ ret = usb_submit_urb(priv->urbs[i], GFP_KERNEL);
+
+ if (ret == 0)
+ usbip_dbg_stub_rx("submit urb ok, seqnum %u\n",
+ pdu->base.seqnum);
+ else {
+ dev_err(&udev->dev, "submit_urb error, %d\n", ret);
+ usbip_dump_header(pdu);
+ usbip_dump_urb(priv->urbs[i]);
+
+ /*
+ * Pessimistic.
+ * This connection will be discarded.
+ */
+ usbip_event_add(ud, SDEV_EVENT_ERROR_SUBMIT);
+ break;
+ }
}

usbip_dbg_stub_rx("Leave\n");
+ return;
+
+err_urb:
+ kfree(priv->urbs);
+err_urbs:
+ kfree(buffer);
+ sgl_free(sgl);
+err_malloc:
+ usbip_event_add(ud, SDEV_EVENT_ERROR_MALLOC);
}

/* recv a pdu */
diff --git a/drivers/usb/usbip/stub_tx.c b/drivers/usb/usbip/stub_tx.c
index f0ec41a50cbc..36010a82b359 100644
--- a/drivers/usb/usbip/stub_tx.c
+++ b/drivers/usb/usbip/stub_tx.c
@@ -5,25 +5,11 @@

#include <linux/kthread.h>
#include <linux/socket.h>
+#include <linux/scatterlist.h>

#include "usbip_common.h"
#include "stub.h"

-static void stub_free_priv_and_urb(struct stub_priv *priv)
-{
- struct urb *urb = priv->urb;
-
- kfree(urb->setup_packet);
- urb->setup_packet = NULL;
-
- kfree(urb->transfer_buffer);
- urb->transfer_buffer = NULL;
-
- list_del(&priv->list);
- kmem_cache_free(stub_priv_cache, priv);
- usb_free_urb(urb);
-}
-
/* be in spin_lock_irqsave(&sdev->priv_lock, flags) */
void stub_enqueue_ret_unlink(struct stub_device *sdev, __u32 seqnum,
__u32 status)
@@ -85,6 +71,22 @@ void stub_complete(struct urb *urb)
break;
}

+ /*
+ * If the server breaks single SG request into the several URBs, the
+ * URBs must be reassembled before sending completed URB to the vhci.
+ * Don't wake up the tx thread until all the URBs are completed.
+ */
+ if (priv->sgl) {
+ priv->completed_urbs++;
+
+ /* Only save the first error status */
+ if (urb->status && !priv->urb_status)
+ priv->urb_status = urb->status;
+
+ if (priv->completed_urbs < priv->num_urbs)
+ return;
+ }
+
/* link a urb to the queue of tx. */
spin_lock_irqsave(&sdev->priv_lock, flags);
if (sdev->ud.tcp_socket == NULL) {
@@ -156,18 +158,22 @@ static int stub_send_ret_submit(struct stub_device *sdev)
size_t total_size = 0;

while ((priv = dequeue_from_priv_tx(sdev)) != NULL) {
- int ret;
- struct urb *urb = priv->urb;
+ struct urb *urb = priv->urbs[0];
struct usbip_header pdu_header;
struct usbip_iso_packet_descriptor *iso_buffer = NULL;
struct kvec *iov = NULL;
+ struct scatterlist *sg;
+ u32 actual_length = 0;
int iovnum = 0;
+ int ret;
+ int i;

txsize = 0;
memset(&pdu_header, 0, sizeof(pdu_header));
memset(&msg, 0, sizeof(msg));

- if (urb->actual_length > 0 && !urb->transfer_buffer) {
+ if (urb->actual_length > 0 && !urb->transfer_buffer &&
+ !urb->num_sgs) {
dev_err(&sdev->udev->dev,
"urb: actual_length %d transfer_buffer null\n",
urb->actual_length);
@@ -176,6 +182,11 @@ static int stub_send_ret_submit(struct stub_device *sdev)

if (usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS)
iovnum = 2 + urb->number_of_packets;
+ else if (usb_pipein(urb->pipe) && urb->actual_length > 0 &&
+ urb->num_sgs)
+ iovnum = 1 + urb->num_sgs;
+ else if (usb_pipein(urb->pipe) && priv->sgl)
+ iovnum = 1 + priv->num_urbs;
else
iovnum = 2;

@@ -192,6 +203,15 @@ static int stub_send_ret_submit(struct stub_device *sdev)
setup_ret_submit_pdu(&pdu_header, urb);
usbip_dbg_stub_tx("setup txdata seqnum: %d\n",
pdu_header.base.seqnum);
+
+ if (priv->sgl) {
+ for (i = 0; i < priv->num_urbs; i++)
+ actual_length += priv->urbs[i]->actual_length;
+
+ pdu_header.u.ret_submit.status = priv->urb_status;
+ pdu_header.u.ret_submit.actual_length = actual_length;
+ }
+
usbip_header_correct_endian(&pdu_header, 1);

iov[iovnum].iov_base = &pdu_header;
@@ -200,12 +220,47 @@ static int stub_send_ret_submit(struct stub_device *sdev)
txsize += sizeof(pdu_header);

/* 2. setup transfer buffer */
- if (usb_pipein(urb->pipe) &&
+ if (usb_pipein(urb->pipe) && priv->sgl) {
+ /* If the server split a single SG request into several
+ * URBs because the server's HCD doesn't support SG,
+ * reassemble the split URB buffers into a single
+ * return command.
+ */
+ for (i = 0; i < priv->num_urbs; i++) {
+ iov[iovnum].iov_base =
+ priv->urbs[i]->transfer_buffer;
+ iov[iovnum].iov_len =
+ priv->urbs[i]->actual_length;
+ iovnum++;
+ }
+ txsize += actual_length;
+ } else if (usb_pipein(urb->pipe) &&
usb_pipetype(urb->pipe) != PIPE_ISOCHRONOUS &&
urb->actual_length > 0) {
- iov[iovnum].iov_base = urb->transfer_buffer;
- iov[iovnum].iov_len = urb->actual_length;
- iovnum++;
+ if (urb->num_sgs) {
+ unsigned int copy = urb->actual_length;
+ int size;
+
+ for_each_sg(urb->sg, sg, urb->num_sgs, i) {
+ if (copy == 0)
+ break;
+
+ if (copy < sg->length)
+ size = copy;
+ else
+ size = sg->length;
+
+ iov[iovnum].iov_base = sg_virt(sg);
+ iov[iovnum].iov_len = size;
+
+ iovnum++;
+ copy -= size;
+ }
+ } else {
+ iov[iovnum].iov_base = urb->transfer_buffer;
+ iov[iovnum].iov_len = urb->actual_length;
+ iovnum++;
+ }
txsize += urb->actual_length;
} else if (usb_pipein(urb->pipe) &&
usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS) {
diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c
index 45da3e01c7b0..6532d68e8808 100644
--- a/drivers/usb/usbip/usbip_common.c
+++ b/drivers/usb/usbip/usbip_common.c
@@ -680,8 +680,12 @@ EXPORT_SYMBOL_GPL(usbip_pad_iso);
/* some members of urb must be substituted before. */
int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb)
{
- int ret;
+ struct scatterlist *sg;
+ int ret = 0;
+ int recv;
int size;
+ int copy;
+ int i;

if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) {
/* the direction of urb must be OUT. */
@@ -701,29 +705,48 @@ int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb)
if (!(size > 0))
return 0;

- if (size > urb->transfer_buffer_length) {
+ if (size > urb->transfer_buffer_length)
/* should not happen, probably malicious packet */
- if (ud->side == USBIP_STUB) {
- usbip_event_add(ud, SDEV_EVENT_ERROR_TCP);
- return 0;
- } else {
- usbip_event_add(ud, VDEV_EVENT_ERROR_TCP);
- return -EPIPE;
- }
- }
+ goto error;

- ret = usbip_recv(ud->tcp_socket, urb->transfer_buffer, size);
- if (ret != size) {
- dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret);
- if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) {
- usbip_event_add(ud, SDEV_EVENT_ERROR_TCP);
- } else {
- usbip_event_add(ud, VDEV_EVENT_ERROR_TCP);
- return -EPIPE;
+ if (urb->num_sgs) {
+ copy = size;
+ for_each_sg(urb->sg, sg, urb->num_sgs, i) {
+ int recv_size;
+
+ if (copy < sg->length)
+ recv_size = copy;
+ else
+ recv_size = sg->length;
+
+ recv = usbip_recv(ud->tcp_socket, sg_virt(sg),
+ recv_size);
+
+ if (recv != recv_size)
+ goto error;
+
+ copy -= recv;
+ ret += recv;
}
+
+ if (ret != size)
+ goto error;
+ } else {
+ ret = usbip_recv(ud->tcp_socket, urb->transfer_buffer, size);
+ if (ret != size)
+ goto error;
}

return ret;
+
+error:
+ dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret);
+ if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC)
+ usbip_event_add(ud, SDEV_EVENT_ERROR_TCP);
+ else
+ usbip_event_add(ud, VDEV_EVENT_ERROR_TCP);
+
+ return -EPIPE;
}
EXPORT_SYMBOL_GPL(usbip_recv_xbuff);

diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c
index 000ab7225717..585a84d319bd 100644
--- a/drivers/usb/usbip/vhci_hcd.c
+++ b/drivers/usb/usbip/vhci_hcd.c
@@ -697,7 +697,8 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
}
vdev = &vhci_hcd->vdev[portnum-1];

- if (!urb->transfer_buffer && urb->transfer_buffer_length) {
+ if (!urb->transfer_buffer && !urb->num_sgs &&
+ urb->transfer_buffer_length) {
dev_dbg(dev, "Null URB transfer buffer\n");
return -EINVAL;
}
@@ -1143,6 +1144,15 @@ static int vhci_setup(struct usb_hcd *hcd)
hcd->speed = HCD_USB3;
hcd->self.root_hub->speed = USB_SPEED_SUPER;
}
+
+ /*
+ * Support SG.
+ * sg_tablesize is an arbitrary value to alleviate memory pressure
+ * on the host.
+ */
+ hcd->self.sg_tablesize = 32;
+ hcd->self.no_sg_constraint = 1;
+
return 0;
}

diff --git a/drivers/usb/usbip/vhci_rx.c b/drivers/usb/usbip/vhci_rx.c
index 44cd64518925..33f8972ba842 100644
--- a/drivers/usb/usbip/vhci_rx.c
+++ b/drivers/usb/usbip/vhci_rx.c
@@ -90,6 +90,9 @@ static void vhci_recv_ret_submit(struct vhci_device *vdev,
if (usbip_dbg_flag_vhci_rx)
usbip_dump_urb(urb);

+ if (urb->num_sgs)
+ urb->transfer_flags &= ~URB_DMA_MAP_SG;
+
usbip_dbg_vhci_rx("now giveback urb %u\n", pdu->base.seqnum);

spin_lock_irqsave(&vhci->lock, flags);
diff --git a/drivers/usb/usbip/vhci_tx.c b/drivers/usb/usbip/vhci_tx.c
index 2fa26d0578d7..0ae40a13a9fe 100644
--- a/drivers/usb/usbip/vhci_tx.c
+++ b/drivers/usb/usbip/vhci_tx.c
@@ -5,6 +5,7 @@

#include <linux/kthread.h>
#include <linux/slab.h>
+#include <linux/scatterlist.h>

#include "usbip_common.h"
#include "vhci.h"
@@ -50,19 +51,23 @@ static struct vhci_priv *dequeue_from_priv_tx(struct vhci_device *vdev)

static int vhci_send_cmd_submit(struct vhci_device *vdev)
{
+ struct usbip_iso_packet_descriptor *iso_buffer = NULL;
struct vhci_priv *priv = NULL;
+ struct scatterlist *sg;

struct msghdr msg;
- struct kvec iov[3];
+ struct kvec *iov;
size_t txsize;

size_t total_size = 0;
+ int iovnum;
+ int err = -ENOMEM;
+ int i;

while ((priv = dequeue_from_priv_tx(vdev)) != NULL) {
int ret;
struct urb *urb = priv->urb;
struct usbip_header pdu_header;
- struct usbip_iso_packet_descriptor *iso_buffer = NULL;

txsize = 0;
memset(&pdu_header, 0, sizeof(pdu_header));
@@ -72,18 +77,45 @@ static int vhci_send_cmd_submit(struct vhci_device *vdev)
usbip_dbg_vhci_tx("setup txdata urb seqnum %lu\n",
priv->seqnum);

+ if (urb->num_sgs && usb_pipeout(urb->pipe))
+ iovnum = 2 + urb->num_sgs;
+ else
+ iovnum = 3;
+
+ iov = kcalloc(iovnum, sizeof(*iov), GFP_KERNEL);
+ if (!iov) {
+ usbip_event_add(&vdev->ud, SDEV_EVENT_ERROR_MALLOC);
+ return -ENOMEM;
+ }
+
+ if (urb->num_sgs)
+ urb->transfer_flags |= URB_DMA_MAP_SG;
+
/* 1. setup usbip_header */
setup_cmd_submit_pdu(&pdu_header, urb);
usbip_header_correct_endian(&pdu_header, 1);
+ iovnum = 0;

- iov[0].iov_base = &pdu_header;
- iov[0].iov_len = sizeof(pdu_header);
+ iov[iovnum].iov_base = &pdu_header;
+ iov[iovnum].iov_len = sizeof(pdu_header);
txsize += sizeof(pdu_header);
+ iovnum++;

/* 2. setup transfer buffer */
if (!usb_pipein(urb->pipe) && urb->transfer_buffer_length > 0) {
- iov[1].iov_base = urb->transfer_buffer;
- iov[1].iov_len = urb->transfer_buffer_length;
+ if (urb->num_sgs &&
+ !usb_endpoint_xfer_isoc(&urb->ep->desc)) {
+ for_each_sg(urb->sg, sg, urb->num_sgs, i) {
+ iov[iovnum].iov_base = sg_virt(sg);
+ iov[iovnum].iov_len = sg->length;
+ iovnum++;
+ }
+ } else {
+ iov[iovnum].iov_base = urb->transfer_buffer;
+ iov[iovnum].iov_len =
+ urb->transfer_buffer_length;
+ iovnum++;
+ }
txsize += urb->transfer_buffer_length;
}

@@ -95,30 +127,43 @@ static int vhci_send_cmd_submit(struct vhci_device *vdev)
if (!iso_buffer) {
usbip_event_add(&vdev->ud,
SDEV_EVENT_ERROR_MALLOC);
- return -1;
+ goto err_iso_buffer;
}

- iov[2].iov_base = iso_buffer;
- iov[2].iov_len = len;
+ iov[iovnum].iov_base = iso_buffer;
+ iov[iovnum].iov_len = len;
+ iovnum++;
txsize += len;
}

- ret = kernel_sendmsg(vdev->ud.tcp_socket, &msg, iov, 3, txsize);
+ ret = kernel_sendmsg(vdev->ud.tcp_socket, &msg, iov, iovnum,
+ txsize);
if (ret != txsize) {
pr_err("sendmsg failed!, ret=%d for %zd\n", ret,
txsize);
- kfree(iso_buffer);
usbip_event_add(&vdev->ud, VDEV_EVENT_ERROR_TCP);
- return -1;
+ err = -EPIPE;
+ goto err_tx;
}

+ kfree(iov);
+ /* This is only for isochronous case */
kfree(iso_buffer);
+ iso_buffer = NULL;
+
usbip_dbg_vhci_tx("send txdata\n");

total_size += txsize;
}

return total_size;
+
+err_tx:
+ kfree(iso_buffer);
+err_iso_buffer:
+ kfree(iov);
+
+ return err;
}

static struct vhci_unlink *dequeue_from_unlink_tx(struct vhci_device *vdev)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 1b85278471f6..a0318bc57fa6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -472,6 +472,7 @@ static noinline void compress_file_range(struct async_chunk *async_chunk,
u64 start = async_chunk->start;
u64 end = async_chunk->end;
u64 actual_end;
+ u64 i_size;
int ret = 0;
struct page **pages = NULL;
unsigned long nr_pages;
@@ -485,7 +486,19 @@ static noinline void compress_file_range(struct async_chunk *async_chunk,
inode_should_defrag(BTRFS_I(inode), start, end, end - start + 1,
SZ_16K);

- actual_end = min_t(u64, i_size_read(inode), end + 1);
+ /*
+ * We need to save i_size before now because it could change in between
+ * us evaluating the size and assigning it. This is because we lock and
+ * unlock the page in truncate and fallocate, and then modify the i_size
+ * later on.
+ *
+ * The barriers are to emulate READ_ONCE, remove that once i_size_read
+ * does that for us.
+ */
+ barrier();
+ i_size = i_size_read(inode);
+ barrier();
+ actual_end = min_t(u64, i_size, end + 1);
again:
will_compress = 0;
nr_pages = (end >> PAGE_SHIFT) - (start >> PAGE_SHIFT) + 1;
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 9634cae1e1b1..24f36e2dac06 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -686,9 +686,7 @@ static void dev_item_err(const struct extent_buffer *eb, int slot,
static int check_dev_item(struct extent_buffer *leaf,
struct btrfs_key *key, int slot)
{
- struct btrfs_fs_info *fs_info = leaf->fs_info;
struct btrfs_dev_item *ditem;
- u64 max_devid = max(BTRFS_MAX_DEVS(fs_info), BTRFS_MAX_DEVS_SYS_CHUNK);

if (key->objectid != BTRFS_DEV_ITEMS_OBJECTID) {
dev_item_err(leaf, slot,
@@ -696,12 +694,6 @@ static int check_dev_item(struct extent_buffer *leaf,
key->objectid, BTRFS_DEV_ITEMS_OBJECTID);
return -EUCLEAN;
}
- if (key->offset > max_devid) {
- dev_item_err(leaf, slot,
- "invalid devid: has=%llu expect=[0, %llu]",
- key->offset, max_devid);
- return -EUCLEAN;
- }
ditem = btrfs_item_ptr(leaf, slot, struct btrfs_dev_item);
if (btrfs_device_id(leaf, ditem) != key->offset) {
dev_item_err(leaf, slot,
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 9c057609eaec..0084fb9fa91e 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4976,6 +4976,7 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
} else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
max_stripe_size = SZ_32M;
max_chunk_size = 2 * max_stripe_size;
+ devs_max = min_t(int, devs_max, BTRFS_MAX_DEVS_SYS_CHUNK);
} else {
btrfs_err(info, "invalid chunk type 0x%llx requested",
type);
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 8fd530112810..966cec9d62e4 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1087,6 +1087,11 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)

dout("__ceph_remove_cap %p from %p\n", cap, &ci->vfs_inode);

+ /* remove from inode's cap rbtree, and clear auth cap */
+ rb_erase(&cap->ci_node, &ci->i_caps);
+ if (ci->i_auth_cap == cap)
+ ci->i_auth_cap = NULL;
+
/* remove from session list */
spin_lock(&session->s_cap_lock);
if (session->s_cap_iterator == cap) {
@@ -1120,11 +1125,6 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)

spin_unlock(&session->s_cap_lock);

- /* remove from inode list */
- rb_erase(&cap->ci_node, &ci->i_caps);
- if (ci->i_auth_cap == cap)
- ci->i_auth_cap = NULL;
-
if (removed)
ceph_put_cap(mdsc, cap);

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 4ca0b8ff9a72..d17a789fd856 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -1553,36 +1553,37 @@ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags)
{
int valid = 0;
struct dentry *parent;
- struct inode *dir;
+ struct inode *dir, *inode;

if (flags & LOOKUP_RCU) {
parent = READ_ONCE(dentry->d_parent);
dir = d_inode_rcu(parent);
if (!dir)
return -ECHILD;
+ inode = d_inode_rcu(dentry);
} else {
parent = dget_parent(dentry);
dir = d_inode(parent);
+ inode = d_inode(dentry);
}

dout("d_revalidate %p '%pd' inode %p offset %lld\n", dentry,
- dentry, d_inode(dentry), ceph_dentry(dentry)->offset);
+ dentry, inode, ceph_dentry(dentry)->offset);

/* always trust cached snapped dentries, snapdir dentry */
if (ceph_snap(dir) != CEPH_NOSNAP) {
dout("d_revalidate %p '%pd' inode %p is SNAPPED\n", dentry,
- dentry, d_inode(dentry));
+ dentry, inode);
valid = 1;
- } else if (d_really_is_positive(dentry) &&
- ceph_snap(d_inode(dentry)) == CEPH_SNAPDIR) {
+ } else if (inode && ceph_snap(inode) == CEPH_SNAPDIR) {
valid = 1;
} else {
valid = dentry_lease_is_valid(dentry, flags);
if (valid == -ECHILD)
return valid;
if (valid || dir_lease_is_valid(dir, dentry)) {
- if (d_really_is_positive(dentry))
- valid = ceph_is_any_caps(d_inode(dentry));
+ if (inode)
+ valid = ceph_is_any_caps(inode);
else
valid = 1;
}
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 685a03cc4b77..8273d86bf499 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -458,6 +458,9 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
err = ceph_security_init_secctx(dentry, mode, &as_ctx);
if (err < 0)
goto out_ctx;
+ } else if (!d_in_lookup(dentry)) {
+ /* If it's not being looked up, it's negative */
+ return -ENOENT;
}

/* do the open */
@@ -1931,10 +1934,18 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off,
if (ceph_test_mount_opt(ceph_inode_to_client(src_inode), NOCOPYFROM))
return -EOPNOTSUPP;

+ /*
+ * Striped file layouts require that we copy partial objects, but the
+ * OSD copy-from operation only supports full-object copies. Limit
+ * this to non-striped file layouts for now.
+ */
if ((src_ci->i_layout.stripe_unit != dst_ci->i_layout.stripe_unit) ||
- (src_ci->i_layout.stripe_count != dst_ci->i_layout.stripe_count) ||
- (src_ci->i_layout.object_size != dst_ci->i_layout.object_size))
+ (src_ci->i_layout.stripe_count != 1) ||
+ (dst_ci->i_layout.stripe_count != 1) ||
+ (src_ci->i_layout.object_size != dst_ci->i_layout.object_size)) {
+ dout("Invalid src/dst files layout\n");
return -EOPNOTSUPP;
+ }

if (len < src_ci->i_layout.object_size)
return -EOPNOTSUPP; /* no remote copy will be done */
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 3b537e7038c7..1676a46822ad 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -1432,6 +1432,7 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req)
dout(" final dn %p\n", dn);
} else if ((req->r_op == CEPH_MDS_OP_LOOKUPSNAP ||
req->r_op == CEPH_MDS_OP_MKSNAP) &&
+ test_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags) &&
!test_bit(CEPH_MDS_R_ABORTED, &req->r_req_flags)) {
struct inode *dir = req->r_parent;

diff --git a/fs/cifs/smb2pdu.h b/fs/cifs/smb2pdu.h
index 747de9317659..a7f3eb12472f 100644
--- a/fs/cifs/smb2pdu.h
+++ b/fs/cifs/smb2pdu.h
@@ -836,6 +836,7 @@ struct create_durable_handle_reconnect_v2 {
struct create_context ccontext;
__u8 Name[8];
struct durable_reconnect_context_v2 dcontext;
+ __u8 Pad[4];
} __packed;

/* See MS-SMB2 2.2.13.2.5 */
diff --git a/fs/configfs/symlink.c b/fs/configfs/symlink.c
index 91eac6c55e07..f3881e4caedd 100644
--- a/fs/configfs/symlink.c
+++ b/fs/configfs/symlink.c
@@ -143,11 +143,42 @@ int configfs_symlink(struct inode *dir, struct dentry *dentry, const char *symna
!type->ct_item_ops->allow_link)
goto out_put;

+ /*
+ * This is really sick. What they wanted was a hybrid of
+ * link(2) and symlink(2) - they wanted the target resolved
+ * at syscall time (as link(2) would've done), be a directory
+ * (which link(2) would've refused to do) *AND* be a deep
+ * fucking magic, making the target busy from rmdir POV.
+ * symlink(2) is nothing of that sort, and the locking it
+ * gets matches the normal symlink(2) semantics. Without
+ * attempts to resolve the target (which might very well
+ * not even exist yet) done prior to locking the parent
+ * directory. This perversion, OTOH, needs to resolve
+ * the target, which would lead to obvious deadlocks if
+ * attempted with any directories locked.
+ *
+ * Unfortunately, that garbage is userland ABI and we should've
+ * said "no" back in 2005. Too late now, so we get to
+ * play very ugly games with locking.
+ *
+ * Try *ANYTHING* of that sort in new code, and you will
+ * really regret it. Just ask yourself - what could a BOFH
+ * do to me and do I want to find it out first-hand?
+ *
+ * AV, a thoroughly annoyed bastard.
+ */
+ inode_unlock(dir);
ret = get_target(symname, &path, &target_item, dentry->d_sb);
+ inode_lock(dir);
if (ret)
goto out_put;

- ret = type->ct_item_ops->allow_link(parent_item, target_item);
+ if (dentry->d_inode || d_unhashed(dentry))
+ ret = -EEXIST;
+ else
+ ret = inode_permission(dir, MAY_WRITE | MAY_EXEC);
+ if (!ret)
+ ret = type->ct_item_ops->allow_link(parent_item, target_item);
if (!ret) {
mutex_lock(&configfs_symlink_mutex);
ret = create_link(parent_item, target_item, dentry);
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 542b02d170f8..57b23703c679 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -577,10 +577,13 @@ void wbc_attach_and_unlock_inode(struct writeback_control *wbc,
spin_unlock(&inode->i_lock);

/*
- * A dying wb indicates that the memcg-blkcg mapping has changed
- * and a new wb is already serving the memcg. Switch immediately.
+ * A dying wb indicates that either the blkcg associated with the
+ * memcg changed or the associated memcg is dying. In the first
+ * case, a replacement wb should already be available and we should
+ * refresh the wb immediately. In the second case, trying to
+ * refresh will keep failing.
*/
- if (unlikely(wb_dying(wbc->wb)))
+ if (unlikely(wb_dying(wbc->wb) && !css_is_dying(wbc->wb->memcg_css)))
inode_switch_wbs(inode, wbc->wb_id);
}
EXPORT_SYMBOL_GPL(wbc_attach_and_unlock_inode);
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index ad7a77101471..af549d70ec50 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -53,6 +53,16 @@ nfs4_is_valid_delegation(const struct nfs_delegation *delegation,
return false;
}

+struct nfs_delegation *nfs4_get_valid_delegation(const struct inode *inode)
+{
+ struct nfs_delegation *delegation;
+
+ delegation = rcu_dereference(NFS_I(inode)->delegation);
+ if (nfs4_is_valid_delegation(delegation, 0))
+ return delegation;
+ return NULL;
+}
+
static int
nfs4_do_check_delegation(struct inode *inode, fmode_t flags, bool mark)
{
diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h
index 9eb87ae4c982..8b14d441e699 100644
--- a/fs/nfs/delegation.h
+++ b/fs/nfs/delegation.h
@@ -68,6 +68,7 @@ int nfs4_lock_delegation_recall(struct file_lock *fl, struct nfs4_state *state,
bool nfs4_copy_delegation_stateid(struct inode *inode, fmode_t flags, nfs4_stateid *dst, const struct cred **cred);
bool nfs4_refresh_delegation_stateid(nfs4_stateid *dst, struct inode *inode);

+struct nfs_delegation *nfs4_get_valid_delegation(const struct inode *inode);
void nfs_mark_delegation_referenced(struct nfs_delegation *delegation);
int nfs4_have_delegation(struct inode *inode, fmode_t flags);
int nfs4_check_delegation(struct inode *inode, fmode_t flags);
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index e1e7d2724b97..e600f28b1ddb 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1435,8 +1435,6 @@ static int can_open_delegated(struct nfs_delegation *delegation, fmode_t fmode,
return 0;
if ((delegation->type & fmode) != fmode)
return 0;
- if (test_bit(NFS_DELEGATION_RETURNING, &delegation->flags))
- return 0;
switch (claim) {
case NFS4_OPEN_CLAIM_NULL:
case NFS4_OPEN_CLAIM_FH:
@@ -1805,7 +1803,6 @@ static void nfs4_return_incompatible_delegation(struct inode *inode, fmode_t fmo
static struct nfs4_state *nfs4_try_open_cached(struct nfs4_opendata *opendata)
{
struct nfs4_state *state = opendata->state;
- struct nfs_inode *nfsi = NFS_I(state->inode);
struct nfs_delegation *delegation;
int open_mode = opendata->o_arg.open_flags;
fmode_t fmode = opendata->o_arg.fmode;
@@ -1822,7 +1819,7 @@ static struct nfs4_state *nfs4_try_open_cached(struct nfs4_opendata *opendata)
}
spin_unlock(&state->owner->so_lock);
rcu_read_lock();
- delegation = rcu_dereference(nfsi->delegation);
+ delegation = nfs4_get_valid_delegation(state->inode);
if (!can_open_delegated(delegation, fmode, claim)) {
rcu_read_unlock();
break;
@@ -2366,7 +2363,7 @@ static void nfs4_open_prepare(struct rpc_task *task, void *calldata)
data->o_arg.open_flags, claim))
goto out_no_action;
rcu_read_lock();
- delegation = rcu_dereference(NFS_I(data->state->inode)->delegation);
+ delegation = nfs4_get_valid_delegation(data->state->inode);
if (can_open_delegated(delegation, data->o_arg.fmode, claim))
goto unlock_no_action;
rcu_read_unlock();
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 4435df3e5adb..f6d790b7f2e2 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2092,54 +2092,90 @@ static int ocfs2_is_io_unaligned(struct inode *inode, size_t count, loff_t pos)
return 0;
}

-static int ocfs2_prepare_inode_for_refcount(struct inode *inode,
- struct file *file,
- loff_t pos, size_t count,
- int *meta_level)
+static int ocfs2_inode_lock_for_extent_tree(struct inode *inode,
+ struct buffer_head **di_bh,
+ int meta_level,
+ int overwrite_io,
+ int write_sem,
+ int wait)
{
- int ret;
- struct buffer_head *di_bh = NULL;
- u32 cpos = pos >> OCFS2_SB(inode->i_sb)->s_clustersize_bits;
- u32 clusters =
- ocfs2_clusters_for_bytes(inode->i_sb, pos + count) - cpos;
+ int ret = 0;

- ret = ocfs2_inode_lock(inode, &di_bh, 1);
- if (ret) {
- mlog_errno(ret);
+ if (wait)
+ ret = ocfs2_inode_lock(inode, NULL, meta_level);
+ else
+ ret = ocfs2_try_inode_lock(inode,
+ overwrite_io ? NULL : di_bh, meta_level);
+ if (ret < 0)
goto out;
+
+ if (wait) {
+ if (write_sem)
+ down_write(&OCFS2_I(inode)->ip_alloc_sem);
+ else
+ down_read(&OCFS2_I(inode)->ip_alloc_sem);
+ } else {
+ if (write_sem)
+ ret = down_write_trylock(&OCFS2_I(inode)->ip_alloc_sem);
+ else
+ ret = down_read_trylock(&OCFS2_I(inode)->ip_alloc_sem);
+
+ if (!ret) {
+ ret = -EAGAIN;
+ goto out_unlock;
+ }
}

- *meta_level = 1;
+ return ret;

- ret = ocfs2_refcount_cow(inode, di_bh, cpos, clusters, UINT_MAX);
- if (ret)
- mlog_errno(ret);
+out_unlock:
+ brelse(*di_bh);
+ ocfs2_inode_unlock(inode, meta_level);
out:
- brelse(di_bh);
return ret;
}

+static void ocfs2_inode_unlock_for_extent_tree(struct inode *inode,
+ struct buffer_head **di_bh,
+ int meta_level,
+ int write_sem)
+{
+ if (write_sem)
+ up_write(&OCFS2_I(inode)->ip_alloc_sem);
+ else
+ up_read(&OCFS2_I(inode)->ip_alloc_sem);
+
+ brelse(*di_bh);
+ *di_bh = NULL;
+
+ if (meta_level >= 0)
+ ocfs2_inode_unlock(inode, meta_level);
+}
+
static int ocfs2_prepare_inode_for_write(struct file *file,
loff_t pos, size_t count, int wait)
{
int ret = 0, meta_level = 0, overwrite_io = 0;
+ int write_sem = 0;
struct dentry *dentry = file->f_path.dentry;
struct inode *inode = d_inode(dentry);
struct buffer_head *di_bh = NULL;
loff_t end;
+ u32 cpos;
+ u32 clusters;

/*
* We start with a read level meta lock and only jump to an ex
* if we need to make modifications here.
*/
for(;;) {
- if (wait)
- ret = ocfs2_inode_lock(inode, NULL, meta_level);
- else
- ret = ocfs2_try_inode_lock(inode,
- overwrite_io ? NULL : &di_bh, meta_level);
+ ret = ocfs2_inode_lock_for_extent_tree(inode,
+ &di_bh,
+ meta_level,
+ overwrite_io,
+ write_sem,
+ wait);
if (ret < 0) {
- meta_level = -1;
if (ret != -EAGAIN)
mlog_errno(ret);
goto out;
@@ -2151,15 +2187,8 @@ static int ocfs2_prepare_inode_for_write(struct file *file,
*/
if (!wait && !overwrite_io) {
overwrite_io = 1;
- if (!down_read_trylock(&OCFS2_I(inode)->ip_alloc_sem)) {
- ret = -EAGAIN;
- goto out_unlock;
- }

ret = ocfs2_overwrite_io(inode, di_bh, pos, count);
- brelse(di_bh);
- di_bh = NULL;
- up_read(&OCFS2_I(inode)->ip_alloc_sem);
if (ret < 0) {
if (ret != -EAGAIN)
mlog_errno(ret);
@@ -2178,7 +2207,10 @@ static int ocfs2_prepare_inode_for_write(struct file *file,
* set inode->i_size at the end of a write. */
if (should_remove_suid(dentry)) {
if (meta_level == 0) {
- ocfs2_inode_unlock(inode, meta_level);
+ ocfs2_inode_unlock_for_extent_tree(inode,
+ &di_bh,
+ meta_level,
+ write_sem);
meta_level = 1;
continue;
}
@@ -2194,18 +2226,32 @@ static int ocfs2_prepare_inode_for_write(struct file *file,

ret = ocfs2_check_range_for_refcount(inode, pos, count);
if (ret == 1) {
- ocfs2_inode_unlock(inode, meta_level);
- meta_level = -1;
-
- ret = ocfs2_prepare_inode_for_refcount(inode,
- file,
- pos,
- count,
- &meta_level);
+ ocfs2_inode_unlock_for_extent_tree(inode,
+ &di_bh,
+ meta_level,
+ write_sem);
+ ret = ocfs2_inode_lock_for_extent_tree(inode,
+ &di_bh,
+ meta_level,
+ overwrite_io,
+ 1,
+ wait);
+ write_sem = 1;
+ if (ret < 0) {
+ if (ret != -EAGAIN)
+ mlog_errno(ret);
+ goto out;
+ }
+
+ cpos = pos >> OCFS2_SB(inode->i_sb)->s_clustersize_bits;
+ clusters =
+ ocfs2_clusters_for_bytes(inode->i_sb, pos + count) - cpos;
+ ret = ocfs2_refcount_cow(inode, di_bh, cpos, clusters, UINT_MAX);
}

if (ret < 0) {
- mlog_errno(ret);
+ if (ret != -EAGAIN)
+ mlog_errno(ret);
goto out_unlock;
}

@@ -2216,10 +2262,10 @@ static int ocfs2_prepare_inode_for_write(struct file *file,
trace_ocfs2_prepare_inode_for_write(OCFS2_I(inode)->ip_blkno,
pos, count, wait);

- brelse(di_bh);
-
- if (meta_level >= 0)
- ocfs2_inode_unlock(inode, meta_level);
+ ocfs2_inode_unlock_for_extent_tree(inode,
+ &di_bh,
+ meta_level,
+ write_sem);

out:
return ret;
diff --git a/include/asm-generic/vdso/vsyscall.h b/include/asm-generic/vdso/vsyscall.h
index e94b19782c92..ce4103208619 100644
--- a/include/asm-generic/vdso/vsyscall.h
+++ b/include/asm-generic/vdso/vsyscall.h
@@ -25,13 +25,6 @@ static __always_inline int __arch_get_clock_mode(struct timekeeper *tk)
}
#endif /* __arch_get_clock_mode */

-#ifndef __arch_use_vsyscall
-static __always_inline int __arch_use_vsyscall(struct vdso_data *vdata)
-{
- return 1;
-}
-#endif /* __arch_use_vsyscall */
-
#ifndef __arch_update_vsyscall
static __always_inline void __arch_update_vsyscall(struct vdso_data *vdata,
struct timekeeper *tk)
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index fcb1386bb0d4..4643fcf55474 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -59,6 +59,11 @@ extern ssize_t cpu_show_l1tf(struct device *dev,
struct device_attribute *attr, char *buf);
extern ssize_t cpu_show_mds(struct device *dev,
struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_tsx_async_abort(struct device *dev,
+ struct device_attribute *attr,
+ char *buf);
+extern ssize_t cpu_show_itlb_multihit(struct device *dev,
+ struct device_attribute *attr, char *buf);

extern __printf(4, 5)
struct device *cpu_device_create(struct device *parent, void *drvdata,
@@ -211,28 +216,7 @@ static inline int cpuhp_smt_enable(void) { return 0; }
static inline int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) { return 0; }
#endif

-/*
- * These are used for a global "mitigations=" cmdline option for toggling
- * optional CPU mitigations.
- */
-enum cpu_mitigations {
- CPU_MITIGATIONS_OFF,
- CPU_MITIGATIONS_AUTO,
- CPU_MITIGATIONS_AUTO_NOSMT,
-};
-
-extern enum cpu_mitigations cpu_mitigations;
-
-/* mitigations=off */
-static inline bool cpu_mitigations_off(void)
-{
- return cpu_mitigations == CPU_MITIGATIONS_OFF;
-}
-
-/* mitigations=auto,nosmt */
-static inline bool cpu_mitigations_auto_nosmt(void)
-{
- return cpu_mitigations == CPU_MITIGATIONS_AUTO_NOSMT;
-}
+extern bool cpu_mitigations_off(void);
+extern bool cpu_mitigations_auto_nosmt(void);

#endif /* _LINUX_CPU_H_ */
diff --git a/include/linux/efi.h b/include/linux/efi.h
index f87fabea4a85..b3a93f8e6e59 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -1585,9 +1585,22 @@ char *efi_convert_cmdline(efi_system_table_t *sys_table_arg,
efi_status_t efi_get_memory_map(efi_system_table_t *sys_table_arg,
struct efi_boot_memmap *map);

+efi_status_t efi_low_alloc_above(efi_system_table_t *sys_table_arg,
+ unsigned long size, unsigned long align,
+ unsigned long *addr, unsigned long min);
+
+static inline
efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
unsigned long size, unsigned long align,
- unsigned long *addr);
+ unsigned long *addr)
+{
+ /*
+ * Don't allocate at 0x0. It will confuse code that
+ * checks pointers against NULL. Skip the first 8
+ * bytes so we start at a nice even number.
+ */
+ return efi_low_alloc_above(sys_table_arg, size, align, addr, 0x8);
+}

efi_status_t efi_high_alloc(efi_system_table_t *sys_table_arg,
unsigned long size, unsigned long align,
@@ -1598,7 +1611,8 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
unsigned long image_size,
unsigned long alloc_size,
unsigned long preferred_addr,
- unsigned long alignment);
+ unsigned long alignment,
+ unsigned long min_addr);

efi_status_t handle_cmdline_files(efi_system_table_t *sys_table_arg,
efi_loaded_image_t *image,
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 92c6e31fb008..38716f93825f 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1099,7 +1099,6 @@ static inline void bpf_get_prog_name(const struct bpf_prog *prog, char *sym)

#endif /* CONFIG_BPF_JIT */

-void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp);
void bpf_prog_kallsyms_del_all(struct bpf_prog *fp);

#define BPF_ANC BIT(15)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index fcb46b3374c6..52ed5f66e8f9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1090,6 +1090,7 @@ enum kvm_stat_kind {

struct kvm_stat_data {
int offset;
+ int mode;
struct kvm *kvm;
};

@@ -1097,6 +1098,7 @@ struct kvm_stats_debugfs_item {
const char *name;
int offset;
enum kvm_stat_kind kind;
+ int mode;
};
extern struct kvm_stats_debugfs_item debugfs_entries[];
extern struct dentry *kvm_debugfs_dir;
@@ -1380,4 +1382,10 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
}
#endif /* CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE */

+typedef int (*kvm_vm_thread_fn_t)(struct kvm *kvm, uintptr_t data);
+
+int kvm_vm_create_worker_thread(struct kvm *kvm, kvm_vm_thread_fn_t thread_fn,
+ uintptr_t data, const char *name,
+ struct task_struct **thread_ptr);
+
#endif
diff --git a/include/linux/mm.h b/include/linux/mm.h
index fe4552e1c40b..f17931c40dfb 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -695,11 +695,6 @@ static inline void *kvcalloc(size_t n, size_t size, gfp_t flags)

extern void kvfree(const void *addr);

-static inline atomic_t *compound_mapcount_ptr(struct page *page)
-{
- return &page[1].compound_mapcount;
-}
-
static inline int compound_mapcount(struct page *page)
{
VM_BUG_ON_PAGE(!PageCompound(page), page);
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 6a7a1083b6fb..a1b50f12e648 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -221,6 +221,11 @@ struct page {
#endif
} _struct_page_alignment;

+static inline atomic_t *compound_mapcount_ptr(struct page *page)
+{
+ return &page[1].compound_mapcount;
+}
+
/*
* Used for sizing the vmemmap region on some architectures
*/
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index f91cb8898ff0..1bf83c8fcaa7 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -622,12 +622,28 @@ static inline int PageTransCompound(struct page *page)
*
* Unlike PageTransCompound, this is safe to be called only while
* split_huge_pmd() cannot run from under us, like if protected by the
- * MMU notifier, otherwise it may result in page->_mapcount < 0 false
+ * MMU notifier, otherwise it may result in page->_mapcount check false
* positives.
+ *
+ * We have to treat page cache THP differently since every subpage of it
+ * would get _mapcount inc'ed once it is PMD mapped. But, it may be PTE
+ * mapped in the current process so comparing subpage's _mapcount to
+ * compound_mapcount to filter out PTE mapped case.
*/
static inline int PageTransCompoundMap(struct page *page)
{
- return PageTransCompound(page) && atomic_read(&page->_mapcount) < 0;
+ struct page *head;
+
+ if (!PageTransCompound(page))
+ return 0;
+
+ if (PageAnon(page))
+ return atomic_read(&page->_mapcount) < 0;
+
+ head = compound_head(page);
+ /* File THP is PMD mapped and not PTE mapped */
+ return atomic_read(&page->_mapcount) ==
+ atomic_read(compound_mapcount_ptr(head));
}

/*
diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index e4b3fb4bb77c..ce7055259877 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -139,6 +139,11 @@ static inline void sk_msg_apply_bytes(struct sk_psock *psock, u32 bytes)
}
}

+static inline u32 sk_msg_iter_dist(u32 start, u32 end)
+{
+ return end >= start ? end - start : end + (MAX_MSG_FRAGS - start);
+}
+
#define sk_msg_iter_var_prev(var) \
do { \
if (var == 0) \
@@ -198,9 +203,7 @@ static inline u32 sk_msg_elem_used(const struct sk_msg *msg)
if (sk_msg_full(msg))
return MAX_MSG_FRAGS;

- return msg->sg.end >= msg->sg.start ?
- msg->sg.end - msg->sg.start :
- msg->sg.end + (MAX_MSG_FRAGS - msg->sg.start);
+ return sk_msg_iter_dist(msg->sg.start, msg->sg.end);
}

static inline struct scatterlist *sk_msg_elem(struct sk_msg *msg, int which)
diff --git a/include/linux/sunrpc/bc_xprt.h b/include/linux/sunrpc/bc_xprt.h
index 87d27e13d885..d796058cdff2 100644
--- a/include/linux/sunrpc/bc_xprt.h
+++ b/include/linux/sunrpc/bc_xprt.h
@@ -64,6 +64,11 @@ static inline int xprt_setup_backchannel(struct rpc_xprt *xprt,
return 0;
}

+static inline void xprt_destroy_backchannel(struct rpc_xprt *xprt,
+ unsigned int max_reqs)
+{
+}
+
static inline bool svc_is_backchannel(const struct svc_rqst *rqstp)
{
return false;
diff --git a/include/net/bonding.h b/include/net/bonding.h
index f7fe45689142..be404b272d6b 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -159,7 +159,6 @@ struct slave {
unsigned long target_last_arp_rx[BOND_MAX_ARP_TARGETS];
s8 link; /* one of BOND_LINK_XXXX */
s8 link_new_state; /* one of BOND_LINK_XXXX */
- s8 new_link;
u8 backup:1, /* indicates backup slave. Value corresponds with
BOND_STATE_ACTIVE and BOND_STATE_BACKUP */
inactive:1, /* indicates inactive slave */
@@ -239,6 +238,7 @@ struct bonding {
struct dentry *debug_dir;
#endif /* CONFIG_DEBUG_FS */
struct rtnl_link_stats64 bond_stats;
+ struct lock_class_key stats_lock_key;
};

#define bond_slave_get_rcu(dev) \
@@ -549,7 +549,7 @@ static inline void bond_propose_link_state(struct slave *slave, int state)

static inline void bond_commit_link_state(struct slave *slave, bool notify)
{
- if (slave->link == slave->link_new_state)
+ if (slave->link_new_state == BOND_LINK_NOCHANGE)
return;

slave->link = slave->link_new_state;
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 3759167f91f5..078887c8c586 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -889,6 +889,7 @@ struct netns_ipvs {
struct delayed_work defense_work; /* Work handler */
int drop_rate;
int drop_counter;
+ int old_secure_tcp;
atomic_t dropentry;
/* locks in ctl.c */
spinlock_t dropentry_lock; /* drop entry handling */
diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index 50a67bd6a434..b8452cc0e059 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -439,8 +439,8 @@ static inline int neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
{
unsigned long now = jiffies;

- if (neigh->used != now)
- neigh->used = now;
+ if (READ_ONCE(neigh->used) != now)
+ WRITE_ONCE(neigh->used, now);
if (!(neigh->nud_state&(NUD_CONNECTED|NUD_DELAY|NUD_PROBE)))
return __neigh_event_send(neigh, skb);
return 0;
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 7f7a4d9137e5..f9a8bcb37da0 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -801,7 +801,8 @@ struct nft_expr_ops {
*/
struct nft_expr {
const struct nft_expr_ops *ops;
- unsigned char data[];
+ unsigned char data[]
+ __attribute__((aligned(__alignof__(u64))));
};

static inline void *nft_expr_priv(const struct nft_expr *expr)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 58b1fbc884a7..50ea27d0f7c5 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -13,6 +13,7 @@
#include <linux/refcount.h>
#include <linux/workqueue.h>
#include <linux/mutex.h>
+#include <linux/hashtable.h>
#include <net/gen_stats.h>
#include <net/rtnetlink.h>
#include <net/flow_offload.h>
@@ -359,6 +360,7 @@ struct tcf_proto {
bool deleting;
refcount_t refcnt;
struct rcu_head rcu;
+ struct hlist_node destroy_ht_node;
};

struct qdisc_skb_cb {
@@ -409,6 +411,8 @@ struct tcf_block {
struct list_head filter_chain_list;
} chain0;
struct rcu_head rcu;
+ DECLARE_HASHTABLE(proto_destroy_ht, 7);
+ struct mutex proto_destroy_lock; /* Lock for proto_destroy hashtable. */
};

#ifdef CONFIG_PROVE_LOCKING
diff --git a/include/net/sock.h b/include/net/sock.h
index b03f96370f8e..59220da079e8 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2331,7 +2331,7 @@ static inline ktime_t sock_read_timestamp(struct sock *sk)

return kt;
#else
- return sk->sk_stamp;
+ return READ_ONCE(sk->sk_stamp);
#endif
}

@@ -2342,7 +2342,7 @@ static inline void sock_write_timestamp(struct sock *sk, ktime_t kt)
sk->sk_stamp = kt;
write_sequnlock(&sk->sk_stamp_seq);
#else
- sk->sk_stamp = kt;
+ WRITE_ONCE(sk->sk_stamp, kt);
#endif
}

diff --git a/include/net/tls.h b/include/net/tls.h
index 41b2d41bb1b8..bd1ef1a915e9 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -40,6 +40,7 @@
#include <linux/socket.h>
#include <linux/tcp.h>
#include <linux/skmsg.h>
+#include <linux/mutex.h>
#include <linux/netdevice.h>

#include <net/tcp.h>
@@ -268,6 +269,10 @@ struct tls_context {

bool in_tcp_sendpages;
bool pending_open_record_frags;
+
+ struct mutex tx_lock; /* protects partially_sent_* fields and
+ * per-type TX fields
+ */
unsigned long flags;

/* cache cold stuff */
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 4f225175cb91..77d8df451805 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -327,7 +327,7 @@ struct ib_tm_caps {

struct ib_cq_init_attr {
unsigned int cqe;
- int comp_vector;
+ u32 comp_vector;
u32 flags;
};

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 66088a9e9b9e..ef0e1e3e66f4 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -502,7 +502,7 @@ int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt)
return WARN_ON_ONCE(bpf_adj_branches(prog, off, off + cnt, off, false));
}

-void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp)
+static void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp)
{
int i;

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 272071e9112f..aac966b32c42 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1316,24 +1316,32 @@ static void __bpf_prog_put_rcu(struct rcu_head *rcu)
{
struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu);

+ kvfree(aux->func_info);
free_used_maps(aux);
bpf_prog_uncharge_memlock(aux->prog);
security_bpf_prog_free(aux);
bpf_prog_free(aux->prog);
}

+static void __bpf_prog_put_noref(struct bpf_prog *prog, bool deferred)
+{
+ bpf_prog_kallsyms_del_all(prog);
+ btf_put(prog->aux->btf);
+ bpf_prog_free_linfo(prog);
+
+ if (deferred)
+ call_rcu(&prog->aux->rcu, __bpf_prog_put_rcu);
+ else
+ __bpf_prog_put_rcu(&prog->aux->rcu);
+}
+
static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock)
{
if (atomic_dec_and_test(&prog->aux->refcnt)) {
perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_UNLOAD, 0);
/* bpf_prog_free_id() must be called first */
bpf_prog_free_id(prog, do_idr_lock);
- bpf_prog_kallsyms_del_all(prog);
- btf_put(prog->aux->btf);
- kvfree(prog->aux->func_info);
- bpf_prog_free_linfo(prog);
-
- call_rcu(&prog->aux->rcu, __bpf_prog_put_rcu);
+ __bpf_prog_put_noref(prog, true);
}
}

@@ -1730,11 +1738,12 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
return err;

free_used_maps:
- bpf_prog_free_linfo(prog);
- kvfree(prog->aux->func_info);
- btf_put(prog->aux->btf);
- bpf_prog_kallsyms_del_subprogs(prog);
- free_used_maps(prog->aux);
+ /* In case we have subprogs, we need to wait for a grace
+ * period before we can tear down JIT memory since symbols
+ * are already exposed under kallsyms.
+ */
+ __bpf_prog_put_noref(prog, prog->aux->func_cnt);
+ return err;
free_prog:
bpf_prog_uncharge_memlock(prog);
free_prog_sec:
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 5aa37531ce76..a8122c405603 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -786,7 +786,8 @@ static int generate_sched_domains(cpumask_var_t **domains,
cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus))
continue;

- if (is_sched_load_balance(cp))
+ if (is_sched_load_balance(cp) &&
+ !cpumask_empty(cp->effective_cpus))
csa[csn++] = cp;

/* skip @cp's subtree if not a partition root */
diff --git a/kernel/cpu.c b/kernel/cpu.c
index e84c0873559e..df186823dda6 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2339,7 +2339,18 @@ void __init boot_cpu_hotplug_init(void)
this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);
}

-enum cpu_mitigations cpu_mitigations __ro_after_init = CPU_MITIGATIONS_AUTO;
+/*
+ * These are used for a global "mitigations=" cmdline option for toggling
+ * optional CPU mitigations.
+ */
+enum cpu_mitigations {
+ CPU_MITIGATIONS_OFF,
+ CPU_MITIGATIONS_AUTO,
+ CPU_MITIGATIONS_AUTO_NOSMT,
+};
+
+static enum cpu_mitigations cpu_mitigations __ro_after_init =
+ CPU_MITIGATIONS_AUTO;

static int __init mitigations_parse_cmdline(char *arg)
{
@@ -2356,3 +2367,17 @@ static int __init mitigations_parse_cmdline(char *arg)
return 0;
}
early_param("mitigations", mitigations_parse_cmdline);
+
+/* mitigations=off */
+bool cpu_mitigations_off(void)
+{
+ return cpu_mitigations == CPU_MITIGATIONS_OFF;
+}
+EXPORT_SYMBOL_GPL(cpu_mitigations_off);
+
+/* mitigations=auto,nosmt */
+bool cpu_mitigations_auto_nosmt(void)
+{
+ return cpu_mitigations == CPU_MITIGATIONS_AUTO_NOSMT;
+}
+EXPORT_SYMBOL_GPL(cpu_mitigations_auto_nosmt);
diff --git a/kernel/fork.c b/kernel/fork.c
index 3647097e6783..8bbd39585301 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2586,7 +2586,35 @@ noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
return 0;
}

-static bool clone3_args_valid(const struct kernel_clone_args *kargs)
+/**
+ * clone3_stack_valid - check and prepare stack
+ * @kargs: kernel clone args
+ *
+ * Verify that the stack arguments userspace gave us are sane.
+ * In addition, set the stack direction for userspace since it's easy for us to
+ * determine.
+ */
+static inline bool clone3_stack_valid(struct kernel_clone_args *kargs)
+{
+ if (kargs->stack == 0) {
+ if (kargs->stack_size > 0)
+ return false;
+ } else {
+ if (kargs->stack_size == 0)
+ return false;
+
+ if (!access_ok((void __user *)kargs->stack, kargs->stack_size))
+ return false;
+
+#if !defined(CONFIG_STACK_GROWSUP) && !defined(CONFIG_IA64)
+ kargs->stack += kargs->stack_size;
+#endif
+ }
+
+ return true;
+}
+
+static bool clone3_args_valid(struct kernel_clone_args *kargs)
{
/*
* All lower bits of the flag word are taken.
@@ -2606,6 +2634,9 @@ static bool clone3_args_valid(const struct kernel_clone_args *kargs)
kargs->exit_signal)
return false;

+ if (!clone3_stack_valid(kargs))
+ return false;
+
return true;
}

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index f751ce0b783e..93a8749763ea 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1927,7 +1927,7 @@ static struct sched_domain_topology_level
static int
build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *attr)
{
- enum s_alloc alloc_state;
+ enum s_alloc alloc_state = sa_none;
struct sched_domain *sd;
struct s_data d;
struct rq *rq = NULL;
@@ -1935,6 +1935,9 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
struct sched_domain_topology_level *tl_asym;
bool has_asym = false;

+ if (WARN_ON(cpumask_empty(cpu_map)))
+ goto error;
+
alloc_state = __visit_domain_allocation_hell(&d, cpu_map);
if (alloc_state != sa_rootdomain)
goto error;
@@ -2005,7 +2008,7 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
rcu_read_unlock();

if (has_asym)
- static_branch_enable_cpuslocked(&sched_asym_cpucapacity);
+ static_branch_inc_cpuslocked(&sched_asym_cpucapacity);

if (rq && sched_debug_enabled) {
pr_info("root domain span: %*pbl (max cpu_capacity = %lu)\n",
@@ -2100,8 +2103,12 @@ int sched_init_domains(const struct cpumask *cpu_map)
*/
static void detach_destroy_domains(const struct cpumask *cpu_map)
{
+ unsigned int cpu = cpumask_any(cpu_map);
int i;

+ if (rcu_access_pointer(per_cpu(sd_asym_cpucapacity, cpu)))
+ static_branch_dec_cpuslocked(&sched_asym_cpucapacity);
+
rcu_read_lock();
for_each_cpu(i, cpu_map)
cpu_attach_domain(NULL, &def_root_domain, i);
diff --git a/kernel/time/vsyscall.c b/kernel/time/vsyscall.c
index 4bc37ac3bb05..5ee0f7709410 100644
--- a/kernel/time/vsyscall.c
+++ b/kernel/time/vsyscall.c
@@ -110,8 +110,7 @@ void update_vsyscall(struct timekeeper *tk)
nsec = nsec + tk->wall_to_monotonic.tv_nsec;
vdso_ts->sec += __iter_div_u64_rem(nsec, NSEC_PER_SEC, &vdso_ts->nsec);

- if (__arch_use_vsyscall(vdata))
- update_vdso_data(vdata, tk);
+ update_vdso_data(vdata, tk);

__arch_update_vsyscall(vdata, tk);

@@ -124,10 +123,8 @@ void update_vsyscall_tz(void)
{
struct vdso_data *vdata = __arch_get_k_vdso_data();

- if (__arch_use_vsyscall(vdata)) {
- vdata[CS_HRES_COARSE].tz_minuteswest = sys_tz.tz_minuteswest;
- vdata[CS_HRES_COARSE].tz_dsttime = sys_tz.tz_dsttime;
- }
+ vdata[CS_HRES_COARSE].tz_minuteswest = sys_tz.tz_minuteswest;
+ vdata[CS_HRES_COARSE].tz_dsttime = sys_tz.tz_dsttime;

__arch_sync_vdso_data(vdata);
}
diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 5cff72f18c4a..33ffbf308853 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -106,7 +106,12 @@ asmlinkage __visible void dump_stack(void)
was_locked = 1;
} else {
local_irq_restore(flags);
- cpu_relax();
+ /*
+ * Wait for the lock to release before jumping to
+ * atomic_cmpxchg() in order to mitigate the thundering herd
+ * problem.
+ */
+ do { cpu_relax(); } while (atomic_read(&dump_lock) != -1);
goto retry;
}

diff --git a/mm/filemap.c b/mm/filemap.c
index d0cf700bf201..d9572593e5c7 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
.range_end = end,
};

- if (!mapping_cap_writeback_dirty(mapping))
+ if (!mapping_cap_writeback_dirty(mapping) ||
+ !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
return 0;

wbc_attach_fdatawrite_inode(&wbc, mapping->host);
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index eaaa21b23215..5ce6d8728e2b 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1016,12 +1016,13 @@ static void collapse_huge_page(struct mm_struct *mm,

anon_vma_lock_write(vma->anon_vma);

- pte = pte_offset_map(pmd, address);
- pte_ptl = pte_lockptr(mm, pmd);
-
mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, NULL, mm,
address, address + HPAGE_PMD_SIZE);
mmu_notifier_invalidate_range_start(&range);
+
+ pte = pte_offset_map(pmd, address);
+ pte_ptl = pte_lockptr(mm, pmd);
+
pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */
/*
* After this gup_fast can't run anymore. This also removes
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e18108b2b786..89fd0829ebd0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -486,7 +486,7 @@ ino_t page_cgroup_ino(struct page *page)
unsigned long ino = 0;

rcu_read_lock();
- if (PageHead(page) && PageSlab(page))
+ if (PageSlab(page) && !PageTail(page))
memcg = memcg_from_slab_page(page);
else
memcg = READ_ONCE(page->mem_cgroup);
@@ -2407,6 +2407,15 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
goto retry;
}

+ /*
+ * Memcg doesn't have a dedicated reserve for atomic
+ * allocations. But like the global atomic pool, we need to
+ * put the burden of reclaim on regular allocation requests
+ * and let these go through as privileged allocations.
+ */
+ if (gfp_mask & __GFP_ATOMIC)
+ goto force;
+
/*
* Unlike in global OOM situations, memcg is not in a physical
* memory shortage. Allow dying and OOM-killed tasks to
@@ -4763,12 +4772,6 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg)
{
int node;

- /*
- * Flush percpu vmstats and vmevents to guarantee the value correctness
- * on parent's and all ancestor levels.
- */
- memcg_flush_percpu_vmstats(memcg, false);
- memcg_flush_percpu_vmevents(memcg);
for_each_node(node)
free_mem_cgroup_per_node_info(memcg, node);
free_percpu(memcg->vmstats_percpu);
@@ -4779,6 +4782,12 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg)
static void mem_cgroup_free(struct mem_cgroup *memcg)
{
memcg_wb_domain_exit(memcg);
+ /*
+ * Flush percpu vmstats and vmevents to guarantee the value correctness
+ * on parent's and all ancestor levels.
+ */
+ memcg_flush_percpu_vmstats(memcg, false);
+ memcg_flush_percpu_vmevents(memcg);
__mem_cgroup_free(memcg);
}

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bfa5815e59f8..702a1d02fc62 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1946,6 +1946,14 @@ void __init page_alloc_init_late(void)
/* Block until all are initialised */
wait_for_completion(&pgdat_init_all_done_comp);

+ /*
+ * The number of managed pages has changed due to the initialisation
+ * so the pcpu batch and high limits needs to be updated or the limits
+ * will be artificially small.
+ */
+ for_each_populated_zone(zone)
+ zone_pcp_update(zone);
+
/*
* We initialized the rest of the deferred pages. Permanently disable
* on-demand struct page initialization.
@@ -8479,7 +8487,6 @@ void free_contig_range(unsigned long pfn, unsigned int nr_pages)
WARN(count != 0, "%d pages are still in use!\n", count);
}

-#ifdef CONFIG_MEMORY_HOTPLUG
/*
* The zone indicated has a new number of managed_pages; batch sizes and percpu
* page high values need to be recalulated.
@@ -8493,7 +8500,6 @@ void __meminit zone_pcp_update(struct zone *zone)
per_cpu_ptr(zone->pageset, cpu));
mutex_unlock(&pcp_batch_high_lock);
}
-#endif

void zone_pcp_reset(struct zone *zone)
{
diff --git a/mm/slab.h b/mm/slab.h
index 9057b8056b07..8d830e722398 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -259,8 +259,8 @@ static inline struct kmem_cache *memcg_root_cache(struct kmem_cache *s)
* Expects a pointer to a slab page. Please note, that PageSlab() check
* isn't sufficient, as it returns true also for tail compound slab pages,
* which do not have slab_cache pointer set.
- * So this function assumes that the page can pass PageHead() and PageSlab()
- * checks.
+ * So this function assumes that the page can pass PageSlab() && !PageTail()
+ * check.
*
* The kmem_cache can be reparented asynchronously. The caller must ensure
* the memcg lifetime, e.g. by taking rcu_read_lock() or cgroup_mutex.
diff --git a/mm/vmstat.c b/mm/vmstat.c
index fd7e16ca6996..3c85f760bdd0 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1970,7 +1970,7 @@ void __init init_mm_internals(void)
#endif
#ifdef CONFIG_PROC_FS
proc_create_seq("buddyinfo", 0444, NULL, &fragmentation_op);
- proc_create_seq("pagetypeinfo", 0444, NULL, &pagetypeinfo_op);
+ proc_create_seq("pagetypeinfo", 0400, NULL, &pagetypeinfo_op);
proc_create_seq("vmstat", 0444, NULL, &vmstat_op);
proc_create_seq("zoneinfo", 0444, NULL, &zoneinfo_op);
#endif
diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
index f93785e5833c..74cfb8b5ab33 100644
--- a/net/core/lwt_bpf.c
+++ b/net/core/lwt_bpf.c
@@ -88,11 +88,16 @@ static int bpf_lwt_input_reroute(struct sk_buff *skb)
int err = -EINVAL;

if (skb->protocol == htons(ETH_P_IP)) {
+ struct net_device *dev = skb_dst(skb)->dev;
struct iphdr *iph = ip_hdr(skb);

+ dev_hold(dev);
+ skb_dst_drop(skb);
err = ip_route_input_noref(skb, iph->daddr, iph->saddr,
- iph->tos, skb_dst(skb)->dev);
+ iph->tos, dev);
+ dev_put(dev);
} else if (skb->protocol == htons(ETH_P_IPV6)) {
+ skb_dst_drop(skb);
err = ipv6_stub->ipv6_route_input(skb);
} else {
err = -EAFNOSUPPORT;
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 6832eeb4b785..c10e3e56006e 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -271,18 +271,28 @@ void sk_msg_trim(struct sock *sk, struct sk_msg *msg, int len)

msg->sg.data[i].length -= trim;
sk_mem_uncharge(sk, trim);
+ /* Adjust copybreak if it falls into the trimmed part of last buf */
+ if (msg->sg.curr == i && msg->sg.copybreak > msg->sg.data[i].length)
+ msg->sg.copybreak = msg->sg.data[i].length;
out:
- /* If we trim data before curr pointer update copybreak and current
- * so that any future copy operations start at new copy location.
+ sk_msg_iter_var_next(i);
+ msg->sg.end = i;
+
+ /* If we trim data a full sg elem before curr pointer update
+ * copybreak and current so that any future copy operations
+ * start at new copy location.
* However trimed data that has not yet been used in a copy op
* does not require an update.
*/
- if (msg->sg.curr >= i) {
+ if (!msg->sg.size) {
+ msg->sg.curr = msg->sg.start;
+ msg->sg.copybreak = 0;
+ } else if (sk_msg_iter_dist(msg->sg.start, msg->sg.curr) >=
+ sk_msg_iter_dist(msg->sg.start, msg->sg.end)) {
+ sk_msg_iter_var_prev(i);
msg->sg.curr = i;
msg->sg.copybreak = msg->sg.data[i].length;
}
- sk_msg_iter_var_next(i);
- msg->sg.end = i;
}
EXPORT_SYMBOL_GPL(sk_msg_trim);

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 0913a090b2bf..f1888c683426 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1814,8 +1814,8 @@ int fib_sync_down_addr(struct net_device *dev, __be32 local)
int ret = 0;
unsigned int hash = fib_laddr_hashfn(local);
struct hlist_head *head = &fib_info_laddrhash[hash];
+ int tb_id = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN;
struct net *net = dev_net(dev);
- int tb_id = l3mdev_fib_table(dev);
struct fib_info *fi;

if (!fib_info_laddrhash || local == 0)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 546088e50815..2b25a0de0364 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -621,6 +621,7 @@ static void rt6_probe(struct fib6_nh *fib6_nh)
{
struct __rt6_probe_work *work = NULL;
const struct in6_addr *nh_gw;
+ unsigned long last_probe;
struct neighbour *neigh;
struct net_device *dev;
struct inet6_dev *idev;
@@ -639,6 +640,7 @@ static void rt6_probe(struct fib6_nh *fib6_nh)
nh_gw = &fib6_nh->fib_nh_gw6;
dev = fib6_nh->fib_nh_dev;
rcu_read_lock_bh();
+ last_probe = READ_ONCE(fib6_nh->last_probe);
idev = __in6_dev_get(dev);
neigh = __ipv6_neigh_lookup_noref(dev, nh_gw);
if (neigh) {
@@ -654,13 +656,15 @@ static void rt6_probe(struct fib6_nh *fib6_nh)
__neigh_set_probe_once(neigh);
}
write_unlock(&neigh->lock);
- } else if (time_after(jiffies, fib6_nh->last_probe +
+ } else if (time_after(jiffies, last_probe +
idev->cnf.rtr_probe_interval)) {
work = kmalloc(sizeof(*work), GFP_ATOMIC);
}

- if (work) {
- fib6_nh->last_probe = jiffies;
+ if (!work || cmpxchg(&fib6_nh->last_probe,
+ last_probe, jiffies) != last_probe) {
+ kfree(work);
+ } else {
INIT_WORK(&work->work, rt6_probe_deferred);
work->target = *nh_gw;
dev_hold(dev);
@@ -3385,6 +3389,9 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh,
int err;

fib6_nh->fib_nh_family = AF_INET6;
+#ifdef CONFIG_IPV6_ROUTER_PREF
+ fib6_nh->last_probe = jiffies;
+#endif

err = -ENODEV;
if (cfg->fc_ifindex) {
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index e64d5f9a89dd..e7288eab7512 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -2069,8 +2069,9 @@ ip_set_sockfn_get(struct sock *sk, int optval, void __user *user, int *len)
}

req_version->version = IPSET_PROTOCOL;
- ret = copy_to_user(user, req_version,
- sizeof(struct ip_set_req_version));
+ if (copy_to_user(user, req_version,
+ sizeof(struct ip_set_req_version)))
+ ret = -EFAULT;
goto done;
}
case IP_SET_OP_GET_BYNAME: {
@@ -2129,7 +2130,8 @@ ip_set_sockfn_get(struct sock *sk, int optval, void __user *user, int *len)
} /* end of switch(op) */

copy:
- ret = copy_to_user(user, data, copylen);
+ if (copy_to_user(user, data, copylen))
+ ret = -EFAULT;

done:
vfree(data);
diff --git a/net/netfilter/ipset/ip_set_hash_ipmac.c b/net/netfilter/ipset/ip_set_hash_ipmac.c
index 24d8f4df4230..4ce563eb927d 100644
--- a/net/netfilter/ipset/ip_set_hash_ipmac.c
+++ b/net/netfilter/ipset/ip_set_hash_ipmac.c
@@ -209,7 +209,7 @@ hash_ipmac6_kadt(struct ip_set *set, const struct sk_buff *skb,
(skb_mac_header(skb) + ETH_HLEN) > skb->data)
return -EINVAL;

- if (opt->flags & IPSET_DIM_ONE_SRC)
+ if (opt->flags & IPSET_DIM_TWO_SRC)
ether_addr_copy(e.ether, eth_hdr(skb)->h_source);
else
ether_addr_copy(e.ether, eth_hdr(skb)->h_dest);
diff --git a/net/netfilter/ipvs/ip_vs_app.c b/net/netfilter/ipvs/ip_vs_app.c
index 4515056ef1c2..f9b16f2b2219 100644
--- a/net/netfilter/ipvs/ip_vs_app.c
+++ b/net/netfilter/ipvs/ip_vs_app.c
@@ -193,21 +193,29 @@ struct ip_vs_app *register_ip_vs_app(struct netns_ipvs *ipvs, struct ip_vs_app *

mutex_lock(&__ip_vs_app_mutex);

+ /* increase the module use count */
+ if (!ip_vs_use_count_inc()) {
+ err = -ENOENT;
+ goto out_unlock;
+ }
+
list_for_each_entry(a, &ipvs->app_list, a_list) {
if (!strcmp(app->name, a->name)) {
err = -EEXIST;
+ /* decrease the module use count */
+ ip_vs_use_count_dec();
goto out_unlock;
}
}
a = kmemdup(app, sizeof(*app), GFP_KERNEL);
if (!a) {
err = -ENOMEM;
+ /* decrease the module use count */
+ ip_vs_use_count_dec();
goto out_unlock;
}
INIT_LIST_HEAD(&a->incs_list);
list_add(&a->a_list, &ipvs->app_list);
- /* increase the module use count */
- ip_vs_use_count_inc();

out_unlock:
mutex_unlock(&__ip_vs_app_mutex);
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 060565e7d227..e29b00f514a0 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -93,7 +93,6 @@ static bool __ip_vs_addr_is_local_v6(struct net *net,
static void update_defense_level(struct netns_ipvs *ipvs)
{
struct sysinfo i;
- static int old_secure_tcp = 0;
int availmem;
int nomem;
int to_change = -1;
@@ -174,35 +173,35 @@ static void update_defense_level(struct netns_ipvs *ipvs)
spin_lock(&ipvs->securetcp_lock);
switch (ipvs->sysctl_secure_tcp) {
case 0:
- if (old_secure_tcp >= 2)
+ if (ipvs->old_secure_tcp >= 2)
to_change = 0;
break;
case 1:
if (nomem) {
- if (old_secure_tcp < 2)
+ if (ipvs->old_secure_tcp < 2)
to_change = 1;
ipvs->sysctl_secure_tcp = 2;
} else {
- if (old_secure_tcp >= 2)
+ if (ipvs->old_secure_tcp >= 2)
to_change = 0;
}
break;
case 2:
if (nomem) {
- if (old_secure_tcp < 2)
+ if (ipvs->old_secure_tcp < 2)
to_change = 1;
} else {
- if (old_secure_tcp >= 2)
+ if (ipvs->old_secure_tcp >= 2)
to_change = 0;
ipvs->sysctl_secure_tcp = 1;
}
break;
case 3:
- if (old_secure_tcp < 2)
+ if (ipvs->old_secure_tcp < 2)
to_change = 1;
break;
}
- old_secure_tcp = ipvs->sysctl_secure_tcp;
+ ipvs->old_secure_tcp = ipvs->sysctl_secure_tcp;
if (to_change >= 0)
ip_vs_protocol_timeout_change(ipvs,
ipvs->sysctl_secure_tcp > 1);
@@ -1275,7 +1274,8 @@ ip_vs_add_service(struct netns_ipvs *ipvs, struct ip_vs_service_user_kern *u,
struct ip_vs_service *svc = NULL;

/* increase the module use count */
- ip_vs_use_count_inc();
+ if (!ip_vs_use_count_inc())
+ return -ENOPROTOOPT;

/* Lookup the scheduler by 'u->sched_name' */
if (strcmp(u->sched_name, "none")) {
@@ -2434,9 +2434,6 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
if (copy_from_user(arg, user, len) != 0)
return -EFAULT;

- /* increase the module use count */
- ip_vs_use_count_inc();
-
/* Handle daemons since they have another lock */
if (cmd == IP_VS_SO_SET_STARTDAEMON ||
cmd == IP_VS_SO_SET_STOPDAEMON) {
@@ -2449,13 +2446,13 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
ret = -EINVAL;
if (strscpy(cfg.mcast_ifn, dm->mcast_ifn,
sizeof(cfg.mcast_ifn)) <= 0)
- goto out_dec;
+ return ret;
cfg.syncid = dm->syncid;
ret = start_sync_thread(ipvs, &cfg, dm->state);
} else {
ret = stop_sync_thread(ipvs, dm->state);
}
- goto out_dec;
+ return ret;
}

mutex_lock(&__ip_vs_mutex);
@@ -2550,10 +2547,6 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)

out_unlock:
mutex_unlock(&__ip_vs_mutex);
- out_dec:
- /* decrease the module use count */
- ip_vs_use_count_dec();
-
return ret;
}

diff --git a/net/netfilter/ipvs/ip_vs_pe.c b/net/netfilter/ipvs/ip_vs_pe.c
index 8e104dff7abc..166c669f0763 100644
--- a/net/netfilter/ipvs/ip_vs_pe.c
+++ b/net/netfilter/ipvs/ip_vs_pe.c
@@ -68,7 +68,8 @@ int register_ip_vs_pe(struct ip_vs_pe *pe)
struct ip_vs_pe *tmp;

/* increase the module use count */
- ip_vs_use_count_inc();
+ if (!ip_vs_use_count_inc())
+ return -ENOENT;

mutex_lock(&ip_vs_pe_mutex);
/* Make sure that the pe with this name doesn't exist
diff --git a/net/netfilter/ipvs/ip_vs_sched.c b/net/netfilter/ipvs/ip_vs_sched.c
index 2f9d5cd5daee..d4903723be7e 100644
--- a/net/netfilter/ipvs/ip_vs_sched.c
+++ b/net/netfilter/ipvs/ip_vs_sched.c
@@ -179,7 +179,8 @@ int register_ip_vs_scheduler(struct ip_vs_scheduler *scheduler)
}

/* increase the module use count */
- ip_vs_use_count_inc();
+ if (!ip_vs_use_count_inc())
+ return -ENOENT;

mutex_lock(&ip_vs_sched_mutex);

diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index a4a78c4b06de..8dc892a9dc91 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1762,6 +1762,10 @@ int start_sync_thread(struct netns_ipvs *ipvs, struct ipvs_sync_daemon_cfg *c,
IP_VS_DBG(7, "Each ip_vs_sync_conn entry needs %zd bytes\n",
sizeof(struct ip_vs_sync_conn_v0));

+ /* increase the module use count */
+ if (!ip_vs_use_count_inc())
+ return -ENOPROTOOPT;
+
/* Do not hold one mutex and then to block on another */
for (;;) {
rtnl_lock();
@@ -1892,9 +1896,6 @@ int start_sync_thread(struct netns_ipvs *ipvs, struct ipvs_sync_daemon_cfg *c,
mutex_unlock(&ipvs->sync_mutex);
rtnl_unlock();

- /* increase the module use count */
- ip_vs_use_count_inc();
-
return 0;

out:
@@ -1924,11 +1925,17 @@ int start_sync_thread(struct netns_ipvs *ipvs, struct ipvs_sync_daemon_cfg *c,
}
kfree(ti);
}
+
+ /* decrease the module use count */
+ ip_vs_use_count_dec();
return result;

out_early:
mutex_unlock(&ipvs->sync_mutex);
rtnl_unlock();
+
+ /* decrease the module use count */
+ ip_vs_use_count_dec();
return result;
}

diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index a0b4bf654de2..4c2f8959de58 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -201,6 +201,8 @@ int flow_offload_add(struct nf_flowtable *flow_table, struct flow_offload *flow)
{
int err;

+ flow->timeout = (u32)jiffies + NF_FLOW_TIMEOUT;
+
err = rhashtable_insert_fast(&flow_table->rhashtable,
&flow->tuplehash[0].node,
nf_flow_offload_rhash_params);
@@ -217,7 +219,6 @@ int flow_offload_add(struct nf_flowtable *flow_table, struct flow_offload *flow)
return err;
}

- flow->timeout = (u32)jiffies + NF_FLOW_TIMEOUT;
return 0;
}
EXPORT_SYMBOL_GPL(flow_offload_add);
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 22a80eb60222..5cb2d8908d2a 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -161,13 +161,21 @@ static int nft_payload_offload_ll(struct nft_offload_ctx *ctx,

switch (priv->offset) {
case offsetof(struct ethhdr, h_source):
+ if (priv->len != ETH_ALEN)
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_ETH_ADDRS, eth_addrs,
src, ETH_ALEN, reg);
break;
case offsetof(struct ethhdr, h_dest):
+ if (priv->len != ETH_ALEN)
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_ETH_ADDRS, eth_addrs,
dst, ETH_ALEN, reg);
break;
+ default:
+ return -EOPNOTSUPP;
}

return 0;
@@ -181,14 +189,23 @@ static int nft_payload_offload_ip(struct nft_offload_ctx *ctx,

switch (priv->offset) {
case offsetof(struct iphdr, saddr):
+ if (priv->len != sizeof(struct in_addr))
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_IPV4_ADDRS, ipv4, src,
sizeof(struct in_addr), reg);
break;
case offsetof(struct iphdr, daddr):
+ if (priv->len != sizeof(struct in_addr))
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_IPV4_ADDRS, ipv4, dst,
sizeof(struct in_addr), reg);
break;
case offsetof(struct iphdr, protocol):
+ if (priv->len != sizeof(__u8))
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_BASIC, basic, ip_proto,
sizeof(__u8), reg);
nft_offload_set_dependency(ctx, NFT_OFFLOAD_DEP_TRANSPORT);
@@ -208,14 +225,23 @@ static int nft_payload_offload_ip6(struct nft_offload_ctx *ctx,

switch (priv->offset) {
case offsetof(struct ipv6hdr, saddr):
+ if (priv->len != sizeof(struct in6_addr))
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_IPV6_ADDRS, ipv6, src,
sizeof(struct in6_addr), reg);
break;
case offsetof(struct ipv6hdr, daddr):
+ if (priv->len != sizeof(struct in6_addr))
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_IPV6_ADDRS, ipv6, dst,
sizeof(struct in6_addr), reg);
break;
case offsetof(struct ipv6hdr, nexthdr):
+ if (priv->len != sizeof(__u8))
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_BASIC, basic, ip_proto,
sizeof(__u8), reg);
nft_offload_set_dependency(ctx, NFT_OFFLOAD_DEP_TRANSPORT);
@@ -255,10 +281,16 @@ static int nft_payload_offload_tcp(struct nft_offload_ctx *ctx,

switch (priv->offset) {
case offsetof(struct tcphdr, source):
+ if (priv->len != sizeof(__be16))
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_PORTS, tp, src,
sizeof(__be16), reg);
break;
case offsetof(struct tcphdr, dest):
+ if (priv->len != sizeof(__be16))
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_PORTS, tp, dst,
sizeof(__be16), reg);
break;
@@ -277,10 +309,16 @@ static int nft_payload_offload_udp(struct nft_offload_ctx *ctx,

switch (priv->offset) {
case offsetof(struct udphdr, source):
+ if (priv->len != sizeof(__be16))
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_PORTS, tp, src,
sizeof(__be16), reg);
break;
case offsetof(struct udphdr, dest):
+ if (priv->len != sizeof(__be16))
+ return -EOPNOTSUPP;
+
NFT_OFFLOAD_MATCH(FLOW_DISSECTOR_KEY_PORTS, tp, dst,
sizeof(__be16), reg);
break;
diff --git a/net/nfc/netlink.c b/net/nfc/netlink.c
index 17e6ca62f1be..afde0d763039 100644
--- a/net/nfc/netlink.c
+++ b/net/nfc/netlink.c
@@ -1099,7 +1099,6 @@ static int nfc_genl_llc_set_params(struct sk_buff *skb, struct genl_info *info)

local = nfc_llcp_find_local(dev);
if (!local) {
- nfc_put_device(dev);
rc = -ENODEV;
goto exit;
}
@@ -1159,7 +1158,6 @@ static int nfc_genl_llc_sdreq(struct sk_buff *skb, struct genl_info *info)

local = nfc_llcp_find_local(dev);
if (!local) {
- nfc_put_device(dev);
rc = -ENODEV;
goto exit;
}
diff --git a/net/openvswitch/vport-internal_dev.c b/net/openvswitch/vport-internal_dev.c
index d2437b5b2f6a..baa33103108a 100644
--- a/net/openvswitch/vport-internal_dev.c
+++ b/net/openvswitch/vport-internal_dev.c
@@ -137,7 +137,7 @@ static void do_setup(struct net_device *netdev)
netdev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_OPENVSWITCH |
IFF_NO_QUEUE;
netdev->needs_free_netdev = true;
- netdev->priv_destructor = internal_dev_destructor;
+ netdev->priv_destructor = NULL;
netdev->ethtool_ops = &internal_dev_ethtool_ops;
netdev->rtnl_link_ops = &internal_dev_link_ops;

@@ -159,7 +159,6 @@ static struct vport *internal_dev_create(const struct vport_parms *parms)
struct internal_dev *internal_dev;
struct net_device *dev;
int err;
- bool free_vport = true;

vport = ovs_vport_alloc(0, &ovs_internal_vport_ops, parms);
if (IS_ERR(vport)) {
@@ -190,10 +189,9 @@ static struct vport *internal_dev_create(const struct vport_parms *parms)

rtnl_lock();
err = register_netdevice(vport->dev);
- if (err) {
- free_vport = false;
+ if (err)
goto error_unlock;
- }
+ vport->dev->priv_destructor = internal_dev_destructor;

dev_set_promiscuity(vport->dev, 1);
rtnl_unlock();
@@ -207,8 +205,7 @@ static struct vport *internal_dev_create(const struct vport_parms *parms)
error_free_netdev:
free_netdev(dev);
error_free_vport:
- if (free_vport)
- ovs_vport_free(vport);
+ ovs_vport_free(vport);
error:
return ERR_PTR(err);
}
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 6b12883e04b8..5c1769999a92 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -21,6 +21,7 @@
#include <linux/slab.h>
#include <linux/idr.h>
#include <linux/rhashtable.h>
+#include <linux/jhash.h>
#include <net/net_namespace.h>
#include <net/sock.h>
#include <net/netlink.h>
@@ -45,6 +46,62 @@ static LIST_HEAD(tcf_proto_base);
/* Protects list of registered TC modules. It is pure SMP lock. */
static DEFINE_RWLOCK(cls_mod_lock);

+static u32 destroy_obj_hashfn(const struct tcf_proto *tp)
+{
+ return jhash_3words(tp->chain->index, tp->prio,
+ (__force __u32)tp->protocol, 0);
+}
+
+static void tcf_proto_signal_destroying(struct tcf_chain *chain,
+ struct tcf_proto *tp)
+{
+ struct tcf_block *block = chain->block;
+
+ mutex_lock(&block->proto_destroy_lock);
+ hash_add_rcu(block->proto_destroy_ht, &tp->destroy_ht_node,
+ destroy_obj_hashfn(tp));
+ mutex_unlock(&block->proto_destroy_lock);
+}
+
+static bool tcf_proto_cmp(const struct tcf_proto *tp1,
+ const struct tcf_proto *tp2)
+{
+ return tp1->chain->index == tp2->chain->index &&
+ tp1->prio == tp2->prio &&
+ tp1->protocol == tp2->protocol;
+}
+
+static bool tcf_proto_exists_destroying(struct tcf_chain *chain,
+ struct tcf_proto *tp)
+{
+ u32 hash = destroy_obj_hashfn(tp);
+ struct tcf_proto *iter;
+ bool found = false;
+
+ rcu_read_lock();
+ hash_for_each_possible_rcu(chain->block->proto_destroy_ht, iter,
+ destroy_ht_node, hash) {
+ if (tcf_proto_cmp(tp, iter)) {
+ found = true;
+ break;
+ }
+ }
+ rcu_read_unlock();
+
+ return found;
+}
+
+static void
+tcf_proto_signal_destroyed(struct tcf_chain *chain, struct tcf_proto *tp)
+{
+ struct tcf_block *block = chain->block;
+
+ mutex_lock(&block->proto_destroy_lock);
+ if (hash_hashed(&tp->destroy_ht_node))
+ hash_del_rcu(&tp->destroy_ht_node);
+ mutex_unlock(&block->proto_destroy_lock);
+}
+
/* Find classifier type by string name */

static const struct tcf_proto_ops *__tcf_proto_lookup_ops(const char *kind)
@@ -232,9 +289,11 @@ static void tcf_proto_get(struct tcf_proto *tp)
static void tcf_chain_put(struct tcf_chain *chain);

static void tcf_proto_destroy(struct tcf_proto *tp, bool rtnl_held,
- struct netlink_ext_ack *extack)
+ bool sig_destroy, struct netlink_ext_ack *extack)
{
tp->ops->destroy(tp, rtnl_held, extack);
+ if (sig_destroy)
+ tcf_proto_signal_destroyed(tp->chain, tp);
tcf_chain_put(tp->chain);
module_put(tp->ops->owner);
kfree_rcu(tp, rcu);
@@ -244,7 +303,7 @@ static void tcf_proto_put(struct tcf_proto *tp, bool rtnl_held,
struct netlink_ext_ack *extack)
{
if (refcount_dec_and_test(&tp->refcnt))
- tcf_proto_destroy(tp, rtnl_held, extack);
+ tcf_proto_destroy(tp, rtnl_held, true, extack);
}

static int walker_check_empty(struct tcf_proto *tp, void *fh,
@@ -368,6 +427,7 @@ static bool tcf_chain_detach(struct tcf_chain *chain)
static void tcf_block_destroy(struct tcf_block *block)
{
mutex_destroy(&block->lock);
+ mutex_destroy(&block->proto_destroy_lock);
kfree_rcu(block, rcu);
}

@@ -543,6 +603,12 @@ static void tcf_chain_flush(struct tcf_chain *chain, bool rtnl_held)

mutex_lock(&chain->filter_chain_lock);
tp = tcf_chain_dereference(chain->filter_chain, chain);
+ while (tp) {
+ tp_next = rcu_dereference_protected(tp->next, 1);
+ tcf_proto_signal_destroying(chain, tp);
+ tp = tp_next;
+ }
+ tp = tcf_chain_dereference(chain->filter_chain, chain);
RCU_INIT_POINTER(chain->filter_chain, NULL);
tcf_chain0_head_change(chain, NULL);
chain->flushing = true;
@@ -1002,6 +1068,7 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
return ERR_PTR(-ENOMEM);
}
mutex_init(&block->lock);
+ mutex_init(&block->proto_destroy_lock);
flow_block_init(&block->flow_block);
INIT_LIST_HEAD(&block->chain_list);
INIT_LIST_HEAD(&block->owner_list);
@@ -1754,6 +1821,12 @@ static struct tcf_proto *tcf_chain_tp_insert_unique(struct tcf_chain *chain,

mutex_lock(&chain->filter_chain_lock);

+ if (tcf_proto_exists_destroying(chain, tp_new)) {
+ mutex_unlock(&chain->filter_chain_lock);
+ tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
+ return ERR_PTR(-EAGAIN);
+ }
+
tp = tcf_chain_tp_find(chain, &chain_info,
protocol, prio, false);
if (!tp)
@@ -1761,10 +1834,10 @@ static struct tcf_proto *tcf_chain_tp_insert_unique(struct tcf_chain *chain,
mutex_unlock(&chain->filter_chain_lock);

if (tp) {
- tcf_proto_destroy(tp_new, rtnl_held, NULL);
+ tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
tp_new = tp;
} else if (err) {
- tcf_proto_destroy(tp_new, rtnl_held, NULL);
+ tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
tp_new = ERR_PTR(err);
}

@@ -1802,6 +1875,7 @@ static void tcf_chain_tp_delete_empty(struct tcf_chain *chain,
return;
}

+ tcf_proto_signal_destroying(chain, tp);
next = tcf_chain_dereference(chain_info.next, chain);
if (tp == chain->filter_chain)
tcf_chain0_head_change(chain, next);
@@ -2321,6 +2395,7 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
err = -EINVAL;
goto errout_locked;
} else if (t->tcm_handle == 0) {
+ tcf_proto_signal_destroying(chain, tp);
tcf_chain_tp_remove(chain, &chain_info, tp);
mutex_unlock(&chain->filter_chain_lock);

diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
index bab2da8cf17a..a20594056fef 100644
--- a/net/smc/smc_pnet.c
+++ b/net/smc/smc_pnet.c
@@ -376,8 +376,6 @@ static int smc_pnet_fill_entry(struct net *net,
return 0;

error:
- if (pnetelem->ndev)
- dev_put(pnetelem->ndev);
return rc;
}

diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
index 339e8c077c2d..195b40c5dae4 100644
--- a/net/sunrpc/backchannel_rqst.c
+++ b/net/sunrpc/backchannel_rqst.c
@@ -220,7 +220,7 @@ void xprt_destroy_bc(struct rpc_xprt *xprt, unsigned int max_reqs)
goto out;

spin_lock_bh(&xprt->bc_pa_lock);
- xprt->bc_alloc_max -= max_reqs;
+ xprt->bc_alloc_max -= min(max_reqs, xprt->bc_alloc_max);
list_for_each_entry_safe(req, tmp, &xprt->bc_pa_list, rq_bc_pa_list) {
dprintk("RPC: req=%p\n", req);
list_del(&req->rq_bc_pa_list);
@@ -307,8 +307,8 @@ void xprt_free_bc_rqst(struct rpc_rqst *req)
*/
dprintk("RPC: Last session removed req=%p\n", req);
xprt_free_allocation(req);
- return;
}
+ xprt_put(xprt);
}

/*
@@ -339,7 +339,7 @@ struct rpc_rqst *xprt_lookup_bc_request(struct rpc_xprt *xprt, __be32 xid)
spin_unlock(&xprt->bc_pa_lock);
if (new) {
if (req != new)
- xprt_free_bc_rqst(new);
+ xprt_free_allocation(new);
break;
} else if (req)
break;
@@ -368,6 +368,7 @@ void xprt_complete_bc_request(struct rpc_rqst *req, uint32_t copied)
set_bit(RPC_BC_PA_IN_USE, &req->rq_bc_pa_state);

dprintk("RPC: add callback request to list\n");
+ xprt_get(xprt);
spin_lock(&bc_serv->sv_cb_lock);
list_add(&req->rq_bc_list, &bc_serv->sv_cb_list);
wake_up(&bc_serv->sv_cb_waitq);
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 20631d64312c..ac796f3d4240 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1935,6 +1935,11 @@ static void xprt_destroy_cb(struct work_struct *work)
rpc_destroy_wait_queue(&xprt->sending);
rpc_destroy_wait_queue(&xprt->backlog);
kfree(xprt->servername);
+ /*
+ * Destroy any existing back channel
+ */
+ xprt_destroy_backchannel(xprt, UINT_MAX);
+
/*
* Tear down transport state and free the rpc_xprt
*/
diff --git a/net/sunrpc/xprtrdma/backchannel.c b/net/sunrpc/xprtrdma/backchannel.c
index 59e624b1d7a0..7cccaab9a17a 100644
--- a/net/sunrpc/xprtrdma/backchannel.c
+++ b/net/sunrpc/xprtrdma/backchannel.c
@@ -165,6 +165,7 @@ void xprt_rdma_bc_free_rqst(struct rpc_rqst *rqst)
spin_lock(&xprt->bc_pa_lock);
list_add_tail(&rqst->rq_bc_pa_list, &xprt->bc_pa_list);
spin_unlock(&xprt->bc_pa_lock);
+ xprt_put(xprt);
}

static struct rpc_rqst *rpcrdma_bc_rqst_get(struct rpcrdma_xprt *r_xprt)
@@ -261,6 +262,7 @@ void rpcrdma_bc_receive_call(struct rpcrdma_xprt *r_xprt,

/* Queue rqst for ULP's callback service */
bc_serv = xprt->bc_serv;
+ xprt_get(xprt);
spin_lock(&bc_serv->sv_cb_lock);
list_add(&rqst->rq_bc_list, &bc_serv->sv_cb_list);
spin_unlock(&bc_serv->sv_cb_lock);
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 43922d86e510..6b0c9b798d9c 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -482,8 +482,10 @@ static int tls_push_data(struct sock *sk,
int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
{
unsigned char record_type = TLS_RECORD_TYPE_DATA;
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
int rc;

+ mutex_lock(&tls_ctx->tx_lock);
lock_sock(sk);

if (unlikely(msg->msg_controllen)) {
@@ -497,12 +499,14 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)

out:
release_sock(sk);
+ mutex_unlock(&tls_ctx->tx_lock);
return rc;
}

int tls_device_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags)
{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
struct iov_iter msg_iter;
char *kaddr = kmap(page);
struct kvec iov;
@@ -511,6 +515,7 @@ int tls_device_sendpage(struct sock *sk, struct page *page,
if (flags & MSG_SENDPAGE_NOTLAST)
flags |= MSG_MORE;

+ mutex_lock(&tls_ctx->tx_lock);
lock_sock(sk);

if (flags & MSG_OOB) {
@@ -527,6 +532,7 @@ int tls_device_sendpage(struct sock *sk, struct page *page,

out:
release_sock(sk);
+ mutex_unlock(&tls_ctx->tx_lock);
return rc;
}

@@ -575,9 +581,11 @@ static int tls_device_push_pending_record(struct sock *sk, int flags)

void tls_device_write_space(struct sock *sk, struct tls_context *ctx)
{
- if (!sk->sk_write_pending && tls_is_partially_sent_record(ctx)) {
+ if (tls_is_partially_sent_record(ctx)) {
gfp_t sk_allocation = sk->sk_allocation;

+ WARN_ON_ONCE(sk->sk_write_pending);
+
sk->sk_allocation = GFP_ATOMIC;
tls_push_partial_record(sk, ctx,
MSG_DONTWAIT | MSG_NOSIGNAL |
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 43252a801c3f..9313dd51023a 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -258,6 +258,7 @@ void tls_ctx_free(struct tls_context *ctx)

memzero_explicit(&ctx->crypto_send, sizeof(ctx->crypto_send));
memzero_explicit(&ctx->crypto_recv, sizeof(ctx->crypto_recv));
+ mutex_destroy(&ctx->tx_lock);
kfree(ctx);
}

@@ -615,6 +616,7 @@ static struct tls_context *create_ctx(struct sock *sk)
ctx->getsockopt = sk->sk_prot->getsockopt;
ctx->sk_proto_close = sk->sk_prot->close;
ctx->unhash = sk->sk_prot->unhash;
+ mutex_init(&ctx->tx_lock);
return ctx;
}

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 91d21b048a9b..881f06f465f8 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -897,15 +897,9 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL))
return -ENOTSUPP;

+ mutex_lock(&tls_ctx->tx_lock);
lock_sock(sk);

- /* Wait till there is any pending write on socket */
- if (unlikely(sk->sk_write_pending)) {
- ret = wait_on_pending_writer(sk, &timeo);
- if (unlikely(ret))
- goto send_end;
- }
-
if (unlikely(msg->msg_controllen)) {
ret = tls_proccess_cmsg(sk, msg, &record_type);
if (ret) {
@@ -1091,6 +1085,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
ret = sk_stream_error(sk, msg->msg_flags, ret);

release_sock(sk);
+ mutex_unlock(&tls_ctx->tx_lock);
return copied ? copied : ret;
}

@@ -1114,13 +1109,6 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
eor = !(flags & (MSG_MORE | MSG_SENDPAGE_NOTLAST));
sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);

- /* Wait till there is any pending write on socket */
- if (unlikely(sk->sk_write_pending)) {
- ret = wait_on_pending_writer(sk, &timeo);
- if (unlikely(ret))
- goto sendpage_end;
- }
-
/* Call the sk_stream functions to manage the sndbuf mem. */
while (size > 0) {
size_t copy, required_size;
@@ -1219,15 +1207,18 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
int tls_sw_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags)
{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
int ret;

if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
return -ENOTSUPP;

+ mutex_lock(&tls_ctx->tx_lock);
lock_sock(sk);
ret = tls_sw_do_sendpage(sk, page, offset, size, flags);
release_sock(sk);
+ mutex_unlock(&tls_ctx->tx_lock);
return ret;
}

@@ -2172,9 +2163,11 @@ static void tx_work_handler(struct work_struct *work)

if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
return;
+ mutex_lock(&tls_ctx->tx_lock);
lock_sock(sk);
tls_tx_records(sk, -1);
release_sock(sk);
+ mutex_unlock(&tls_ctx->tx_lock);
}

void tls_sw_write_space(struct sock *sk, struct tls_context *ctx)
@@ -2182,12 +2175,9 @@ void tls_sw_write_space(struct sock *sk, struct tls_context *ctx)
struct tls_sw_context_tx *tx_ctx = tls_sw_ctx_tx(ctx);

/* Schedule the transmission if tx list is ready */
- if (is_tx_ready(tx_ctx) && !sk->sk_write_pending) {
- /* Schedule the transmission */
- if (!test_and_set_bit(BIT_TX_SCHEDULED,
- &tx_ctx->tx_bitmask))
- schedule_delayed_work(&tx_ctx->tx_work.work, 0);
- }
+ if (is_tx_ready(tx_ctx) &&
+ !test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
+ schedule_delayed_work(&tx_ctx->tx_work.work, 0);
}

void tls_sw_strparser_arm(struct sock *sk, struct tls_context *tls_ctx)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index a7adffd062c7..058d59fceddd 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -870,9 +870,11 @@ virtio_transport_recv_connected(struct sock *sk,
if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SHUTDOWN_SEND)
vsk->peer_shutdown |= SEND_SHUTDOWN;
if (vsk->peer_shutdown == SHUTDOWN_MASK &&
- vsock_stream_has_data(vsk) <= 0) {
- sock_set_flag(sk, SOCK_DONE);
- sk->sk_state = TCP_CLOSING;
+ vsock_stream_has_data(vsk) <= 0 &&
+ !sock_flag(sk, SOCK_DONE)) {
+ (void)virtio_transport_reset(vsk, NULL);
+
+ virtio_transport_do_close(vsk, true);
}
if (le32_to_cpu(pkt->hdr.flags))
sk->sk_state_change(sk);
diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c
index 688aac7a6943..182f9eb48dde 100644
--- a/net/xdp/xdp_umem.c
+++ b/net/xdp/xdp_umem.c
@@ -26,6 +26,9 @@ void xdp_add_sk_umem(struct xdp_umem *umem, struct xdp_sock *xs)
{
unsigned long flags;

+ if (!xs->tx)
+ return;
+
spin_lock_irqsave(&umem->xsk_list_lock, flags);
list_add_rcu(&xs->list, &umem->xsk_list);
spin_unlock_irqrestore(&umem->xsk_list_lock, flags);
@@ -35,6 +38,9 @@ void xdp_del_sk_umem(struct xdp_umem *umem, struct xdp_sock *xs)
{
unsigned long flags;

+ if (!xs->tx)
+ return;
+
spin_lock_irqsave(&umem->xsk_list_lock, flags);
list_del_rcu(&xs->list);
spin_unlock_irqrestore(&umem->xsk_list_lock, flags);
diff --git a/sound/core/timer.c b/sound/core/timer.c
index 6b724d2ee2de..59ae21b0bb93 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -284,11 +284,11 @@ int snd_timer_open(struct snd_timer_instance **ti,
goto unlock;
}
if (!list_empty(&timer->open_list_head)) {
- timeri = list_entry(timer->open_list_head.next,
+ struct snd_timer_instance *t =
+ list_entry(timer->open_list_head.next,
struct snd_timer_instance, open_list);
- if (timeri->flags & SNDRV_TIMER_IFLG_EXCLUSIVE) {
+ if (t->flags & SNDRV_TIMER_IFLG_EXCLUSIVE) {
err = -EBUSY;
- timeri = NULL;
goto unlock;
}
}
diff --git a/sound/firewire/bebob/bebob_focusrite.c b/sound/firewire/bebob/bebob_focusrite.c
index 32b864bee25f..06d6a37cd853 100644
--- a/sound/firewire/bebob/bebob_focusrite.c
+++ b/sound/firewire/bebob/bebob_focusrite.c
@@ -27,6 +27,8 @@
#define SAFFIRE_CLOCK_SOURCE_SPDIF 1

/* clock sources as returned from register of Saffire Pro 10 and 26 */
+#define SAFFIREPRO_CLOCK_SOURCE_SELECT_MASK 0x000000ff
+#define SAFFIREPRO_CLOCK_SOURCE_DETECT_MASK 0x0000ff00
#define SAFFIREPRO_CLOCK_SOURCE_INTERNAL 0
#define SAFFIREPRO_CLOCK_SOURCE_SKIP 1 /* never used on hardware */
#define SAFFIREPRO_CLOCK_SOURCE_SPDIF 2
@@ -189,6 +191,7 @@ saffirepro_both_clk_src_get(struct snd_bebob *bebob, unsigned int *id)
map = saffirepro_clk_maps[1];

/* In a case that this driver cannot handle the value of register. */
+ value &= SAFFIREPRO_CLOCK_SOURCE_SELECT_MASK;
if (value >= SAFFIREPRO_CLOCK_SOURCE_COUNT || map[value] < 0) {
err = -EIO;
goto end;
diff --git a/sound/pci/hda/patch_ca0132.c b/sound/pci/hda/patch_ca0132.c
index 6d1fb7c11f17..b7a1abb3e231 100644
--- a/sound/pci/hda/patch_ca0132.c
+++ b/sound/pci/hda/patch_ca0132.c
@@ -7604,7 +7604,7 @@ static void hp_callback(struct hda_codec *codec, struct hda_jack_callback *cb)
/* Delay enabling the HP amp, to let the mic-detection
* state machine run.
*/
- cancel_delayed_work_sync(&spec->unsol_hp_work);
+ cancel_delayed_work(&spec->unsol_hp_work);
schedule_delayed_work(&spec->unsol_hp_work, msecs_to_jiffies(500));
tbl = snd_hda_jack_tbl_get(codec, cb->nid);
if (tbl)
diff --git a/sound/soc/sh/rcar/dma.c b/sound/soc/sh/rcar/dma.c
index 0324a5c39619..28f65eba2bb4 100644
--- a/sound/soc/sh/rcar/dma.c
+++ b/sound/soc/sh/rcar/dma.c
@@ -508,10 +508,10 @@ static struct rsnd_mod_ops rsnd_dmapp_ops = {
#define RDMA_SSI_I_N(addr, i) (addr ##_reg - 0x00300000 + (0x40 * i) + 0x8)
#define RDMA_SSI_O_N(addr, i) (addr ##_reg - 0x00300000 + (0x40 * i) + 0xc)

-#define RDMA_SSIU_I_N(addr, i, j) (addr ##_reg - 0x00441000 + (0x1000 * (i)) + (((j) / 4) * 0xA000) + (((j) % 4) * 0x400))
+#define RDMA_SSIU_I_N(addr, i, j) (addr ##_reg - 0x00441000 + (0x1000 * (i)) + (((j) / 4) * 0xA000) + (((j) % 4) * 0x400) - (0x4000 * ((i) / 9) * ((j) / 4)))
#define RDMA_SSIU_O_N(addr, i, j) RDMA_SSIU_I_N(addr, i, j)

-#define RDMA_SSIU_I_P(addr, i, j) (addr ##_reg - 0x00141000 + (0x1000 * (i)) + (((j) / 4) * 0xA000) + (((j) % 4) * 0x400))
+#define RDMA_SSIU_I_P(addr, i, j) (addr ##_reg - 0x00141000 + (0x1000 * (i)) + (((j) / 4) * 0xA000) + (((j) % 4) * 0x400) - (0x4000 * ((i) / 9) * ((j) / 4)))
#define RDMA_SSIU_O_P(addr, i, j) RDMA_SSIU_I_P(addr, i, j)

#define RDMA_SRC_I_N(addr, i) (addr ##_reg - 0x00500000 + (0x400 * i))
diff --git a/sound/soc/sof/intel/hda-stream.c b/sound/soc/sof/intel/hda-stream.c
index 2c7447188402..0c11fceb28a7 100644
--- a/sound/soc/sof/intel/hda-stream.c
+++ b/sound/soc/sof/intel/hda-stream.c
@@ -190,7 +190,7 @@ hda_dsp_stream_get(struct snd_sof_dev *sdev, int direction)
* Workaround to address a known issue with host DMA that results
* in xruns during pause/release in capture scenarios.
*/
- if (!IS_ENABLED(SND_SOC_SOF_HDA_ALWAYS_ENABLE_DMI_L1))
+ if (!IS_ENABLED(CONFIG_SND_SOC_SOF_HDA_ALWAYS_ENABLE_DMI_L1))
if (stream && direction == SNDRV_PCM_STREAM_CAPTURE)
snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR,
HDA_VS_INTEL_EM2,
@@ -228,7 +228,7 @@ int hda_dsp_stream_put(struct snd_sof_dev *sdev, int direction, int stream_tag)
spin_unlock_irq(&bus->reg_lock);

/* Enable DMI L1 entry if there are no capture streams open */
- if (!IS_ENABLED(SND_SOC_SOF_HDA_ALWAYS_ENABLE_DMI_L1))
+ if (!IS_ENABLED(CONFIG_SND_SOC_SOF_HDA_ALWAYS_ENABLE_DMI_L1))
if (!active_capture_stream)
snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR,
HDA_VS_INTEL_EM2,
diff --git a/sound/usb/Makefile b/sound/usb/Makefile
index e1ce257ab705..d27a21b0ff9c 100644
--- a/sound/usb/Makefile
+++ b/sound/usb/Makefile
@@ -16,7 +16,8 @@ snd-usb-audio-objs := card.o \
power.o \
proc.o \
quirks.o \
- stream.o
+ stream.o \
+ validate.o

snd-usb-audio-$(CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER) += media.o

diff --git a/sound/usb/clock.c b/sound/usb/clock.c
index 72e9bdf76115..6b8c14f9b5d4 100644
--- a/sound/usb/clock.c
+++ b/sound/usb/clock.c
@@ -38,39 +38,37 @@ static void *find_uac_clock_desc(struct usb_host_interface *iface, int id,
static bool validate_clock_source_v2(void *p, int id)
{
struct uac_clock_source_descriptor *cs = p;
- return cs->bLength == sizeof(*cs) && cs->bClockID == id;
+ return cs->bClockID == id;
}

static bool validate_clock_source_v3(void *p, int id)
{
struct uac3_clock_source_descriptor *cs = p;
- return cs->bLength == sizeof(*cs) && cs->bClockID == id;
+ return cs->bClockID == id;
}

static bool validate_clock_selector_v2(void *p, int id)
{
struct uac_clock_selector_descriptor *cs = p;
- return cs->bLength >= sizeof(*cs) && cs->bClockID == id &&
- cs->bLength == 7 + cs->bNrInPins;
+ return cs->bClockID == id;
}

static bool validate_clock_selector_v3(void *p, int id)
{
struct uac3_clock_selector_descriptor *cs = p;
- return cs->bLength >= sizeof(*cs) && cs->bClockID == id &&
- cs->bLength == 11 + cs->bNrInPins;
+ return cs->bClockID == id;
}

static bool validate_clock_multiplier_v2(void *p, int id)
{
struct uac_clock_multiplier_descriptor *cs = p;
- return cs->bLength == sizeof(*cs) && cs->bClockID == id;
+ return cs->bClockID == id;
}

static bool validate_clock_multiplier_v3(void *p, int id)
{
struct uac3_clock_multiplier_descriptor *cs = p;
- return cs->bLength == sizeof(*cs) && cs->bClockID == id;
+ return cs->bClockID == id;
}

#define DEFINE_FIND_HELPER(name, obj, validator, type) \
diff --git a/sound/usb/helper.h b/sound/usb/helper.h
index 6afb70156ec4..5e8a18b4e7b9 100644
--- a/sound/usb/helper.h
+++ b/sound/usb/helper.h
@@ -31,4 +31,8 @@ static inline int snd_usb_ctrl_intf(struct snd_usb_audio *chip)
return get_iface_desc(chip->ctrl_intf)->bInterfaceNumber;
}

+/* in validate.c */
+bool snd_usb_validate_audio_desc(void *p, int protocol);
+bool snd_usb_validate_midi_desc(void *p);
+
#endif /* __USBAUDIO_HELPER_H */
diff --git a/sound/usb/mixer.c b/sound/usb/mixer.c
index eceab19766db..673652ad7018 100644
--- a/sound/usb/mixer.c
+++ b/sound/usb/mixer.c
@@ -740,13 +740,6 @@ static int uac_mixer_unit_get_channels(struct mixer_build *state,
{
int mu_channels;

- if (desc->bLength < sizeof(*desc))
- return -EINVAL;
- if (!desc->bNrInPins)
- return -EINVAL;
- if (desc->bLength < sizeof(*desc) + desc->bNrInPins)
- return -EINVAL;
-
switch (state->mixer->protocol) {
case UAC_VERSION_1:
case UAC_VERSION_2:
@@ -765,222 +758,242 @@ static int uac_mixer_unit_get_channels(struct mixer_build *state,
}

/*
- * parse the source unit recursively until it reaches to a terminal
- * or a branched unit.
+ * Parse Input Terminal Unit
*/
static int __check_input_term(struct mixer_build *state, int id,
- struct usb_audio_term *term)
+ struct usb_audio_term *term);
+
+static int parse_term_uac1_iterm_unit(struct mixer_build *state,
+ struct usb_audio_term *term,
+ void *p1, int id)
{
- int protocol = state->mixer->protocol;
+ struct uac_input_terminal_descriptor *d = p1;
+
+ term->type = le16_to_cpu(d->wTerminalType);
+ term->channels = d->bNrChannels;
+ term->chconfig = le16_to_cpu(d->wChannelConfig);
+ term->name = d->iTerminal;
+ return 0;
+}
+
+static int parse_term_uac2_iterm_unit(struct mixer_build *state,
+ struct usb_audio_term *term,
+ void *p1, int id)
+{
+ struct uac2_input_terminal_descriptor *d = p1;
int err;
- void *p1;
- unsigned char *hdr;

- memset(term, 0, sizeof(*term));
- for (;;) {
- /* a loop in the terminal chain? */
- if (test_and_set_bit(id, state->termbitmap))
- return -EINVAL;
+ /* call recursively to verify the referenced clock entity */
+ err = __check_input_term(state, d->bCSourceID, term);
+ if (err < 0)
+ return err;

- p1 = find_audio_control_unit(state, id);
- if (!p1)
- break;
+ /* save input term properties after recursion,
+ * to ensure they are not overriden by the recursion calls
+ */
+ term->id = id;
+ term->type = le16_to_cpu(d->wTerminalType);
+ term->channels = d->bNrChannels;
+ term->chconfig = le32_to_cpu(d->bmChannelConfig);
+ term->name = d->iTerminal;
+ return 0;
+}

- hdr = p1;
- term->id = id;
+static int parse_term_uac3_iterm_unit(struct mixer_build *state,
+ struct usb_audio_term *term,
+ void *p1, int id)
+{
+ struct uac3_input_terminal_descriptor *d = p1;
+ int err;

- if (protocol == UAC_VERSION_1 || protocol == UAC_VERSION_2) {
- switch (hdr[2]) {
- case UAC_INPUT_TERMINAL:
- if (protocol == UAC_VERSION_1) {
- struct uac_input_terminal_descriptor *d = p1;
-
- term->type = le16_to_cpu(d->wTerminalType);
- term->channels = d->bNrChannels;
- term->chconfig = le16_to_cpu(d->wChannelConfig);
- term->name = d->iTerminal;
- } else { /* UAC_VERSION_2 */
- struct uac2_input_terminal_descriptor *d = p1;
-
- /* call recursively to verify that the
- * referenced clock entity is valid */
- err = __check_input_term(state, d->bCSourceID, term);
- if (err < 0)
- return err;
+ /* call recursively to verify the referenced clock entity */
+ err = __check_input_term(state, d->bCSourceID, term);
+ if (err < 0)
+ return err;

- /* save input term properties after recursion,
- * to ensure they are not overriden by the
- * recursion calls */
- term->id = id;
- term->type = le16_to_cpu(d->wTerminalType);
- term->channels = d->bNrChannels;
- term->chconfig = le32_to_cpu(d->bmChannelConfig);
- term->name = d->iTerminal;
- }
- return 0;
- case UAC_FEATURE_UNIT: {
- /* the header is the same for v1 and v2 */
- struct uac_feature_unit_descriptor *d = p1;
+ /* save input term properties after recursion,
+ * to ensure they are not overriden by the recursion calls
+ */
+ term->id = id;
+ term->type = le16_to_cpu(d->wTerminalType);

- id = d->bSourceID;
- break; /* continue to parse */
- }
- case UAC_MIXER_UNIT: {
- struct uac_mixer_unit_descriptor *d = p1;
-
- term->type = UAC3_MIXER_UNIT << 16; /* virtual type */
- term->channels = uac_mixer_unit_bNrChannels(d);
- term->chconfig = uac_mixer_unit_wChannelConfig(d, protocol);
- term->name = uac_mixer_unit_iMixer(d);
- return 0;
- }
- case UAC_SELECTOR_UNIT:
- case UAC2_CLOCK_SELECTOR: {
- struct uac_selector_unit_descriptor *d = p1;
- /* call recursively to retrieve the channel info */
- err = __check_input_term(state, d->baSourceID[0], term);
- if (err < 0)
- return err;
- term->type = UAC3_SELECTOR_UNIT << 16; /* virtual type */
- term->id = id;
- term->name = uac_selector_unit_iSelector(d);
- return 0;
- }
- case UAC1_PROCESSING_UNIT:
- /* UAC2_EFFECT_UNIT */
- if (protocol == UAC_VERSION_1)
- term->type = UAC3_PROCESSING_UNIT << 16; /* virtual type */
- else /* UAC_VERSION_2 */
- term->type = UAC3_EFFECT_UNIT << 16; /* virtual type */
- /* fall through */
- case UAC1_EXTENSION_UNIT:
- /* UAC2_PROCESSING_UNIT_V2 */
- if (protocol == UAC_VERSION_1 && !term->type)
- term->type = UAC3_EXTENSION_UNIT << 16; /* virtual type */
- else if (protocol == UAC_VERSION_2 && !term->type)
- term->type = UAC3_PROCESSING_UNIT << 16; /* virtual type */
- /* fall through */
- case UAC2_EXTENSION_UNIT_V2: {
- struct uac_processing_unit_descriptor *d = p1;
-
- if (protocol == UAC_VERSION_2 &&
- hdr[2] == UAC2_EFFECT_UNIT) {
- /* UAC2/UAC1 unit IDs overlap here in an
- * uncompatible way. Ignore this unit for now.
- */
- return 0;
- }
+ err = get_cluster_channels_v3(state, le16_to_cpu(d->wClusterDescrID));
+ if (err < 0)
+ return err;
+ term->channels = err;

- if (d->bNrInPins) {
- id = d->baSourceID[0];
- break; /* continue to parse */
- }
- if (!term->type)
- term->type = UAC3_EXTENSION_UNIT << 16; /* virtual type */
+ /* REVISIT: UAC3 IT doesn't have channels cfg */
+ term->chconfig = 0;

- term->channels = uac_processing_unit_bNrChannels(d);
- term->chconfig = uac_processing_unit_wChannelConfig(d, protocol);
- term->name = uac_processing_unit_iProcessing(d, protocol);
- return 0;
- }
- case UAC2_CLOCK_SOURCE: {
- struct uac_clock_source_descriptor *d = p1;
+ term->name = le16_to_cpu(d->wTerminalDescrStr);
+ return 0;
+}

- term->type = UAC3_CLOCK_SOURCE << 16; /* virtual type */
- term->id = id;
- term->name = d->iClockSource;
- return 0;
- }
- default:
- return -ENODEV;
- }
- } else { /* UAC_VERSION_3 */
- switch (hdr[2]) {
- case UAC_INPUT_TERMINAL: {
- struct uac3_input_terminal_descriptor *d = p1;
-
- /* call recursively to verify that the
- * referenced clock entity is valid */
- err = __check_input_term(state, d->bCSourceID, term);
- if (err < 0)
- return err;
+static int parse_term_mixer_unit(struct mixer_build *state,
+ struct usb_audio_term *term,
+ void *p1, int id)
+{
+ struct uac_mixer_unit_descriptor *d = p1;
+ int protocol = state->mixer->protocol;
+ int err;

- /* save input term properties after recursion,
- * to ensure they are not overriden by the
- * recursion calls */
- term->id = id;
- term->type = le16_to_cpu(d->wTerminalType);
+ err = uac_mixer_unit_get_channels(state, d);
+ if (err <= 0)
+ return err;

- err = get_cluster_channels_v3(state, le16_to_cpu(d->wClusterDescrID));
- if (err < 0)
- return err;
- term->channels = err;
+ term->type = UAC3_MIXER_UNIT << 16; /* virtual type */
+ term->channels = err;
+ if (protocol != UAC_VERSION_3) {
+ term->chconfig = uac_mixer_unit_wChannelConfig(d, protocol);
+ term->name = uac_mixer_unit_iMixer(d);
+ }
+ return 0;
+}

- /* REVISIT: UAC3 IT doesn't have channels cfg */
- term->chconfig = 0;
+static int parse_term_selector_unit(struct mixer_build *state,
+ struct usb_audio_term *term,
+ void *p1, int id)
+{
+ struct uac_selector_unit_descriptor *d = p1;
+ int err;

- term->name = le16_to_cpu(d->wTerminalDescrStr);
- return 0;
- }
- case UAC3_FEATURE_UNIT: {
- struct uac3_feature_unit_descriptor *d = p1;
+ /* call recursively to retrieve the channel info */
+ err = __check_input_term(state, d->baSourceID[0], term);
+ if (err < 0)
+ return err;
+ term->type = UAC3_SELECTOR_UNIT << 16; /* virtual type */
+ term->id = id;
+ if (state->mixer->protocol != UAC_VERSION_3)
+ term->name = uac_selector_unit_iSelector(d);
+ return 0;
+}

- id = d->bSourceID;
- break; /* continue to parse */
- }
- case UAC3_CLOCK_SOURCE: {
- struct uac3_clock_source_descriptor *d = p1;
+static int parse_term_proc_unit(struct mixer_build *state,
+ struct usb_audio_term *term,
+ void *p1, int id, int vtype)
+{
+ struct uac_processing_unit_descriptor *d = p1;
+ int protocol = state->mixer->protocol;
+ int err;

- term->type = UAC3_CLOCK_SOURCE << 16; /* virtual type */
- term->id = id;
- term->name = le16_to_cpu(d->wClockSourceStr);
- return 0;
- }
- case UAC3_MIXER_UNIT: {
- struct uac_mixer_unit_descriptor *d = p1;
+ if (d->bNrInPins) {
+ /* call recursively to retrieve the channel info */
+ err = __check_input_term(state, d->baSourceID[0], term);
+ if (err < 0)
+ return err;
+ }

- err = uac_mixer_unit_get_channels(state, d);
- if (err <= 0)
- return err;
+ term->type = vtype << 16; /* virtual type */
+ term->id = id;

- term->channels = err;
- term->type = UAC3_MIXER_UNIT << 16; /* virtual type */
+ if (protocol == UAC_VERSION_3)
+ return 0;

- return 0;
- }
- case UAC3_SELECTOR_UNIT:
- case UAC3_CLOCK_SELECTOR: {
- struct uac_selector_unit_descriptor *d = p1;
- /* call recursively to retrieve the channel info */
- err = __check_input_term(state, d->baSourceID[0], term);
- if (err < 0)
- return err;
- term->type = UAC3_SELECTOR_UNIT << 16; /* virtual type */
- term->id = id;
- term->name = 0; /* TODO: UAC3 Class-specific strings */
+ if (!term->channels) {
+ term->channels = uac_processing_unit_bNrChannels(d);
+ term->chconfig = uac_processing_unit_wChannelConfig(d, protocol);
+ }
+ term->name = uac_processing_unit_iProcessing(d, protocol);
+ return 0;
+}

- return 0;
- }
- case UAC3_PROCESSING_UNIT: {
- struct uac_processing_unit_descriptor *d = p1;
+static int parse_term_uac2_clock_source(struct mixer_build *state,
+ struct usb_audio_term *term,
+ void *p1, int id)
+{
+ struct uac_clock_source_descriptor *d = p1;

- if (!d->bNrInPins)
- return -EINVAL;
+ term->type = UAC3_CLOCK_SOURCE << 16; /* virtual type */
+ term->id = id;
+ term->name = d->iClockSource;
+ return 0;
+}

- /* call recursively to retrieve the channel info */
- err = __check_input_term(state, d->baSourceID[0], term);
- if (err < 0)
- return err;
+static int parse_term_uac3_clock_source(struct mixer_build *state,
+ struct usb_audio_term *term,
+ void *p1, int id)
+{
+ struct uac3_clock_source_descriptor *d = p1;
+
+ term->type = UAC3_CLOCK_SOURCE << 16; /* virtual type */
+ term->id = id;
+ term->name = le16_to_cpu(d->wClockSourceStr);
+ return 0;
+}

- term->type = UAC3_PROCESSING_UNIT << 16; /* virtual type */
- term->id = id;
- term->name = 0; /* TODO: UAC3 Class-specific strings */
+#define PTYPE(a, b) ((a) << 8 | (b))

- return 0;
- }
- default:
- return -ENODEV;
- }
+/*
+ * parse the source unit recursively until it reaches to a terminal
+ * or a branched unit.
+ */
+static int __check_input_term(struct mixer_build *state, int id,
+ struct usb_audio_term *term)
+{
+ int protocol = state->mixer->protocol;
+ void *p1;
+ unsigned char *hdr;
+
+ for (;;) {
+ /* a loop in the terminal chain? */
+ if (test_and_set_bit(id, state->termbitmap))
+ return -EINVAL;
+
+ p1 = find_audio_control_unit(state, id);
+ if (!p1)
+ break;
+ if (!snd_usb_validate_audio_desc(p1, protocol))
+ break; /* bad descriptor */
+
+ hdr = p1;
+ term->id = id;
+
+ switch (PTYPE(protocol, hdr[2])) {
+ case PTYPE(UAC_VERSION_1, UAC_FEATURE_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC_FEATURE_UNIT):
+ case PTYPE(UAC_VERSION_3, UAC3_FEATURE_UNIT): {
+ /* the header is the same for all versions */
+ struct uac_feature_unit_descriptor *d = p1;
+
+ id = d->bSourceID;
+ break; /* continue to parse */
+ }
+ case PTYPE(UAC_VERSION_1, UAC_INPUT_TERMINAL):
+ return parse_term_uac1_iterm_unit(state, term, p1, id);
+ case PTYPE(UAC_VERSION_2, UAC_INPUT_TERMINAL):
+ return parse_term_uac2_iterm_unit(state, term, p1, id);
+ case PTYPE(UAC_VERSION_3, UAC_INPUT_TERMINAL):
+ return parse_term_uac3_iterm_unit(state, term, p1, id);
+ case PTYPE(UAC_VERSION_1, UAC_MIXER_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC_MIXER_UNIT):
+ case PTYPE(UAC_VERSION_3, UAC3_MIXER_UNIT):
+ return parse_term_mixer_unit(state, term, p1, id);
+ case PTYPE(UAC_VERSION_1, UAC_SELECTOR_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC_SELECTOR_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC2_CLOCK_SELECTOR):
+ case PTYPE(UAC_VERSION_3, UAC3_SELECTOR_UNIT):
+ case PTYPE(UAC_VERSION_3, UAC3_CLOCK_SELECTOR):
+ return parse_term_selector_unit(state, term, p1, id);
+ case PTYPE(UAC_VERSION_1, UAC1_PROCESSING_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC2_PROCESSING_UNIT_V2):
+ case PTYPE(UAC_VERSION_3, UAC3_PROCESSING_UNIT):
+ return parse_term_proc_unit(state, term, p1, id,
+ UAC3_PROCESSING_UNIT);
+ case PTYPE(UAC_VERSION_2, UAC2_EFFECT_UNIT):
+ case PTYPE(UAC_VERSION_3, UAC3_EFFECT_UNIT):
+ return parse_term_proc_unit(state, term, p1, id,
+ UAC3_EFFECT_UNIT);
+ case PTYPE(UAC_VERSION_1, UAC1_EXTENSION_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC2_EXTENSION_UNIT_V2):
+ case PTYPE(UAC_VERSION_3, UAC3_EXTENSION_UNIT):
+ return parse_term_proc_unit(state, term, p1, id,
+ UAC3_EXTENSION_UNIT);
+ case PTYPE(UAC_VERSION_2, UAC2_CLOCK_SOURCE):
+ return parse_term_uac2_clock_source(state, term, p1, id);
+ case PTYPE(UAC_VERSION_3, UAC3_CLOCK_SOURCE):
+ return parse_term_uac3_clock_source(state, term, p1, id);
+ default:
+ return -ENODEV;
}
}
return -ENODEV;
@@ -1024,10 +1037,15 @@ static struct usb_feature_control_info audio_feature_info[] = {
{ UAC2_FU_PHASE_INVERTER, "Phase Inverter Control", USB_MIXER_BOOLEAN, -1 },
};

+static void usb_mixer_elem_info_free(struct usb_mixer_elem_info *cval)
+{
+ kfree(cval);
+}
+
/* private_free callback */
void snd_usb_mixer_elem_free(struct snd_kcontrol *kctl)
{
- kfree(kctl->private_data);
+ usb_mixer_elem_info_free(kctl->private_data);
kctl->private_data = NULL;
}

@@ -1550,7 +1568,7 @@ static void __build_feature_ctl(struct usb_mixer_interface *mixer,

ctl_info = get_feature_control_info(control);
if (!ctl_info) {
- kfree(cval);
+ usb_mixer_elem_info_free(cval);
return;
}
if (mixer->protocol == UAC_VERSION_1)
@@ -1583,7 +1601,7 @@ static void __build_feature_ctl(struct usb_mixer_interface *mixer,

if (!kctl) {
usb_audio_err(mixer->chip, "cannot malloc kcontrol\n");
- kfree(cval);
+ usb_mixer_elem_info_free(cval);
return;
}
kctl->private_free = snd_usb_mixer_elem_free;
@@ -1753,7 +1771,7 @@ static void build_connector_control(struct usb_mixer_interface *mixer,
kctl = snd_ctl_new1(&usb_connector_ctl_ro, cval);
if (!kctl) {
usb_audio_err(mixer->chip, "cannot malloc kcontrol\n");
- kfree(cval);
+ usb_mixer_elem_info_free(cval);
return;
}
get_connector_control_name(mixer, term, is_input, kctl->id.name,
@@ -1774,13 +1792,6 @@ static int parse_clock_source_unit(struct mixer_build *state, int unitid,
if (state->mixer->protocol != UAC_VERSION_2)
return -EINVAL;

- if (hdr->bLength != sizeof(*hdr)) {
- usb_audio_dbg(state->chip,
- "Bogus clock source descriptor length of %d, ignoring.\n",
- hdr->bLength);
- return 0;
- }
-
/*
* The only property of this unit we are interested in is the
* clock source validity. If that isn't readable, just bail out.
@@ -1806,7 +1817,7 @@ static int parse_clock_source_unit(struct mixer_build *state, int unitid,
kctl = snd_ctl_new1(&usb_bool_master_control_ctl_ro, cval);

if (!kctl) {
- kfree(cval);
+ usb_mixer_elem_info_free(cval);
return -ENOMEM;
}

@@ -1839,62 +1850,20 @@ static int parse_audio_feature_unit(struct mixer_build *state, int unitid,
__u8 *bmaControls;

if (state->mixer->protocol == UAC_VERSION_1) {
- if (hdr->bLength < 7) {
- usb_audio_err(state->chip,
- "unit %u: invalid UAC_FEATURE_UNIT descriptor\n",
- unitid);
- return -EINVAL;
- }
csize = hdr->bControlSize;
- if (!csize) {
- usb_audio_dbg(state->chip,
- "unit %u: invalid bControlSize == 0\n",
- unitid);
- return -EINVAL;
- }
channels = (hdr->bLength - 7) / csize - 1;
bmaControls = hdr->bmaControls;
- if (hdr->bLength < 7 + csize) {
- usb_audio_err(state->chip,
- "unit %u: invalid UAC_FEATURE_UNIT descriptor\n",
- unitid);
- return -EINVAL;
- }
} else if (state->mixer->protocol == UAC_VERSION_2) {
struct uac2_feature_unit_descriptor *ftr = _ftr;
- if (hdr->bLength < 6) {
- usb_audio_err(state->chip,
- "unit %u: invalid UAC_FEATURE_UNIT descriptor\n",
- unitid);
- return -EINVAL;
- }
csize = 4;
channels = (hdr->bLength - 6) / 4 - 1;
bmaControls = ftr->bmaControls;
- if (hdr->bLength < 6 + csize) {
- usb_audio_err(state->chip,
- "unit %u: invalid UAC_FEATURE_UNIT descriptor\n",
- unitid);
- return -EINVAL;
- }
} else { /* UAC_VERSION_3 */
struct uac3_feature_unit_descriptor *ftr = _ftr;

- if (hdr->bLength < 7) {
- usb_audio_err(state->chip,
- "unit %u: invalid UAC3_FEATURE_UNIT descriptor\n",
- unitid);
- return -EINVAL;
- }
csize = 4;
channels = (ftr->bLength - 7) / 4 - 1;
bmaControls = ftr->bmaControls;
- if (hdr->bLength < 7 + csize) {
- usb_audio_err(state->chip,
- "unit %u: invalid UAC3_FEATURE_UNIT descriptor\n",
- unitid);
- return -EINVAL;
- }
}

/* parse the source unit */
@@ -2068,7 +2037,7 @@ static void build_mixer_unit_ctl(struct mixer_build *state,
kctl = snd_ctl_new1(&usb_feature_unit_ctl, cval);
if (!kctl) {
usb_audio_err(state->chip, "cannot malloc kcontrol\n");
- kfree(cval);
+ usb_mixer_elem_info_free(cval);
return;
}
kctl->private_free = snd_usb_mixer_elem_free;
@@ -2094,15 +2063,11 @@ static int parse_audio_input_terminal(struct mixer_build *state, int unitid,

if (state->mixer->protocol == UAC_VERSION_2) {
struct uac2_input_terminal_descriptor *d_v2 = raw_desc;
- if (d_v2->bLength < sizeof(*d_v2))
- return -EINVAL;
control = UAC2_TE_CONNECTOR;
term_id = d_v2->bTerminalID;
bmctls = le16_to_cpu(d_v2->bmControls);
} else if (state->mixer->protocol == UAC_VERSION_3) {
struct uac3_input_terminal_descriptor *d_v3 = raw_desc;
- if (d_v3->bLength < sizeof(*d_v3))
- return -EINVAL;
control = UAC3_TE_INSERTION;
term_id = d_v3->bTerminalID;
bmctls = le32_to_cpu(d_v3->bmControls);
@@ -2364,18 +2329,7 @@ static int build_audio_procunit(struct mixer_build *state, int unitid,
const char *name = extension_unit ?
"Extension Unit" : "Processing Unit";

- if (desc->bLength < 13) {
- usb_audio_err(state->chip, "invalid %s descriptor (id %d)\n", name, unitid);
- return -EINVAL;
- }
-
num_ins = desc->bNrInPins;
- if (desc->bLength < 13 + num_ins ||
- desc->bLength < num_ins + uac_processing_unit_bControlSize(desc, state->mixer->protocol)) {
- usb_audio_err(state->chip, "invalid %s descriptor (id %d)\n", name, unitid);
- return -EINVAL;
- }
-
for (i = 0; i < num_ins; i++) {
err = parse_audio_unit(state, desc->baSourceID[i]);
if (err < 0)
@@ -2466,7 +2420,7 @@ static int build_audio_procunit(struct mixer_build *state, int unitid,

kctl = snd_ctl_new1(&mixer_procunit_ctl, cval);
if (!kctl) {
- kfree(cval);
+ usb_mixer_elem_info_free(cval);
return -ENOMEM;
}
kctl->private_free = snd_usb_mixer_elem_free;
@@ -2604,7 +2558,7 @@ static void usb_mixer_selector_elem_free(struct snd_kcontrol *kctl)
if (kctl->private_data) {
struct usb_mixer_elem_info *cval = kctl->private_data;
num_ins = cval->max;
- kfree(cval);
+ usb_mixer_elem_info_free(cval);
kctl->private_data = NULL;
}
if (kctl->private_value) {
@@ -2630,13 +2584,6 @@ static int parse_audio_selector_unit(struct mixer_build *state, int unitid,
const struct usbmix_name_map *map;
char **namelist;

- if (desc->bLength < 5 || !desc->bNrInPins ||
- desc->bLength < 5 + desc->bNrInPins) {
- usb_audio_err(state->chip,
- "invalid SELECTOR UNIT descriptor %d\n", unitid);
- return -EINVAL;
- }
-
for (i = 0; i < desc->bNrInPins; i++) {
err = parse_audio_unit(state, desc->baSourceID[i]);
if (err < 0)
@@ -2676,10 +2623,10 @@ static int parse_audio_selector_unit(struct mixer_build *state, int unitid,
break;
}

- namelist = kmalloc_array(desc->bNrInPins, sizeof(char *), GFP_KERNEL);
+ namelist = kcalloc(desc->bNrInPins, sizeof(char *), GFP_KERNEL);
if (!namelist) {
- kfree(cval);
- return -ENOMEM;
+ err = -ENOMEM;
+ goto error_cval;
}
#define MAX_ITEM_NAME_LEN 64
for (i = 0; i < desc->bNrInPins; i++) {
@@ -2687,11 +2634,8 @@ static int parse_audio_selector_unit(struct mixer_build *state, int unitid,
len = 0;
namelist[i] = kmalloc(MAX_ITEM_NAME_LEN, GFP_KERNEL);
if (!namelist[i]) {
- while (i--)
- kfree(namelist[i]);
- kfree(namelist);
- kfree(cval);
- return -ENOMEM;
+ err = -ENOMEM;
+ goto error_name;
}
len = check_mapped_selector_name(state, unitid, i, namelist[i],
MAX_ITEM_NAME_LEN);
@@ -2705,11 +2649,8 @@ static int parse_audio_selector_unit(struct mixer_build *state, int unitid,
kctl = snd_ctl_new1(&mixer_selectunit_ctl, cval);
if (! kctl) {
usb_audio_err(state->chip, "cannot malloc kcontrol\n");
- for (i = 0; i < desc->bNrInPins; i++)
- kfree(namelist[i]);
- kfree(namelist);
- kfree(cval);
- return -ENOMEM;
+ err = -ENOMEM;
+ goto error_name;
}
kctl->private_value = (unsigned long)namelist;
kctl->private_free = usb_mixer_selector_elem_free;
@@ -2755,6 +2696,14 @@ static int parse_audio_selector_unit(struct mixer_build *state, int unitid,
usb_audio_dbg(state->chip, "[%d] SU [%s] items = %d\n",
cval->head.id, kctl->id.name, desc->bNrInPins);
return snd_usb_mixer_add_control(&cval->head, kctl);
+
+ error_name:
+ for (i = 0; i < desc->bNrInPins; i++)
+ kfree(namelist[i]);
+ kfree(namelist);
+ error_cval:
+ usb_mixer_elem_info_free(cval);
+ return err;
}

/*
@@ -2775,62 +2724,49 @@ static int parse_audio_unit(struct mixer_build *state, int unitid)
return -EINVAL;
}

- if (protocol == UAC_VERSION_1 || protocol == UAC_VERSION_2) {
- switch (p1[2]) {
- case UAC_INPUT_TERMINAL:
- return parse_audio_input_terminal(state, unitid, p1);
- case UAC_MIXER_UNIT:
- return parse_audio_mixer_unit(state, unitid, p1);
- case UAC2_CLOCK_SOURCE:
- return parse_clock_source_unit(state, unitid, p1);
- case UAC_SELECTOR_UNIT:
- case UAC2_CLOCK_SELECTOR:
- return parse_audio_selector_unit(state, unitid, p1);
- case UAC_FEATURE_UNIT:
- return parse_audio_feature_unit(state, unitid, p1);
- case UAC1_PROCESSING_UNIT:
- /* UAC2_EFFECT_UNIT has the same value */
- if (protocol == UAC_VERSION_1)
- return parse_audio_processing_unit(state, unitid, p1);
- else
- return 0; /* FIXME - effect units not implemented yet */
- case UAC1_EXTENSION_UNIT:
- /* UAC2_PROCESSING_UNIT_V2 has the same value */
- if (protocol == UAC_VERSION_1)
- return parse_audio_extension_unit(state, unitid, p1);
- else /* UAC_VERSION_2 */
- return parse_audio_processing_unit(state, unitid, p1);
- case UAC2_EXTENSION_UNIT_V2:
- return parse_audio_extension_unit(state, unitid, p1);
- default:
- usb_audio_err(state->chip,
- "unit %u: unexpected type 0x%02x\n", unitid, p1[2]);
- return -EINVAL;
- }
- } else { /* UAC_VERSION_3 */
- switch (p1[2]) {
- case UAC_INPUT_TERMINAL:
- return parse_audio_input_terminal(state, unitid, p1);
- case UAC3_MIXER_UNIT:
- return parse_audio_mixer_unit(state, unitid, p1);
- case UAC3_CLOCK_SOURCE:
- return parse_clock_source_unit(state, unitid, p1);
- case UAC3_SELECTOR_UNIT:
- case UAC3_CLOCK_SELECTOR:
- return parse_audio_selector_unit(state, unitid, p1);
- case UAC3_FEATURE_UNIT:
- return parse_audio_feature_unit(state, unitid, p1);
- case UAC3_EFFECT_UNIT:
- return 0; /* FIXME - effect units not implemented yet */
- case UAC3_PROCESSING_UNIT:
- return parse_audio_processing_unit(state, unitid, p1);
- case UAC3_EXTENSION_UNIT:
- return parse_audio_extension_unit(state, unitid, p1);
- default:
- usb_audio_err(state->chip,
- "unit %u: unexpected type 0x%02x\n", unitid, p1[2]);
- return -EINVAL;
- }
+ if (!snd_usb_validate_audio_desc(p1, protocol)) {
+ usb_audio_dbg(state->chip, "invalid unit %d\n", unitid);
+ return 0; /* skip invalid unit */
+ }
+
+ switch (PTYPE(protocol, p1[2])) {
+ case PTYPE(UAC_VERSION_1, UAC_INPUT_TERMINAL):
+ case PTYPE(UAC_VERSION_2, UAC_INPUT_TERMINAL):
+ case PTYPE(UAC_VERSION_3, UAC_INPUT_TERMINAL):
+ return parse_audio_input_terminal(state, unitid, p1);
+ case PTYPE(UAC_VERSION_1, UAC_MIXER_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC_MIXER_UNIT):
+ case PTYPE(UAC_VERSION_3, UAC3_MIXER_UNIT):
+ return parse_audio_mixer_unit(state, unitid, p1);
+ case PTYPE(UAC_VERSION_2, UAC2_CLOCK_SOURCE):
+ case PTYPE(UAC_VERSION_3, UAC3_CLOCK_SOURCE):
+ return parse_clock_source_unit(state, unitid, p1);
+ case PTYPE(UAC_VERSION_1, UAC_SELECTOR_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC_SELECTOR_UNIT):
+ case PTYPE(UAC_VERSION_3, UAC3_SELECTOR_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC2_CLOCK_SELECTOR):
+ case PTYPE(UAC_VERSION_3, UAC3_CLOCK_SELECTOR):
+ return parse_audio_selector_unit(state, unitid, p1);
+ case PTYPE(UAC_VERSION_1, UAC_FEATURE_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC_FEATURE_UNIT):
+ case PTYPE(UAC_VERSION_3, UAC3_FEATURE_UNIT):
+ return parse_audio_feature_unit(state, unitid, p1);
+ case PTYPE(UAC_VERSION_1, UAC1_PROCESSING_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC2_PROCESSING_UNIT_V2):
+ case PTYPE(UAC_VERSION_3, UAC3_PROCESSING_UNIT):
+ return parse_audio_processing_unit(state, unitid, p1);
+ case PTYPE(UAC_VERSION_1, UAC1_EXTENSION_UNIT):
+ case PTYPE(UAC_VERSION_2, UAC2_EXTENSION_UNIT_V2):
+ case PTYPE(UAC_VERSION_3, UAC3_EXTENSION_UNIT):
+ return parse_audio_extension_unit(state, unitid, p1);
+ case PTYPE(UAC_VERSION_2, UAC2_EFFECT_UNIT):
+ case PTYPE(UAC_VERSION_3, UAC3_EFFECT_UNIT):
+ return 0; /* FIXME - effect units not implemented yet */
+ default:
+ usb_audio_err(state->chip,
+ "unit %u: unexpected type 0x%02x\n",
+ unitid, p1[2]);
+ return -EINVAL;
}
}

@@ -3145,11 +3081,12 @@ static int snd_usb_mixer_controls(struct usb_mixer_interface *mixer)
while ((p = snd_usb_find_csint_desc(mixer->hostif->extra,
mixer->hostif->extralen,
p, UAC_OUTPUT_TERMINAL)) != NULL) {
+ if (!snd_usb_validate_audio_desc(p, mixer->protocol))
+ continue; /* skip invalid descriptor */
+
if (mixer->protocol == UAC_VERSION_1) {
struct uac1_output_terminal_descriptor *desc = p;

- if (desc->bLength < sizeof(*desc))
- continue; /* invalid descriptor? */
/* mark terminal ID as visited */
set_bit(desc->bTerminalID, state.unitbitmap);
state.oterm.id = desc->bTerminalID;
@@ -3161,8 +3098,6 @@ static int snd_usb_mixer_controls(struct usb_mixer_interface *mixer)
} else if (mixer->protocol == UAC_VERSION_2) {
struct uac2_output_terminal_descriptor *desc = p;

- if (desc->bLength < sizeof(*desc))
- continue; /* invalid descriptor? */
/* mark terminal ID as visited */
set_bit(desc->bTerminalID, state.unitbitmap);
state.oterm.id = desc->bTerminalID;
@@ -3188,8 +3123,6 @@ static int snd_usb_mixer_controls(struct usb_mixer_interface *mixer)
} else { /* UAC_VERSION_3 */
struct uac3_output_terminal_descriptor *desc = p;

- if (desc->bLength < sizeof(*desc))
- continue; /* invalid descriptor? */
/* mark terminal ID as visited */
set_bit(desc->bTerminalID, state.unitbitmap);
state.oterm.id = desc->bTerminalID;
diff --git a/sound/usb/power.c b/sound/usb/power.c
index bd303a1ba1b7..606a2cb23eab 100644
--- a/sound/usb/power.c
+++ b/sound/usb/power.c
@@ -31,6 +31,8 @@ snd_usb_find_power_domain(struct usb_host_interface *ctrl_iface,
struct uac3_power_domain_descriptor *pd_desc = p;
int i;

+ if (!snd_usb_validate_audio_desc(p, UAC_VERSION_3))
+ continue;
for (i = 0; i < pd_desc->bNrEntities; i++) {
if (pd_desc->baEntityID[i] == id) {
pd->pd_id = pd_desc->bPowerDomainID;
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c
index 059b70313f35..0bbe1201a6ac 100644
--- a/sound/usb/quirks.c
+++ b/sound/usb/quirks.c
@@ -248,6 +248,9 @@ static int create_yamaha_midi_quirk(struct snd_usb_audio *chip,
NULL, USB_MS_MIDI_OUT_JACK);
if (!injd && !outjd)
return -ENODEV;
+ if (!(injd && snd_usb_validate_midi_desc(injd)) ||
+ !(outjd && snd_usb_validate_midi_desc(outjd)))
+ return -ENODEV;
if (injd && (injd->bLength < 5 ||
(injd->bJackType != USB_MS_EMBEDDED &&
injd->bJackType != USB_MS_EXTERNAL)))
diff --git a/sound/usb/stream.c b/sound/usb/stream.c
index e852c7fd6109..a0649c8ae460 100644
--- a/sound/usb/stream.c
+++ b/sound/usb/stream.c
@@ -627,16 +627,14 @@ static int parse_uac_endpoint_attributes(struct snd_usb_audio *chip,
*/
static void *
snd_usb_find_input_terminal_descriptor(struct usb_host_interface *ctrl_iface,
- int terminal_id, bool uac23)
+ int terminal_id, int protocol)
{
struct uac2_input_terminal_descriptor *term = NULL;
- size_t minlen = uac23 ? sizeof(struct uac2_input_terminal_descriptor) :
- sizeof(struct uac_input_terminal_descriptor);

while ((term = snd_usb_find_csint_desc(ctrl_iface->extra,
ctrl_iface->extralen,
term, UAC_INPUT_TERMINAL))) {
- if (term->bLength < minlen)
+ if (!snd_usb_validate_audio_desc(term, protocol))
continue;
if (term->bTerminalID == terminal_id)
return term;
@@ -647,7 +645,7 @@ snd_usb_find_input_terminal_descriptor(struct usb_host_interface *ctrl_iface,

static void *
snd_usb_find_output_terminal_descriptor(struct usb_host_interface *ctrl_iface,
- int terminal_id)
+ int terminal_id, int protocol)
{
/* OK to use with both UAC2 and UAC3 */
struct uac2_output_terminal_descriptor *term = NULL;
@@ -655,8 +653,9 @@ snd_usb_find_output_terminal_descriptor(struct usb_host_interface *ctrl_iface,
while ((term = snd_usb_find_csint_desc(ctrl_iface->extra,
ctrl_iface->extralen,
term, UAC_OUTPUT_TERMINAL))) {
- if (term->bLength >= sizeof(*term) &&
- term->bTerminalID == terminal_id)
+ if (!snd_usb_validate_audio_desc(term, protocol))
+ continue;
+ if (term->bTerminalID == terminal_id)
return term;
}

@@ -731,7 +730,7 @@ snd_usb_get_audioformat_uac12(struct snd_usb_audio *chip,

iterm = snd_usb_find_input_terminal_descriptor(chip->ctrl_intf,
as->bTerminalLink,
- false);
+ protocol);
if (iterm) {
num_channels = iterm->bNrChannels;
chconfig = le16_to_cpu(iterm->wChannelConfig);
@@ -767,7 +766,7 @@ snd_usb_get_audioformat_uac12(struct snd_usb_audio *chip,
*/
input_term = snd_usb_find_input_terminal_descriptor(chip->ctrl_intf,
as->bTerminalLink,
- true);
+ protocol);
if (input_term) {
clock = input_term->bCSourceID;
if (!chconfig && (num_channels == input_term->bNrChannels))
@@ -776,7 +775,8 @@ snd_usb_get_audioformat_uac12(struct snd_usb_audio *chip,
}

output_term = snd_usb_find_output_terminal_descriptor(chip->ctrl_intf,
- as->bTerminalLink);
+ as->bTerminalLink,
+ protocol);
if (output_term) {
clock = output_term->bCSourceID;
goto found_clock;
@@ -1002,14 +1002,15 @@ snd_usb_get_audioformat_uac3(struct snd_usb_audio *chip,
*/
input_term = snd_usb_find_input_terminal_descriptor(chip->ctrl_intf,
as->bTerminalLink,
- true);
+ UAC_VERSION_3);
if (input_term) {
clock = input_term->bCSourceID;
goto found_clock;
}

output_term = snd_usb_find_output_terminal_descriptor(chip->ctrl_intf,
- as->bTerminalLink);
+ as->bTerminalLink,
+ UAC_VERSION_3);
if (output_term) {
clock = output_term->bCSourceID;
goto found_clock;
diff --git a/sound/usb/validate.c b/sound/usb/validate.c
new file mode 100644
index 000000000000..a5e584b60dcd
--- /dev/null
+++ b/sound/usb/validate.c
@@ -0,0 +1,332 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+//
+// Validation of USB-audio class descriptors
+//
+
+#include <linux/init.h>
+#include <linux/usb.h>
+#include <linux/usb/audio.h>
+#include <linux/usb/audio-v2.h>
+#include <linux/usb/audio-v3.h>
+#include <linux/usb/midi.h>
+#include "usbaudio.h"
+#include "helper.h"
+
+struct usb_desc_validator {
+ unsigned char protocol;
+ unsigned char type;
+ bool (*func)(const void *p, const struct usb_desc_validator *v);
+ size_t size;
+};
+
+#define UAC_VERSION_ALL (unsigned char)(-1)
+
+/* UAC1 only */
+static bool validate_uac1_header(const void *p,
+ const struct usb_desc_validator *v)
+{
+ const struct uac1_ac_header_descriptor *d = p;
+
+ return d->bLength >= sizeof(*d) &&
+ d->bLength >= sizeof(*d) + d->bInCollection;
+}
+
+/* for mixer unit; covering all UACs */
+static bool validate_mixer_unit(const void *p,
+ const struct usb_desc_validator *v)
+{
+ const struct uac_mixer_unit_descriptor *d = p;
+ size_t len;
+
+ if (d->bLength < sizeof(*d) || !d->bNrInPins)
+ return false;
+ len = sizeof(*d) + d->bNrInPins;
+ /* We can't determine the bitmap size only from this unit descriptor,
+ * so just check with the remaining length.
+ * The actual bitmap is checked at mixer unit parser.
+ */
+ switch (v->protocol) {
+ case UAC_VERSION_1:
+ default:
+ len += 2 + 1; /* wChannelConfig, iChannelNames */
+ /* bmControls[n*m] */
+ len += 1; /* iMixer */
+ break;
+ case UAC_VERSION_2:
+ len += 4 + 1; /* bmChannelConfig, iChannelNames */
+ /* bmMixerControls[n*m] */
+ len += 1 + 1; /* bmControls, iMixer */
+ break;
+ case UAC_VERSION_3:
+ len += 2; /* wClusterDescrID */
+ /* bmMixerControls[n*m] */
+ break;
+ }
+ return d->bLength >= len;
+}
+
+/* both for processing and extension units; covering all UACs */
+static bool validate_processing_unit(const void *p,
+ const struct usb_desc_validator *v)
+{
+ const struct uac_processing_unit_descriptor *d = p;
+ const unsigned char *hdr = p;
+ size_t len, m;
+
+ if (d->bLength < sizeof(*d))
+ return false;
+ len = sizeof(*d) + d->bNrInPins;
+ if (d->bLength < len)
+ return false;
+ switch (v->protocol) {
+ case UAC_VERSION_1:
+ default:
+ /* bNrChannels, wChannelConfig, iChannelNames, bControlSize */
+ len += 1 + 2 + 1 + 1;
+ if (d->bLength < len) /* bControlSize */
+ return false;
+ m = hdr[len];
+ len += 1 + m + 1; /* bControlSize, bmControls, iProcessing */
+ break;
+ case UAC_VERSION_2:
+ /* bNrChannels, bmChannelConfig, iChannelNames */
+ len += 1 + 4 + 1;
+ if (v->type == UAC2_PROCESSING_UNIT_V2)
+ len += 2; /* bmControls -- 2 bytes for PU */
+ else
+ len += 1; /* bmControls -- 1 byte for EU */
+ len += 1; /* iProcessing */
+ break;
+ case UAC_VERSION_3:
+ /* wProcessingDescrStr, bmControls */
+ len += 2 + 4;
+ break;
+ }
+ if (d->bLength < len)
+ return false;
+
+ switch (v->protocol) {
+ case UAC_VERSION_1:
+ default:
+ if (v->type == UAC1_EXTENSION_UNIT)
+ return true; /* OK */
+ switch (d->wProcessType) {
+ case UAC_PROCESS_UP_DOWNMIX:
+ case UAC_PROCESS_DOLBY_PROLOGIC:
+ if (d->bLength < len + 1) /* bNrModes */
+ return false;
+ m = hdr[len];
+ len += 1 + m * 2; /* bNrModes, waModes(n) */
+ break;
+ default:
+ break;
+ }
+ break;
+ case UAC_VERSION_2:
+ if (v->type == UAC2_EXTENSION_UNIT_V2)
+ return true; /* OK */
+ switch (d->wProcessType) {
+ case UAC2_PROCESS_UP_DOWNMIX:
+ case UAC2_PROCESS_DOLBY_PROLOCIC: /* SiC! */
+ if (d->bLength < len + 1) /* bNrModes */
+ return false;
+ m = hdr[len];
+ len += 1 + m * 4; /* bNrModes, daModes(n) */
+ break;
+ default:
+ break;
+ }
+ break;
+ case UAC_VERSION_3:
+ if (v->type == UAC3_EXTENSION_UNIT) {
+ len += 2; /* wClusterDescrID */
+ break;
+ }
+ switch (d->wProcessType) {
+ case UAC3_PROCESS_UP_DOWNMIX:
+ if (d->bLength < len + 1) /* bNrModes */
+ return false;
+ m = hdr[len];
+ len += 1 + m * 2; /* bNrModes, waClusterDescrID(n) */
+ break;
+ case UAC3_PROCESS_MULTI_FUNCTION:
+ len += 2 + 4; /* wClusterDescrID, bmAlgorighms */
+ break;
+ default:
+ break;
+ }
+ break;
+ }
+ if (d->bLength < len)
+ return false;
+
+ return true;
+}
+
+/* both for selector and clock selector units; covering all UACs */
+static bool validate_selector_unit(const void *p,
+ const struct usb_desc_validator *v)
+{
+ const struct uac_selector_unit_descriptor *d = p;
+ size_t len;
+
+ if (d->bLength < sizeof(*d))
+ return false;
+ len = sizeof(*d) + d->bNrInPins;
+ switch (v->protocol) {
+ case UAC_VERSION_1:
+ default:
+ len += 1; /* iSelector */
+ break;
+ case UAC_VERSION_2:
+ len += 1 + 1; /* bmControls, iSelector */
+ break;
+ case UAC_VERSION_3:
+ len += 4 + 2; /* bmControls, wSelectorDescrStr */
+ break;
+ }
+ return d->bLength >= len;
+}
+
+static bool validate_uac1_feature_unit(const void *p,
+ const struct usb_desc_validator *v)
+{
+ const struct uac_feature_unit_descriptor *d = p;
+
+ if (d->bLength < sizeof(*d) || !d->bControlSize)
+ return false;
+ /* at least bmaControls(0) for master channel + iFeature */
+ return d->bLength >= sizeof(*d) + d->bControlSize + 1;
+}
+
+static bool validate_uac2_feature_unit(const void *p,
+ const struct usb_desc_validator *v)
+{
+ const struct uac2_feature_unit_descriptor *d = p;
+
+ if (d->bLength < sizeof(*d))
+ return false;
+ /* at least bmaControls(0) for master channel + iFeature */
+ return d->bLength >= sizeof(*d) + 4 + 1;
+}
+
+static bool validate_uac3_feature_unit(const void *p,
+ const struct usb_desc_validator *v)
+{
+ const struct uac3_feature_unit_descriptor *d = p;
+
+ if (d->bLength < sizeof(*d))
+ return false;
+ /* at least bmaControls(0) for master channel + wFeatureDescrStr */
+ return d->bLength >= sizeof(*d) + 4 + 2;
+}
+
+static bool validate_midi_out_jack(const void *p,
+ const struct usb_desc_validator *v)
+{
+ const struct usb_midi_out_jack_descriptor *d = p;
+
+ return d->bLength >= sizeof(*d) &&
+ d->bLength >= sizeof(*d) + d->bNrInputPins * 2;
+}
+
+#define FIXED(p, t, s) { .protocol = (p), .type = (t), .size = sizeof(s) }
+#define FUNC(p, t, f) { .protocol = (p), .type = (t), .func = (f) }
+
+static struct usb_desc_validator audio_validators[] = {
+ /* UAC1 */
+ FUNC(UAC_VERSION_1, UAC_HEADER, validate_uac1_header),
+ FIXED(UAC_VERSION_1, UAC_INPUT_TERMINAL,
+ struct uac_input_terminal_descriptor),
+ FIXED(UAC_VERSION_1, UAC_OUTPUT_TERMINAL,
+ struct uac1_output_terminal_descriptor),
+ FUNC(UAC_VERSION_1, UAC_MIXER_UNIT, validate_mixer_unit),
+ FUNC(UAC_VERSION_1, UAC_SELECTOR_UNIT, validate_selector_unit),
+ FUNC(UAC_VERSION_1, UAC_FEATURE_UNIT, validate_uac1_feature_unit),
+ FUNC(UAC_VERSION_1, UAC1_PROCESSING_UNIT, validate_processing_unit),
+ FUNC(UAC_VERSION_1, UAC1_EXTENSION_UNIT, validate_processing_unit),
+
+ /* UAC2 */
+ FIXED(UAC_VERSION_2, UAC_HEADER, struct uac2_ac_header_descriptor),
+ FIXED(UAC_VERSION_2, UAC_INPUT_TERMINAL,
+ struct uac2_input_terminal_descriptor),
+ FIXED(UAC_VERSION_2, UAC_OUTPUT_TERMINAL,
+ struct uac2_output_terminal_descriptor),
+ FUNC(UAC_VERSION_2, UAC_MIXER_UNIT, validate_mixer_unit),
+ FUNC(UAC_VERSION_2, UAC_SELECTOR_UNIT, validate_selector_unit),
+ FUNC(UAC_VERSION_2, UAC_FEATURE_UNIT, validate_uac2_feature_unit),
+ /* UAC_VERSION_2, UAC2_EFFECT_UNIT: not implemented yet */
+ FUNC(UAC_VERSION_2, UAC2_PROCESSING_UNIT_V2, validate_processing_unit),
+ FUNC(UAC_VERSION_2, UAC2_EXTENSION_UNIT_V2, validate_processing_unit),
+ FIXED(UAC_VERSION_2, UAC2_CLOCK_SOURCE,
+ struct uac_clock_source_descriptor),
+ FUNC(UAC_VERSION_2, UAC2_CLOCK_SELECTOR, validate_selector_unit),
+ FIXED(UAC_VERSION_2, UAC2_CLOCK_MULTIPLIER,
+ struct uac_clock_multiplier_descriptor),
+ /* UAC_VERSION_2, UAC2_SAMPLE_RATE_CONVERTER: not implemented yet */
+
+ /* UAC3 */
+ FIXED(UAC_VERSION_2, UAC_HEADER, struct uac3_ac_header_descriptor),
+ FIXED(UAC_VERSION_3, UAC_INPUT_TERMINAL,
+ struct uac3_input_terminal_descriptor),
+ FIXED(UAC_VERSION_3, UAC_OUTPUT_TERMINAL,
+ struct uac3_output_terminal_descriptor),
+ /* UAC_VERSION_3, UAC3_EXTENDED_TERMINAL: not implemented yet */
+ FUNC(UAC_VERSION_3, UAC3_MIXER_UNIT, validate_mixer_unit),
+ FUNC(UAC_VERSION_3, UAC3_SELECTOR_UNIT, validate_selector_unit),
+ FUNC(UAC_VERSION_3, UAC_FEATURE_UNIT, validate_uac3_feature_unit),
+ /* UAC_VERSION_3, UAC3_EFFECT_UNIT: not implemented yet */
+ FUNC(UAC_VERSION_3, UAC3_PROCESSING_UNIT, validate_processing_unit),
+ FUNC(UAC_VERSION_3, UAC3_EXTENSION_UNIT, validate_processing_unit),
+ FIXED(UAC_VERSION_3, UAC3_CLOCK_SOURCE,
+ struct uac3_clock_source_descriptor),
+ FUNC(UAC_VERSION_3, UAC3_CLOCK_SELECTOR, validate_selector_unit),
+ FIXED(UAC_VERSION_3, UAC3_CLOCK_MULTIPLIER,
+ struct uac3_clock_multiplier_descriptor),
+ /* UAC_VERSION_3, UAC3_SAMPLE_RATE_CONVERTER: not implemented yet */
+ /* UAC_VERSION_3, UAC3_CONNECTORS: not implemented yet */
+ { } /* terminator */
+};
+
+static struct usb_desc_validator midi_validators[] = {
+ FIXED(UAC_VERSION_ALL, USB_MS_HEADER,
+ struct usb_ms_header_descriptor),
+ FIXED(UAC_VERSION_ALL, USB_MS_MIDI_IN_JACK,
+ struct usb_midi_in_jack_descriptor),
+ FUNC(UAC_VERSION_ALL, USB_MS_MIDI_OUT_JACK,
+ validate_midi_out_jack),
+ { } /* terminator */
+};
+
+
+/* Validate the given unit descriptor, return true if it's OK */
+static bool validate_desc(unsigned char *hdr, int protocol,
+ const struct usb_desc_validator *v)
+{
+ if (hdr[1] != USB_DT_CS_INTERFACE)
+ return true; /* don't care */
+
+ for (; v->type; v++) {
+ if (v->type == hdr[2] &&
+ (v->protocol == UAC_VERSION_ALL ||
+ v->protocol == protocol)) {
+ if (v->func)
+ return v->func(hdr, v);
+ /* check for the fixed size */
+ return hdr[0] >= v->size;
+ }
+ }
+
+ return true; /* not matching, skip validation */
+}
+
+bool snd_usb_validate_audio_desc(void *p, int protocol)
+{
+ return validate_desc(p, protocol, audio_validators);
+}
+
+bool snd_usb_validate_midi_desc(void *p)
+{
+ return validate_desc(p, UAC_VERSION_1, midi_validators);
+}
+
diff --git a/tools/gpio/Makefile b/tools/gpio/Makefile
index 6ecdd1067826..1178d302757e 100644
--- a/tools/gpio/Makefile
+++ b/tools/gpio/Makefile
@@ -3,7 +3,11 @@ include ../scripts/Makefile.include

bindir ?= /usr/bin

-ifeq ($(srctree),)
+# This will work when gpio is built in tools env. where srctree
+# isn't set and when invoked from selftests build, where srctree
+# is set to ".". building_out_of_srctree is undefined for in srctree
+# builds
+ifndef building_out_of_srctree
srctree := $(patsubst %/,%,$(dir $(CURDIR)))
srctree := $(patsubst %/,%,$(dir $(srctree)))
endif
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6bd270a1e93e..7ec174361c54 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1618,7 +1618,7 @@ int hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
return 0;
}

-static int hist_entry__sort(struct hist_entry *a, struct hist_entry *b)
+static int64_t hist_entry__sort(struct hist_entry *a, struct hist_entry *b)
{
struct hists *hists = a->hists;
struct perf_hpp_fmt *fmt;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index f18113581cf0..b7691d4af521 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -637,7 +637,7 @@ bool map_groups__empty(struct map_groups *mg)

struct map_groups *map_groups__new(struct machine *machine)
{
- struct map_groups *mg = malloc(sizeof(*mg));
+ struct map_groups *mg = zalloc(sizeof(*mg));

if (mg != NULL)
map_groups__init(mg, machine);
diff --git a/tools/testing/selftests/bpf/test_tc_edt.sh b/tools/testing/selftests/bpf/test_tc_edt.sh
index f38567ef694b..daa7d1b8d309 100755
--- a/tools/testing/selftests/bpf/test_tc_edt.sh
+++ b/tools/testing/selftests/bpf/test_tc_edt.sh
@@ -59,7 +59,7 @@ ip netns exec ${NS_SRC} tc filter add dev veth_src egress \

# start the listener
ip netns exec ${NS_DST} bash -c \
- "nc -4 -l -s ${IP_DST} -p 9000 >/dev/null &"
+ "nc -4 -l -p 9000 >/dev/null &"
declare -i NC_PID=$!
sleep 1

diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 4c285b6e1db8..1c8f194d6556 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -898,6 +898,114 @@ TEST_F(tls, nonblocking)
}
}

+static void
+test_mutliproc(struct __test_metadata *_metadata, struct _test_data_tls *self,
+ bool sendpg, unsigned int n_readers, unsigned int n_writers)
+{
+ const unsigned int n_children = n_readers + n_writers;
+ const size_t data = 6 * 1000 * 1000;
+ const size_t file_sz = data / 100;
+ size_t read_bias, write_bias;
+ int i, fd, child_id;
+ char buf[file_sz];
+ pid_t pid;
+
+ /* Only allow multiples for simplicity */
+ ASSERT_EQ(!(n_readers % n_writers) || !(n_writers % n_readers), true);
+ read_bias = n_writers / n_readers ?: 1;
+ write_bias = n_readers / n_writers ?: 1;
+
+ /* prep a file to send */
+ fd = open("/tmp/", O_TMPFILE | O_RDWR, 0600);
+ ASSERT_GE(fd, 0);
+
+ memset(buf, 0xac, file_sz);
+ ASSERT_EQ(write(fd, buf, file_sz), file_sz);
+
+ /* spawn children */
+ for (child_id = 0; child_id < n_children; child_id++) {
+ pid = fork();
+ ASSERT_NE(pid, -1);
+ if (!pid)
+ break;
+ }
+
+ /* parent waits for all children */
+ if (pid) {
+ for (i = 0; i < n_children; i++) {
+ int status;
+
+ wait(&status);
+ EXPECT_EQ(status, 0);
+ }
+
+ return;
+ }
+
+ /* Split threads for reading and writing */
+ if (child_id < n_readers) {
+ size_t left = data * read_bias;
+ char rb[8001];
+
+ while (left) {
+ int res;
+
+ res = recv(self->cfd, rb,
+ left > sizeof(rb) ? sizeof(rb) : left, 0);
+
+ EXPECT_GE(res, 0);
+ left -= res;
+ }
+ } else {
+ size_t left = data * write_bias;
+
+ while (left) {
+ int res;
+
+ ASSERT_EQ(lseek(fd, 0, SEEK_SET), 0);
+ if (sendpg)
+ res = sendfile(self->fd, fd, NULL,
+ left > file_sz ? file_sz : left);
+ else
+ res = send(self->fd, buf,
+ left > file_sz ? file_sz : left, 0);
+
+ EXPECT_GE(res, 0);
+ left -= res;
+ }
+ }
+}
+
+TEST_F(tls, mutliproc_even)
+{
+ test_mutliproc(_metadata, self, false, 6, 6);
+}
+
+TEST_F(tls, mutliproc_readers)
+{
+ test_mutliproc(_metadata, self, false, 4, 12);
+}
+
+TEST_F(tls, mutliproc_writers)
+{
+ test_mutliproc(_metadata, self, false, 10, 2);
+}
+
+TEST_F(tls, mutliproc_sendpage_even)
+{
+ test_mutliproc(_metadata, self, true, 6, 6);
+}
+
+TEST_F(tls, mutliproc_sendpage_readers)
+{
+ test_mutliproc(_metadata, self, true, 4, 12);
+}
+
+TEST_F(tls, mutliproc_sendpage_writers)
+{
+ test_mutliproc(_metadata, self, true, 10, 2);
+}
+
TEST_F(tls, control_msg)
{
if (self->notls)
diff --git a/tools/usb/usbip/libsrc/usbip_device_driver.c b/tools/usb/usbip/libsrc/usbip_device_driver.c
index 5a3726eb44ab..b237a43e6299 100644
--- a/tools/usb/usbip/libsrc/usbip_device_driver.c
+++ b/tools/usb/usbip/libsrc/usbip_device_driver.c
@@ -69,7 +69,7 @@ int read_usb_vudc_device(struct udev_device *sdev, struct usbip_usb_device *dev)
FILE *fd = NULL;
struct udev_device *plat;
const char *speed;
- int ret = 0;
+ size_t ret;

plat = udev_device_get_parent(sdev);
path = udev_device_get_syspath(plat);
@@ -79,8 +79,10 @@ int read_usb_vudc_device(struct udev_device *sdev, struct usbip_usb_device *dev)
if (!fd)
return -1;
ret = fread((char *) &descr, sizeof(descr), 1, fd);
- if (ret < 0)
+ if (ret != 1) {
+ err("Cannot read vudc device descr file: %s", strerror(errno));
goto err;
+ }
fclose(fd);

copy_descr_attr(dev, &descr, bDeviceClass);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c6a91b044d8d..9d4e03eddccf 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -50,6 +50,7 @@
#include <linux/bsearch.h>
#include <linux/io.h>
#include <linux/lockdep.h>
+#include <linux/kthread.h>

#include <asm/processor.h>
#include <asm/ioctl.h>
@@ -617,13 +618,31 @@ static int kvm_create_vm_debugfs(struct kvm *kvm, int fd)

stat_data->kvm = kvm;
stat_data->offset = p->offset;
+ stat_data->mode = p->mode ? p->mode : 0644;
kvm->debugfs_stat_data[p - debugfs_entries] = stat_data;
- debugfs_create_file(p->name, 0644, kvm->debugfs_dentry,
+ debugfs_create_file(p->name, stat_data->mode, kvm->debugfs_dentry,
stat_data, stat_fops_per_vm[p->kind]);
}
return 0;
}

+/*
+ * Called after the VM is otherwise initialized, but just before adding it to
+ * the vm_list.
+ */
+int __weak kvm_arch_post_init_vm(struct kvm *kvm)
+{
+ return 0;
+}
+
+/*
+ * Called just after removing the VM from the vm_list, but before doing any
+ * other destruction.
+ */
+void __weak kvm_arch_pre_destroy_vm(struct kvm *kvm)
+{
+}
+
static struct kvm *kvm_create_vm(unsigned long type)
{
int r, i;
@@ -674,10 +693,14 @@ static struct kvm *kvm_create_vm(unsigned long type)
rcu_assign_pointer(kvm->buses[i],
kzalloc(sizeof(struct kvm_io_bus), GFP_KERNEL_ACCOUNT));
if (!kvm->buses[i])
- goto out_err;
+ goto out_err_no_mmu_notifier;
}

r = kvm_init_mmu_notifier(kvm);
+ if (r)
+ goto out_err_no_mmu_notifier;
+
+ r = kvm_arch_post_init_vm(kvm);
if (r)
goto out_err;

@@ -690,6 +713,11 @@ static struct kvm *kvm_create_vm(unsigned long type)
return kvm;

out_err:
+#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
+ if (kvm->mmu_notifier.ops)
+ mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);
+#endif
+out_err_no_mmu_notifier:
cleanup_srcu_struct(&kvm->irq_srcu);
out_err_no_irq_srcu:
cleanup_srcu_struct(&kvm->srcu);
@@ -732,6 +760,8 @@ static void kvm_destroy_vm(struct kvm *kvm)
mutex_lock(&kvm_lock);
list_del(&kvm->vm_list);
mutex_unlock(&kvm_lock);
+ kvm_arch_pre_destroy_vm(kvm);
+
kvm_free_irq_routing(kvm);
for (i = 0; i < KVM_NR_BUSES; i++) {
struct kvm_io_bus *bus = kvm_get_bus(kvm, i);
@@ -3930,7 +3960,9 @@ static int kvm_debugfs_open(struct inode *inode, struct file *file,
if (!refcount_inc_not_zero(&stat_data->kvm->users_count))
return -ENOENT;

- if (simple_attr_open(inode, file, get, set, fmt)) {
+ if (simple_attr_open(inode, file, get,
+ stat_data->mode & S_IWUGO ? set : NULL,
+ fmt)) {
kvm_put_kvm(stat_data->kvm);
return -ENOMEM;
}
@@ -4178,7 +4210,8 @@ static void kvm_init_debug(void)

kvm_debugfs_num_entries = 0;
for (p = debugfs_entries; p->name; ++p, kvm_debugfs_num_entries++) {
- debugfs_create_file(p->name, 0644, kvm_debugfs_dir,
+ int mode = p->mode ? p->mode : 0644;
+ debugfs_create_file(p->name, mode, kvm_debugfs_dir,
(void *)(long)p->offset,
stat_fops[p->kind]);
}
@@ -4361,3 +4394,86 @@ void kvm_exit(void)
kvm_vfio_ops_exit();
}
EXPORT_SYMBOL_GPL(kvm_exit);
+
+struct kvm_vm_worker_thread_context {
+ struct kvm *kvm;
+ struct task_struct *parent;
+ struct completion init_done;
+ kvm_vm_thread_fn_t thread_fn;
+ uintptr_t data;
+ int err;
+};
+
+static int kvm_vm_worker_thread(void *context)
+{
+ /*
+ * The init_context is allocated on the stack of the parent thread, so
+ * we have to locally copy anything that is needed beyond initialization
+ */
+ struct kvm_vm_worker_thread_context *init_context = context;
+ struct kvm *kvm = init_context->kvm;
+ kvm_vm_thread_fn_t thread_fn = init_context->thread_fn;
+ uintptr_t data = init_context->data;
+ int err;
+
+ err = kthread_park(current);
+ /* kthread_park(current) is never supposed to return an error */
+ WARN_ON(err != 0);
+ if (err)
+ goto init_complete;
+
+ err = cgroup_attach_task_all(init_context->parent, current);
+ if (err) {
+ kvm_err("%s: cgroup_attach_task_all failed with err %d\n",
+ __func__, err);
+ goto init_complete;
+ }
+
+ set_user_nice(current, task_nice(init_context->parent));
+
+init_complete:
+ init_context->err = err;
+ complete(&init_context->init_done);
+ init_context = NULL;
+
+ if (err)
+ return err;
+
+ /* Wait to be woken up by the spawner before proceeding. */
+ kthread_parkme();
+
+ if (!kthread_should_stop())
+ err = thread_fn(kvm, data);
+
+ return err;
+}
+
+int kvm_vm_create_worker_thread(struct kvm *kvm, kvm_vm_thread_fn_t thread_fn,
+ uintptr_t data, const char *name,
+ struct task_struct **thread_ptr)
+{
+ struct kvm_vm_worker_thread_context init_context = {};
+ struct task_struct *thread;
+
+ *thread_ptr = NULL;
+ init_context.kvm = kvm;
+ init_context.parent = current;
+ init_context.thread_fn = thread_fn;
+ init_context.data = data;
+ init_completion(&init_context.init_done);
+
+ thread = kthread_run(kvm_vm_worker_thread, &init_context,
+ "%s-%d", name, task_pid_nr(current));
+ if (IS_ERR(thread))
+ return PTR_ERR(thread);
+
+ /* kthread_run is never supposed to return NULL */
+ WARN_ON(thread == NULL);
+
+ wait_for_completion(&init_context.init_done);
+
+ if (!init_context.err)
+ *thread_ptr = thread;
+
+ return init_context.err;
+}