Re: [PATCH] crypto: stm32 - remove crc32 and crc32c support

From: Eric Biggers
Date: Sun Jun 01 2025 - 15:02:13 EST


On Sun, Jun 01, 2025 at 11:00:31AM +0200, Ard Biesheuvel wrote:
> (cc Arnd)
>
> Hi Eric,
>
> On Sat, 31 May 2025 at 22:02, Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
> >
> > From: Eric Biggers <ebiggers@xxxxxxxxxx>
> >
> > Remove the crc32 and crc32c support from the stm32 driver, since there's
> > very little chance that it still has any relevance:
> >
> > - Now that nearly all users of crc32 and crc32c in the kernel use the
> > library functions instead of the crypto interface, and this driver
> > only supports the crypto interface, there are very few cases in which
> > this driver could plausibly still be used.
> >
> > - While the commit that added this driver quoted up to a 900% speedup
> > over crc32c-generic, this was likely a best-case scenario with long
> > lengths. Short lengths are commonly used, and this driver has a lot
> > of fixed overhead. It likely performs poorly on short lengths.
> >
> > - At the time that microbenchmark was done, there were multiple generic
> > implementations of CRC32C, and it's unclear which was used. It could
> > have been the bit-at-a-time one, which is really slow.
> >
> > - While the microbenchmark appears to have been done on an ARM Cortex-M7
> > CPU that doesn't support CRC or PMULL instructions, it's now 8 years
> > later and more CPUs have those instructions.
> >
>
> This IP appears to be used on two different SOCs:
> - one based on Cortex-M7, which is based on the ARM M (embedded)
> profile, whose ISA does not include CRC instructions, and does not
> have SIMD at all
> - one based on Cortex-A7, which does not implement CRC instructions
>
> What other SOCs based on other architecture revisions may or may not
> implement today is kind of irrelevant here: the question is whether we
> need to keep supporting h/w accelerated CRC on these particular
> platforms.
>
> I'd say M7 is dead as a doornail, so we can disregard that one, along
> with the speedup claim. The question is whether this IP is useful on
> A7 to anyone still running recent kernels on them.

Sure, IPs get reused and can show up in newer SoCs though. My observation was
just meant to emphasize that that's unlikely here.

> As you say, there are very few users left, as they have all moved to
> the library API. Combined with the fact that this is a unusual,
> synchronous, MMIO based engine that needs to rely on spinlocks to
> protect its critical sections, and fall back to the software
> implementation if, e.g., crc32() is called from softirq context while
> an operation is in progress in task context, I tend to agree that we'd
> be better off just removing it.
>
> (note that even with two available CRC engines that could
> theoretically serve task context and softirq context in parallel, the
> existing logic managing the linked list appears flawed and may result
> in the driver grabbing the CRC engine that is locked and falling back
> to software while an unlocked one might be available)
>
> > - Originally this driver was completely broken: it calculated the wrong
> > CRC values, it wasn't thread-safe, it slept in atomic sections, and it
> > used the wrong context format. Use with ext4 or f2fs immediately
> > crashed the kernel with a BUG_ON. That strongly suggests that the
> > submission was based purely on the microbenchmark and not a real use
> > case. Furthermore, the fixes for these issues added significant
> > additional overhead to the driver, such as a spinlock. That calls
> > into question the possible performance benefit.
> >
>
> I wouldn't qualify an [uncontended] spinlock as 'significant
> additional overhead', tbh.

It's significant on short lengths, which are common. ext4 metadata_csum passes
a lot of 4-byte lengths to crc32c(), for example. Fiddling with a spin lock,
linked list, runtime pm, and hardware registers to process 4 bytes is an awful
lot of work for something that could have just been a few table lookups.

> > - The driver may still be broken. For example, its update function can
> > fail. Many users are not prepared to handle errors. Unlike the
> > software CRC code there are also different code paths for serial vs.
> > parallel usage, which are unlikely to be tested. The software CRC
> > code is much less error-prone and much better tested.
> >
>
> The only failure mode appears to be that the devices may have been
> removed while the shash tfm is still in use. In this case, the driver
> should just use the existing software fallback rather than give up.

Right, that specific bug might be harmless, but it's still an example of an
issue that is not possible with software CRC code.

> > Support for this hardware could be added to arch/arm/lib/crc32.c in the
> > unlikely event that it would actually be useful. But this would need to
> > come with evidence that it's actually worthwhile, along with QEMU
> > support so that the driver can be tested.
> >
>
> I think it is fine to remove this driver solely on the basis that the
> crc32(c) shashes are no longer used (I could only find crc32c being
> used in btrfs, but that doesn't seem like a use case worth caring
> about on this hardware), and we can drop most of the motivation in the
> commit log, and summarize it along the lines of 'driver needs work but
> what's the point?'
>
>
> Acked-by: Ard Biesheuvel <ardb@xxxxxxxxxx>

Sure, but an alternate approach would be to wire up this driver to
arch/arm/lib/crc32.c. I just wanted to elaborate on why this driver seems
pretty useless, possibly even harmful, and is better off just removed for now.

- Eric