Re: [RFC PATCH 0/6] Deduplicating RISCV cmpxchg.h macros

From: Conor Dooley
Date: Tue Mar 21 2023 - 15:49:03 EST


On Tue, Mar 21, 2023 at 02:01:59PM -0300, Leonardo Bras Soares Passos wrote:
> On Tue, Mar 21, 2023 at 4:49 AM Conor Dooley <conor.dooley@xxxxxxxxxxxxx> wrote:
> > On Tue, Mar 21, 2023 at 03:30:35AM -0300, Leonardo Brás wrote:
> > > On Sun, 2023-03-19 at 20:35 +0000, Conor Dooley wrote:
> > > > On Sat, Mar 18, 2023 at 05:00:54AM -0300, Leonardo Bras wrote:
> > > > > While studying riscv's cmpxchg.h file, I got really interested in
> > > > > understanding how RISCV asm implemented the different versions of
> > > > > {cmp,}xchg.
> > > > >
> > > > > When I understood the pattern, it made sense for me to remove the
> > > > > duplications and create macros to make it easier to understand what exactly
> > > > > changes between the versions: Instruction sufixes & barriers.
> > > > >
> > > > > I split those changes in 3 levels for each cmpxchg and xchg, resulting a
> > > > > total of 6 patches. I did this so it becomes easier to review and remove
> > > > > the last levels if desired, but I have no issue squashing them if it's
> > > > > better.
> > > > >
> > > > > Please provide comments.
> > > > >
> > > > > Thanks!
> > > > > Leo
> > > > >
> > > > > Leonardo Bras (6):
> > > > > riscv/cmpxchg: Deduplicate cmpxchg() asm functions
> > > > > riscv/cmpxchg: Deduplicate cmpxchg() macros
> > > > > riscv/cmpxchg: Deduplicate arch_cmpxchg() macros
> > > >
> > > > > riscv/cmpxchg: Deduplicate xchg() asm functions
> > > >
> > > > FWIW, this patch seems to break the build pretty badly:
> > > > https://patchwork.kernel.org/project/linux-riscv/patch/20230318080059.1109286-5-leobras@xxxxxxxxxx/
> > >
> > > Thanks for pointing out!
> > >
> > > It was an intermediary error:
> > > Sufix for amoswap on acquire version was "d.aqrl" instead of the
> > > correct".d.aqrl", and that caused the fail.
> > >
> > > I did not notice anything because the next commit made it more general, and thus
> > > removed this line of code. I will send a v2-RFC shortly.
> > >
> > > I see that patch 4/6 has 5 fails, but on each one of them I can see:
> > > "I: build output in /ci/workspace/[...]", or
> > > ""I: build output in /tmp/[...]".
> >
> > I don't push the built artifacts anywhere, just the brief logs -
> > although the "failed to build" log isn't very helpful other than telling
> > you the build broke.
> > That seems like a bug w.r.t. where tuxmake prints its output. I'll try
> > to fix that.
>
> Thanks for that :)

I've pushed what I think is a fix, the wrong log file was being grepped
for errors in the case of a failed build.

> > > I could not find any reference to where this is saved, though.
> > > Could you point where can I find the error output, just for the sake of further
> > > debugging?
> > >
> > > >
> > > > Patches 1 & 5 also add quite a lot of sparse issues (like 1000), but I
> > > > think that may be more of an artifact of the testing process as opposed
> > > > to something caused by this patchset.
> > >
> > > For those I can see the build output diffs. Both added error lines to
> > > conchuod/build_rv64_gcc_allmodconfig.
> > > I tried to mimic this with [make allmodconfig + gcc build with
> > > CROSS_COMPILE=riscv64-linux-gnu- ] but I could not get any error in any patch.
> >
> > If you can't replicate them, it's probably because they come from
> > sparse. I only really mentioned it here in case you went looking at the
> > other patches in the series and were wondering why things were so amiss.
>
> Oh, it makes sense.
>
> >
> > > For patch 1/6 it removes as much error lines (-) as it adds (+), and the error
> > > messages are mostly the same, apart for an line number.
> >
> > I don't see a line number difference, but rather an increase in the
> > number of times the same error lines have appeared in the output.
> > I don't find the single-line output from sparse to really be very
> > helpful, I should probably have a bit of a think how to present that
> > information better.
>
> Oh, I see.
> The number at the beginning relates to the number of times the error happens.
> Ok it makes sense to me now :)
>
> >
> > > For patch 5/6 it actually adds many more lines, but tracking (some of) the
> > > errors gave me no idea why.
> >
> > Probably just sparse being unhappy with some way the macros were
> > changed - but some of it ("Should it be static?" bits) look very much
> > like the patch causing things to be rebuilt only for the "after the
> > patch" build, but somehow not for the "before" build.
>
> Humm, not sure I could understand that last part:
> What I get is that the patch 5/6 is causing things to be rebuilt, and
> it was not like that on 1-4/6.
> Is that what you said?
> If so, and you are doing it as an incremental build, changing the
> macros in 5/6 should be triggering rebuilds, but it does not make
> sense to not be rebuilt in 1-4/6 as they change the same macros.

Right, it is an incremental build.
First it builds the tree with a patch applied, then it checks out HEAD~1
and builds that tree. Then it goes back to the tree with the patch
applied and builds it again. The output from builds 2 & 3 are compared
to see if any errors snuck in.
In theory, that should ensure that, as builds 2 & 3 have had the same
diff to the previous one just in opposite directions, the same files get
rebuilt - but I am a little worried that ccache gets involved sometimes
and leads to the before/after builds not being exactly the same.

They may very well be real issues as your refactor has caused the
casting in the macros to change or w/e. Not my area of expertise by any
stretch of the imagination, but the lkp sparse is out of date & doesn't
run any more so I figure it's better to be running the stuff, even if it
does sometimes result in a splurge of errors like this than forget about
poor aul sparse.

Cheers,
Conor.

Attachment: signature.asc
Description: PGP signature