Re: [PATCH 1/1] rseq.2: New man page for the rseq(2) API

From: Alejandro Colomar
Date: Wed Feb 15 2023 - 12:17:18 EST


Hi Mathieu,

On 2/15/23 18:09, Mathieu Desnoyers wrote:
> On 2023-02-14 20:20, G. Branden Robinson wrote:
>> [CC list violently trimmed; for those who remain, this is mostly man
>> page style issues]
>
> [ Gently added linux-man to CC list. ;-) ]

:-)

>
>>
>> At 2023-02-14T23:29:37+0100, Alejandro Colomar wrote:
>>> On 2/14/23 20:54, Mathieu Desnoyers wrote:
>>>> +per-thread data structure shared between kernel and user-space.
>>>
>>> This last 'user-space' is not adjectivated, so it should go without
>>> a hyphen, according to common English rules.
>>
>> +1
>
> done
>
>
>>
>> Also I like your coinage. "Adjectivated yeast" is reflexive and
>> tautological!
>>
>>>> +.RB ( "struct rseq" )
>>>
>>> We format types in italics, so this should be '.RI'.
>>
>> +1
>
> OK, so it's italics for both types and arguments.
>
> I will replace all the bold markers for "struct rseq" and "struct
> rseq_cs" to italic in the description (but not in the synopsis section
> and not in the code snippets).
>
>>
>>>> +Only one
>>>> +.BR rseq ()
>>>> +ABI can be registered per thread, so user-space libraries and
>>>> +applications must follow a user-space ABI defining how to share this
>>>> +resource.
>>>
>>> Please use semantic newlines. See man-pages(7):
>>>
>>> Use semantic newlines
>>> In the source of a manual page, new sentences should be started on new
>>> lines, long sentences should be split into lines at clause breaks (com‐
>>> mas, semicolons, colons, and so on), and long clauses should be split
>>> at phrase boundaries. This convention, sometimes known as "semantic
>>> newlines", makes it easier to see the effect of patches, which often
>>> operate at the level of individual sentences, clauses, or phrases.
>>
>> I think I've said this before, but, strictly, commas in particular can
>> separate things that are not clauses. Clauses have subjects and
>> predicates.
>>
>> Might it be better to say simply:
>>
>> Start each sentence on a new line. Split long sentences where
>> punctuated by commas, semicolons, and colons.
>>
>> With this there is not even any need to discuss "phrase boundaries".
>>
>
> I've modified to:
>
> Only one
> .BR rseq ()
> ABI can be registered per thread,
> so user-space libraries and applications must follow a user-space ABI
> defining how to share this resource.
>
> Hopefully that's correct.

Yes, that's good.

>
>
>>> In the above lines, that would mean breaking after the comma,
>>> and not leaving resource in a line of its own.
>>
>> The latter is inevitably going to happen from time to time simply due to
>> sentence length and structure and the line length used by one's text
>> editor. I don't think an "orphan word" (what typographers call this) is
>> symptomatic of anything in *roff source when filling is enabled.
>>
>>>> +The ABI defining how to share this resource between applications and
>>>> +libraries is defined by the C library.
>>>> +Allocation of the per-thread
>>>> +.BR rseq ()
>>>> +ABI and its registration to the kernel is handled by glibc since version
>>>> +2.35.
>>>> +.PP
>>>> +The
>>>> +.BR rseq ()
>>>> +ABI per-thread data structure contains a
>>>> +.I rseq_cs
>>>> +field which points to the currently executing critical section.
>>>
>>> currently-executing should probably use a hyphen
>>> (if I understood the line correctly).
>>
>> This is not the case, according to some style authorities. Dave Kemper
>> convinced me of this on the groff list.
>>
>> Here is one resource.
>>
>> https://www.editorgroup.com/blog/to-hyphenate-or-not-to-hyphenate/
>>
>>> See an interesting discussion in the groff@ mailing list:
>>> <https://lists.gnu.org/archive/html/groff/2022-10/msg00015.html>
>>
>> That's not _squarely_ on point, as none of "block", "device", or "based"
>> is an adverb. "Currently" is.
>
> Leaving unchanged based on this discussion.
>
>>
>>>> +For each thread, a single rseq critical section can run at any given
>>>> +point.
>>>> +Each critical section need to be implemented in assembly.
>>>
>>> needs?
>>
>> +1
>
> done.
>
>>
>>>> +.TP
>>>> +.B Structure alignment
>>>
>>> Let's remove the bold here. It's not necessary for marking a constant
>>> or something that needs bold. And the indentation is already making
>>> it stand out, so bold is a bit too much aggressive to the reader.
>>
>> I agree; if it wouldn't be styled in running text, it doesn't need
>> styling as a paragraph tag; it already stands out by dint of its
>> placement as a tag.
>>
>>>> +Its value should always be confirmed by reading the cpu_id field before
>>>
>>> cpu_id should be formatted (.I).
>>
>> +1
>
> done
>
>>
>>>> +user-space performs any side-effect
>>>> +(e.g. storing to memory).
>>>> +.IP
>>>> +This field is always guaranteed to hold a valid CPU number in the range
>>>> +[ 0 .. nr_possible_cpus - 1 ].
>>>
>>> Please use interval notation:
>>> [0, nr_possible_cpus)
>>> or
>>> [0, nr_possible_cpus - 1]
>>> whichever looks better to you.
>>>
>>> We did some consistency fix recently:
>>> <https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=147a60d792a5db8f3cb93ea16eefb73e16c1fb91>
>>>
>>> Also, do we have a more standard way of saying nr_possible_cpus?
>>> Should we say nproc?
>
> nproc(1) means:
>
> Print the number of processing units available to the current
> process, which may be less than the number of online processors
>
> Which is the number of cpus currently available (AFAIU the result of the
> cpuset and sched affinity).
>
> What I really mean here is the maximum value for possible cpus which can
> be hotplugged into the system. So it's not the maximum number of
> possible CPUs per se, but rather the maximum enabled bit in the possible
> CPUs mask.
>
> Note that we could express this differently as well: rather than saying
> that it guarantees a value in the range [0, nr_possible_cpus - 1], we
> could say that the values are guaranteed to be part of the possible cpus
> mask, which would actually more accurate in case the possible cpus mask
> has a hole (it tends to happen with things like lxc containers nowadays).
>
> Do you agree that we should favor expressing this in terms of belonging
> to the possible cpumask set rather than a range starting from 0 ?

On 2/15/23 18:12, Mathieu Desnoyers wrote:
> Actually, the field may contain the value 0 even if 0 is not part of the
> possible cpumask. So forget what I just said about being guaranteed to
> be part of the possible cpus mask.
>
> Thoughts ?
>
> Thanks,
>
> Mathieu

I don't have a full understanding, so I will trust you for deciding what is
best. I'll try to understand it, but my kernel knowledge is rather limited :)

I suggest writing a detailed description, instead of (or complementary to it)
just using a range, since readers might wonder as I did, what nr_possible_cpus
is (it's not really described anywhere so far). With a worded description,
we can later improve it if we find it not clear enough, but should be enough
for an initial page.

Thanks!

Alex

>
> Thanks,
>
> Mathieu
>

--
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

Attachment: OpenPGP_signature
Description: OpenPGP digital signature