Re: [PATCH V14 01/10] acpi: apei: read ack upon ghes record consumption

From: Baicar, Tyler
Date: Thu Apr 13 2017 - 15:45:39 EST


On 4/11/2017 11:15 AM, Borislav Petkov wrote:
On Tue, Mar 28, 2017 at 01:30:31PM -0600, Tyler Baicar wrote:
A RAS (Reliability, Availability, Serviceability) controller
may be a separate processor running in parallel with OS
execution, and may generate error records for consumption by
the OS. If the RAS controller produces multiple error records,
then they may be overwritten before the OS has consumed them.

The Generic Hardware Error Source (GHES) v2 structure
introduces the capability for the OS to acknowledge the
consumption of the error record generated by the RAS
controller. A RAS controller supporting GHESv2 shall wait for
the acknowledgment before writing a new error record, thus
eliminating the race condition.

Add support for parsing of GHESv2 sub-tables as well.

Signed-off-by: Tyler Baicar <tbaicar@xxxxxxxxxxxxxx>
CC: Jonathan (Zhixiong) Zhang <zjzhang@xxxxxxxxxxxxxx>
Reviewed-by: James Morse <james.morse@xxxxxxx>
---
drivers/acpi/apei/ghes.c | 49 +++++++++++++++++++++++++++++++++++++++++++++---
drivers/acpi/apei/hest.c | 7 +++++--
include/acpi/ghes.h | 5 ++++-
3 files changed, 55 insertions(+), 6 deletions(-)
...

@@ -249,10 +254,18 @@ static struct ghes *ghes_new(struct acpi_hest_generic *generic)
ghes = kzalloc(sizeof(*ghes), GFP_KERNEL);
if (!ghes)
return ERR_PTR(-ENOMEM);
+
ghes->generic = generic;
+ if (IS_HEST_TYPE_GENERIC_V2(ghes)) {
+ rc = apei_map_generic_address(
+ &ghes->generic_v2->read_ack_register);
Yeah, that linebreak just to keep the 80-cols rule makes the code ugly
and hard to read.

Please put that mapping and unmapping in wrappers called
map_gen_v2(ghes) and unmap_gen_v2(ghes) or so, so that you can call them
wherever needed. Thus should make the flow a bit more understandable
what's going on and you won't have to repeat the unmapping lines in
ghes_fini().
Hello Boris,

Thank you for the feedback. I will make this change in the next version.
@@ -649,6 +669,23 @@ static void ghes_estatus_cache_add(
rcu_read_unlock();
}
+static int ghes_ack_error(struct acpi_hest_generic_v2 *generic_v2)
+{
+ int rc;
+ u64 val = 0;
+
+ rc = apei_read(&val, &generic_v2->read_ack_register);
+ if (rc)
+ return rc;
+ val &= generic_v2->read_ack_preserve <<
+ generic_v2->read_ack_register.bit_offset;
+ val |= generic_v2->read_ack_write <<
+ generic_v2->read_ack_register.bit_offset;
Yeah, let them stick out, it more readable this way. Line spacing is
helpful too:

...
rc = apei_read(&val, &generic_v2->read_ack_register);
if (rc)
return rc;

val &= generic_v2->read_ack_preserve << generic_v2->read_ack_register.bit_offset;
val |= generic_v2->read_ack_write << generic_v2->read_ack_register.bit_offset;

return apei_write(val, &generic_v2->read_ack_register);
}
I will make this change in the next version.

Thanks,
Tyler
+ rc = apei_write(val, &generic_v2->read_ack_register);
+
+ return rc;
+}
+
static int ghes_proc(struct ghes *ghes)
{
int rc;

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.