Re: [PATCH v2 0/9] Add support for Microsoft Surface System Aggregator Module

From: Hans de Goede
Date: Mon Dec 07 2020 - 03:50:45 EST


Hi,

On 12/6/20 4:58 PM, Maximilian Luz wrote:
> On 12/6/20 8:07 AM, Leon Romanovsky wrote:
>> On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote:
>>> Hello,
>>>
>>> Here is version two of the Surface System Aggregator Module (SAM/SSAM)
>>> driver series, adding initial support for the embedded controller on 5th
>>> and later generation Microsoft Surface devices. Initial support includes
>>> the ACPI interface to the controller, via which battery and thermal
>>> information is provided on some of these devices.
>>>
>>> The previous version and cover letter detailing what this series is
>>> about can be found at
>>>
>>>    https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@xxxxxxxxx/
>>>
>>> This patch-set can also be found at the following repository and
>>> reference, if you prefer to look at a kernel tree instead of these
>>> emails:
>>>
>>>    https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2
>>>
>>> Thank you all for the feedback to v1, I hope I have addressed all
>>> comments.
>>
>>
>> I think that it is too far fetched to attempt and expose UAPI headers
>> for some obscure char device that we are all know won't be around in
>> a couple of years from now due to the nature of how this embedded world
>> works.
>>
>> More on that, the whole purpose of proposed interface is to debug and
>> not intended to be used by any user space code.
>
> I believe this has already been extensively discussed. I want to focus
> more on the part below in this response:
>
>> Also the idea that you are creating new bus just for this device doesn't
>> really sound right. I recommend you to take a look on auxiliary bus and
>> use it or come with very strong justifications why it is not fit yet.
>
> I tend to agree that this is a valid concern to bring up, and adding a
> new bus is not something that should be done lightly.
>
> Let's ignore that this has been merged into -next after I've submitted
> this (and that I only recently became aware of this) for the time being.
> If I would see a clear benefit, I would not hesitate to switch the
> driver and subsystem over to this.
>
> What does concern me most, is the device/driver matching by string.
> Right now, this subsystem matches those via a device UID. This UID is
> directly tied to the EC functionality provided by the device. A bit of
> background to this:
>
> Requests sent to the EC contain an address, so to say. This consists of
>
>  - Target category (TC): Broad group of functionality, e.g. battery/AC,
>    thermal, HID input, ..., i.e. a subsystem of sorts.
>
>  - Target ID (TID): Some major device, e.g. the dual batteries on the
>    Surface Book 3 are addressed by target ID 1 and 2, some functionality
>    is only available at 2 and some only at 1. May be related to physical
>    parts of/locations on the device.
>
>  - Instance ID (IID): A device instance, e.g. for thermal sensors each
>    sensor is at TC=0x03 (thermal) and has a different instance ID.
>
> Those can be used to pretty much uniquely identify a sub-device on the
> EC.

Thank you for this explanation, that is going to be useful to know
when I get around to reviewing this set (although I guess that you
probably also have written this down in one of the commit msgs /
docs I did not check).

>
> Note the "pretty much". To truly make them unique we can add a function
> ID (FN). With that, we can for example match for TC=0x03, TID=*, IID=*,
> FN=0x00 to load a driver against all thermal sensors. And this is
> basically the device UID that the subsystem uses for matching (modulo
> domain for virtual devices, i.e. device hubs). Sure, we can use some
> string, but that then leads to having to come up with creative names
> once we need some driver specific data, e.g. in the battery driver [1]:
>
>     const struct auxiliary_device_id my_auxiliary_id_table[] = {
>         { .name = "surface_aggregator_registry.battery", .driver_data = x },
>         { .name = "surface_aggregator_registry.battery_sb3", .driver_data = y },
>         { },
>     }
>
> Arguably, not _that_ big of a deal.
>
> What worries me more is that this will block any path of auto-detecting
> devices on a more general/global level. Right now, we hard-code devices
> because we haven't found any way to detect them via some EC query yet
> [2] (FYI the node groups contain all devices that will eventually be
> added to the bus, which are already 11 devices on the Surface Book 3
> without taking missing thermal sensors into account; also they are
> spread across a bunch of subsystems, so not just platform). That's of
> course not an ideal solution and one that I hope we can eventually fix.
> If we can auto-detect devices, it's very likely that we know or can
> easily get to the device UID. A meaningful string is somewhat more
> difficult.
>
> This registry, which is loaded against a platform device that, from what
> we can tell differentiates the models for some driver bindings by
> Windows (that's speculation), is also the reason why we don't register
> client devices directly under the main module, so instead of a nice
> "surface_aggregator.<devicename>", you'll get
> "surface_aggregator_registry.<devicename>". And it may not end there.
>
> Something that's currently not implemented is support for thermal
> sensors on 7th generation devices. With thermal sensors, we can already
> detect which sensors, i.e. which IIDs, are present. Naturally, that's
> part of the EC-API for thermal devices (TC=0x03), so would warrant a
> master driver that registers the individual sensor drivers (that's a
> place where I'd argue that in a normal situation, the auxiliary bus
> makes sense). So with the auxiliary bus we'd now end up with devices
> with "surface_thermal.sensor" for the sensors as well as
> "surface_aggregator_registry.<devicename>", both of type ssam_device
> (which then would be a wrapper around auxiliary_device with UID stored
> in that wrapper). Note that they need to be of type ssam_device (or
> another wrapper around that) as they again need the reference to the
> controller device, their UID for access, etc. With a proper bus, device,
> and the UID for matching, we can just add the sensor devices to the bus
> again, as they will have a meaningful and guaranteed unique UID.
>
> From some reports I've seen it looks like thermal sensors may also be
> available separately on TID=0x01 as well as TID=0x02 on some devices,
> at which point I believe you'd need to introduce some IDA for ID
> allocation to not cause a clash with IDs. At least if you separate the
> base drivers for each TC, which I guess should be preferred due to
> code-reuse. Then again they might use different event registries so you
> may end up needing "surface_thermal.sensor_tc1" and
> "surface_thermal.sensor_tc2" as device names to differentiate those
> for driver loading. Or store the registry in software node properties
> when registering the device.
>
> I'm repeating myself here, but to me it looks cleaner to have a single
> bus type as opposed to spreading the same base auxiliary device type
> over several namespaces.
>
> Which then leads me to the question of how a function like
> "is_ssam_device()", i.e. a function testing if the device is of a given
> type, would be implemented without enforcing and testing against some
> part of the device name. Something that, again, doesn't look clean to
> me. Although the use of such a function could probably avoided, but that
> then feels like working around the auxiliary bus.
>
> Unfortunately, there are a couple more hypotheticals at play than I'd
> like to have (making this not an easy decision), but it's a reverse
> engineered driver so I guess that comes with the territory. All in all,


> I believe it's possible to do this (i.e. use the auxiliary bus), but, to
> me at least, the implementation using a discrete bus feels tidier and
> more true to the hardware (or virtual hardware anyway) behind this. I'm
> happy to hear any arguments against this though.

I agree, the whole setup with the TC + TID + IID feels like the functionality
is nicely (and cleanly) split into separate functions and as with other
busses using a bus + 1 device per function for this is a perfectly clean
way to handle this.

Note if in the future you do see benefit in switching the auxiliary bus
I have no problems with that. But atm I don't really see any benefits of
doing so, so then we would just be switching over for the sake of switching
over which does not seem productive.

Regards,

Hans