Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading

From: Iouri Tarassov
Date: Wed Mar 02 2022 - 13:49:20 EST


On 3/2/2022 3:53 AM, Wei Liu wrote:
> On Wed, Mar 02, 2022 at 08:53:15AM +0100, Greg KH wrote:
> > On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > > +struct dxgglobal *dxgglobal;
> > > >
> > > > No, make this per-device, NEVER have a single device for your driver.
> > > > The Linux driver model makes it harder to do it this way than to do it
> > > > correctly. Do it correctly please and have no global structures like
> > > > this.
> > > >
> > >
> > > This may not be as big an issue as you thought. The device discovery is
> > > still done via the normal VMBus probing routine. For all intents and
> > > purposes the dxgglobal structure can be broken down into per device
> > > fields and a global structure which contains the protocol versioning
> > > information -- my understanding is there will always be a global
> > > structure to hold information related to the backend, regardless of how
> > > many devices there are.
> >
> > Then that is wrong and needs to be fixed. Drivers should almost never
> > have any global data, that is not how Linux drivers work. What happens
> > when you get a second device in your system for this? Major rework
> > would have to happen and the code will break. Handle that all now as it
> > takes less work to make this per-device than it does to have a global
> > variable.
> >
>
> It is perhaps easier to draw parallel from an existing driver. I feel
> like we're talking past each other.
>
> Let's look at drivers/iommu/intel/iommu.c. There are a bunch of lists
> like `static LIST_HEAD(dmar_rmrr_units)`. During the probing phase, new
> units will be added to the list. I this the current code is following
> this model. dxgglobal fulfills the role of a list.
>
> Setting aside the question of whether it makes sense to keep a copy of
> the per-VM state in each device instance, I can see the code be changed
> to:
>
> struct mutex device_mutex; /* split out from dxgglobal */
> static LIST_HEAD(dxglist);
>
> /* Rename struct dxgglobal to struct dxgstate */
> struct dxgstate {
> struct list_head dxglist; /* link for dxglist */
> /* ... original fields sans device_mutex */
> }
>
> /*
> * Provide a bunch of helpers manipulate the list. Called in probe /
> * remove etc.
> */
> struct dxgstate *find_dxgstate(...);
> void remove_dxgstate(...);
> int add_dxgstate(...);
>
> This model is well understood and used in tree. It is just that it
> doesn't provide much value in doing this now since the list will only
> contain one element. I hope that you're not saying we cannot even use a
> per-module pointer to quickly get the data structure we want to use,
> right?
>
> Are you suggesting Iouri use dev_set_drvdata to stash the dxgstate
> into the device object? I think that can be done too.
>
> The code can be changed as:
>
> /* Rename struct dxgglobal to dxgstate and remove unneeded fields */
> struct dxgstate { ... };
>
> static int dxg_probe_vmbus(...) {
>
> /* probe successfully */
>
> struct dxgstate *state = kmalloc(...);
> /* Fill in dxgstate with information from backend */
>
> /* hdev->dev is the device object from the core driver framework */
> dev_set_drvdata(&hdev->dev, state);
> }
>
> static int dxg_remove_vmbus(...) {
> /* Normal stuff here ...*/
>
> struct dxgstate *state = dev_get_drvdata(...);
> dev_set_drvdata(..., NULL);
> kfree(state);
> }
>
> /* In all other functions */
> void do_things(...) {
> struct dxgstate *state = dev_get_drvdata(...);
>
> /* Use state in place of where dxgglobal was needed */
>
> }
>
> Iouri, notice this doesn't change anything regarding how userspace is
> designed. This is about how kernel organises its data.
>
> I hope what I wrote above can bring our understanding closer.
>
> Thanks,
> Wei.


I can certainly remove dxgglobal and keep the  pointer to the global
state in the device object.

This will require passing of the global pointer to all functions, which
need to access it.


Maybe my understanding of the Greg's suggestion was not correct. I
thought the suggestion was

to have multiple /dev/dxgN devices (one per virtual compute device).
This would change how the user mode

clients enumerate and communicate with compute devices.


Thanks

Iouri