Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading

From: Wei Liu
Date: Wed Mar 02 2022 - 15:34:47 EST


On Wed, Mar 02, 2022 at 10:49:15AM -0800, Iouri Tarassov wrote:
> On 3/2/2022 3:53 AM, Wei Liu wrote:
> > On Wed, Mar 02, 2022 at 08:53:15AM +0100, Greg KH wrote:
> > > On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > > > +struct dxgglobal *dxgglobal;
> > > > >
> > > > > No, make this per-device, NEVER have a single device for your driver.
> > > > > The Linux driver model makes it harder to do it this way than to do it
> > > > > correctly. Do it correctly please and have no global structures like
> > > > > this.
> > > > >
> > > >
> > > > This may not be as big an issue as you thought. The device discovery is
> > > > still done via the normal VMBus probing routine. For all intents and
> > > > purposes the dxgglobal structure can be broken down into per device
> > > > fields and a global structure which contains the protocol versioning
> > > > information -- my understanding is there will always be a global
> > > > structure to hold information related to the backend, regardless of how
> > > > many devices there are.
> > >
> > > Then that is wrong and needs to be fixed. Drivers should almost never
> > > have any global data, that is not how Linux drivers work. What happens
> > > when you get a second device in your system for this? Major rework
> > > would have to happen and the code will break. Handle that all now as it
> > > takes less work to make this per-device than it does to have a global
> > > variable.
> > >
> >
> > It is perhaps easier to draw parallel from an existing driver. I feel
> > like we're talking past each other.
> >
> > Let's look at drivers/iommu/intel/iommu.c. There are a bunch of lists
> > like `static LIST_HEAD(dmar_rmrr_units)`. During the probing phase, new
> > units will be added to the list. I this the current code is following
> > this model. dxgglobal fulfills the role of a list.
> >
> > Setting aside the question of whether it makes sense to keep a copy of
> > the per-VM state in each device instance, I can see the code be changed
> > to:
> >
> > struct mutex device_mutex; /* split out from dxgglobal */
> > static LIST_HEAD(dxglist);
> >
> > /* Rename struct dxgglobal to struct dxgstate */
> > struct dxgstate {
> > struct list_head dxglist; /* link for dxglist */
> > /* ... original fields sans device_mutex */
> > }
> >
> > /*
> > * Provide a bunch of helpers manipulate the list. Called in probe /
> > * remove etc.
> > */
> > struct dxgstate *find_dxgstate(...);
> > void remove_dxgstate(...);
> > int add_dxgstate(...);
> >
> > This model is well understood and used in tree. It is just that it
> > doesn't provide much value in doing this now since the list will only
> > contain one element. I hope that you're not saying we cannot even use a
> > per-module pointer to quickly get the data structure we want to use,
> > right?
> >
> > Are you suggesting Iouri use dev_set_drvdata to stash the dxgstate
> > into the device object? I think that can be done too.
> >
> > The code can be changed as:
> >
> > /* Rename struct dxgglobal to dxgstate and remove unneeded fields */
> > struct dxgstate { ... };
> >
> > static int dxg_probe_vmbus(...) {
> >
> > /* probe successfully */
> >
> > struct dxgstate *state = kmalloc(...);
> > /* Fill in dxgstate with information from backend */
> >
> > /* hdev->dev is the device object from the core driver framework */
> > dev_set_drvdata(&hdev->dev, state);
> > }
> >
> > static int dxg_remove_vmbus(...) {
> > /* Normal stuff here ...*/
> >
> > struct dxgstate *state = dev_get_drvdata(...);
> > dev_set_drvdata(..., NULL);
> > kfree(state);
> > }
> >
> > /* In all other functions */
> > void do_things(...) {
> > struct dxgstate *state = dev_get_drvdata(...);
> >
> > /* Use state in place of where dxgglobal was needed */
> >
> > }
> >
> > Iouri, notice this doesn't change anything regarding how userspace is
> > designed. This is about how kernel organises its data.
> >
> > I hope what I wrote above can bring our understanding closer.
> >
> > Thanks,
> > Wei.
>
>
> I can certainly remove dxgglobal and keep the  pointer to the global
> state in the device object.
>

No, no more global pointer needed. You just call dev_drv_setdata in the
place that you assign to the global pointer.

> This will require passing of the global pointer to all functions, which
> need to access it.
>

And in the place you need the global pointer, call dev_drv_getdata.

>
> Maybe my understanding of the Greg's suggestion was not correct. I
> thought the suggestion was
>
> to have multiple /dev/dxgN devices (one per virtual compute device).
> This would change how the user mode
>

No. You still have only one /dev/dxg here.

Wei.