It wasn't the IBM machine that was breaking; it was Fujitsu. They
were returning an error code (the numerical value 1023) when I
called the _SUN method on a slot object that existed in the ACPI
namespace but was not present (as reported by the _STA method).
By the time I got that error report, I'd already dropped the
duplicate name detection code, and was letting the kobject
infrastructure warn about duplicate names because for my test
cases, refcounting was a better solution.
[Kenji-san from Fujitsu seemed to be ok with the progress I'd
made at the time, he can speak up if he's changed his mind ;)]
* Greg KH <gregkh@xxxxxxx>:On Tue, Mar 04, 2008 at 10:18:28AM -0800, Jesse Barnes wrote:On Monday, March 03, 2008 9:49 pm Greg KH wrote:bad data is worse.On Sat, Mar 01, 2008 at 07:43:07AM -0700, Matthew Wilcox wrote:Yeah, but there's a flip side to this too: if no one uses the data, no one will complain when it's wrong. If Linux starts making it easy to see this stuff, there's a chance system vendors will start taking an extra 5 min. before shipment to make sure that the BIOS info is up to date...On Fri, Feb 29, 2008 at 09:25:42PM -0800, Greg KH wrote:My main concern is that BIOS vendors will not fix these bugs, as noWhat is the guarantee that the names of these slots are correct and doThat would be a bug -- and yes, bugs happen, and we have to deal with
not happen to be the same as the hotpluggable ones?
them.
other OS cares/does this kind of thing today. The ammount of bad
information out there might be quite large, and I think this was
confirmed by some initial testing of IBM systems, right?
OTOH, I'm not sure which is worse, bad data or no data.
And then there's the machines with duplicate slot names, how does this
code handle PCI slots with that? I think some of the IBM machines had
non-hotplug slots named the same as the hotplug slots, right?
At one point, I had some code in there to stick the names of the
slots into a linked list and walk through it to try and detect
duplicate slot names, but after a few iterations, the cases I was
dealing with, it turned out to be easier to refcount them.
[my machines did not have colliding names between hp and non-hp
slots, it was more like seeing the same SxFy object appear
multiple times in the namespace and trying to create them
multiple times.]
It wasn't the IBM machine that was breaking; it was Fujitsu. They
were returning an error code (the numerical value 1023) when I
called the _SUN method on a slot object that existed in the ACPI
namespace but was not present (as reported by the _STA method).
By the time I got that error report, I'd already dropped the
duplicate name detection code, and was letting the kobject
infrastructure warn about duplicate names because for my test
cases, refcounting was a better solution.
[Kenji-san from Fujitsu seemed to be ok with the progress I'd
made at the time, he can speak up if he's changed his mind ;)]
But even that is not the error case you're describing, where
there is clear name collision of two physical slots in the
machine, one being hotplug, the other non-hotplug.
Maybe I would have to add some duplicate name detection code back
in there but...
This stuff needs a _lot_ of testing on a lot of different
machines, and a sane way to fall-back if there are errors to
ensure that working machines don't break.
And then there's the issue with userspace programs only expecting
hotplugable slots in the slots/ directory...
Yes -- totally agreed. And I'd like to see actual examples of
name collisions or userspace breakage to get a better idea of how
to handle real world problems rather than writing some crummy
code based on what my limited imagination can think of.
So how to get this test coverage? -mm? linux-next?
Thanks.
/ac