On Fri, Mar 20, 2009 at 09:33:18AM -0600, Allan Lyons wrote:
> On Fri, Mar 20, 2009 at 08:57:20AM +0100, Pierre Bourgin wrote:
> > Right now, the main problem is that adding HDAUDIO support from INF
> > files costs a lot: about 50 seconds of INF files parsing instead of
> > 20/30 seconds for PCI devices search only (for 2GB of drivers - 1800
> > .inf files to check) ... I have to figure out how to write more
> > efficient perl code ;-) but in the other hand, have to parse a lot of
> > INF files line by line, so a lot of I/O.
> I think searching for are reading all of those INF files is what takes
> almost all of the time.
> > I'm also thinking about a cache of INF files (generated by the system
> > that hosts the unattended files and windows drivers) to speed up search
> > on the unattended-client side.
> There is more than one solution that would work. One solution I can think
> of is to sort all of the driver file directories into a new hierarchy that
> gives you the information that you need at install time. For example, if
> all the drivers were sorted in a tree such as /bus/vendor/device/... that
> would cut down the amount of work that you have to do at scan time. (I'm
> assuming that you only need those three items to identify the driver
> properly). By splitting the INF parsing to a separate program, the
> scanning part should be really fast.
> For example, the parsing program would sort all of the driver directories
> win_drivers/PCI/8086/24C5/1/... win_drivers/PCI/8086/24C5/2/...
> win_drivers/PCI/8086/C592/1/... win_drivers/USB/12536/382/1/...
> You should probably allow multiple drivers under the same vendor/device
> and let Windows sort it out at install time. Having a few extra drivers
> probably doesn't hurt.
> Then the scan program just adds all of the driver directories under the
> appropriate bus/vendor/device if they exist and doesn't have to read ANY
> INF files at all.
> This probably uses more disk space since you would probably treat
> win_drivers as a temporary directory that would be rebuilt every time that
> you updated your driver set.
This could be a nice solution from the search perl code point of view, but
people won't probably feel confident with such a reorganization; it's make
more difficult to check what's happening, since a supplementary level of
indirection (between what is choosed and the "real" driver path).
You may also have to duplicate drivers several time, since some drivers
(like Intel drivers for chipsets) "match" several PCI devices; in such a
case, a given Windows drivers will be present several time in
PCI/VENDOR/DEV/ tree, so will also being duplicated into /c/drivers/ ..
until you check that driver in /c/drivers/1/ and drivers in /c/drivers/2/
are the same ... a little weird.
> A second solution would be to use an index file rather than copying the
> driver files around. This is probably nicer. For this case, the index
> file could be a flat file with basically two fields. The first being the
> bus/vendor/device string used as a key and the second field is the path to
> the driver files. The scan program would then load this index into a
> hash. If the key exists in the hash, then we have the driver and the
> path. The drawback with this one is that it probably uses more memory
> since we load the entire index. (However, we probably would need more
> than 10000 drivers before we used more than 500k.)
That's what I was thinking about while talking about "cache file" ... My
explanations were not probably as lighting as yours ;-)
Notice that It won't be a hash indexed by INF strings, but an array, since
several drivers may match a given INF string.
To reduce the memory footprint, may use grep() function of perl on index as
falt ascii content ? But It will costs more I/O and time, since we have to
parse the index file for each INF strings (corresponding to PCI/HDAUDIO/USB
Which format would you recommend for index file ? a simple text (ascii)
file, or a dump of perl datas/structure serialized (thus "binary" content) ?
Keep in mind that such an index file may be generated by any perl version,
since generated by the system that hosts the Windows driver files for
instance: perl dump of datas are portable across perl versions ?
> My gut feeling is that the second solution is probably faster for the scan
> since there are fewer network filesystem reads. Yes, there is probably
> more data read, but it is all read in one shot rather than directory reads
> for every device in the system. Either of these two solutions would
> likely bring the scan time to less than a couple of seconds.
> Now that I have it all written out, I like the second solution better. It
> also wouldn't be much of a change to your code.
This other point for me is that search-win-drivers.pl has to be able to
deal with several drivers path, like win_drivers/DriverPacks/ and
win_drivers/my_specific_stuff/ instead of (only) win_drivers/ : this will help
people to (re)generate more quickly indexes, since we can bet that content
of win_drivers/DriverPacks/ does not change so quickly, compared to