From: Rajarshi G. <rx...@ps...> - 2005-11-09 00:23:10
|
On Tue, 2005-11-08 at 18:44 -0500, Rajarshi Guha wrote: > On Tue, 2005-11-08 at 23:13 +0100, Egon Willighagen wrote: > > > > For example code to get the entries from a jar file (after which a bit > > > of regexp'ing and dynamic instantiation gives us the required descriptor > > > object) see http://blue.chem.psu.edu/~rajarshi/code/java/jarclass.java > > > > ah... right, sure. That's used for the plugins too... > > > > But... with the CDK lib we can't use that. We don't know the jar in which the > > descriptors are... somethings this is cdk-qsar.jar, sometimes it's embedded > > in an other jar... > > Hmm, thats a point. That would pretty much make scanning a jar file for > descriptor classes impossible. > > However I would still like to move the descriptors to different packages > as I had mentioned above > > > I though you actually meant that it was possible to list *all* classes > > available from the system class path... > > Well, extending the above code would do that - though it would lead to a > big performance hit (I think but I'll code it up and see). OK, ran some timing tests (using the bash time command) on a T42p (1GB, 1.8GHz) My class path has 55 jar files (all CDK dist jars, all CDK dependency jars, plus some other assorted jars like Jmol, JMF etc) Task Time ----- ----- Simply scanning each jar for 0.501 sec entries containing "org/openscience/cdk/qsar" and "Descriptor.class" Same as above, but now converting each 0.750 sec matching jar entry using replace() and replaceAll(), loading the class and then collecting all classes that implement the Descriptor interface The above results seem decent to me - of course I don't know whether a class path with 55 jars is small or not. But I would expect it to scale linearly. (actually running the program in a loop 10 times caused the second case to drop to 0.256 sec on the last 5 runs, but clearly thats not a realistic case) However the thing is, I can scan my class path and get the classes for all the CDK descriptors - whether they are in CDK jars or else bundled in some other jar. Note that if we can move descriptors to their own package we can get the required entries from the jar files using one String.contains() call making it faster (case 1 becomes 0.375 sec) In the case of user plugins, it would take a little longer since if we allow users to place descriptor classes under arbitrary packages we have to load each class in the plugin jar and check its implemented interface. However, since the plugin jar is specified beforehand (and if we suggest a good practise of a descriptor plugin jar only containing descriptors!) then this will not be a huge problem in terms of time. ------------------------------------------------------------------- Rajarshi Guha <rx...@ps...> <http://jijo.cjb.net> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE ------------------------------------------------------------------- Alone, adj.: In bad company. -- Ambrose Bierce, "The Devil's Dictionary" |