From: E.L. W. (Egon) <eg...@sc...> - 2003-09-28 07:49:04
|
On Saturday 27 September 2003 23:07, Peter Murray-Rust wrote: > At 20:23 27/09/2003 +0200, E.L. Willighagen (Egon) wrote: > >On Saturday 27 September 2003 18:15, Peter Murray-Rust wrote: > > > CDK is now becoming somewhat large and it is often non-trivial to add > > > it to the classpath. After checking out today's CVS and building, it > > > takes 34 Mbyte on disk and has 64 jar files. There are duplicate copies > > > of many (e.g. 2 xerces, 2 xalan, etc). The dist contains 6 jar files > > > (0.6 Mbytes) but requires some or all of cdk/jar with 28 jar files and > > > 5.6 Mbytes. > > Thanks > > >I count 26 in cdk/jars + 9 cdk-*.jar's. > > The rest are in the docs directory. Agreed, they shouldn't get into the > final distribution but they still have to be downloaded. Right... forgot about those... > > > At present I believe I have to use a minimum of 7 CDK jars: > > > <property name="cdkCore.jar" value="${cdkdist.dir}/cdk-core.jar"/> > > > <property name="cdkIo.jar" value="${cdkdist.dir}/cdk-io.jar"/> > > > <property name="cdkStandard.jar" > > > value="${cdkdist.dir}/cdk-standard.jar"/> <property name="cdkExtra.jar" > > > value="${cdkdist.dir}/cdk-extra.jar"/> > > > > > > <property name="vecmath.jar" > > > value="${cdklib.dir}/vecmath1.2-1.14.jar"/> <property name="jsx.jar" > > > value="${cdklib.dir}/JSX1.0.7.4.jar"/> <property name="gnujaxp.jar" > > > >CDK from cvs no longer needs JSX. > > Thanks. The difficulty is that is that it's difficult to know what is and > isn't used. Indeed. Some classes have the required jar in the JavaDoc... but that's not sufficient... We should have a system that formalizes this, and converts it into a webpage, just like we generate the list of keywords... > > > value="${cdklib.dir}/gnujaxp.jar"/> for a typical CDK application. > > > There is a lot of scope for error. I particularly find that including > > > additional XML parsers (e.g. gnujaxp) is likely to cause problems and > > > having old xerces in the classpath is often a poor idea. > > > >Some time ago you send me a patch which should make the CMLReader at least > >work with the Java 1.4 XML parser too... So you could try without both > > xerces and gnujaxp in the classpath... > > I thought that gnujaxp was used for Isotope... If so, that would be a bug... there is no special reason to use gnujaxp instead of any other XML parser for parsing the Isotope info... > > > Similarly the latest distrib (20030909) is 11 Mbyte on disk, with 20 > > > jars requiring 4.5 Mbytes. > > > > > > My current concern is that I want to redistribute applications based on > > > CDK. This means that the distribution has to be manageable by the > > > recipient, perhaps with constraints over which I have no control. > > > > > > I appreciate that this is a tricky problem but the size and complexity > > > makes it difficult to redistribute applications using CDK. The 20 jars > > > are likely to interact with the recipients' classpath in unpredictable > > > ways. > > > >This comes down to making a proper script (BAT, or shell) that takes care > > of *re*setting the classpath... i.e. not taking the content from the > > CLASSPATH environment variable... > > > >Optionally, you could use the approach Jmol is using by putting all > > required libraries into one application.jar and have people start it with > > java -jar appl.jar... that would skip the $CLASSPATH alltogether... > > I agree - I think this is a useful approach. > > >But you need to carefully check which libs/classes you need... and this > > might be a iterative try-and-run procedures by starting with no libs at > > all... lot of work, but will give a nice and clean jar... > > Well I normally start with no jars and add them gradually to the > compilation classpath. But I don't know which of them are required for what > and I sometimes find that at runtime I have to add more jars - are any > classes loaded by name? See above... > > > For example are castor, xerces and xalan (2.5+ Mbytes) essential to > > > using CDK? > > > >Xerces is, the isotope and atom type information used throughout CDK is > > using XML config files... (which I hope to port to CML2 soon, btw). Xalan > > should not really be there at this moment... (we don't use XSLT sheets to > > transform CML into anything for output formats...) > > > > > I don't know what the best approach is. I suspect some classes are only > > > used infrequently and perhaps could be specifically loaded on demand. > > > It could be useful to cut out old versions of (say > > > xerces/xalan/batik/gnujaxp, etc.) although this is tedious work. > > > >Depending on what you need, you can do without a number of jars... e.g. if > >you > >do not need SVG output, you can leave out all the batik*.jar... > > > >An alternative, btw, is the mechanism the Jmol uses... i.e. to specify > > which CDK class you need, and then make a customized CDK jar... which is > > the jmolcdk.jar found in Jmol cvs... > > It would be useful to have a brief indication of why these jars are > included, which would make it easy to know whether they should be omitted. Agreed. I'll send around a RFC with a proposal on how to formalize this. Egon -- PhD Molecular Representation in Chemometrics Laboratory of Analytical Chemistry http://www-cac.sci.kun.nl/people/egonw.html |