From: Keith R. B. <kbe...@bb...> - 2007-10-25 22:10:24
|
All - I'm working on a project that uses Maven for its build, distribution, =20 etc. We want to use Aperture, so we need to create an aperture.pom =20 file that defines Aperture's dependencies and where to find them. =20 Many of these dependencies are already in public Maven repositories. =20 For those that are not, we will need to put them in our private =20 repository. I'm hoping to come up with something that is general =20 enough so that any Aperture and Maven user can benefit from this. =20 Then, the Aperture community can benefit, and we can benefit from the =20 scrutiny and support of that community. For those files already in public repositories, we merely need to =20 specify their location (group, artifact ID, and version). For those files not already in public repositories, we will need to =20 copy them to a private repository (either an organization's or a =20 user's repository). I'm thinking of writing a shell script that takes =20 a Maven repository location as a parameter, and for each file in the =20 lib directory: * copies it to the right place in the repository * renames it according to Maven conventions (artifact_id-version, e.g. =20 myproject-2.1.jar) * (possibly) generates the MD5 sum ---------------------- I've been examining the dependencies, and have some questions: 1) I tested building Aperture without each of the jar files in the lib =20 directory, and several were not needed for a successful build. Are =20 there any of these that can be retired from the lib directory to =20 reduce the number of dependencies users need to have? Or are they all =20 needed at runtime? Here is the list: activation-1.0.2-upd2.jar applewrapper-0.2.jar bcmail-jdk14-132.jar bcprov-jdk14-132.jar fontbox-0.1.0-dev.jar infSail.jar jcl104-over-slf4j-1.3.0.jar unionSail.jar winlaf-0.5.1.jar 2) There are some cases where the jar file used by Aperture seems to =20 be outdated or alpha/beta, and a newer version exists in Maven public =20 repositories. (The public repository I examined was at =20 http://repo1.maven.org/maven2/org/.) The cases below show our library =20 name including its version and (to the best of my knowledge) the =20 corresponding up to date public Maven versions: activation-1.0.2-upd2.jar --> 1.1 fontbox-0.1.0-dev.jar --> 0.1.0 (does dev=3D=3Dbeta here?) org.apache.commons.codec_1.2.0.jar --> 1.3 (dated 11/2005) org.apache.commons.httpclient_3.0.0.rc2.jar=09--> 3.1 (dated 8/2007) 3) I know that the Jacob software is used to parse Outlook files. The =20 existence of the Jacob DLL means to me that this Outlook parsing will =20 only work in Windows. Is that correct? Are there any other OS =20 dependencies in Aperture? We're hoping that Aperture use and =20 functionality will be OS-independent. 4) We want to use Aperture only for document parsing; we don't need =20 its crawling functionality. Some of the dependency jar files are used =20 only for crawling. Have you ever considered splitting the crawling =20 and the parsing into separate artifacts to reduce the user's required =20 dependencies? I presume you have, but found the cost to be =20 prohibitive, or your other users just don't need it. If you've read this far, I owe you a drink. ;) Seriously, I realize =20 this message is long and appreciate your attention. Thanks for any =20 help you can offer. Regards, Keith Bennett |