From: Leo S. <leo...@df...> - 2009-01-26 13:42:22
|
Hi Daniel! good work! In general: I proposed that fine grained packages are the way to go in december, and I thing we should document all these proposals and the decisions here: http://aperture.wiki.sourceforge.net/ApertureInOSGi so Antoni, Daniel, please read on and say what is "the masterplan" now and then someone (= probably I) will change the ApertureInOSGi wikipage to show the masterplan. answers within... It was Dan...@em... who said at the right time 26.01.2009 11:08 the following words: > Hi all, > > with the fixes provided by Antoni I managed to get the "bundelized" > aperture to run in Smila. > horray! > In Smila we should refactor our two existing aperture integration > bundles into just one and also clean up the code and implement a > ProcessingService instead of a pipelet (Aperture OSGi services are used > now which "cries" to use DS) > no, I think you must not refactor these into one! from what I know about our architecture, refactoring the bundles would cause trouble: * aperture is separated into interfaces and implementations (framework <> implementation), bundling it into one would give the wrong impression to other developers who would then think that aperture is a monolithic piece of .... . whereas "really" Aperture is a perfectly osgi conformant framework, similar to eclipse extension points. (=you would also not bundle all implementations of an extension point into the bundle defining the extension!) * if there are different binary releases for OSGi and on sourceforge, this would cause desaster. we intentionally want to have ONE RELEASE as java and osgi versions, repackaging it for Eclipse would break this. did I miss something? does this help? > Here is a list of all the bundles (and their License) required to run > "bundelized" aperture in Smila: > > com.drew.metadata_2.4.0.jar (Public Domain) > javax.activation_1.1.1.jar (CDDL) > javax.mail_1.4.1.jar (CDDL) > jcl104-over-slf4j-1.5.0.jar (MIT) > openrdf-sesame-2.2.1-onejar-osgi.jar (BSD) > org.apache.poi_3.2.0.jar (Apache License 2.0) > org.bouncycastle.bcmail_1.32.0.jar (MIT) > org.bouncycastle.bcprovider_1.32.0.jar (MIT) > org.fontbox_0.2.0.jar (BSD) > org.htmlparser_1.6.0.jar (CPL 1.0) > org.jempbox.xmp_0.2.0.jar (BSD) > org.pdfbox_0.7.4.jar (BSD) > org.semanticdesktop.aperture.safe_1.2.0.jar (BSD) > org.semanticdesktop.aperture_1.2.0.jar (BSD) > rdf2go.api-4.7.0.jar (BSD) > rdf2go.impl.sesame22-4.7.0.jar (BSD) > slf4j-api-1.5.0.jar (MIT) > slf4j-jdk14-1.5.0.jar (MIT) > com.sun.media.jai (Sun Binary Code License Agreement) required by > PDFBox. Did not publish this bundle yet, as we can't use it in Smila. > > License wise, the bundles are all EPL compatible except for > com.sun.media.jai. > Anotni is keeping a lookout on pfdbox because of that. > 1) bundle org.semanticdesktop.aperture.safe_1.2.0.jar imports packages > from org.pdfbox_0.7.4.jar which in turn imports packages from > com.sun.media.jai. As the latter can't be provided by Smila (because of > LGPL) the other two bundles cannot be started if these packages are > missing!!! So we should separate the Extractors relying on PDFBox from > the other Extractors (putting them in their own bundle). > yep, for now this solves the issue > It seems to be a good approach in general, to provide the Extractors not > in one bundle but on a "bundle per extractor" basis. I made this masterplan back last year, where I said: > > * one aperture core OSGi bundle > * one OSGi bundle for each Extractor (only for extractors that depend > on "Eclipse-Friendly" 3rd party libs) > * all remaining crawlers & subcrawlers & extractors into an extra > OSGi package "the rest" > > Antoni, we already prepared all the fine-grained-activators for this, > so the task at hand is just to check the weird dependencies in the > core OSGi bundle (lib/applewrapper, lib/aduna-commons-xml-2.0.jar) > and move - one by one - the most useful extractors into individual > OSGi bundles. > > Once we got some core Extractors out there, we can do a release and done. > Can we get these running quick? > * Excel, Jpg, Office, OpenDocument, Pdf, Plaintext, Powerpoint, RTF > ... + all others that depend on POI > (PDF will be a beast because we have no official release of PDFBox) So we are halfway there - we still miss the individual bundles for each extractor. A proper packaging must somehow be "one bundle per extractor" because of the 3rd party libs hassle. At the moment we have "all safe extractors into one bundle" which we call "contrib", which is a bit weird, because it is NOT what we have in the aperture-contrib project, but anyway, it works (tm) As nobody objected back then, I assume this is still the masterplan! Daniel? Antoni - should we change http://aperture.wiki.sourceforge.net/ApertureInOSGi to reflect what I said above? > Even though the > Licenses of the other 3rd party bundles are OK, this does NOT mean that > the bundles will pass eclipse legal process ! One common problem is code > provenance. So if all Extractors remain in one bundle > org.semanticdesktop.aperture.safe_1.2.0.jar and just one 3rd party > bundle used by one Extractor does not pass it's CQ, Aperture can't be > used in Smila until this CQ is resolved or the dependencies are removed. > Finer grained bundles will allow us to use Aperture with a subset of > available Extractors. Adding additional extractors when their CQs are > completed. > ha in the beginning you said one bundle for whole aperture, now you follow the track of "one bundle for each extractor" ;-) I guess we are thinking the same direction :-) > 2) bundle org.semanticdesktop.aperture_1.2.0.jar contains 2 jar files > + aduna-commons-xml.2.0.jar > + applewrapper-0.2.jar > We need to create CQs for both jars and according to > http://aperture.wiki.sourceforge.net/Dependencies applewrapper-0.2.jar > is LGPL !? Are there any alternatives ? > this is fucked up, but I think Antoni fixed it today. > 3) do we need all those bundles for just mimetype detection and > extractors ? (e.g. sesame ?) Or could some dependencies be removed, > perhaps also by finer grained bundles ? > aperture is a SEMANTIC framework (as the S in SMILA :-), so we build on RDF, = sesame has to stay in or nothing will work in aperture. theoretically, it can be exchanged by Jena, because we are based on RDF2go, but you don't want to look into their own private hell of ~10mb of dependencies best Leo > > > Bye, > Daniel > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by: > SourcForge Community > SourceForge wants to tell your story. > http://p.sf.net/sfu/sf-spreadtheword > _______________________________________________ > Aperture-devel mailing list > Ape...@li... > https://lists.sourceforge.net/lists/listinfo/aperture-devel > -- ____________________________________________________ DI Leo Sauermann http://www.dfki.de/~sauermann Deutsches Forschungszentrum fuer Kuenstliche Intelligenz DFKI GmbH Trippstadter Strasse 122 P.O. Box 2080 Fon: +49 631 20575-116 D-67663 Kaiserslautern Fax: +49 631 20575-102 Germany Mail: leo...@df... Geschaeftsfuehrung: Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 ____________________________________________________ |