From: Gunnar A. G. <gun...@df...> - 2006-10-27 17:18:05
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Well done! I can rewrite the appleaddressbook crawler to use rdf2go classes and test it when I get back next friday. To create the temp model the crawler would also need to get a modelfactory somehow... hmm. Maybe this could be exposed in the rdfcontainer interface? - - Gunnar Antoni Mylka wrote: > So, more or less the first stage of migration seems to be nearing > completion. Here is a short summary of what has been done, what remains > to be done and what issues arose in the process.. > > ----------------------------------------------------------------------------- > > Translation issues: > > RDFContainer > > changed the interfaces to their RDF2Go Equivalents > getModel returns an RDFModel, not a generic Object > getValueFactory returns a ValueFactory > > Creation of values and statements > > There are three ways to create values (URIS, Literals, BlankNodes and > Statements) > * using the model interface directly > * using a value factory > * using ModelUtil static methods (which are exactly the same as those in the > valueFactory, but accept a model as their first argument). > > The rest is more of a mechanical process. > * changing imports > * changing new URIImpl into URIImpl.createURIWithoutChecking > * changing new LiteralImpl into ModelUtil.createLiteral, or > rdfContainer.getValueFactory().createLiteral > > > ----------------------------------------------------------------------------- > > Implementation independence issues: > > There are three goals I tried to pursue. > > 1. Core aperture should depend only on rdf2go.jar (that is > no class from the core aperture should use any concrete model > implementation, since that > would introduce a dependency on that particular implementation). > > 2. The tests may depend on ModelSesameImpl (but not on org.openrdf ... > classes) > > 3. From the concrete Node implementations only URIImpl can be used > directly in aperture code. It is impossible to prevent it, since it > causes chicken-and-egg type problems e.g. with constructing > rdfContainers. All other values are to be created with the aid of a > model instance (either directly or through a valuefactory or through a > ModelUtil). > > This is broken in: > > The AppleAddressBookCrawler. It uses a temporary model. Invokes a > createSimpleModel() method. That uses ModelImplSesame Originally it was > in RepositoryUtil class. I moved it to the AppleAddressBookCrawler for > the dependency issues to be better visible. > > The RDF2GoRDFContainer itself. If the user doesn't provide a model - it > creates a default ModelImplSesame model. I did some searching around the > code. The only place within core architecture classes those constructors > are used is the RDF2GoRDFContainer factory. The factory itself is used > in other parts of the code but no class creates it. If removing this > dependency is a concern, I would suggest a following solution. > > 1. Have RDF2GoRDFContainer accept a Model from outside. Don't provide > any default implementation. > > 2. Ask Benjamin to create a ModelImplSesameFactory. (I'm actually > surprised it isn't there). > > 3. Create a constructor for RDF2GoRDFContainerFactory that accepts an > instance of the ModelFactory interface. (it is possible since as I said > no aperture class creates instances of RDFContainerFactory, and the > DEFAULT_FACTORY static field is never used in aperture). > > 4. Use the ModelFactory in newInstance and and getRDFContainer > > 5. Remove the DEFAULT_FACTORY field. > > This might break applications that use aperture. It's hard for me to > estimate what would be the costs of such a change. > (every RDFContainerFactory creation would need an > instance of ModelFactory, DEFAULT_FACTORY couldn't be used). > > ----------------------------------------------------------------------------- > > URI Validation > > Leo is right that a malformed URI is usually an indication of a bug on > our side that should be found and removed. Nevertheless if a Schema for > some input file format states that some elements from the input should > be represented as URIs in the RDF output - we should be prepared for > situations where an input file can contain arbitrary strings that should > be interpreted as 'URIS' The solution to ignore DataObjects with faulty > URIs is simple and clear, It cannot be implemented with java.net.URI, > since a simple string without spaces is accepted. I do insist that we > need a general way to validate URI's from the input... a checkURI method > in the Model interface would be sufficient. It has been proposed > on the RDF2Go devel list independetly by me, and by Mr. Richard Cyganiak. > > In reply to this discussion Max Volkel changed a single comment in the > Model interface, from > > /** @return a new URI from the given String */ > > to > > /** The model must create URIs it would accept itself. > @return a new URI from the given String */ > > This is clearly NOT a solution, because in current implementation the > validation is to be employed when a log level is high enough. We can't > build an application that will be robust only if debug log level is > enabled... > > ------------------------------------------------------------------------------- > > Other issues: > > ...outlook.OutlookResource > > Contains references to the ICAL ontology. May I switch it to ICALTZD > ontology? > > Untestable classes... > > ...outlook.TestOutlookCrawlAll > ...outlook.TestOutlookCrawler > ...addressbook.AppleAddressbookCrawlerTest > > Couldn't test it since I don't have Outlook and Apple Addressbook... > > ----------------------------------------------------------------------------- > > Obsolete classes - had to be rewritten... > > RepositoryAccessData - replaced by ModelAccessData > > Since rdf2go doesn't support contexts directly it would be up to the > user of ModelAccessData, to provide a model implementation that would > use an appropriate context (if necessary) > > SesameRDFContainer - replaced by RDF2GoRDFContainer > SesameRDFContainerFactory - replaced by RDF2GoRDFContainerFactory > > RepositoryUtil - methods from this class have been included in ModelUtil > > I have also rewritten the ConfigurationUtil.get/set domain boundaries > and the VocabularyWriter, so they don't use SERQL queries anymore. > > ----------------------------------------------------------------------------- > RDF2Go Bugs... > > Something's wrong with the reading part... I have sent an email to the > RDF2Go devel > mailing list. This exception occurs in the VocabularyWriter and in > ThunderbirdCrawlerTest > > The ModelImplSesame constructor was wrong... > > There are deadlocks... > > There seems to be no implementation of ModelSet in the Sesame2 driver. > Or maybe I don't understand how to use it... I couldn't find any > documentation for the ModelFactory.getModelSet(Properties p). What > properties could go there? > > Antoni Mylka > ant...@df... > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Aperture-devel mailing list > Ape...@li... > https://lists.sourceforge.net/lists/listinfo/aperture-devel -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFQk4DfD15aMgAOfcRAgECAKDhnYZ7laSbuqS/UzWf0/Z307vmRQCg4r9G 2otqsGuuAziRFuf0fxY/j5w= =fB8+ -----END PGP SIGNATURE----- |