From: Antoni M. <ant...@gm...> - 2009-08-31 17:56:24
|
Sean O'Connor pisze: > Hi all, > I am working on implementing a DataAccess backed by a SQL database. > Since I like java more than sql, and I've gotten used to ORM (and > grails/gorm), I'd like to try using hibernate. > > From what I can tell so far, here are the steps I need to take: > > * HibernateAccessData > o class I create: extend AccessDataImpl > + is this reasonable? seems like a good starting point. > + put(id, key, value) and get(id, key) are the two > functions I should focus on, right? > + AggregationNode, UntouchedIterator, > AggregatedClosureIterator > # do I need to do something with these? > + idMap, referredIDMap, aggregatedIDMap > # is this just efficient lookups? > # If I am happy with hibernate and it's caches, > can I ignore/remove/...? AccessData is actually more complex than a hashmap of hashmaps. For each id it stores the following elements (conceptually) - a hashmap of strings (e.g. "lastModified" => "134523451") - a boolean touched/untouched flag, at the beginning all ids are untouched, whenever you call put(String,String,String) or get (String,String), the flag for the id is 'touched' two important use-cases - "have I seen it before?", e.g. the webcrawler doesn't get caught in circular links - "report all unseen ids as deleted", done by the CrawlerBase, without it no crawler will report deleted objects correctly - a set of "referredIDs" - used by the webcrawler to store info about links coming out of a given website - if you don't use webcrawer, you can ignore this mechanism (leave an empty stub) - a set of aggregatedIDs - stores information about the aggregation between data objects - if we detect that a zip is unmodified, we can report everything inside as unmodified, otherwise we'd have either to crawl into the zip file each time, or else every 'untouched' data object would be reported as deleted, (the same for unmodified emails with attachments). - you could experiment with ignoring this method (leaving an empty stub) but then reporting unmodified zip files and emails would not work > * HibernateAccessor > o class I create: implement DataAccessor > o public DataObject getDataObject(...) > o public DataObject getDataObjectIfModified(...) > o Map params :: these are optional? > o RDFContainerFactory containerFactory = > getRDFContainerFactory(postHref); > + // copy/paste from DeliciousCrawler > + so I need get a ref to the containerFactory in my > crawler, and then pass it as needed? You can leave params as null if your accessor doesn't use them. I assume that your accessor would fetch some objects from the database via hiberbate and convert them to RDF. The containerFactory is just meant to give the accessor a place to store the data in. Make sure all the models you use are opened before being passed to the accessor. > * Initialization? > o I am lost in OSGi. I know what you mean. If you don't use OSGI you can ignore it. If you do, you need to remeber about: - implement an activator (use other activators as a template) - make sure that correct manifest headers are generated, we use the maven-bundle-plugin, use other pom.xml files as a template - make sure all dependencies you use are available as osgi bundles either osgi-enable them yourselves or look at: http://www.springsource.com/repository http://download.eclipse.org/tools/orbit/downloads/ > o Can I "take the easy way out?" -- some hardcoded > initialization to start with, for simplicity sake? Don't know exactly what you mean. In plain non-osgi java application you can use your accessor directly DataAccessor acc = new HibernateDataAccessor(...) or via a factory DataAccessorFactory fac = new HibernateDataAccessorFactory(...); DataAccessor acc = fac.get(); or add your factory to the default registry DataAccessorRegisty reg = new DefaultDataAccessorRegistry(): reg.add(new HibernateDataAccessorFactory(...)); DataAccessorFactory fac = reg.get("hibernate"); // or something similar DataAccessor acc = fac.get(); ... depending on your use case. > o Perhaps all I have to worry about is initializing hibernate > in my app? In general the architectural goal of Aperture is to make it possible to write an application that knows nothing about concrete implementations, and only works with URIs of data objects, data source types and URI schemes. In such a case it would probably be best to initialize Hibernate in a Factory, pass the factory to a registry and assume that there will be a single instance of the factory inside an application. That's how the Nepomuk Eclipse prototype worked. You are obviously free to deviate from it as you please if that's better for your application. All kinds of comments welcome Antoni Mylka ant...@gm... |