[Bigdata-developers] modules for mavenization

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Sean,

What follows is where I think the module divisions might be for a fine grained mavenization.  I'd tried to give this in terms of increase dependency order.  As you've noted there are a variety of places where an API module needs to be defined to break a cycle.  This is a lot of possible divisions.  However, we do not need JARs for all of this.  For example, there could be one JAR per concrete service implementation without having jars for their abstract implementations. 

- commons-io, -util, -striterator, etc.

- counters (but concrete implementations of counter stores will be backed by a RW journal).

- store-api (IRawStore, IBigdataFederation, IResourceMetadata, etc).

- btree (everything under com.bigdata.btree).

- sparse (the key-value store, which is a dependency for journals and services).

- query (the query engine and non-RDF specific query operators: com.bigdata.relation, com.bigdata.bop)

- services-api (APIs for the various services)

- services-async-api (asynchronous write pipeline APIs, which are used by the RDF bulk loader)

- rdf

- sail

- journal (WORM, RW, etc).

- journal-HA (journal + HA, which brings in jini and zookeeper)

- services-async-impl (used for distributing jobs between masters and clients and for asynchronous index writes in the bulk loader).

- services-tx,-mds,-ds,-lbs,-cs, etc.  The abstract service implementations.  The data service module would get com.bigdata.resources, which is the shard container stuff. 

- query-peer. The version of the query engine which is federation aware.  The peer lives in the data service, but I could see cases where we might want to reuse this to have query against the performance counters in the load balancer.

- services-jini-tx,-mds,-ds,-lbs,-cs, etc.  the concrete implementations of the various services using jini.

There are definately going to be explicit casts to AbstractJournal from the rdf/sail stuff, but I think that we can work through those.  They are mainly for reporting out various interesting details.

Eventually, I think that both the load balancer and the metadata service might be decomposed or deconstructed.  The load balancer could be split into a stateless piece which queries a service that aggregates and manages performance counters for the federation.  The metadata service could be decomposed into a P2P protocol with the data services self-reporting the shards for which they have responsibility.

Thanks,
Bryan

[Bigdata-developers] modules for mavenization

Fast, scalable, robust graph database platform

[Bigdata-developers] modules for mavenization