[cweb-CVS] bigdata-rdf/src/java/com/bigdata/rdf TripleStore.java, 1.31, 1.32

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Update of /cvsroot/cweb/bigdata-rdf/src/java/com/bigdata/rdf
In directory sc8-pr-cvs4.sourceforge.net:/tmp/cvs-serv797/src/java/com/bigdata/rdf

Modified Files:
	TripleStore.java 
Log Message:
javadoc edits.


Index: TripleStore.java
===================================================================
RCS file: /cvsroot/cweb/bigdata-rdf/src/java/com/bigdata/rdf/TripleStore.java,v
retrieving revision 1.31
retrieving revision 1.32
diff -C2 -d -r1.31 -r1.32
*** TripleStore.java	18 Apr 2007 17:29:08 -0000	1.31
--- TripleStore.java	19 Apr 2007 13:22:31 -0000	1.32
***************
*** 105,110 ****
   * A triple store based on the <em>bigdata</em> architecture.
   * 
-  * @todo verify that re-loading the same data does not cause index writes.
-  * 
   * @todo Refactor to support transactions and concurrent load/query and test
   *       same.
--- 105,108 ----
***************
*** 153,170 ****
   *       where appropriate, so we need to assign identifiers to bnodes in a
   *       restart-safe manner even if we "forget" the term-id mapping.
!  *       
   * @todo modify the term identifier assignment mechanism to be compatible with
   *       the scale-out index partitions (32-bit unique within index partition
!  *       identified plus a restart-safe counter for each index partition). 
   * 
   * @todo Refactor to use a delegation mechanism so that we can run with or
   *       without partitioned indices? (All you have to do now is change the
   *       class that is being extended from Journal to MasterJournal and handle
!  *       some different initialization properties.)
   * 
   * @todo the only added cost for a quad store is the additional statement
   *       indices. There are only three more statement indices in a quad store.
   *       Since statement indices are so cheap, it is probably worth implementing
!  *       them now, even if only as a configuration option.
   * 
   * @todo verify read after commit (restart safe) for large data sets and test
--- 151,171 ----
   *       where appropriate, so we need to assign identifiers to bnodes in a
   *       restart-safe manner even if we "forget" the term-id mapping.
!  * 
   * @todo modify the term identifier assignment mechanism to be compatible with
   *       the scale-out index partitions (32-bit unique within index partition
!  *       identified plus a restart-safe counter for each index partition).
   * 
   * @todo Refactor to use a delegation mechanism so that we can run with or
   *       without partitioned indices? (All you have to do now is change the
   *       class that is being extended from Journal to MasterJournal and handle
!  *       some different initialization properties.) In fact, the "triple store"
!  *       should be a client that uses partitioned indices to talk to metadata
!  *       and data services.
   * 
   * @todo the only added cost for a quad store is the additional statement
   *       indices. There are only three more statement indices in a quad store.
   *       Since statement indices are so cheap, it is probably worth implementing
!  *       them now, even if only as a configuration option. (There may be reasons
!  *       to maintain both versions.)
   * 
   * @todo verify read after commit (restart safe) for large data sets and test
***************
*** 184,193 ****
   * 
   * @todo support metadata about the statement, e.g., whether or not it is an
!  *       inference.
!  * 
!  * @todo compute the MB/sec rate at which the store can load data and compare it
!  *       with the maximum transfer rate for the journal without the btree and
!  *       the maximum transfer rate to disk. this will tell us the overhead of
!  *       the btree implementation.
   * 
   * @todo Try a variant in which we have metadata linking statements and terms
--- 185,192 ----
   * 
   * @todo support metadata about the statement, e.g., whether or not it is an
!  *       inference.  consider that we may need to move the triple/quad ids into
!  *       the value in the statement indices since some key compression schemes
!  *       are not reversable (we depend on reversable keys to extract the term 
!  *       ids for a statement). 
   * 
   * @todo Try a variant in which we have metadata linking statements and terms
***************
*** 217,220 ****
--- 216,240 ----
   *       for more thought.
   * 
+  * @todo examine role for semi joins for a Sesame 2.x integration (quad store
+  *       with real query operators). semi-joins (join indices) can be declared
+  *       for various predicate combinations and then maintained. The
+  *       declarations can be part of the scale-out index metadata. The logic
+  *       that handles batch data load can also maintain the join indices. While
+  *       triggers could be used for this purpose, there would need to be a means
+  *       to aggregate and order the triggered events and then redistribute them
+  *       against the partitions of the join indices. If the logic is in the
+  *       client, then we need to make sure that newly declared join indices are
+  *       fully populated (e.g., clients are notified to start building the join
+  *       index and then we start the index build from existing data to remove
+  *       any chance that the join index would be incomplete - the index would be
+  *       ready as soon as the index build completes and client operations would
+  *       be in a maintenance role).
+  * 
+  * @todo provide option for closing aspects of the entire store vs just a single
+  *       context in a quad store. For example, in an open web and internet scale
+  *       kb it is unlikely that you would want to have all harvested ontologies
+  *       closed against all the data. however, that might make more sense in a
+  *       more controlled setting.
+  * 
   * @author <a href="mailto:tho...@us...">Bryan Thompson</a>
   * @version $Id$