From: Jack P. <jac...@gm...> - 2011-07-05 17:31:48
|
When one chooses the path to the journal, has anyone had any success choosing a path to one of those RAID networked terabyte stores? Jack On Tue, Jul 5, 2011 at 10:26 AM, Bryan Thompson <br...@sy...> wrote: > This is a bigdata (R) release. This release is capable of loading 1B triples in > under one hour on a 15 node cluster. JDK 1.6 is required. > > Bigdata(R) is a horizontally scaled open source architecture for indexed data > with an emphasis on semantic web data architectures. Bigdata operates in both > a single machine mode (Journal) and a cluster mode (Federation). The Journal > provides fast scalable ACID indexed storage for very large data sets. The > federation provides fast scalable shard-wise parallel indexed storage using > dynamic sharding and shard-wise ACID updates. Both platforms support fully > concurrent readers with snapshot isolation. > > Distributed processing offers greater throughput but does not reduce query or > update latency. Choose the Journal when the anticipated scale and throughput > requirements permit. Choose the Federation when the administrative and machine > overhead associated with operating a cluster is an acceptable tradeoff to have > essentially unlimited data scaling and throughput. > > See [1,2,8] for instructions on installing bigdata(R), [4] for the javadoc, and > [3,5,6] for news, questions, and the latest developments. For more information > about SYSTAP, LLC and bigdata, see [7]. > > Starting with this release, we offer a WAR artifact [8] for easy installation of > the Journal mode database. For custom development and cluster installations we > recommend checking out the code from SVN using the tag for this release. The > code will build automatically under eclipse. You can also build the code using > the ant script. The cluster installer requires the use of the ant script. You > can checkout this release from the following URL: > > https://bigdata.svn.sourceforge.net/svnroot/bigdata/branches/BIGDATA_RELEASE_1_0_0 > > New features: > > - Single machine data storage to ~50B triples/quads (RWStore); > - Simple embedded and/or webapp deployment (NanoSparqlServer); > - Fast 100% native SPARQL 1.0 evaluation with lots of query optimizations; > > Feature summary: > > - Triples, quads, or triples with provenance (SIDs); > - Fast RDFS+ inference and truth maintenance; > - Clustered data storage is essentially unlimited; > - Fast statement level provenance mode (SIDs). > > The road map [3] for the next releases includes: > > - High-volume analytic query and SPARQL 1.1 query, including aggregations; > - Simplified deployment, configuration, and administration for clusters; and > - High availability for the journal and the cluster. > > For more information, please see the following links: > > [1] https://sourceforge.net/apps/mediawiki/bigdata/index.php?title=Main_Page > [2] https://sourceforge.net/apps/mediawiki/bigdata/index.php?title=GettingStarted > [3] https://sourceforge.net/apps/mediawiki/bigdata/index.php?title=Roadmap > [4] http://www.bigdata.com/bigdata/docs/api/ > [5] http://sourceforge.net/projects/bigdata/ > [6] http://www.bigdata.com/blog > [7] http://www.systap.com/bigdata.htm > [8] https://sourceforge.net/projects/bigdata/files/bigdata/ > > About bigdata: > > Bigdata(r) is a horizontally-scaled, general purpose storage and computing fabric > for ordered data (B+Trees), designed to operate on either a single server or a > cluster of commodity hardware. Bigdata(r) uses dynamically partitioned key-range > shards in order to remove any realistic scaling limits - in principle, bigdata(r) > may be deployed on 10s, 100s, or even thousands of machines and new capacity may > be added incrementally without requiring the full reload of all data. The bigdata(r) > RDF database supports RDFS and OWL Lite reasoning, high-level query (SPARQL), > and datum level provenance. |