This list is closed, nobody may subscribe to it.
| 2010 |
Jan
|
Feb
(19) |
Mar
(8) |
Apr
(25) |
May
(16) |
Jun
(77) |
Jul
(131) |
Aug
(76) |
Sep
(30) |
Oct
(7) |
Nov
(3) |
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(2) |
Jul
(16) |
Aug
(3) |
Sep
(1) |
Oct
|
Nov
(7) |
Dec
(7) |
| 2012 |
Jan
(10) |
Feb
(1) |
Mar
(8) |
Apr
(6) |
May
(1) |
Jun
(3) |
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
(8) |
Dec
(2) |
| 2013 |
Jan
(5) |
Feb
(12) |
Mar
(2) |
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
(22) |
Aug
(50) |
Sep
(31) |
Oct
(64) |
Nov
(83) |
Dec
(28) |
| 2014 |
Jan
(31) |
Feb
(18) |
Mar
(27) |
Apr
(39) |
May
(45) |
Jun
(15) |
Jul
(6) |
Aug
(27) |
Sep
(6) |
Oct
(67) |
Nov
(70) |
Dec
(1) |
| 2015 |
Jan
(3) |
Feb
(18) |
Mar
(22) |
Apr
(121) |
May
(42) |
Jun
(17) |
Jul
(8) |
Aug
(11) |
Sep
(26) |
Oct
(15) |
Nov
(66) |
Dec
(38) |
| 2016 |
Jan
(14) |
Feb
(59) |
Mar
(28) |
Apr
(44) |
May
(21) |
Jun
(12) |
Jul
(9) |
Aug
(11) |
Sep
(4) |
Oct
(2) |
Nov
(1) |
Dec
|
| 2017 |
Jan
(20) |
Feb
(7) |
Mar
(4) |
Apr
(18) |
May
(7) |
Jun
(3) |
Jul
(13) |
Aug
(2) |
Sep
(4) |
Oct
(9) |
Nov
(2) |
Dec
(5) |
| 2018 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2019 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: b. <no...@so...> - 2010-04-21 16:48:15
|
#72: Testing - Make changes to allow all tests of TestServiceStarter to pass
when run using both ant & eclipse
--------------------------------+-------------------------------------------
Reporter: btmurphy | Owner: btmurphy
Type: defect | Status: new
Priority: major | Milestone:
Component: Bigdata Federation | Version:
Keywords: |
--------------------------------+-------------------------------------------
In comment 34 of Trac issue #53, indicates that the tests
specified by com.bigdata.jini.start.TestServiceStarter do
not all pass when run using either ant or eclipse.
As indicated in that comment, when run using eclipse, the
tests fail because a security policy file is not set on
the VM in which the tests run.
Additionally, because of changes made as part of changeset 2614,
AbstractFedZooTestCase now overrides the groups to join for the
TransactionServer that is started as part of the test. These
changes have resulted in test failures when using the ant junit
task; because the TransactionServer is not discovered, because
the member groups of the lookup service that is started by ant
is not the same as the groups the TransactionServer is configured
to join.
Finally, besides the above issues, it appears that the test
TestServiceStarter.test_startServer will fail in a number of
ways related to how the Zookeeper client and server are
handled in that test. For example,
1. It appears that 2 child znodes are created, a "physical services"
znode, and a "master election" znode. But when the test attempts
to verify that the physicalServices znode was registered with
the Zookeeper server, it the test will fail because the test
asserts that there is only 1 child znode, when there is actually
2 children.
2. When the "physical services" znode is created, an empty byte
array is input for the data parameter; which will cause the test
to fail with a RuntimeException when an attempt is made to
validate the serviceUUID by deserializing the data element
associated with that znode.
3. At the end of test_startServer, the test waits 20 seconds
for the physical services znode to be removed. But it appears
that the test will always fail because that znode is not removed
until the AbstractZooFedTestCase.tearDown method is called (which
calls ProcessHelper.kill & JiniFederation.shutdownNow); which
is not invoked until after the wait for znode removal has
timed out and the test has declared failure.
There are a number of changes that can be made to address the
issues described above.
--
Ticket URL: <http://sourceforge.net/apps/trac/bigdata/ticket/72>
bigdata® <http://www.bigdata.com/blog>
bigdata® is a scale-out storage and computing fabric supporting optional transactions, very high concurrency, and very high aggregate IO rates.
|
|
From: Bryan T. <br...@sy...> - 2010-04-14 20:24:56
|
This is a bigdata (R) snapshot release. This release is capable of loading 1B triples in under one hour on a 15 node cluster and has been used to load up to 13B triples on the same cluster. JDK 1.6 is required. See [1] for instructions on installing bigdata(R), [2] for the javadoc and [3] and [4] for news, questions, and the latest developments. For more information about SYSTAP, LLC and bigdata, see [5]. Please note that we recommend checking out the code from SVN using the tag for this release. The code will build automatically under eclipse. You can also build the code using the ant script. The cluster installer requires the use of the ant script. You can checkout this release from the following URL: https://bigdata.svn.sourceforge.net/svnroot/bigdata/tags/RELEASE_0.82b This tag corresponds to revision 2604. New features: - Full transaction support in the SAIL. - Installer for the Sesame workbench server. - SPARQL query hints in the SAIL. - Native execution of UNION queries. (And better overall coverage for native rule execution in general.) - Significant performance gains in SPARQL query execution as measured against the LUBM and BSBM benchmarks. Using a commodity server, the LUBM U50 queries run in ~6 seconds using a branching factor of 64 and the RW store and BSBM query performance is ~ 2300 query mixes per hour (QMpH) for 100M triples with 4 concurrent clients and the LIRS cache). - A new "WORM" persistence store implementation supporting concurrent IO (DiskWORM). The WORMStrategy is a binary compatible replacement for the DiskOnlyStrategy which the potential for concurrent IOs. People interested in a preview may enable this implementation on either new or existing persistence stores by changing the com.bigdata.journal.BufferMode to DiskWORM. - A new "RW" persistence store implementation with limited history retention supporting concurrent IO, secure checksummed data, and a storage model that may significantly reduce storage requirements in particular for skewed data. The roadmap for the next release includes: - Refactor of the dynamic sharding mechanism for higher performance; - High availability for the journal; - Query optimizations. Our next milestone will include both high availability and a greatly simplified deployment, configure, and administration process for the clustered database. Later this year we will be introducing support for RDF spatial and analytic query workloads. For more information, please see the following links: [1] http://bigdata.wiki.sourceforge.net/GettingStarted [2] http://www.bigdata.com/bigdata/docs/api/ [3] http://sourceforge.net/projects/bigdata/ [4] http://www.bigdata.com/blog [5] http://www.systap.com/bigdata.htm About bigdata: Bigdata(r) is a horizontally-scaled, general purpose storage and computing fabric for ordered data (B+Trees), designed to operate on either a single server or a cluster of commodity hardware. Bigdata(r) uses dynamically partitioned key-range shards in order to remove any realistic scaling limits - in principle, bigdata(r) may be deployed on 10s, 100s, or even thousands of machines and new capacity may be added incrementally without requiring the full reload of all data. The bigdata(r) RDF database supports RDFS and OWL Lite reasoning, high-level query (SPARQL), and datum level provenance. |
|
From: Brian M. <btm...@gm...> - 2010-04-13 12:55:54
|
On Tue, Apr 13, 2010 at 8:08 AM, Bryan Thompson <br...@sy...> wrote: In preparation for our next releas, can everyone with an issue on the > tracker please review their issues and update their status if appropriate. Trac issue #66 has patch files attached that probably should be reviewed by Bryan, Mike, or Martyn prior to merging to the trunk. Brian M |
|
From: Bryan T. <br...@sy...> - 2010-04-13 12:08:40
|
Hello, In preparation for our next releas, can everyone with an issue on the tracker please review their issues and update their status if appropriate. Thanks, Bryan |
|
From: Bryan T. <br...@sy...> - 2010-04-12 14:42:23
|
Martyn points out that this may have been an uninterruptable Lock.lock() invocation in JournalTransactionService.SinglePhaseCommit. I think that a survey of the code would discover many places where uninteruptable lock acquisitions are attempted. Bryan > -----Original Message----- > From: Bryan Thompson [mailto:br...@sy...] > Sent: Monday, April 12, 2010 10:38 AM > To: big...@li... > Subject: [Bigdata-developers] Design patterns for shutdown of > embedded services and in particular for the Journal. > > We currently have a deadlock which occasionally arises in > Journal#shutdownNow(). That method closes down various > services used by the journal (an executor service, the > concurrency manager (which also shuts down the write > service), and the local transaction manager). If there is a > concurrent commit running, then the code will occasionally > deadlock as illustrated below. > > I'd like to find someone with experience in shutdown patterns > for embedded services who could help out on this shutdown > pattern. Any volunteers? > > Bryan > > "pool-1-thread-560" daemon prio=10 tid=0x52e9ac00 nid=0x7ae6 > waiting on condition [0x41869000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x88a1fe40> (a > java.util.concurrent.FutureTask$Sync) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndC > heckInterrupt(AbstractQueuedSynchronizer.java:747) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquir > eSharedInterruptibly(AbstractQueuedSynchronizer.java:905) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireS > haredInterruptibly(AbstractQueuedSynchronizer.java:1217) > at > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218) > at java.util.concurrent.FutureTask.get(FutureTask.java:83) > at > com.bigdata.journal.JournalTransactionService.commitImpl(Journ > alTransactionService.java:300) > at > com.bigdata.service.AbstractTransactionService.commit(Abstract > TransactionService.java:1626) > at com.bigdata.journal.Journal.commit(Journal.java:726) > > "main" prio=10 tid=0x08f62c00 nid=0x6c29 waiting on condition > [0xb7395000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x88a241b8> (a > java.util.concurrent.locks.ReentrantLock$NonfairSync) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndC > heckInterrupt(AbstractQueuedSynchronizer.java:747) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQ > ueued(AbstractQueuedSynchronizer.java:778) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire( > AbstractQueuedSynchronizer.java:1114) > at > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(Reen > trantLock.java:186) > at > java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) > at > com.bigdata.service.AbstractTransactionService.shutdownNow(Abs > tractTransactionService.java:488) > at com.bigdata.journal.Journal$1.shutdownNow(Journal.java:201) > at com.bigdata.journal.Journal.shutdownNow(Journal.java:852) > - locked <0x86a5dc40> (a com.bigdata.journal.Journal) > -------------------------------------------------------------- > ---------------- > Download Intel® Parallel Studio Eval Try the new > software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
|
From: Bryan T. <br...@sy...> - 2010-04-12 14:38:23
|
We currently have a deadlock which occasionally arises in Journal#shutdownNow(). That method closes down various services used by the journal (an executor service, the concurrency manager (which also shuts down the write service), and the local transaction manager). If there is a concurrent commit running, then the code will occasionally deadlock as illustrated below. I'd like to find someone with experience in shutdown patterns for embedded services who could help out on this shutdown pattern. Any volunteers? Bryan "pool-1-thread-560" daemon prio=10 tid=0x52e9ac00 nid=0x7ae6 waiting on condition [0x41869000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x88a1fe40> (a java.util.concurrent.FutureTask$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:905) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1217) at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at com.bigdata.journal.JournalTransactionService.commitImpl(JournalTransactionService.java:300) at com.bigdata.service.AbstractTransactionService.commit(AbstractTransactionService.java:1626) at com.bigdata.journal.Journal.commit(Journal.java:726) "main" prio=10 tid=0x08f62c00 nid=0x6c29 waiting on condition [0xb7395000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x88a241b8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) at com.bigdata.service.AbstractTransactionService.shutdownNow(AbstractTransactionService.java:488) at com.bigdata.journal.Journal$1.shutdownNow(Journal.java:201) at com.bigdata.journal.Journal.shutdownNow(Journal.java:852) - locked <0x86a5dc40> (a com.bigdata.journal.Journal) |
|
From: Bryan T. <br...@sy...> - 2010-04-02 21:06:10
|
All, Hudson builds are now running automatically and should be mirroring the unit test results to [1] each time they run. Right now the test suites take 20m to run and we have a 97% success rate. Once we enable them all that will go up, but there seems to be some issue with held references for stores that do not get destroyed so some database modes are not being tested at this time. People doing commits: Please make an effort to check this [1] each day so you can see if you've broken something. Once we get to an "all green" state for the project I will enable email nags. There are hooks available in hudson for code coverage tests and related stuff, so let's consider adding some of that into the build. However, my next goal with hudson is to introduce a separate series of performance tests so we can capture regression or improvement in different database modes over time as well. Bryan PS: You can actually access historical hudson builds using URIs patterned on [2]. However, no index is being generated for those historical builds (by Hudson). We could probably do that using a cron job on www.bigdata.com. [1] http://www.bigdata.com/bigdata/hudson/lastSuccessful/report/ [2] http://www.bigdata.com/bigdata/hudson/builds/N/report/, where N is the build number. |
|
From: Bryan T. <br...@sy...> - 2010-04-02 00:31:06
|
All, We now have a limited set junit tests running via hudson on each SVN commit at [1]. I am going to expand the test coverage (by uncommenting tests in the ant junit target), but we need to debug an apparent resource exhaustion when running the full test suite on the server (perhaps some junit tests are not releasing all references to journals, etc). We also have plans to introduce a suite of performance tests. Since some performance tests can take as much as a day or more, this will probably run on a different machine from the normal CI builds. Bryan [1] http://www.systap.com/bigdata/hudson/lastSuccessful/report/ |
|
From: Bryan T. <br...@sy...> - 2010-03-24 20:50:53
|
I have merged the developed branch [DEV_BRANCH_27_OCT_2009] into the trunk. This brings us to committed revision 2547. At this time, we are preparing for a release from the trunk. People should restrict activities on the trunk to bug fixes. If you are doing any exploratory work, please create a private branch for your work. Thanks, Bryan PS: I have some outstanding code changes in DEV_BRANCH_27_OCT_2009 dealing with changes to dynamic sharding. Rather than incorporate those into the trunk now and possibly break code for people who are using scale-out, I am planning to do a second merge from this branch to the trunk once I have tested these changes out on a cluster. |
|
From: Bryan T. <br...@sy...> - 2010-03-23 17:26:23
|
All, We should do something to improve our IRI compression in the reverse lexicon (ID2TERM) and potentially our IRI locality in the forward lexicon (TERM2ID). Right now things stand as follows: TERM2ID: This index using front coding (prefix compression). This works quite nicely. We could improve locality for web graph data by transforming URLs such that the domain part of the URI looks like "com.bigdata.www" rather than "www.bigdata.com". This would organize anything in "com.bigdata.blog" close to "com.bigdata.www" rather than close to "blog.foo.org". This transformation would only exist in the TERM2ID keys. The reverse index (ID2TERM) would store the untransformed URI. ID2TERM: The term identifiers are not well correlated with the term types (literals, uris, etc). The compression scheme here is also dead simple (basically, no compression). We would do better if we moved the flag bits which indicate the type (literal, uri, bnode, or statement identifier) into the high order bits so most leaves would only have a single type of value, e.g., all literals, all uris, etc. We could then do type-specific compression rather easily, handling URIs in one way, e.g., by segmenting them into a domain and a sequence of path names and coding those in a dictionary, etc. Likewise, by moving those type flag bits into the high order bits of the term identifier, each shard (after the initial shard) could be constrained to have only a single kind of data (e.g., only URIs, only literals, etc). That would probably improve access patterns as well. Also, in terms of ID2TERM, there has been some off list discussion and we are inclined to introduce transparent "blobs" for long literals. Obviously this has some bearing on compression techniques since we would only expect to find literals of modest length inline. Right now, if you load an ontology into the system all URIs with the same prefix will be assigned term identifiers which are relatively close to one another and most probably dense. This is even true in scale-out since term identifiers are assigned shard-wise by the TERM2ID index. It might be that we could do more to exploit this fact. Thoughts? Bryan |
|
From: Martyn C. <ma...@sy...> - 2010-03-22 18:21:09
|
Hi Bryan, Yep, this definitely makes sense for a number of reasons. Firstly it'll be good to be able to use SVN incrementally in larger experimental projects, it'll remove the tension between wanting to check in incremental versions and reluctance to commit into the main trunk. This would also encourage developers to play with and share different strategies. - Martyn Bryan Thompson wrote: > All, > > I would like to change to a model where we do development in the trunk, slow down commits to the trunk as we approach release points, and do experimental development in branches which people create and maintain for those purposes themselves. Release points will be tagged as branches for maintenance. I think that this is all pretty standard and I think that it will make it easier to manage versioning overall for the project and easier for people who want to get the "latest and greatest" from the trunk while keeping the trunk relatively stable. > > We are planning to have a release this month which includes various performance optimizations and full transaction support for the SAIL together with a preview of the new RWStore. This will be followed by another release next month which includes more query optimizations, a hardened version of the RWStore, and HA support for the Journal. People who are not directly working on those feature sets should coordinate and establish a branch for their development activities. > > My plan is to merge the current state of SVN in the development branch down to the trunk on Wednesday morning. For that reason, I would like people to reach a checkpoint with their commits by Tuesday evening. I will send out a notice once the merge to trunk is done. At that point, I would like people doing anything experimental to create a branch and do work within that branch and coordinate with me when they are ready to merge changes back into the trunk. > > Please let me know if you have any questions or if the timeline for this merge is problematic for you. > > Thanks, > > Bryan > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > . > > |
|
From: Mike P. <mi...@sy...> - 2010-03-22 17:54:14
|
I think this is a much better system. I shall make sure to make my commits by tomorrow night at the latest, hopefully tonight. -Mike -----Original Message----- From: Bryan Thompson [mailto:br...@sy...] Sent: Monday, March 22, 2010 11:47 AM To: big...@li... Subject: [Bigdata-developers] next releases, merge to trunk, develop in branches All, I would like to change to a model where we do development in the trunk, slow down commits to the trunk as we approach release points, and do experimental development in branches which people create and maintain for those purposes themselves. Release points will be tagged as branches for maintenance. I think that this is all pretty standard and I think that it will make it easier to manage versioning overall for the project and easier for people who want to get the "latest and greatest" from the trunk while keeping the trunk relatively stable. We are planning to have a release this month which includes various performance optimizations and full transaction support for the SAIL together with a preview of the new RWStore. This will be followed by another release next month which includes more query optimizations, a hardened version of the RWStore, and HA support for the Journal. People who are not directly working on those feature sets should coordinate and establish a branch for their development activities. My plan is to merge the current state of SVN in the development branch down to the trunk on Wednesday morning. For that reason, I would like people to reach a checkpoint with their commits by Tuesday evening. I will send out a notice once the merge to trunk is done. At that point, I would like people doing anything experimental to create a branch and do work within that branch and coordinate with me when they are ready to merge changes back into the trunk. Please let me know if you have any questions or if the timeline for this merge is problematic for you. Thanks, Bryan ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Bigdata-developers mailing list Big...@li... https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
|
From: Bryan T. <br...@sy...> - 2010-03-22 17:47:32
|
All, I would like to change to a model where we do development in the trunk, slow down commits to the trunk as we approach release points, and do experimental development in branches which people create and maintain for those purposes themselves. Release points will be tagged as branches for maintenance. I think that this is all pretty standard and I think that it will make it easier to manage versioning overall for the project and easier for people who want to get the "latest and greatest" from the trunk while keeping the trunk relatively stable. We are planning to have a release this month which includes various performance optimizations and full transaction support for the SAIL together with a preview of the new RWStore. This will be followed by another release next month which includes more query optimizations, a hardened version of the RWStore, and HA support for the Journal. People who are not directly working on those feature sets should coordinate and establish a branch for their development activities. My plan is to merge the current state of SVN in the development branch down to the trunk on Wednesday morning. For that reason, I would like people to reach a checkpoint with their commits by Tuesday evening. I will send out a notice once the merge to trunk is done. At that point, I would like people doing anything experimental to create a branch and do work within that branch and coordinate with me when they are ready to merge changes back into the trunk. Please let me know if you have any questions or if the timeline for this merge is problematic for you. Thanks, Bryan |
|
From: Bryan T. <br...@sy...> - 2010-03-04 13:43:25
|
All, I am getting reasonable performance out of the standalone server mode of the bigdata federation with dynamic sharding and all. Based on that, I am tempted to drop the LDS and EDS modes. Their historical role was to provide an incremental capability between the standalone journal (local persistence store) and the full-up distributed federation. LDS was the special case where the client was connected to a single data service instance w/o dynamic sharding (no metadata server / shard locator). EDS was the special case of a federation where all services ran inside of a single JVM. Looking back on things, I think that these have more of a role for testing and the incremental evolution of the architecture than a role for deployment, so I am thinking of dropping them out over time. The LDS in particular has a bunch of cruft related to it designed to allow nested subquery joins to be issued with full locality, however the pipeline join really obviates the need for this and the logic to support LDS makes rule execution setup significantly more complex than it has to be otherwise. EDS is useful for some unit tests because we can directly inspect the state of two data services when writing tests for index partition moves, etc. However, I am increasingly doubtful that either of these modes should be supported for deployment and LDS in particular I think should just go altogether. The standalone federation mode differs from the clustered federation mode solely in how it is deployed. If you have a server class machine, rather than just a laptop, the standalone federation appears to be a reasonable way to prototype services. On a related note, I am seeing significant data skew with the BSBM data in scale-out. It appears that the TERM2ID index split estimates based on the #of tuples in the shard do not work well for these data, which results in some very large shards (1-3G). I am going to finish the change set to split shards based on the size on the disk after a compacting merge rather than the #of tuples. This will fix the data skew problem and also greatly simplify the KB configuration. This is part of issue #20 [1] Bryan [1] https://sourceforge.net/apps/trac/bigdata/ticket/20 |
|
From: Bryan T. <br...@sy...> - 2010-03-04 00:44:21
|
Brian,
Following your excellent example, I have forwarded this on to the list.
I will see if I can adapt your LookupStarter to handle startup/shutdown of jini for the new ant tasks for managing a bigdata federation running on a single server.
Bryan
________________________________________
From: Brian Murphy [bri...@no...]
Sent: Wednesday, March 03, 2010 7:01 PM
To: Bryan Thompson
Cc: Gossard Sean (Nokia-S/Boston); Levine Brian (Nokia-S/Boston); Wharton Andrew (Nokia-S/Boston); Mcdonough Ryan (Nokia-S/Boston); Mike Personick; Martyn Cutcher; Morrison Bradley (Nokia-S/Boston); Additon Chris (Nokia-S/Boston)
Subject: Re: #53 Enhance build.xml to compile & package test source & allow one to run the junit tests from the command line or a script
Hi Bryan,
I've copied everyone else on this message because may be of
general interest to anyone on the list who might end up working
with Jini.
Brian
ext Bryan Thompson wrote:
> Brian,
>
> If I follow the pattern in your modified configuration for the test suites, which looks like this:
>
> static private groups = new String[]{bigdata.fedname};
> static private locators = new LookupLocator[] {};
>
> within the context of the src/resources/config/standalone/bigdataStandalone.config file then the ServicesManagerServer is never assign a ServiceID and things never get moving.
>
> However, if I setup the groups/locators as follows, then it works.
>
> static private groups = LookupDiscovery.ALL_GROUPS;
> static private locators = new LookupLocator[] {};
>
> Is the issue that the LUS needs to be started for the same value of "groups" that is specified in the configuration file?
Yes.
One thing to keep in mind though is that the lookup
service has two group-related config items, whereas all
other services have only one. Whereas Jini services
that are not a lookup service are told what group(s)
to find and join, the lookup service can be told not
only what group(s) to find and join, but more importantly,
what "member group(s) it is to belong to.
Because the lookup service is a Jini service itself, it
can be configured to find and register with (join) other
lookup services on the network; which is configured
similarly to how the other services are configured.
But the item that must be configured when starting a
lookup service that will cause other services to find
and join that lookup service is the "member group(s)".
As an example, if you look at the config file I committed
recently -- <baseDir>/src/resources/config/jini/serviceStarterAll.config --
you'll see the following:
private static reggieArgs0 =
"com.sun.jini.reggie.initialLookupGroups=new
String[]${groupsToJoin}";
private static reggieArgs1 =
"com.sun.jini.reggie.initialLookupLocators=new
LookupLocator[]${locsToJoin}";
private static reggieArgs2 =
"com.sun.jini.reggie.initialMemberGroups=new
String[]${memberGroups}";
The 2 items above that are being used to configure the reggie
implementation of the lookup service, tell that lookup service
what other lookup services it should find and join. This then,
is a way that lookup services can be chained.
The third item tells the lookup service the group(s) it should
announce itself as being a member of. That is, the value(s) of
that item is what other services will look for to join.
Thus, if a lookup service is started with group value(s) identical
to the group value(s) used to start the other services in the system,
but the set of "member" group(s) configured for that lookup service
does not contain at least one of the group names in the group set,
then the lookup service will not be discovered by those other
services. That is, there's no harm in setting the group(s) for the
lookup service, but it's the member group(s) that must be set
for the lookup service to be discovered.
The reason it works for what you used above -- that is --
static private groups = LookupDiscovery.ALL_GROUPS;
is because the way ALL_GROUPS is specified is that it will
discover any lookup service, regardless of its group membership.
Note that even if the lookup service is started with a group
membership value of NO_GROUPS, I believe ALL_GROUPS
will still discover it.
As a side note, we always regretted providing ALL_GROUPS
because although it resulted in an initial success, it almost
always caused people problems in the long run; because they
ultimately want to isolate their federations from other federations.
So if your federation is the only federation on the network, then
ALL_GROUPS will work. But if there's another Jini federation
on the network -- for example, one that Chris starts -- then you
may have a lot of undesirable cross talk.
One other note. For those on the list who are not as familiar with
the Jini config files as Bryan is, the tokens ${groupsToJoin},
${locsToJoin}, and ${memberGroups} in the config items above
take advantage of Java's automatic system property substitution
mechanism. Rather than having to manually modify each config
file when you want to set groups and/or member groups, the
goal is to provide a mechanism where those values are set in
a single place -- for example, Bigdata's current build.properties --
and then have those values automatically propagated down to
the various config files so they don't have to be touched by
the deployer.
Another possibility we probably need to consider, is to exploit the
currently existing com.bigdata.jini.util.ConfigMath utility class to
achieve similar effects as above; rather than setting system properties.
This is something I was planning to investigate at some point.
Brian M
|
|
From: Brian M. <btm...@gm...> - 2010-03-01 23:06:42
|
Hi folks, Trac issue #53 might be of general interest to some on this list, and since I'm not sure who is watching the issues that have been filed, or the checkins that have occurred, I thought I'd give folks a heads up. Changeset 2493 was the first attempt to address issue #53; which is a request for enhancements to the Bigdata build.xml ant script that would allow one to compile, package, and run the junit tests using ant. For more details, see the comments for issue #53 in trac. You can also review the diffs for changeset 2493 to see what was actually changed or added. I hope folks find this of some use. Regards, Brian M |
|
From: Bryan T. <br...@sy...> - 2010-02-27 15:14:45
|
Chris, I've tried to answer your questions here [1]. Bryan [1] https://sourceforge.net/apps/mediawiki/bigdata/index.php?title=StandaloneGuide ________________________________________ From: cha...@em... [cha...@em...] Sent: Wednesday, February 24, 2010 5:42 PM To: big...@li... Subject: Re: [Bigdata-developers] Bigdata HA architecture design document (RFC) Bryan, In this thread you talk about the 2 standalone versions of bigdata. Could you provide more detail or references on their differences including configuration ? Thanks. Chris |
|
From: Bryan T. <br...@sy...> - 2010-02-27 01:20:22
|
All, I just figured out that some postings were delayed. Hopefully the will all come through now and I will keep a watch for queued postings. Bryan |
|
From: Bryan T. <br...@sy...> - 2010-02-25 14:13:09
|
It looks like the difference was Node#getChild(int), not whether or not NIO was used for jeri/RMI. I have another run now with the same general shape for throughput. The total throughput is not as high, but that is because one of the clients was relatively less loaded initially for some unknown reason. However, the 2nd client is coming up to a roughly equal loading now. I will retry the run to verify things, but I think that NIO did not make a difference in throughput. Which suggests that we might want to NOT use NIO by default until / unless we can prove a win with it. Bryan > -----Original Message----- > From: Bryan Thompson [mailto:br...@sy...] > Sent: Wednesday, February 24, 2010 8:29 PM > To: big...@li... > Subject: [Bigdata-developers] two changes, with a big difference > > Well, I made two changes today which have resulted in a > whopping difference in the data load throughput for > scale-out. The first was the Node#getChild(int) refactor. > The second was a configuration change to turn off nio for > jeri. I can't say which one is having the effect since I am > testing with both changes, but throughput on the 16 node test > cluster shot up to nearly 400k tps before falling back and > leveling off at a bit about 310k tps, and climbing slowly. > > The performance drop off appears to be correlated to the > onset of increased index segment builds and merges, so we can > probably retain that performance by addressing the remaining > issues discussed in [1] (basically, better scheduling of > index segment builds and merges). > > I will have to go back and test with jeri nio enabled again > and see if that accounts for the difference. > > If not, then the difference is entirely due to removing some > synchronization around reading child nodes. If this is true, > then getting rid of the synchronization on journal reads and > in the cache should really boost throughput. > > Bryan > > [1] https://sourceforge.net/apps/trac/bigdata/ticket/20 > -------------------------------------------------------------- > ---------------- > Download Intel® Parallel Studio Eval Try the new > software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
|
From: Bryan T. <br...@sy...> - 2010-02-25 01:29:15
|
Well, I made two changes today which have resulted in a whopping difference in the data load throughput for scale-out. The first was the Node#getChild(int) refactor. The second was a configuration change to turn off nio for jeri. I can't say which one is having the effect since I am testing with both changes, but throughput on the 16 node test cluster shot up to nearly 400k tps before falling back and leveling off at a bit about 310k tps, and climbing slowly. The performance drop off appears to be correlated to the onset of increased index segment builds and merges, so we can probably retain that performance by addressing the remaining issues discussed in [1] (basically, better scheduling of index segment builds and merges). I will have to go back and test with jeri nio enabled again and see if that accounts for the difference. If not, then the difference is entirely due to removing some synchronization around reading child nodes. If this is true, then getting rid of the synchronization on journal reads and in the cache should really boost throughput. Bryan [1] https://sourceforge.net/apps/trac/bigdata/ticket/20 |
|
From: <cha...@em...> - 2010-02-24 22:42:56
|
Bryan, In this thread you talk about the 2 standalone versions of bigdata. Could you provide more detail or references on their differences including configuration ? Thanks. Chris |
|
From: Bryan T. <br...@sy...> - 2010-02-24 17:58:45
|
Matt,
I think that I have a fix for the Node#getChild(int) hotspot. I am going to commit this and do some benchmarking on a server platform. If you have time, you could see if it helps out on your end. This does not address the concurrent disk read/write requests yet. That is one of the next things that I am going to look at.
Committed now.
Sending Documents and Settings/Bryan Thompson/workspace/bigdata-branch/bigdata/src/java/com/bigdata/btree/AbstractBTree.java
Sending Documents and Settings/Bryan Thompson/workspace/bigdata-branch/bigdata/src/java/com/bigdata/btree/IndexMetadata.java
Sending Documents and Settings/Bryan Thompson/workspace/bigdata-branch/bigdata/src/java/com/bigdata/btree/Node.java
Adding Documents and Settings/Bryan Thompson/workspace/bigdata-branch/bigdata/src/java/com/bigdata/util/concurrent/Computable.java
Adding Documents and Settings/Bryan Thompson/workspace/bigdata-branch/bigdata/src/java/com/bigdata/util/concurrent/Memoizer.java
Sending Documents and Settings/Bryan Thompson/workspace/bigdata-branch/bigdata-bsbm/build.properties
Sending Documents and Settings/Bryan Thompson/workspace/bigdata-branch/bigdata-bsbm/src/java/benchmark/bigdata/BigdataQueryDriver.java
Transmitting file data ...
Committed revision 2463.
Bryan
|
|
From: Mike P. <mi...@sy...> - 2010-02-24 15:30:40
|
One thing to consider is how this choice would interact with the star join mechanism, especially if we start encoding RDF values into the BTree tuple value. If we are trying to retrieve information about a particular subject inside a particular set of contexts, then we'd have S and C bound and the CSPO index would be a more natural choice for star joins. If we use SPOC, then a star join on a particular subject would only work if C is unbound or filtered after the fact. If we used SPOC, we'd have to leave C unbound, at the risk of retrieving large amount of information irrelevant to the query (outside of the set of contexts of interest). ________________________________ From: Bryan Thompson [mailto:br...@sy...] Sent: Tuesday, February 23, 2010 1:22 PM To: Matthew Roy; big...@li... Subject: Re: [Bigdata-developers] CSPO or SPOC? Matt, This notion of "primary" is a bit misleading. There are two (well, maybe three) issues at stake. First, whether we use an alternative set of covering indices, e.g., in order to have SCPO as an index. Second, for high throughput writes in scale-out without transactional isolation, we can apply updates at the shards of the "primary" statement index and have eventually consistent updates at the secondary indices. In order for this to work we need to impose a constraint that all statements for a given "S" (or "C", depending on the application's information architecture) must be on the same shard in order to gain ACID guarantees without distributed locks for operations which inspect or update other statements for the same S (or C) during an update. We currently maintain the following covering indices for the quads mode: "SPOC",// "POCS",// "OCSP",// "CSPO",// "PCSO",// "SOPC" // For the SPOC index, the constraint that the "S" may not cross a shard boundary is useful if you believe that you might do validation during updates which cross contexts. However it has less locality of reference within a context when compared with the CSPO index (maybe this is what you meant? That CSPO is better for reading off all statements for a context?) Likewise, if you believe that updates (and validation) would only occur within the same context, then the CSPO index with a "C" constraint would be sufficient. However, we can not atomically assemble a view across different contexts for the same subject with CSPO (which begs the question of whether or not this is a requirement for anyone's information architecture). Martyn's proposal of an SCPO index with an "S" constraint would be much the same as the SPOC index with an "S" constraint, except that it would have better locality within a given context. However, we would need to identify a different set of covering indices which included SCPO if we went that route. One other thread that is relevant here is the notion of partially denormalizing the RDF Values into an index used for star-joins and filters based on values. I've been thinking about this in terms of denormalizing "small datatype values", e.g., not string literals, but xsd:int, xsd:long, xsd:float, xsd:double, etc. The star join would operate against this index (which I have been assuming was also the primary index) and could directly decode the value of the tuple into the appropriate xsd datatype value for filters and to materialize attribute values without indirection through the lexicon for datatyped attributes. It does seem like CSPO would do better for this last purpose than SPOC if your application is more likely to read within a given context (CSPO has better locality here) than to read within a given subject without regard to their context (SPOC has better locality here). Concerning query performance, have you tried overriding SPOKeyOrder#isPrimaryKey() to return true for CSPO and do you observe a performance benefit for query? Looking at the code, I see that there are a few hard coded assumptions that SPOC is the sole access path / primary index, but not that many (they appear to all be in SPORelation). We could probably parameterize this with an Option for the AbstractTripleStore and move the tests for SPOKeyOrder#isPrimaryIndex() onto the AbstractRelation (which would affect SPOIndexRemover, SPOIndexWriter, SPOIndexWriteProc, and the AsynchronousStatementBufferFactory). I can definately see how CSPO could work better for applications where context is king. Bryan ________________________________ From: Matthew Roy [mailto:mr...@ca...] Sent: Tuesday, February 23, 2010 11:47 AM To: big...@li... Subject: Re: [Bigdata-developers] CSPO or SPOC? Coming from a system where the Context is the main unit of management for statements, CSPO feels like the correct primary index. One question would be what effect on addition/deletion efficiency does the primary index make? More specifically, if within a transactions additions/deletions usually occur with a high number of statements per context, does the proximity of the changed statements within the primary index help performance? Matt On 2/22/2010 6:38 PM, Bryan Thompson wrote: I would like to solicit some input on the question of whether the primary index for the quad store should be SPOC (it is today) or CSPO. There has been some discussion on this issue in the past. I am raising the issue again in the light of discussions where an entire context corresponding to a relatively large collection of statements is to be dropped, e.g., wikipedia when mapped onto a single context, and when eventual consistency is being used for the secondary indices (that is, we handle conflict resolution on the primary statement index, e.g., SPOC, and then have a restart safe protocol guaranteeing eventual updates on the secondary statement indices). I have come around to the opinion that mapping that much data onto a single context is generally wrong. The information would be more readily managed by mapping it onto a set of contexts corresponding to individual wikipedia entries, each of which was then associated with the source using statements about that context. Thoughts? Bryan ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Bigdata-developers mailing list Big...@li...<mailto:Big...@li...> https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
|
From: Bryan T. <br...@sy...> - 2010-02-23 20:22:40
|
Matt, This notion of "primary" is a bit misleading. There are two (well, maybe three) issues at stake. First, whether we use an alternative set of covering indices, e.g., in order to have SCPO as an index. Second, for high throughput writes in scale-out without transactional isolation, we can apply updates at the shards of the "primary" statement index and have eventually consistent updates at the secondary indices. In order for this to work we need to impose a constraint that all statements for a given "S" (or "C", depending on the application's information architecture) must be on the same shard in order to gain ACID guarantees without distributed locks for operations which inspect or update other statements for the same S (or C) during an update. We currently maintain the following covering indices for the quads mode: "SPOC",// "POCS",// "OCSP",// "CSPO",// "PCSO",// "SOPC" // For the SPOC index, the constraint that the "S" may not cross a shard boundary is useful if you believe that you might do validation during updates which cross contexts. However it has less locality of reference within a context when compared with the CSPO index (maybe this is what you meant? That CSPO is better for reading off all statements for a context?) Likewise, if you believe that updates (and validation) would only occur within the same context, then the CSPO index with a "C" constraint would be sufficient. However, we can not atomically assemble a view across different contexts for the same subject with CSPO (which begs the question of whether or not this is a requirement for anyone's information architecture). Martyn's proposal of an SCPO index with an "S" constraint would be much the same as the SPOC index with an "S" constraint, except that it would have better locality within a given context. However, we would need to identify a different set of covering indices which included SCPO if we went that route. One other thread that is relevant here is the notion of partially denormalizing the RDF Values into an index used for star-joins and filters based on values. I've been thinking about this in terms of denormalizing "small datatype values", e.g., not string literals, but xsd:int, xsd:long, xsd:float, xsd:double, etc. The star join would operate against this index (which I have been assuming was also the primary index) and could directly decode the value of the tuple into the appropriate xsd datatype value for filters and to materialize attribute values without indirection through the lexicon for datatyped attributes. It does seem like CSPO would do better for this last purpose than SPOC if your application is more likely to read within a given context (CSPO has better locality here) than to read within a given subject without regard to their context (SPOC has better locality here). Concerning query performance, have you tried overriding SPOKeyOrder#isPrimaryKey() to return true for CSPO and do you observe a performance benefit for query? Looking at the code, I see that there are a few hard coded assumptions that SPOC is the sole access path / primary index, but not that many (they appear to all be in SPORelation). We could probably parameterize this with an Option for the AbstractTripleStore and move the tests for SPOKeyOrder#isPrimaryIndex() onto the AbstractRelation (which would affect SPOIndexRemover, SPOIndexWriter, SPOIndexWriteProc, and the AsynchronousStatementBufferFactory). I can definately see how CSPO could work better for applications where context is king. Bryan ________________________________ From: Matthew Roy [mailto:mr...@ca...] Sent: Tuesday, February 23, 2010 11:47 AM To: big...@li... Subject: Re: [Bigdata-developers] CSPO or SPOC? Coming from a system where the Context is the main unit of management for statements, CSPO feels like the correct primary index. One question would be what effect on addition/deletion efficiency does the primary index make? More specifically, if within a transactions additions/deletions usually occur with a high number of statements per context, does the proximity of the changed statements within the primary index help performance? Matt On 2/22/2010 6:38 PM, Bryan Thompson wrote: I would like to solicit some input on the question of whether the primary index for the quad store should be SPOC (it is today) or CSPO. There has been some discussion on this issue in the past. I am raising the issue again in the light of discussions where an entire context corresponding to a relatively large collection of statements is to be dropped, e.g., wikipedia when mapped onto a single context, and when eventual consistency is being used for the secondary indices (that is, we handle conflict resolution on the primary statement index, e.g., SPOC, and then have a restart safe protocol guaranteeing eventual updates on the secondary statement indices). I have come around to the opinion that mapping that much data onto a single context is generally wrong. The information would be more readily managed by mapping it onto a set of contexts corresponding to individual wikipedia entries, each of which was then associated with the source using statements about that context. Thoughts? Bryan ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Bigdata-developers mailing list Big...@li...<mailto:Big...@li...> https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
|
From: Matthew R. <mr...@ca...> - 2010-02-23 16:47:47
|
Coming from a system where the Context is the main unit of management for statements, CSPO feels like the correct primary index. One question would be what effect on addition/deletion efficiency does the primary index make? More specifically, if within a transactions additions/deletions usually occur with a high number of statements per context, does the proximity of the changed statements within the primary index help performance? Matt On 2/22/2010 6:38 PM, Bryan Thompson wrote: > I would like to solicit some input on the question of whether the primary index for the quad store should be SPOC (it is today) or CSPO. There has been some discussion on this issue in the past. I am raising the issue again in the light of discussions where an entire context corresponding to a relatively large collection of statements is to be dropped, e.g., wikipedia when mapped onto a single context, and when eventual consistency is being used for the secondary indices (that is, we handle conflict resolution on the primary statement index, e.g., SPOC, and then have a restart safe protocol guaranteeing eventual updates on the secondary statement indices). > > I have come around to the opinion that mapping that much data onto a single context is generally wrong. The information would be more readily managed by mapping it onto a set of contexts corresponding to individual wikipedia entries, each of which was then associated with the source using statements about that context. > > Thoughts? > > Bryan > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |