This list is closed, nobody may subscribe to it.
2010 |
Jan
|
Feb
(19) |
Mar
(8) |
Apr
(25) |
May
(16) |
Jun
(77) |
Jul
(131) |
Aug
(76) |
Sep
(30) |
Oct
(7) |
Nov
(3) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(2) |
Jul
(16) |
Aug
(3) |
Sep
(1) |
Oct
|
Nov
(7) |
Dec
(7) |
2012 |
Jan
(10) |
Feb
(1) |
Mar
(8) |
Apr
(6) |
May
(1) |
Jun
(3) |
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
(8) |
Dec
(2) |
2013 |
Jan
(5) |
Feb
(12) |
Mar
(2) |
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
(22) |
Aug
(50) |
Sep
(31) |
Oct
(64) |
Nov
(83) |
Dec
(28) |
2014 |
Jan
(31) |
Feb
(18) |
Mar
(27) |
Apr
(39) |
May
(45) |
Jun
(15) |
Jul
(6) |
Aug
(27) |
Sep
(6) |
Oct
(67) |
Nov
(70) |
Dec
(1) |
2015 |
Jan
(3) |
Feb
(18) |
Mar
(22) |
Apr
(121) |
May
(42) |
Jun
(17) |
Jul
(8) |
Aug
(11) |
Sep
(26) |
Oct
(15) |
Nov
(66) |
Dec
(38) |
2016 |
Jan
(14) |
Feb
(59) |
Mar
(28) |
Apr
(44) |
May
(21) |
Jun
(12) |
Jul
(9) |
Aug
(11) |
Sep
(4) |
Oct
(2) |
Nov
(1) |
Dec
|
2017 |
Jan
(20) |
Feb
(7) |
Mar
(4) |
Apr
(18) |
May
(7) |
Jun
(3) |
Jul
(13) |
Aug
(2) |
Sep
(4) |
Oct
(9) |
Nov
(2) |
Dec
(5) |
2018 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Mike P. <mi...@sy...> - 2015-04-13 13:41:33
|
Thanks, I'll take a look at it. --- Mike Personick Managing Partner Systap, LLC www.systap.com 801-243-3678 skype: mike.personick On Sun, Apr 12, 2015 at 11:17 AM, Jack Park <jac...@gm...> wrote: > I created a gist here: > https://gist.github.com/KnowledgeGarden/87ac9991cafc69d179e1 > > The gist shows two bodies of code: > > My embedded driver code; rather spartan, but something to get started, by > copying code found on the web and in blueprints unit tests. > > My test code, copied from blueprints unit tests, which loads one of the > simple blueprinits properties graphs and prints the vertex and edge lists. > > Included is the output trace of the first test, which shows that it does, > indeed, load the graph and print. But, no journal document is created; the > data folder is empty. > > Included below that is the output trace where I hand-created the journal. > It promptly ballooned to 10,240kb, which, I presume, means that a graph was > loading into it along with other stuff, but it crashed apparently while > doing the graph import. The link into my code points to this line: > > GraphMLReader.inputGraph(graph, example); > > It is simply not clear to me what is going on here. > > Thanks in advance for some ideas. > > Jack > > > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live > exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- > event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > |
From: Brad B. <be...@sy...> - 2015-04-12 23:58:39
|
Jack, I saw Bryan's note regarding data to replicate the issue. That would definitely be helpful. Regarding the journal size, the journal makes an initial allocation upon creation, which is configurable, and then fills it up over time. That is likely the explanation for the large initial size. Thanks, --Brad On Sun, Apr 12, 2015 at 1:17 PM, Jack Park <jac...@gm...> wrote: > I created a gist here: > https://gist.github.com/KnowledgeGarden/87ac9991cafc69d179e1 > > The gist shows two bodies of code: > > My embedded driver code; rather spartan, but something to get started, by > copying code found on the web and in blueprints unit tests. > > My test code, copied from blueprints unit tests, which loads one of the > simple blueprinits properties graphs and prints the vertex and edge lists. > > Included is the output trace of the first test, which shows that it does, > indeed, load the graph and print. But, no journal document is created; the > data folder is empty. > > Included below that is the output trace where I hand-created the journal. > It promptly ballooned to 10,240kb, which, I presume, means that a graph was > loading into it along with other stuff, but it crashed apparently while > doing the graph import. The link into my code points to this line: > > GraphMLReader.inputGraph(graph, example); > > It is simply not clear to me what is going on here. > > Thanks in advance for some ideas. > > Jack > > > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live > exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- > event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > -- _______________ Brad Bebee Managing Partner SYSTAP, LLC e: be...@sy... m: 202.642.7961 f: 571.367.5000 w: www.systap.com Blazegraph™ <http://www.blazegraph.com> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Brad B. <be...@sy...> - 2015-04-12 23:55:40
|
Jack, Thank you. Yes, if you have dependency conflicts you can try the tar.gz bundle [1] and choose the jar files that you need. If you are deploying within Tomcat, you can also try including the bundled jar at the container level rather than the webapp for classpath dependencies. Thanks, --Brad [1] http://sourceforge.net/projects/bigdata/files/bigdata/1.5.1/REL.bigdata-1.5.1.tgz/download On Sun, Apr 12, 2015 at 4:45 PM, Jack Park <jac...@gm...> wrote: > I dropped bigdata-bundled.jar into the OpenSherlock build. Booting now > gives this issue: > > java.lang.NoSuchFieldError: LUCENE_3_6 > which is occurring inside Lucene's Version class. > > There's an apparent collision, since I already have Lucene 4.9.0 in the > build and classpath. > > I suspect this means that I must simply grab jars from the war rather than > run the bundled jar. > > > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live > exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- > event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > -- _______________ Brad Bebee Managing Partner SYSTAP, LLC e: be...@sy... m: 202.642.7961 f: 571.367.5000 w: www.systap.com Blazegraph™ <http://www.blazegraph.com> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Jack P. <jac...@gm...> - 2015-04-12 20:45:57
|
I dropped bigdata-bundled.jar into the OpenSherlock build. Booting now gives this issue: java.lang.NoSuchFieldError: LUCENE_3_6 which is occurring inside Lucene's Version class. There's an apparent collision, since I already have Lucene 4.9.0 in the build and classpath. I suspect this means that I must simply grab jars from the war rather than run the bundled jar. |
From: Jack P. <jac...@gm...> - 2015-04-12 17:17:43
|
I created a gist here: https://gist.github.com/KnowledgeGarden/87ac9991cafc69d179e1 The gist shows two bodies of code: My embedded driver code; rather spartan, but something to get started, by copying code found on the web and in blueprints unit tests. My test code, copied from blueprints unit tests, which loads one of the simple blueprinits properties graphs and prints the vertex and edge lists. Included is the output trace of the first test, which shows that it does, indeed, load the graph and print. But, no journal document is created; the data folder is empty. Included below that is the output trace where I hand-created the journal. It promptly ballooned to 10,240kb, which, I presume, means that a graph was loading into it along with other stuff, but it crashed apparently while doing the graph import. The link into my code points to this line: GraphMLReader.inputGraph(graph, example); It is simply not clear to me what is going on here. Thanks in advance for some ideas. Jack |
From: Brad B. <be...@sy...> - 2015-04-09 19:50:42
|
Daniel, Thank you. You might try EasyRDF in PHP ( http://www.easyrdf.org/docs/api/EasyRdf_Sparql_Client.html) with our NanoSparqlServer (NSS) Sparql Endpoint ( http://wiki.blazegraph.com/wiki/index.php/NanoSparqlServer#REST_API). I've copied our developers and user's list to see if they have any success of ideas with good PHP libraries with the Blazegraph Sparql Endpoint. Cheers, --Brad On Thu, Apr 9, 2015 at 3:03 PM, Daniel Bankhead <la...@me...> wrote: > Thanks Brad, > > I’m working on a platform that requires the storage of user data (such as > emails and purchases). This information would need to be accessible via PHP > and should be compatible with a REST API. > > Daniel Bankhead > > On Apr 9, 2015, at 2:31 PM, Brad Bebee <be...@sy...> wrote: > > Daniel, > > Great -- we're excited that you're getting started with Blazegraph. Can > you describe a little more about your use case? We are will be publishing > an enhanced user's manual at the end of this month. > > Thanks, --Brad > > On Thu, Apr 9, 2015 at 2:21 PM, Daniel Bankhead <la...@me...> wrote: > >> Hello Blazegraph team, >> >> I’m interested in getting started with Blazegraph, but there is a huge >> lack of code examples on your site. With a MySQL/XML background, how do I >> get started with Blazegraph? I’ve scoured the web and StackOverflow ( >> http://stackoverflow.com/search?q=Blazegraph) with no prevail. >> >> With this said, could you please provide a few code examples, preferably >> with PHP, so that I may begin to use your software. >> >> Thanks, >> >> Daniel Bankhead >> >> > > > -- > _______________ > Brad Bebee > Managing Partner > SYSTAP, LLC > e: be...@sy... > m: 202.642.7961 > f: 571.367.5000 > w: www.systap.com > > Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance > graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints > APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new > technology to use GPUs to accelerate data-parallel graph analytics. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP, LLC. Any unauthorized review, use, disclosure, > dissemination or copying of this email or its contents or attachments is > prohibited. If you have received this communication in error, please notify > the sender by reply email and permanently delete all copies of the email > and its contents and attachments. > > > -- _______________ Brad Bebee Managing Partner SYSTAP, LLC e: be...@sy... m: 202.642.7961 f: 571.367.5000 w: www.systap.com Blazegraph™ <http://www.blazegraph.com> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Brad B. <be...@sy...> - 2015-04-09 13:28:02
|
Hello, We have migrated the Blazegraph wiki to the GoogleLogin functionality due to Google decision to end support for OpenID. This is now deployed to http://wiki.blazegraph.com/. Unfortunately, if you have an existing Wiki account and would like it linked to your new GoogleLogin functionality, you will need to follow the procedure below. If you just want to create a separate account, there are not additional steps required. Linking a legacy wiki account to GoogleLogin. 1. Create a new GoogleLogin account on the wiki. It must use a different name than your existing account. 2. Send an email to bla...@sy... with your new and old username. 3. We will confirm the linking of the account. Thanks for choosing Blazegraph! Thanks, --Brad -- _______________ Brad Bebee Managing Partner SYSTAP, LLC e: be...@sy... m: 202.642.7961 f: 571.367.5000 w: www.systap.com Blazegraph™ <http://www.blazegraph.com> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Bryan T. <br...@sy...> - 2015-04-07 19:58:07
|
You can also try renting EC2 instances and benchmarking against them. Yes, the analytic mode will help. However, we do not yet do ORDER BY on the native heap. Just FYI. We will be delivering some changes for quads mode to improve the use of the native heap - it is currently not used for default graph access paths (where we impose the DISTINCT SPO constraint on each access path). You have been in a bad environment with slow disk, heavy queries, and relatively little RAM. Even though the indices are clustered, they are not in key order on the disk. Thus an index scan still induces random IO. Improving your IOPs will make a huge difference. More RAM will help, but just drop an SSD in there and you should get a big win. Thanks, Bryan On Fri, Apr 3, 2015 at 4:59 PM, Jim Balhoff <ba...@ne...> wrote: > Hi Bryan, > > Thanks for your reply. I suppose I would characterize my queries as heavy, > since many of them individually take longer than I would like, but I am not > running Blazegraph on a great server at the moment. We do not have that > many concurrent clients. I do a lot of queries that have a large result set > that needs to be distinct and sorted, for paging through. It sounds like I > should experiment more with the analytic query mode. But my current old > server does not have much extra memory available. Would that be a > prerequisite for the analytic mode making a difference? The old server has > 8 GB memory, 6 GB allocated to JVM (perhaps that is too high), with a very > slow disk. > > Best regards, > Jim > > > On Apr 2, 2015, at 6:05 PM, Bryan Thompson <br...@sy...> wrote: > > > > Jim, > > > > The best way to size a machine is for a data set and workload. Always > buy SSD. The historical guidance was to use relatively small heaps (4-8GB) > and let the OS buffer the disk. The concept was to minimize the impact of > GC pauses. However, some people are having good success using large heaps > (112G) and the G1 garbage collector. > > > > We run data sets of that size on platforms as small as Mac minis. > > > > For query performance, faster CPU cores are good and more cores are > good. This assumes that the IO system has high IOPS. > > > > Would you characterize your queries as lightweight of heavy? Is the > query workload highly concurrent (lots of clients)? Is the working set > required to answer those queries small or a large part of your data? These > things effect the throughput you will observe for query. Query plan > optimizations is also very important. If you have an expensive query, make > sure that it is doing what you intend. Often the query can be improved. > For our part, we are working to improve the query optimizer. One client > recently reported a 2x improvement in 1.5.1 vs 1.2.x. We have a lot more > optimizations in the pipeline. > > > > The analytic query mode is for larger intermediate solutions sets. If > you run this kind of query, then turn it on. You can do this on a query by > query basis. The jvm ergonomics automatically allow a certain amount of > native memory allocation. You only need to explicitly specify this if you > are running into limits with those native buffers. The other use of native > buffers is for the write cache. This improves the bulk load rate, but it > does not look like it is your primary concern. > > > > Thanks, > > Bryan > > > > PS: yes, the list is fine. > > > > On Thursday, April 2, 2015, Jim Balhoff <ba...@ne...> wrote: > > Hi, > > > > I was wondering if you provided any guidance on hardware for different > sizes of databases. I have read through the performance articles on the > wiki, but am wondering if there are some more generalized guidelines that > could be stated. In my case, say I will have 150 million triples, and am > going to purchase a new system, how much memory is recommended? How much of > that memory should I give to the JVM via "-Xmx" vs. letting the OS use it > for caching the db? (I am also a little confused about whether I need to > specifically allocate some other amount to the JVM through > MaxDirectMemorySize, for analytic queries). I am only concerned with query > speed, not writes. > > > > Maybe there are too many special cases, but I was hoping there are some > minimum guidelines that could be determined. > > > > Side question: is it okay to post questions like this here? I find email > lists to be a lot more convenient than the Sourceforge forum, but I can > move it there if needed. > > > > Thank you, > > Jim > > > > > > ____________________________________________ > > James P. Balhoff, Ph.D. > > National Evolutionary Synthesis Center > > 2024 West Main St., Suite A200 > > Durham, NC 27705 > > USA > > > > > > > > > > > ------------------------------------------------------------------------------ > > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > > by Intel and developed in partnership with Slashdot Media, is your hub > for all > > things parallel software development, from weekly thought leadership > blogs to > > news, videos, case studies, tutorials and more. Take a look and join the > > conversation now. http://goparallel.sourceforge.net/ > > _______________________________________________ > > Bigdata-developers mailing list > > Big...@li... > > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > > > > > -- > > ---- > > Bryan Thompson > > Chief Scientist & Founder > > SYSTAP, LLC > > 4501 Tower Road > > Greensboro, NC 27410 > > br...@sy... > > http://blazegraph.com > > http://blog.bigdata.com > > http://mapgraph.io > > Blazegraph™ is our ultra high-performance graph database that supports > both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ is our disruptive > new technology to use GPUs to accelerate data-parallel graph analytics. > > > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP. Any unauthorized review, use, disclosure, > dissemination or copying of this email or its contents or attachments is > prohibited. If you have received this communication in error, please notify > the sender by reply email and permanently delete all copies of the email > and its contents and attachments. > > > > > > > |
From: Brad B. <be...@sy...> - 2015-04-04 17:56:24
|
All, Trac is back up. We will have another outage in the next 2 weeks due to the OpenID decommissioning. Thanks, --Brad On Fri, Apr 3, 2015 at 6:10 PM, Brad Bebee <be...@sy...> wrote: > Trac will be down this weekend for am upgrade. > > trac.blazegraph.com > > Thanks, Brad > > _______________ > Brad Bebee > Managing Partner > SYSTAP, LLC > e: be...@sy... > m: 202.642.7961 > f: 571.367.5000 > w: www.systap.com > > Blazegraph™ is our ultra high-performance graph database that supports > both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ is our disruptive > new technology to use GPUs to accelerate data-parallel graph analytics. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP, LLC. Any unauthorized review, use, disclosure, > dissemination or copying of this email or its contents or attachments is > prohibited. If you have received this communication in error, please notify > the sender by reply email and permanently delete all copies of the email > and its contents and attachments. > -- _______________ Brad Bebee Managing Partner SYSTAP, LLC e: be...@sy... m: 202.642.7961 f: 571.367.5000 w: www.systap.com Blazegraph™ <http://www.blazegraph.com> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Brad B. <be...@sy...> - 2015-04-03 22:10:27
|
Trac will be down this weekend for am upgrade. trac.blazegraph.com Thanks, Brad _______________ Brad Bebee Managing Partner SYSTAP, LLC e: be...@sy... m: 202.642.7961 f: 571.367.5000 w: www.systap.com Blazegraph™ is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Jim B. <ba...@ne...> - 2015-04-03 20:59:41
|
Hi Bryan, Thanks for your reply. I suppose I would characterize my queries as heavy, since many of them individually take longer than I would like, but I am not running Blazegraph on a great server at the moment. We do not have that many concurrent clients. I do a lot of queries that have a large result set that needs to be distinct and sorted, for paging through. It sounds like I should experiment more with the analytic query mode. But my current old server does not have much extra memory available. Would that be a prerequisite for the analytic mode making a difference? The old server has 8 GB memory, 6 GB allocated to JVM (perhaps that is too high), with a very slow disk. Best regards, Jim > On Apr 2, 2015, at 6:05 PM, Bryan Thompson <br...@sy...> wrote: > > Jim, > > The best way to size a machine is for a data set and workload. Always buy SSD. The historical guidance was to use relatively small heaps (4-8GB) and let the OS buffer the disk. The concept was to minimize the impact of GC pauses. However, some people are having good success using large heaps (112G) and the G1 garbage collector. > > We run data sets of that size on platforms as small as Mac minis. > > For query performance, faster CPU cores are good and more cores are good. This assumes that the IO system has high IOPS. > > Would you characterize your queries as lightweight of heavy? Is the query workload highly concurrent (lots of clients)? Is the working set required to answer those queries small or a large part of your data? These things effect the throughput you will observe for query. Query plan optimizations is also very important. If you have an expensive query, make sure that it is doing what you intend. Often the query can be improved. For our part, we are working to improve the query optimizer. One client recently reported a 2x improvement in 1.5.1 vs 1.2.x. We have a lot more optimizations in the pipeline. > > The analytic query mode is for larger intermediate solutions sets. If you run this kind of query, then turn it on. You can do this on a query by query basis. The jvm ergonomics automatically allow a certain amount of native memory allocation. You only need to explicitly specify this if you are running into limits with those native buffers. The other use of native buffers is for the write cache. This improves the bulk load rate, but it does not look like it is your primary concern. > > Thanks, > Bryan > > PS: yes, the list is fine. > > On Thursday, April 2, 2015, Jim Balhoff <ba...@ne...> wrote: > Hi, > > I was wondering if you provided any guidance on hardware for different sizes of databases. I have read through the performance articles on the wiki, but am wondering if there are some more generalized guidelines that could be stated. In my case, say I will have 150 million triples, and am going to purchase a new system, how much memory is recommended? How much of that memory should I give to the JVM via "-Xmx" vs. letting the OS use it for caching the db? (I am also a little confused about whether I need to specifically allocate some other amount to the JVM through MaxDirectMemorySize, for analytic queries). I am only concerned with query speed, not writes. > > Maybe there are too many special cases, but I was hoping there are some minimum guidelines that could be determined. > > Side question: is it okay to post questions like this here? I find email lists to be a lot more convenient than the Sourceforge forum, but I can move it there if needed. > > Thank you, > Jim > > > ____________________________________________ > James P. Balhoff, Ph.D. > National Evolutionary Synthesis Center > 2024 West Main St., Suite A200 > Durham, NC 27705 > USA > > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > > -- > ---- > Bryan Thompson > Chief Scientist & Founder > SYSTAP, LLC > 4501 Tower Road > Greensboro, NC 27410 > br...@sy... > http://blazegraph.com > http://blog.bigdata.com > http://mapgraph.io > Blazegraph™ is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. > > > |
From: Bryan T. <br...@sy...> - 2015-04-02 22:05:51
|
Jim, The best way to size a machine is for a data set and workload. Always buy SSD. The historical guidance was to use relatively small heaps (4-8GB) and let the OS buffer the disk. The concept was to minimize the impact of GC pauses. However, some people are having good success using large heaps (112G) and the G1 garbage collector. We run data sets of that size on platforms as small as Mac minis. For query performance, faster CPU cores are good and more cores are good. This assumes that the IO system has high IOPS. Would you characterize your queries as lightweight of heavy? Is the query workload highly concurrent (lots of clients)? Is the working set required to answer those queries small or a large part of your data? These things effect the throughput you will observe for query. Query plan optimizations is also very important. If you have an expensive query, make sure that it is doing what you intend. Often the query can be improved. For our part, we are working to improve the query optimizer. One client recently reported a 2x improvement in 1.5.1 vs 1.2.x. We have a lot more optimizations in the pipeline. The analytic query mode is for larger intermediate solutions sets. If you run this kind of query, then turn it on. You can do this on a query by query basis. The jvm ergonomics automatically allow a certain amount of native memory allocation. You only need to explicitly specify this if you are running into limits with those native buffers. The other use of native buffers is for the write cache. This improves the bulk load rate, but it does not look like it is your primary concern. Thanks, Bryan PS: yes, the list is fine. On Thursday, April 2, 2015, Jim Balhoff <ba...@ne...> wrote: > Hi, > > I was wondering if you provided any guidance on hardware for different > sizes of databases. I have read through the performance articles on the > wiki, but am wondering if there are some more generalized guidelines that > could be stated. In my case, say I will have 150 million triples, and am > going to purchase a new system, how much memory is recommended? How much of > that memory should I give to the JVM via "-Xmx" vs. letting the OS use it > for caching the db? (I am also a little confused about whether I need to > specifically allocate some other amount to the JVM through > MaxDirectMemorySize, for analytic queries). I am only concerned with query > speed, not writes. > > Maybe there are too many special cases, but I was hoping there are some > minimum guidelines that could be determined. > > Side question: is it okay to post questions like this here? I find email > lists to be a lot more convenient than the Sourceforge forum, but I can > move it there if needed. > > Thank you, > Jim > > > ____________________________________________ > James P. Balhoff, Ph.D. > National Evolutionary Synthesis Center > 2024 West Main St., Suite A200 > Durham, NC 27705 > USA > > > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Bigdata-developers mailing list > Big...@li... <javascript:;> > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > -- ---- Bryan Thompson Chief Scientist & Founder SYSTAP, LLC 4501 Tower Road Greensboro, NC 27410 br...@sy... http://blazegraph.com http://blog.bigdata.com <http://bigdata.com> http://mapgraph.io Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Jim B. <ba...@ne...> - 2015-04-02 16:59:35
|
Hi, I was wondering if you provided any guidance on hardware for different sizes of databases. I have read through the performance articles on the wiki, but am wondering if there are some more generalized guidelines that could be stated. In my case, say I will have 150 million triples, and am going to purchase a new system, how much memory is recommended? How much of that memory should I give to the JVM via "-Xmx" vs. letting the OS use it for caching the db? (I am also a little confused about whether I need to specifically allocate some other amount to the JVM through MaxDirectMemorySize, for analytic queries). I am only concerned with query speed, not writes. Maybe there are too many special cases, but I was hoping there are some minimum guidelines that could be determined. Side question: is it okay to post questions like this here? I find email lists to be a lot more convenient than the Sourceforge forum, but I can move it there if needed. Thank you, Jim ____________________________________________ James P. Balhoff, Ph.D. National Evolutionary Synthesis Center 2024 West Main St., Suite A200 Durham, NC 27705 USA |
From: Bryan T. <br...@sy...> - 2015-03-26 10:55:38
|
Sounds like you have been making progress. If the leader is not ready, you might have inconsistent data in the journal root blocks on the servers. Try failing the leader and see if the other two meet. If they do, then redeploy the leader (remove the service directory, redeploy and restart the service).j. In general, once you figure out the configuration of the HAJournalServer, OS, and network I would do a from scratch deploy to make sure that everything (deployment configuration and data) is the same on each node. If you have things setup correctly, the cluster should come online smoothly. The cluster is available for read and write as long as a majority (2 out of 3) of the nodes are available. If the cluster goes below a majority then you can not read or write on it. This is because it can no longer be decided which services have a consensus around the state of the database. However, it is always possible to open the backing journal file in a non-clustered mode. The file format is identical. You can read all about the HA features at the HAJournalServer page of the wiki. http://wiki.bigdata.com/wiki/index.php/HAJournalServer There is also a complete description of the architecture for standalone, HA, and scale-out linked from the white papers section of our blog. http://blog.bigdata.com/ http://www.blazegraph.com/whitepapers/bigdata_architecture_whitepaper.pdf Thanks, Bryan On Thursday, March 26, 2015, Maximilian Brodhun < br...@su...> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Dear Bryan, > > I'm lucky to say that now all nodes are joined the quorum. Thanks a lot > for your help! > I have a problem that the leader has the status "notReady", but I fill > figure out why. > > > A last question I have is, will it become possible to still have write > access if one node will be down? > > Best regards, > > Max > > > Am 25.03.2015 um 16:57 schrieb Bryan Thompson: > > > > Ok. I encourage you to revert the files and just use environment > overrides. this is more consistent and systematic. Shut down and remove > the installs. See if it comes up with the documented overrides. If not, > send me the list of exported overrides and the status output. > > > > Make sure the services are shutdown. Use jps to list java processes. > Then deploy and start with those overrides. > > > > You could also remove the deployments to get as close as possible to a > blank slate. > > > > Check your firewalls. > > > > Bryan > > > > On Mar 25, 2015 11:44 AM, "Maximilian Brodhun" < > br...@su... > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');>> wrote: > > > > > > Oh, I'm sorry, this was an older configuration. In the actual (also not > working configuration) the port 3080 wasn't put there. > > > > Thanks again for your quick answer. > > > > Max > > > > Am 25.03.2015 um 16:39 schrieb Bryan Thompson: > > > > > One of your join locators is wrong. It has port 3080. > > > > > You can achieve these overrides just by exporting the relevant > variables as defined on the HAJournalServer page of the wiki. > > > > > Bryan > > > > > On Mar 25, 2015 11:32 AM, "Maximilian Brodhun" < > br...@su... > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');>> wrote: > > > > > > > Sorry for bothering you so much. > > > > > Thanks for your help, but I have still problems. The 127.0.1.1 looks > also odd to me. > > > Maybe it is helpful to tell you, what I have done so far. > > > > > I get the code for Version 1.5.1 by using git on one server. (Having > no other instances of blazegraph running) > > > After this I edit the config file > src/resources/HAJournal/HAJournal.config regarding this lines (marked red, > the others are like before) and copy it to the other server to having the > same configuration at all servers. > > > > > > > BIG DATA SECTION: > > > > > * private static fedname = > ConfigMath.getProperty("FEDNAME","tgRDFCluster"); > > > * private static rmiPort = > Integer.parseInt(ConfigMath.getProperty("RMI_PORT","3080")); > > > * private static haPort = > Integer.parseInt(ConfigMath.getProperty("HA_PORT","3090")); > > > * private static replicationFactor = > Integer.parseInt(ConfigMath.getProperty("REPLICATION_FACTOR","3")); > > > * private static logicalServiceId = > ConfigMath.getProperty("LOGICAL_SERVICE_ID","tgHA-1"); > > > * private static serviceId = null; > > > * private static fedDir = new > File(ConfigMath.getProperty("FED_DIR","."),fedname); > > > * private static serviceDir = new > File(fedDir,logicalServiceId+File.separator+"HAJournalServer"); > > > * private static dataDir = new > File(ConfigMath.getProperty("DATA_DIR",""+serviceDir)); > > > * private static haLogDir = new > File(ConfigMath.getProperty("HALOG_DIR",""+serviceDir+File.separator+"tgHALog")); > > > * private static snapshotDir = new > File(ConfigMath.getProperty("SNAPSHOT_DIR",""+serviceDir+File.separator+"snapshot")); > > > * private static snapshotPolicy = new > DefaultSnapshotPolicy(200/*hhmm*/,20/*percent*/); > > > * private static restorePolicy = new > DefaultRestorePolicy(ConfigMath.d2ms(7)); > > > * static private groups = > ConfigMath.getGroups(ConfigMath.getProperty("GROUPS",bigdata.fedname)); > > > * static private locators = > ConfigMath.getLocators(ConfigMath.getProperty("LOCATORS","jini://hostname1/,jini://hostname2/,jini://hostname3/")); > > > * static private leaseTimeout = ConfigMath.s2ms(100); // 20 > > > * static private sessionTimeout = (int)ConfigMath.s2ms(100); // 5 > > > * private static namespace = "tg"; > > > > > > > ZOOKEEPER SECTION: > > > > > * servers = > ConfigMath.getProperty("ZK_SERVERS","hostname1:2081,hostname2:2081,hostname2:2081"); > > > > > Doing the ant deploy-artifact on all server and edit the script in > dist/bigdata/bin/config.sh like I send you before: > > > > > * export FEDNAME=tgRDFCluster > > > * export FED_DIR=/home/"username"/blazegraphCluster/data > > > * export LOGICAL_SERVICE_ID=tgHA-1 > > > * export > LOCATORS="jini:/hostname1/,jini://hostname2/,jini://hostname3:3080/" > > > * export ZK_SERVERS="localhost:2181" > > > * export REPLICATION_FACTOR=3 > > > * export JETTY_PORT=7072 > > > * export GROUP_COMMIT=true > > > > > > > For having all hosts knows each other and make it possible to > communicate I free the ports in the firewall and edit all etc/hosts files: > > > > > * 127.0.0.1 localhost > > > * 127.0.1.1 hostname.domain shortname (without domain) > > > * > > > * # The following lines are desirable for IPv6 capable hosts > > > * ::1 localhost ip6-localhost ip6-loopback > > > * ff02::1 ip6-allnodes > > > * ff02::2 ip6-allrouters > > > > > > > Hope you aren't bugged about this mail. > > > > > best regards, > > > > > Maximilian > > > > > Am 25.03.2015 um 12:01 schrieb Bryan Thompson: > > > > I see several different manners in which ip addresses are being > expressed (host name, 10.x.x.x network, 127.x.x,x network). This would > appear to be the status for the one host that can not reach the others. > While it has discovered zookeeper and obtained R I proxies for two other > services (those 127.0.1.1 addresses look odd) the services are not > responding at those addresses. This could be an ip configuration issue (I > would try consistently using the private network ip addresses thoughout) or > a firewall issue. > > > > > > Sometimes you can windup with bad proxy caches when bouncing the > services and making configuration changes. If this happens the easiest > thing to do is bring down all 3 services, wait a minute or two for > zookeeper to definitively notice that the services are dead and expire the > ephemeral znodes. Also, make sure that the services really are down using > jps to list java processes and use netstat to make sure that lingering > ports have been released. The check the logs for each service on restart. > Once you have a proper configuration, these steps are not necessary but > sometimes they can help while attempting to coverage on a correct is > installation. > > > > > > On Wednesday, March 25, 2015, Maximilian Brodhun < > br...@su... > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');>> wrote: > > > > > > > > Thanks for the quick answer. > > > > I'm wondering cause I put all hosts in etc/hosts and I can ping all > servers by using the name not the IP. > > > > > > I give all servers a higher timeout time but the problem still > occurs. The status tab from blazegraph gives the following response: (maybe > an RMI Problem?) > > > > > > > > Quorum Services > > > > > > * http://myhostname.de:7073/bigdata > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> : is not joined, > pipelineOrder=0, writePipelineAddr=/10.254.1.6:3090 > <http://10.254.1.6:3090> <http://10.254.1.6:3090> <http://10.254.1.6:3090> > <http://10.254.1.6:3090> <http://10.254.1.6:3090> <http://10.254.1.6:3090> > <http://10.254.1.6:3090> <http://10.254.1.6:3090> <http://10.254.1.6:3090> > <http://10.254.1.6:3090> <http://10.254.1.6:3090> <http://10.254.1.6:3090> > <http://10.254.1.6:3090> <http://10.254.1.6:3090>, service=self, > extendedRunState={server=Running, quorumService=SeekConsensus @ 0, > haReady=-1, haStatus=NotReady, > serviceId=791f97d3-6c12-4470-ac61-97d03c0cd43b, now=1427274241853} > > > > * Unable to reach service: > Proxy[HAGlue,BasicInvocationHandler[BasicObjectEndpoint[704b0793-0285-4a26-ae9a-904a3fc3b5ee,TcpEndpoint[127.0.1.1:3080 > ]]]] > > > > * Unable to reach service: > Proxy[HAGlue,BasicInvocationHandler[BasicObjectEndpoint[f0c1c096-03d8-4d49-8093-30b50d605d8b,TcpEndpoint[127.0.1.1:3080 > ]]]] > > > > > > > > Zookeeper > > > > > > tgHA-1(1 children) > > > > quorum(4 children) > com.bigdata.quorum.zk.QuorumTokenState{lastValidToken=67,currentToken=-1,replicationFactor=3} > > > > joined(2 children) > > > > joined0000000300 (Ephemeral165590815907643397) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48} > > > > joined0000000301 (Ephemeral93533224253980679) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59} > > > > member(3 children) > > > > member71ef114a-b872-470a-ac9b-0ff632aa0b59 > (Ephemeral93533224253980679) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59} > > > > member791f97d3-6c12-4470-ac61-97d03c0cd43b > (Ephemeral237648408135794694) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=791f97d3-6c12-4470-ac61-97d03c0cd43b} > > > > member8deaf15c-776d-48d9-84d5-2157c56dbe48 > (Ephemeral165590815907643397) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48} > > > > pipeline(3 children) > > > > pipeline0000000401 (Ephemeral237648408135794694) > com.bigdata.quorum.zk.QuorumPipelineState{serviceUUID=791f97d3-6c12-4470-ac61-97d03c0cd43b,addrSelf=/ > 10.254.1.6:3090 <http://10.254.1.6:3090> <http://10.254.1.6:3090> > <http://10.254.1.6:3090> <http://10.254.1.6:3090> <http://10.254.1.6:3090> > <http://10.254.1.6:3090> <http://10.254.1.6:3090> <http://10.254.1.6:3090> > <http://10.254.1.6:3090> <http://10.254.1.6:3090> <http://10.254.1.6:3090> > <http://10.254.1.6:3090> <http://10.254.1.6:3090> <http://10.254.1.6:3090> > } > > > > pipeline0000000403 (Ephemeral165590815907643397) > com.bigdata.quorum.zk.QuorumPipelineState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48,addrSelf=/ > 10.254.1.2:3090 <http://10.254.1.2:3090> <http://10.254.1.2:3090> > <http://10.254.1.2:3090> <http://10.254.1.2:3090> <http://10.254.1.2:3090> > <http://10.254.1.2:3090> <http://10.254.1.2:3090> <http://10.254.1.2:3090> > <http://10.254.1.2:3090> <http://10.254.1.2:3090> <http://10.254.1.2:3090> > <http://10.254.1.2:3090> <http://10.254.1.2:3090> <http://10.254.1.2:3090> > } > > > > pipeline0000000404 (Ephemeral93533224253980679) > com.bigdata.quorum.zk.QuorumPipelineState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59,addrSelf=/ > 10.254.1.5:3090 <http://10.254.1.5:3090> <http://10.254.1.5:3090> > <http://10.254.1.5:3090> <http://10.254.1.5:3090> <http://10.254.1.5:3090> > <http://10.254.1.5:3090> <http://10.254.1.5:3090> <http://10.254.1.5:3090> > <http://10.254.1.5:3090> <http://10.254.1.5:3090> <http://10.254.1.5:3090> > <http://10.254.1.5:3090> <http://10.254.1.5:3090> <http://10.254.1.5:3090> > } > > > > votes(1 children) > > > > 0(2 children) > > > > vote0000000000 (Ephemeral165590815907643397) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48} > > > > vote0000000001 (Ephemeral93533224253980679) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59} > > > > > > > > > > Am 24.03.2015 um 14:58 schrieb Bryan Thompson: > > > > > This could very easily be DNS. Also, java can have long timeouts > (30-60 seconds) if reverse DNS is not properly configured. > > > > > > > You can use http://localhost:port/bigdata/status to see the > detailed status (including zookeeper). This information is also available > under the "status" tab of the workbench. > > > > > > > Thanks, > > > > > Bryan > > > > > > > ---- > > > > > Bryan Thompson > > > > > Chief Scientist & Founder > > > > > SYSTAP, LLC > > > > > 4501 Tower Road > > > > > Greensboro, NC 27410 > > > > > br...@sy... > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > > > > > http://blazegraph.com > > > > > http://blog.bigdata.com <http://bigdata.com> <http://bigdata.com> > <http://bigdata.com> <http://bigdata.com> <http://bigdata.com> > <http://bigdata.com> <http://bigdata.com> <http://bigdata.com> > <http://bigdata.com> <http://bigdata.com> <http://bigdata.com> > <http://bigdata.com> <http://bigdata.com> <http://bigdata.com> > <http://bigdata.com> <http://bigdata.com> > > > > > http://mapgraph.io > > > > > > > Blazegraph™ <http://www.blazegraph.com/> > <http://www.blazegraph.com/> <http://www.blazegraph.com/> > <http://www.blazegraph.com/> <http://www.blazegraph.com/> > <http://www.blazegraph.com/> <http://www.blazegraph.com/> > <http://www.blazegraph.com/> <http://www.blazegraph.com/> > <http://www.blazegraph.com/> <http://www.blazegraph.com/> > <http://www.blazegraph.com/> <http://www.blazegraph.com/> > <http://www.blazegraph.com/> <http://www.blazegraph.com/> > <http://www.blazegraph.com/> is our ultra high-performance graph database > that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> is our > disruptive new technology to use GPUs to accelerate data-parallel graph > analytics. > > > > > > > CONFIDENTIALITY NOTICE: This email and its contents and > attachments are for the sole use of the intended recipient(s) and are > confidential or proprietary to SYSTAP. Any unauthorized review, use, > disclosure, dissemination or copying of this email or its contents or > attachments is prohibited. If you have received this communication in > error, please notify the sender by reply email and permanently delete all > copies of the email and its contents and attachments. > > > > > > > > > On Tue, Mar 24, 2015 at 9:54 AM, Maximilian Brodhun < > br...@su... > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <javascript:_e(%7B%7D,'cvml','br...@su...');>> wrote: > > > > > > > > > Dear All, > > > > > > > I'm very new to blazegraph but I changed to blazegraph cause of the > > > > > clustering possibilities. I'm poor of having trouble with this. I > want > > > > > to cluster three nodes, all three become members in the quorum but > only > > > > > two join them. > > > > > I discover this with zooinspector. > > > > > > > The only difference between the three servers is that one of them > > > > > doesn't have DNS is that a problem? Maybe on you can help me. > > > > > > > > > My config file looks like this (the same on every machine): > > > > > > > ## Configure basic environment variables. Obviously, you must use > your > > > > > own parameters for LOCATORS and ZK_SERVERS. > > > > > > > ## This will not override parameters in the environment. > > > > > > > > > # Name of the federation of services (controls the Apache River > GROUPS). > > > > > > > > > if [ -z "${FEDNAME}" ]; then > > > > > > > export FEDNAME=tgRDFCluster > > > > > > > fi > > > > > > > > > # Path for local storage for this federation of services. > > > > > > > > > if [ -z "${FED_DIR}" ]; then > > > > > > > export FED_DIR=/home/tomcat-sesame/blazegraphCluster/data > > > > > > > fi > > > > > > > > > # Name of the replication cluster to which this HAJournalServer > will belong. > > > > > > > > > if [ -z "${LOGICAL_SERVICE_ID}" ]; then > > > > > > > export LOGICAL_SERVICE_ID=tgHA-1 > > > > > > > fi > > > > > > > > > # Where to find the Apache River service registrars (can also use > > > > > multicast). > > > > > > > > > if [ -z "${LOCATORS}" ]; then > > > > > > > #Use for a HA1+ configuration > > > > > > > #export LOCATORS="jini://localhost/" > > > > > > > #HA3 example > > > > > > > export > > > > > LOCATORS="jini:// > textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/ > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > " > > > > > > > fi > > > > > > > > > # Where to find the Apache Zookeeper ensemble. > > > > > > > > > if [ -z "${ZK_SERVERS}" ] ; then > > > > > > > #Use for single node configuration > > > > > > > export ZK_SERVERS="localhost:2181" > > > > > > > #Use for a multiple ZK configuration > > > > > > > #export > ZK_SERVERS="bigdata15:2081,bigdata16:2081,bigdata17:2081" > > > > > > > fi > > > > > > > > > #Replication Factor (set to one for HA1) configuration > > > > > > > > > if [ -z "${REPLICATION_FACTOR}" ] ; then > > > > > > > #Use for a HA1 configuration > > > > > > > export REPLICATION_FACTOR=3 > > > > > > > #Use for a HA1+ configuration > > > > > > > #export REPLICATION_FACTOR=3 > > > > > > > fi > > > > > > > > > #Port for the NanoSparqlServer Jetty > > > > > > > > > if [ -z "${JETTY_PORT}" ] ; then > > > > > > > export JETTY_PORT=7070 > > > > > > > fi > > > > > > > > > #Group commit (true|false) > > > > > > > > > if [ -z "${GROUP_COMMIT}" ] ; then > > > > > > > export GROUP_COMMIT=true > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > > > > Dive into the World of Parallel Programming The Go Parallel > Website, sponsored > > > > > by Intel and developed in partnership with Slashdot Media, is > your hub for all > > > > > things parallel software development, from weekly thought > leadership blogs to > > > > > news, videos, case studies, tutorials and more. Take a look > and join the > > > > > conversation now. http://goparallel.sourceforge.net/ > > > > > _______________________________________________ > > > > > Bigdata-developers mailing list > > > > > Big...@li... > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > <mailto:Big...@li...> > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > <mailto:Big...@li...> > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > <mailto:Big...@li...> > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > <mailto:Big...@li...> > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > <mailto:Big...@li...> > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > <mailto:Big...@li...> > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > <mailto:Big...@li...> > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > > > > > > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > > > > > > > > > > > > > > > -- > > > > ---- > > > > Bryan Thompson > > > > Chief Scientist & Founder > > > > SYSTAP, LLC > > > > 4501 Tower Road > > > > Greensboro, NC 27410 > > > > br...@sy... <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > > > > http://blazegraph.com > > > > http://blog.bigdata.com <http://bigdata.com> <http://bigdata.com> > <http://bigdata.com> <http://bigdata.com> <http://bigdata.com> > <http://bigdata.com> <http://bigdata.com> <http://bigdata.com> > > > > http://mapgraph.io > > > > > > Blazegraph™ <http://www.blazegraph.com/> > <http://www.blazegraph.com/> <http://www.blazegraph.com/> > <http://www.blazegraph.com/> <http://www.blazegraph.com/> > <http://www.blazegraph.com/> <http://www.blazegraph.com/> > <http://www.blazegraph.com/> is our ultra high-performance graph database > that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> > <http://www.systap.com/mapgraph> <http://www.systap.com/mapgraph> is our > disruptive new technology to use GPUs to accelerate data-parallel graph > analytics. > > > > > > CONFIDENTIALITY NOTICE: This email and its contents and attachments > are for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP. Any unauthorized review, use, disclosure, > dissemination or copying of this email or its contents or attachments is > prohibited. If you have received this communication in error, please notify > the sender by reply email and permanently delete all copies of the email > and its contents and attachments. > > > > > > > > > > > > > > - -- > Maximilian Brodhun > > Abteilung Forschung und Entwicklung > Georg-August-Universität Göttingen > Niedersächsische Staats- und Universitätsbibliothek Göttingen > D-37070 Göttingen > > Papendiek 14 (Historisches Gebäude, Raum 2.409) > +49 551 39-4923 (Tel.) > > br...@su... > <javascript:_e(%7B%7D,'cvml','br...@su...');> > > http://www.sub.uni-goettingende/ > http://www.rdd.sub.uni-goettingen.de/ > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1 > > iQIcBAEBAgAGBQJVE+FGAAoJEBDMOSiH8mYu1AQP/3t4YqHm+PAL+jGTf3wH8qry > UUy5kM1eTYRKrZ0+il6KLFaGsewkAjfy1ordWXq+WO80RmZo4yqepnO+pkUF8LvY > GE+6vUnzRM6ny+JznmPfIIrWND/qUA75alUvYCD2aPoF/MDosDpHYU9tZxQfUPIm > 74Y2QyLsZvkRppSan7S0m3EOX0ZYbUCN+FfwwseEAr3bVo2l+DToTQyWazRltxnq > cj3p2iUUrxP6ojDzn7nVhZJCrnxvGluAgYSHsbD0TVRD1rKs32yZwRg1RtKs2/xh > 105jV8in+LKIfOJ8nmlVtk0UWPPOecxHiMMpbHC4VlL6HHDJPvR0lzfiVKR90lp8 > 5sh4FAXgvDqsDOLJCdn2x7DyMrUWuNpkbcGFoHdZlrdHDE+soewmLCCdPxMwVJM2 > nvvMuoA3hZ1bAiRjksNyQ0KzFfO31DNSCYDVfYTJw34vCz36xwZW+3dwnhh9oibT > cDfNQCu97b/UIjobMXvP+JMNtFiDdJmCIKFwXpzsF0HyLYS+5x++qDWzzNyY8wFR > Xj3sWx+N06mvaWiOnsn2i0j/lSjWPZS7eCKT/113wCoc083AhxXRhDY2bOUHWbz5 > BjOccRkNtbmHl1e3IY4loIhJOh9xQznOXOYYTa4BZwWWnh1fYWQ5GmV5tafU7Ayy > TmImZmmEQTC05LncxLFt > =x6f1 > -----END PGP SIGNATURE----- > > -- ---- Bryan Thompson Chief Scientist & Founder SYSTAP, LLC 4501 Tower Road Greensboro, NC 27410 br...@sy... http://blazegraph.com http://blog.bigdata.com <http://bigdata.com> http://mapgraph.io Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Bryan T. <br...@sy...> - 2015-03-25 11:01:26
|
I see several different manners in which ip addresses are being expressed (host name, 10.x.x.x network, 127.x.x,x network). This would appear to be the status for the one host that can not reach the others. While it has discovered zookeeper and obtained R I proxies for two other services (those 127.0.1.1 addresses look odd) the services are not responding at those addresses. This could be an ip configuration issue (I would try consistently using the private network ip addresses thoughout) or a firewall issue. Sometimes you can windup with bad proxy caches when bouncing the services and making configuration changes. If this happens the easiest thing to do is bring down all 3 services, wait a minute or two for zookeeper to definitively notice that the services are dead and expire the ephemeral znodes. Also, make sure that the services really are down using jps to list java processes and use netstat to make sure that lingering ports have been released. The check the logs for each service on restart. Once you have a proper configuration, these steps are not necessary but sometimes they can help while attempting to coverage on a correct is installation. On Wednesday, March 25, 2015, Maximilian Brodhun < br...@su...> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Thanks for the quick answer. > I'm wondering cause I put all hosts in etc/hosts and I can ping all > servers by using the name not the IP. > > I give all servers a higher timeout time but the problem still occurs. The > status tab from blazegraph gives the following response: (maybe an RMI > Problem?) > > > Quorum Services > > * http://myhostname.de:7073/bigdata > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> > <http://textgrid-blazequorum.gwdg.de:7073/bigdata> : is not joined, > pipelineOrder=0, writePipelineAddr=/10.254.1.6:3090, service=self, > extendedRunState={server=Running, quorumService=SeekConsensus @ 0, > haReady=-1, haStatus=NotReady, > serviceId=791f97d3-6c12-4470-ac61-97d03c0cd43b, now=1427274241853} > * Unable to reach service: > Proxy[HAGlue,BasicInvocationHandler[BasicObjectEndpoint[704b0793-0285-4a26-ae9a-904a3fc3b5ee,TcpEndpoint[127.0.1.1:3080 > ]]]] > * Unable to reach service: > Proxy[HAGlue,BasicInvocationHandler[BasicObjectEndpoint[f0c1c096-03d8-4d49-8093-30b50d605d8b,TcpEndpoint[127.0.1.1:3080 > ]]]] > > > Zookeeper > > tgHA-1(1 children) > quorum(4 children) > com.bigdata.quorum.zk.QuorumTokenState{lastValidToken=67,currentToken=-1,replicationFactor=3} > joined(2 children) > joined0000000300 (Ephemeral165590815907643397) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48} > joined0000000301 (Ephemeral93533224253980679) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59} > member(3 children) > member71ef114a-b872-470a-ac9b-0ff632aa0b59 > (Ephemeral93533224253980679) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59} > member791f97d3-6c12-4470-ac61-97d03c0cd43b > (Ephemeral237648408135794694) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=791f97d3-6c12-4470-ac61-97d03c0cd43b} > member8deaf15c-776d-48d9-84d5-2157c56dbe48 > (Ephemeral165590815907643397) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48} > pipeline(3 children) > pipeline0000000401 (Ephemeral237648408135794694) > com.bigdata.quorum.zk.QuorumPipelineState{serviceUUID=791f97d3-6c12-4470-ac61-97d03c0cd43b,addrSelf=/ > 10.254.1.6:3090} > pipeline0000000403 (Ephemeral165590815907643397) > com.bigdata.quorum.zk.QuorumPipelineState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48,addrSelf=/ > 10.254.1.2:3090} > pipeline0000000404 (Ephemeral93533224253980679) > com.bigdata.quorum.zk.QuorumPipelineState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59,addrSelf=/ > 10.254.1.5:3090} > votes(1 children) > 0(2 children) > vote0000000000 (Ephemeral165590815907643397) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48} > vote0000000001 (Ephemeral93533224253980679) > com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59} > > > > Am 24.03.2015 um 14:58 schrieb Bryan Thompson: > > This could very easily be DNS. Also, java can have long timeouts (30-60 > seconds) if reverse DNS is not properly configured. > > > > You can use http://localhost:port/bigdata/status to see the detailed > status (including zookeeper). This information is also available under > the "status" tab of the workbench. > > > > Thanks, > > Bryan > > > > ---- > > Bryan Thompson > > Chief Scientist & Founder > > SYSTAP, LLC > > 4501 Tower Road > > Greensboro, NC 27410 > > br...@sy... <javascript:_e(%7B%7D,'cvml','br...@sy...');> > <mailto:br...@sy...> > <javascript:_e(%7B%7D,'cvml','br...@sy...');> > > http://blazegraph.com > > http://blog.bigdata.com <http://bigdata.com> <http://bigdata.com> > > http://mapgraph.io > > > > Blazegraph™ <http://www.blazegraph.com/> <http://www.blazegraph.com/> > is our ultra high-performance graph database that supports both RDF/SPARQL > and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> > <http://www.systap.com/mapgraph> is our disruptive new technology to use > GPUs to accelerate data-parallel graph analytics. > > > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP. Any unauthorized review, use, disclosure, > dissemination or copying of this email or its contents or attachments is > prohibited. If you have received this communication in error, please notify > the sender by reply email and permanently delete all copies of the email > and its contents and attachments. > > > > > > On Tue, Mar 24, 2015 at 9:54 AM, Maximilian Brodhun < > br...@su... > <javascript:_e(%7B%7D,'cvml','br...@su...');> > <mailto:br...@su...> > <javascript:_e(%7B%7D,'cvml','br...@su...');>> wrote: > > > > > > Dear All, > > > > I'm very new to blazegraph but I changed to blazegraph cause of the > > clustering possibilities. I'm poor of having trouble with this. I want > > to cluster three nodes, all three become members in the quorum but only > > two join them. > > I discover this with zooinspector. > > > > The only difference between the three servers is that one of them > > doesn't have DNS is that a problem? Maybe on you can help me. > > > > > > My config file looks like this (the same on every machine): > > > > ## Configure basic environment variables. Obviously, you must use your > > own parameters for LOCATORS and ZK_SERVERS. > > > > ## This will not override parameters in the environment. > > > > > > # Name of the federation of services (controls the Apache River GROUPS). > > > > > > if [ -z "${FEDNAME}" ]; then > > > > export FEDNAME=tgRDFCluster > > > > fi > > > > > > # Path for local storage for this federation of services. > > > > > > if [ -z "${FED_DIR}" ]; then > > > > export FED_DIR=/home/tomcat-sesame/blazegraphCluster/data > > > > fi > > > > > > # Name of the replication cluster to which this HAJournalServer will > belong. > > > > > > if [ -z "${LOGICAL_SERVICE_ID}" ]; then > > > > export LOGICAL_SERVICE_ID=tgHA-1 > > > > fi > > > > > > # Where to find the Apache River service registrars (can also use > > multicast). > > > > > > if [ -z "${LOCATORS}" ]; then > > > > #Use for a HA1+ configuration > > > > #export LOCATORS="jini://localhost/" > > > > #HA3 example > > > > export > > LOCATORS="jini:// > textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/ > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/> > " > > > > fi > > > > > > # Where to find the Apache Zookeeper ensemble. > > > > > > if [ -z "${ZK_SERVERS}" ] ; then > > > > #Use for single node configuration > > > > export ZK_SERVERS="localhost:2181" > > > > #Use for a multiple ZK configuration > > > > #export ZK_SERVERS="bigdata15:2081,bigdata16:2081,bigdata17:2081" > > > > fi > > > > > > #Replication Factor (set to one for HA1) configuration > > > > > > if [ -z "${REPLICATION_FACTOR}" ] ; then > > > > #Use for a HA1 configuration > > > > export REPLICATION_FACTOR=3 > > > > #Use for a HA1+ configuration > > > > #export REPLICATION_FACTOR=3 > > > > fi > > > > > > #Port for the NanoSparqlServer Jetty > > > > > > if [ -z "${JETTY_PORT}" ] ; then > > > > export JETTY_PORT=7070 > > > > fi > > > > > > #Group commit (true|false) > > > > > > if [ -z "${GROUP_COMMIT}" ] ; then > > > > export GROUP_COMMIT=true > > > > > > > > > > > ------------------------------------------------------------------------------ > > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > > by Intel and developed in partnership with Slashdot Media, is your > hub for all > > things parallel software development, from weekly thought leadership > blogs to > > news, videos, case studies, tutorials and more. Take a look and join > the > > conversation now. http://goparallel.sourceforge.net/ > > _______________________________________________ > > Bigdata-developers mailing list > > Big...@li... > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > <mailto:Big...@li...> > <javascript:_e(%7B%7D,'cvml','Big...@li...');> > > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > > > > > - -- > Maximilian Brodhun > > Abteilung Forschung und Entwicklung > Georg-August-Universität Göttingen > Niedersächsische Staats- und Universitätsbibliothek Göttingen > D-37070 Göttingen > > Papendiek 14 (Historisches Gebäude, Raum 2.409) > +49 551 39-4923 (Tel.) > > br...@su... > <javascript:_e(%7B%7D,'cvml','br...@su...');> > > http://www.sub.uni-goettingende/ > http://www.rdd.sub.uni-goettingen.de/ > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1 > > iQIcBAEBAgAGBQJVEn0FAAoJEBDMOSiH8mYu01EP/ibqT/5sn3dcDHRQ0upDEq4r > dJh6IYUkQnCDJJ36lgozVSlEEKHAnLgMW8HcFWwA4ox74v/NJtvnvMz+mjDuExQ1 > xZbMQHGW19pXSJoiI2rWbi/j7cwdU46EUlAnAxf96UN/P4Srg3OPcqrvGo9Y6YC9 > 5Z+WmLMBZ+kVG9++Vhe0JCoqe6L8NkjVk/wvOwM5Qdh4Et99HLUA0tFBwsK+8cQK > EBrkjDsGmsvtJQGNVygN4EWDP0GfQPXU7XzPExOhKO/mPzFTRuyd0FAzXlSWWnqR > TA77bM9rqBCo3jmWyFme84gpggVeUWMNFyLEdJDKYIU3JU/wcf+5JxHqD5xMGSQ6 > Y2e4c9MNEkkernT8XedUdcttnNBvZqN4M5AILNr19uD5zz+RKpgsg+fNRvAhXvz2 > HFevFWG7+5M87b781z6gjcVvyPSsIOvNKdNljzWYSjOwO5f1hAEz0f2LuYJF9vzf > HaCQ7Gt1tCyGUn/UPteGhUOq6xWOqHZHAgwz0eGGMpxI/4vKNgPq8huObPXvJieV > gu3/RVupNsyNwyczeMDg6OSYyz791EEKfxkKhJ9rN3DpPbAdenCE4/7udpTR7R8c > AJjBttHQ2tXAFS1Tvp5ViM9lEFXT315yLhhWkHhtMt1RVygYPrVr8p3nZmpZ67iW > IwYYhOIpyrLoai1CY6Mm > =JnID > -----END PGP SIGNATURE----- > > -- ---- Bryan Thompson Chief Scientist & Founder SYSTAP, LLC 4501 Tower Road Greensboro, NC 27410 br...@sy... http://blazegraph.com http://blog.bigdata.com <http://bigdata.com> http://mapgraph.io Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Maximilian B. <br...@su...> - 2015-03-25 09:18:37
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Thanks for the quick answer. I'm wondering cause I put all hosts in etc/hosts and I can ping all servers by using the name not the IP. I give all servers a higher timeout time but the problem still occurs. The status tab from blazegraph gives the following response: (maybe an RMI Problem?) Quorum Services * http://myhostname.de:7073/bigdata <http://textgrid-blazequorum.gwdg.de:7073/bigdata> : is not joined, pipelineOrder=0, writePipelineAddr=/10.254.1.6:3090, service=self, extendedRunState={server=Running, quorumService=SeekConsensus @ 0, haReady=-1, haStatus=NotReady, serviceId=791f97d3-6c12-4470-ac61-97d03c0cd43b, now=1427274241853} * Unable to reach service: Proxy[HAGlue,BasicInvocationHandler[BasicObjectEndpoint[704b0793-0285-4a26-ae9a-904a3fc3b5ee,TcpEndpoint[127.0.1.1:3080]]]] * Unable to reach service: Proxy[HAGlue,BasicInvocationHandler[BasicObjectEndpoint[f0c1c096-03d8-4d49-8093-30b50d605d8b,TcpEndpoint[127.0.1.1:3080]]]] Zookeeper tgHA-1(1 children) quorum(4 children) com.bigdata.quorum.zk.QuorumTokenState{lastValidToken=67,currentToken=-1,replicationFactor=3} joined(2 children) joined0000000300 (Ephemeral165590815907643397) com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48} joined0000000301 (Ephemeral93533224253980679) com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59} member(3 children) member71ef114a-b872-470a-ac9b-0ff632aa0b59 (Ephemeral93533224253980679) com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59} member791f97d3-6c12-4470-ac61-97d03c0cd43b (Ephemeral237648408135794694) com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=791f97d3-6c12-4470-ac61-97d03c0cd43b} member8deaf15c-776d-48d9-84d5-2157c56dbe48 (Ephemeral165590815907643397) com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48} pipeline(3 children) pipeline0000000401 (Ephemeral237648408135794694) com.bigdata.quorum.zk.QuorumPipelineState{serviceUUID=791f97d3-6c12-4470-ac61-97d03c0cd43b,addrSelf=/10.254.1.6:3090} pipeline0000000403 (Ephemeral165590815907643397) com.bigdata.quorum.zk.QuorumPipelineState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48,addrSelf=/10.254.1.2:3090} pipeline0000000404 (Ephemeral93533224253980679) com.bigdata.quorum.zk.QuorumPipelineState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59,addrSelf=/10.254.1.5:3090} votes(1 children) 0(2 children) vote0000000000 (Ephemeral165590815907643397) com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=8deaf15c-776d-48d9-84d5-2157c56dbe48} vote0000000001 (Ephemeral93533224253980679) com.bigdata.quorum.zk.QuorumServiceState{serviceUUID=71ef114a-b872-470a-ac9b-0ff632aa0b59} Am 24.03.2015 um 14:58 schrieb Bryan Thompson: > This could very easily be DNS. Also, java can have long timeouts (30-60 seconds) if reverse DNS is not properly configured. > > You can use http://localhost:port/bigdata/status to see the detailed status (including zookeeper). This information is also available under the "status" tab of the workbench. > > Thanks, > Bryan > > ---- > Bryan Thompson > Chief Scientist & Founder > SYSTAP, LLC > 4501 Tower Road > Greensboro, NC 27410 > br...@sy... <mailto:br...@sy...> > http://blazegraph.com > http://blog.bigdata.com <http://bigdata.com> > http://mapgraph.io > > Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. > > > On Tue, Mar 24, 2015 at 9:54 AM, Maximilian Brodhun <br...@su... <mailto:br...@su...>> wrote: > > > Dear All, > > I'm very new to blazegraph but I changed to blazegraph cause of the > clustering possibilities. I'm poor of having trouble with this. I want > to cluster three nodes, all three become members in the quorum but only > two join them. > I discover this with zooinspector. > > The only difference between the three servers is that one of them > doesn't have DNS is that a problem? Maybe on you can help me. > > > My config file looks like this (the same on every machine): > > ## Configure basic environment variables. Obviously, you must use your > own parameters for LOCATORS and ZK_SERVERS. > > ## This will not override parameters in the environment. > > > # Name of the federation of services (controls the Apache River GROUPS). > > > if [ -z "${FEDNAME}" ]; then > > export FEDNAME=tgRDFCluster > > fi > > > # Path for local storage for this federation of services. > > > if [ -z "${FED_DIR}" ]; then > > export FED_DIR=/home/tomcat-sesame/blazegraphCluster/data > > fi > > > # Name of the replication cluster to which this HAJournalServer will belong. > > > if [ -z "${LOGICAL_SERVICE_ID}" ]; then > > export LOGICAL_SERVICE_ID=tgHA-1 > > fi > > > # Where to find the Apache River service registrars (can also use > multicast). > > > if [ -z "${LOCATORS}" ]; then > > #Use for a HA1+ configuration > > #export LOCATORS="jini://localhost/" > > #HA3 example > > export > LOCATORS="jini://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/ <http://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/>" > > fi > > > # Where to find the Apache Zookeeper ensemble. > > > if [ -z "${ZK_SERVERS}" ] ; then > > #Use for single node configuration > > export ZK_SERVERS="localhost:2181" > > #Use for a multiple ZK configuration > > #export ZK_SERVERS="bigdata15:2081,bigdata16:2081,bigdata17:2081" > > fi > > > #Replication Factor (set to one for HA1) configuration > > > if [ -z "${REPLICATION_FACTOR}" ] ; then > > #Use for a HA1 configuration > > export REPLICATION_FACTOR=3 > > #Use for a HA1+ configuration > > #export REPLICATION_FACTOR=3 > > fi > > > #Port for the NanoSparqlServer Jetty > > > if [ -z "${JETTY_PORT}" ] ; then > > export JETTY_PORT=7070 > > fi > > > #Group commit (true|false) > > > if [ -z "${GROUP_COMMIT}" ] ; then > > export GROUP_COMMIT=true > > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Bigdata-developers mailing list > Big...@li... <mailto:Big...@li...> > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > - -- Maximilian Brodhun Abteilung Forschung und Entwicklung Georg-August-Universität Göttingen Niedersächsische Staats- und Universitätsbibliothek Göttingen D-37070 Göttingen Papendiek 14 (Historisches Gebäude, Raum 2.409) +49 551 39-4923 (Tel.) br...@su... http://www.sub.uni-goettingende/ http://www.rdd.sub.uni-goettingen.de/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJVEn0FAAoJEBDMOSiH8mYu01EP/ibqT/5sn3dcDHRQ0upDEq4r dJh6IYUkQnCDJJ36lgozVSlEEKHAnLgMW8HcFWwA4ox74v/NJtvnvMz+mjDuExQ1 xZbMQHGW19pXSJoiI2rWbi/j7cwdU46EUlAnAxf96UN/P4Srg3OPcqrvGo9Y6YC9 5Z+WmLMBZ+kVG9++Vhe0JCoqe6L8NkjVk/wvOwM5Qdh4Et99HLUA0tFBwsK+8cQK EBrkjDsGmsvtJQGNVygN4EWDP0GfQPXU7XzPExOhKO/mPzFTRuyd0FAzXlSWWnqR TA77bM9rqBCo3jmWyFme84gpggVeUWMNFyLEdJDKYIU3JU/wcf+5JxHqD5xMGSQ6 Y2e4c9MNEkkernT8XedUdcttnNBvZqN4M5AILNr19uD5zz+RKpgsg+fNRvAhXvz2 HFevFWG7+5M87b781z6gjcVvyPSsIOvNKdNljzWYSjOwO5f1hAEz0f2LuYJF9vzf HaCQ7Gt1tCyGUn/UPteGhUOq6xWOqHZHAgwz0eGGMpxI/4vKNgPq8huObPXvJieV gu3/RVupNsyNwyczeMDg6OSYyz791EEKfxkKhJ9rN3DpPbAdenCE4/7udpTR7R8c AJjBttHQ2tXAFS1Tvp5ViM9lEFXT315yLhhWkHhtMt1RVygYPrVr8p3nZmpZ67iW IwYYhOIpyrLoai1CY6Mm =JnID -----END PGP SIGNATURE----- |
From: Bryan T. <br...@sy...> - 2015-03-24 13:58:30
|
This could very easily be DNS. Also, java can have long timeouts (30-60 seconds) if reverse DNS is not properly configured. You can use http://localhost:port/bigdata/status to see the detailed status (including zookeeper). This information is also available under the "status" tab of the workbench. Thanks, Bryan ---- Bryan Thompson Chief Scientist & Founder SYSTAP, LLC 4501 Tower Road Greensboro, NC 27410 br...@sy... http://blazegraph.com http://blog.bigdata.com <http://bigdata.com> http://mapgraph.io Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. On Tue, Mar 24, 2015 at 9:54 AM, Maximilian Brodhun < br...@su...> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Dear All, > > I'm very new to blazegraph but I changed to blazegraph cause of the > clustering possibilities. I'm poor of having trouble with this. I want > to cluster three nodes, all three become members in the quorum but only > two join them. > I discover this with zooinspector. > > The only difference between the three servers is that one of them > doesn't have DNS is that a problem? Maybe on you can help me. > > > My config file looks like this (the same on every machine): > > ## Configure basic environment variables. Obviously, you must use your > own parameters for LOCATORS and ZK_SERVERS. > > ## This will not override parameters in the environment. > > > # Name of the federation of services (controls the Apache River GROUPS). > > > if [ -z "${FEDNAME}" ]; then > > export FEDNAME=tgRDFCluster > > fi > > > # Path for local storage for this federation of services. > > > if [ -z "${FED_DIR}" ]; then > > export FED_DIR=/home/tomcat-sesame/blazegraphCluster/data > > fi > > > # Name of the replication cluster to which this HAJournalServer will > belong. > > > if [ -z "${LOGICAL_SERVICE_ID}" ]; then > > export LOGICAL_SERVICE_ID=tgHA-1 > > fi > > > # Where to find the Apache River service registrars (can also use > multicast). > > > if [ -z "${LOCATORS}" ]; then > > #Use for a HA1+ configuration > > #export LOCATORS="jini://localhost/" > > #HA3 example > > export > LOCATORS="jini:// > textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/ > " > > fi > > > # Where to find the Apache Zookeeper ensemble. > > > if [ -z "${ZK_SERVERS}" ] ; then > > #Use for single node configuration > > export ZK_SERVERS="localhost:2181" > > #Use for a multiple ZK configuration > > #export ZK_SERVERS="bigdata15:2081,bigdata16:2081,bigdata17:2081" > > fi > > > #Replication Factor (set to one for HA1) configuration > > > if [ -z "${REPLICATION_FACTOR}" ] ; then > > #Use for a HA1 configuration > > export REPLICATION_FACTOR=3 > > #Use for a HA1+ configuration > > #export REPLICATION_FACTOR=3 > > fi > > > #Port for the NanoSparqlServer Jetty > > > if [ -z "${JETTY_PORT}" ] ; then > > export JETTY_PORT=7070 > > fi > > > #Group commit (true|false) > > > if [ -z "${GROUP_COMMIT}" ] ; then > > export GROUP_COMMIT=true > > > - -- > Maximilian Brodhun > > Abteilung Forschung und Entwicklung > Georg-August-Universität Göttingen > Niedersächsische Staats- und Universitätsbibliothek Göttingen > D-37070 Göttingen > > Papendiek 14 (Historisches Gebäude, Raum 2.409) > +49 551 39-4923 (Tel.) > > br...@su... > > http://www.sub.uni-goettingende/ > http://www.rdd.sub.uni-goettingen.de/ > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1 > > iQIcBAEBAgAGBQJVEWyPAAoJEBDMOSiH8mYucJMP/0PgYrjY1It0HQ6H65S8Qi0P > /2mit6rDD7saQwVm0+f29ZVtteE3mWgGHCgsycthJvuVKiM5rJBvsa7cZGfA0DTy > SY7wS6sx/87J4U38dY7qqPbl846eOHvyIIfxU/IkK7p/UEyJd39V3Nu+TYGnEmzS > beEX6CijpORpbvivNeCv+Lgcb2yrLB5/AwTdURmH65v/nP7oopgMMz+yazznw+vh > UOtYtjMT/msgzl8pAr76W/wAvsiihk6DhzdAgBPCwAriMD3JUuAawXvuWNqRdYfZ > bGCgzdMSMCtGn22St7G2F2BqKLzhW6kY50LV/A0M4DfFzvJ5bZdARYsj1IDwkRL1 > KKv88tYDdN+w+i1H84a20PN5wI9AVsZCbvMFELfrc2wUsdOpXalbiFY6uagwzkzs > bolgHicuw7o63QvrhBMl2xPf/y7qCkMVjcGPClwhcUzHX3a8s6degjPptga/CZMg > 0YnUSK5vSSqPquLanfExza4BbrG7LhgmzKPGkPkoqhgURhQaBCZA+A6tw+8NSs9P > PYrkKwqEYNhug1LIif2BhUlOSwNhshPLmhn0/jt8HWIsnqDckFy4xhaLQYo3tG6+ > bUzLSQCc9z2fdV889Bwn87rHV/II5nk3aJKk1NgTu7C+6txpFj9GtKk1wGkDmf41 > 8TNMmsIgHLA8OQMXd++E > =7LK/ > -----END PGP SIGNATURE----- > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
From: Maximilian B. <br...@su...> - 2015-03-24 13:54:32
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dear All, I'm very new to blazegraph but I changed to blazegraph cause of the clustering possibilities. I'm poor of having trouble with this. I want to cluster three nodes, all three become members in the quorum but only two join them. I discover this with zooinspector. The only difference between the three servers is that one of them doesn't have DNS is that a problem? Maybe on you can help me. My config file looks like this (the same on every machine): ## Configure basic environment variables. Obviously, you must use your own parameters for LOCATORS and ZK_SERVERS. ## This will not override parameters in the environment. # Name of the federation of services (controls the Apache River GROUPS). if [ -z "${FEDNAME}" ]; then export FEDNAME=tgRDFCluster fi # Path for local storage for this federation of services. if [ -z "${FED_DIR}" ]; then export FED_DIR=/home/tomcat-sesame/blazegraphCluster/data fi # Name of the replication cluster to which this HAJournalServer will belong. if [ -z "${LOGICAL_SERVICE_ID}" ]; then export LOGICAL_SERVICE_ID=tgHA-1 fi # Where to find the Apache River service registrars (can also use multicast). if [ -z "${LOCATORS}" ]; then #Use for a HA1+ configuration #export LOCATORS="jini://localhost/" #HA3 example export LOCATORS="jini://textgrid-test1.gwdg.de/,jini://textgrid-test1.gwdg.de/,jini://141.5.102.206/" fi # Where to find the Apache Zookeeper ensemble. if [ -z "${ZK_SERVERS}" ] ; then #Use for single node configuration export ZK_SERVERS="localhost:2181" #Use for a multiple ZK configuration #export ZK_SERVERS="bigdata15:2081,bigdata16:2081,bigdata17:2081" fi #Replication Factor (set to one for HA1) configuration if [ -z "${REPLICATION_FACTOR}" ] ; then #Use for a HA1 configuration export REPLICATION_FACTOR=3 #Use for a HA1+ configuration #export REPLICATION_FACTOR=3 fi #Port for the NanoSparqlServer Jetty if [ -z "${JETTY_PORT}" ] ; then export JETTY_PORT=7070 fi #Group commit (true|false) if [ -z "${GROUP_COMMIT}" ] ; then export GROUP_COMMIT=true - -- Maximilian Brodhun Abteilung Forschung und Entwicklung Georg-August-Universität Göttingen Niedersächsische Staats- und Universitätsbibliothek Göttingen D-37070 Göttingen Papendiek 14 (Historisches Gebäude, Raum 2.409) +49 551 39-4923 (Tel.) br...@su... http://www.sub.uni-goettingende/ http://www.rdd.sub.uni-goettingen.de/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJVEWyPAAoJEBDMOSiH8mYucJMP/0PgYrjY1It0HQ6H65S8Qi0P /2mit6rDD7saQwVm0+f29ZVtteE3mWgGHCgsycthJvuVKiM5rJBvsa7cZGfA0DTy SY7wS6sx/87J4U38dY7qqPbl846eOHvyIIfxU/IkK7p/UEyJd39V3Nu+TYGnEmzS beEX6CijpORpbvivNeCv+Lgcb2yrLB5/AwTdURmH65v/nP7oopgMMz+yazznw+vh UOtYtjMT/msgzl8pAr76W/wAvsiihk6DhzdAgBPCwAriMD3JUuAawXvuWNqRdYfZ bGCgzdMSMCtGn22St7G2F2BqKLzhW6kY50LV/A0M4DfFzvJ5bZdARYsj1IDwkRL1 KKv88tYDdN+w+i1H84a20PN5wI9AVsZCbvMFELfrc2wUsdOpXalbiFY6uagwzkzs bolgHicuw7o63QvrhBMl2xPf/y7qCkMVjcGPClwhcUzHX3a8s6degjPptga/CZMg 0YnUSK5vSSqPquLanfExza4BbrG7LhgmzKPGkPkoqhgURhQaBCZA+A6tw+8NSs9P PYrkKwqEYNhug1LIif2BhUlOSwNhshPLmhn0/jt8HWIsnqDckFy4xhaLQYo3tG6+ bUzLSQCc9z2fdV889Bwn87rHV/II5nk3aJKk1NgTu7C+6txpFj9GtKk1wGkDmf41 8TNMmsIgHLA8OQMXd++E =7LK/ -----END PGP SIGNATURE----- |
From: Brad B. <be...@sy...> - 2015-03-21 00:58:52
|
See the full details: http://blog.blazegraph.com/?p=859. -- _______________ Brad Bebee Managing Partner SYSTAP, LLC e: be...@sy... m: 202.642.7961 f: 571.367.5000 w: www.systap.com Blazegraph™ <http://www.blazegraph.com> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Jim B. <ba...@ne...> - 2015-03-13 20:07:48
|
Hi Michael, Thank you, specifying the GRAPH inside the INSERT does seem to make it work. I had previously come to the (perhaps incorrect) understanding that WITH provided the graph for the updates, while USING would specify the graph for the WHERE query part. This part of the spec led me to that idea: "If the INSERT template specifies GRAPH blocks then these will be the graphs affected. Otherwise, the operation will be applied to the default graph, or, respectively, to the graph specified in the WITH clause, if one was specified. If no USING (NAMED) clause is present, then the pattern in the WHERE clause will be matched against the Graph Store, otherwise against the dataset specified by the USING (NAMED) clauses. The matches against the WHERE clause create bindings to be applied to the template for determining triples to be inserted (following the same rules as for DELETE/INSERT)." I am still a little confused because I submitted a different update that worked the way I thought (query on all graphs, insert into a particular graph). But the GRAPH syntax works just as well for me. Thanks! Jim > On Mar 12, 2015, at 5:04 AM, Michael Schmidt <ms...@me...> wrote: > > Jim, > > I had some thoughts on the scenario, here’s what I believe is going on. You are using WITH, which is defined as follows: > > "The WITH clause defines the graph that will be modified or matched against for any of the subsequent elements (in DELETE, INSERT, or WHERE clauses) if they do not specify a graph explicitly. If not provided, then the default graph of the Graph Store (or an explicitly declared dataset in the WHERE clause) will be assumed. That is, a WITH clause may be viewed as syntactic sugar for wrapping both the QuadPatterns in subsequent DELETE and INSERT clauses, and likewise the GroupGraphPattern in the subsequent WHERE clause into GRAPH patterns.” (see http://www.w3.org/TR/sparql11-update/) > > As I understand the standard, using a SELECT subquery introduces a new scope, so in query #2 your WHERE clause — namely, the inner SELECT query — applies to the *whole* graph, while in query #1 the pattern is evaluated against graph <http://kb.phenoscape.org/ic> only. Assuming that the matching triples are not (or not completely) residing in that graph, this would explain why the first query returns immediately (and, as I understand, would be expected behaviour). So if your intention is to match against all graphs, but insert into a specific graph, you might try out the following (skipping WITH, but specifying the GRAPH in the INSERT clause): > > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> > PREFIX owl: <http://www.w3.org/2002/07/owl#> > PREFIX dc: <http://purl.org/dc/elements/1.1/> > PREFIX ps: <http://purl.org/phenoscape/vocab.owl#> > PREFIX obo: <http://purl.obolibrary.org/obo/> > > INSERT { > GRAPH <http://kb.phenoscape.org/ic> { ?annotation ps:reflexive_subClassOf ?subsumer } > } > WHERE { > ?term ps:has_phenotypic_profile/rdf:type ?annotation . > ?annotation rdfs:subClassOf* ?subsumer . > FILTER(isIRI(?subsumer)) > } > > Does that one work in your scenario? > > Best, > Michael > > > > >> On 11 Mar 2015, at 21:55, Jim Balhoff <ba...@ne...> wrote: >> >>> On Mar 11, 2015, at 1:03 PM, Michael Schmidt <ms...@me...> wrote: >>> >>> Jim, >>> >>> first of all, I’d agree that the two queries are equivalent (unless I’m missing some typo here). >>> >>> We’ve recently worked on improvements related to the evaluation of arbitrary length path operators (in your case: the transitive closure calculation over subClassOf), see http://trac.bigdata.com/ticket/1003. The fix there may help in terms of performance for both of your queries — as you may have noticed from previous discussions, we’re currently preparing the next release (planned for the end of the week), which will include this fix. We’d be happy to get your feedback whether this fixes the memory/performance of your query once the release is out. >> >> After your message I tried the latest from the git master branch. It did change the performance of the update with the subquery: the memory increased more slowly, and although it did get close to maxing out, the update eventually completed, inserting 14,459,839 triples. >> >> >>> Regarding the problem with the update having no effect: would you mind sharing your curl call? Is your request following the guidelines documented at http://wiki.bigdata.com/wiki/index.php/NanoSparqlServer#UPDATE_.28SPARQL_1.1_UPDATE.29? >> >> I am submitting it like this: >> >> curl -X POST --data-binary @materialize_subsumer.rq --header "Content-Type:application/sparql-update" http://example.org/sparql >> >> This doesn't match the example on the wiki, but I think it is correct according to the protocol. And I think it worked for other updates. I tried pasting my update query into the web dashboard and it also returned immediately with zero mutations. This curl command did work for the version containing the subquery. >> >> So it does seem like there could be something funny going on with the first query. I don't have very much time to investigate this week but I will try to reproduce it with some other queries later. >> >> Thank you! >> Jim >> > |
From: Michael S. <ms...@me...> - 2015-03-12 09:04:31
|
Jim, I had some thoughts on the scenario, here’s what I believe is going on. You are using WITH, which is defined as follows: "The WITH clause defines the graph that will be modified or matched against for any of the subsequent elements (in DELETE, INSERT, or WHERE clauses) if they do not specify a graph explicitly. If not provided, then the default graph of the Graph Store (or an explicitly declared dataset in the WHERE clause) will be assumed. That is, a WITH clause may be viewed as syntactic sugar for wrapping both the QuadPatterns in subsequent DELETE and INSERT clauses, and likewise the GroupGraphPattern in the subsequent WHERE clause into GRAPH patterns.” (see http://www.w3.org/TR/sparql11-update/ <http://www.w3.org/TR/sparql11-update/>) As I understand the standard, using a SELECT subquery introduces a new scope, so in query #2 your WHERE clause — namely, the inner SELECT query — applies to the *whole* graph, while in query #1 the pattern is evaluated against graph <http://kb.phenoscape.org/ic <http://kb.phenoscape.org/ic>> only. Assuming that the matching triples are not (or not completely) residing in that graph, this would explain why the first query returns immediately (and, as I understand, would be expected behaviour). So if your intention is to match against all graphs, but insert into a specific graph, you might try out the following (skipping WITH, but specifying the GRAPH in the INSERT clause): PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX ps: <http://purl.org/phenoscape/vocab.owl#> PREFIX obo: <http://purl.obolibrary.org/obo/> INSERT { GRAPH <http://kb.phenoscape.org/ic> { ?annotation ps:reflexive_subClassOf ?subsumer } } WHERE { ?term ps:has_phenotypic_profile/rdf:type ?annotation . ?annotation rdfs:subClassOf* ?subsumer . FILTER(isIRI(?subsumer)) } Does that one work in your scenario? Best, Michael > On 11 Mar 2015, at 21:55, Jim Balhoff <ba...@ne...> wrote: > >> On Mar 11, 2015, at 1:03 PM, Michael Schmidt <ms...@me...> wrote: >> >> Jim, >> >> first of all, I’d agree that the two queries are equivalent (unless I’m missing some typo here). >> >> We’ve recently worked on improvements related to the evaluation of arbitrary length path operators (in your case: the transitive closure calculation over subClassOf), see http://trac.bigdata.com/ticket/1003. The fix there may help in terms of performance for both of your queries — as you may have noticed from previous discussions, we’re currently preparing the next release (planned for the end of the week), which will include this fix. We’d be happy to get your feedback whether this fixes the memory/performance of your query once the release is out. > > After your message I tried the latest from the git master branch. It did change the performance of the update with the subquery: the memory increased more slowly, and although it did get close to maxing out, the update eventually completed, inserting 14,459,839 triples. > > >> Regarding the problem with the update having no effect: would you mind sharing your curl call? Is your request following the guidelines documented at http://wiki.bigdata.com/wiki/index.php/NanoSparqlServer#UPDATE_.28SPARQL_1.1_UPDATE.29? > > I am submitting it like this: > > curl -X POST --data-binary @materialize_subsumer.rq --header "Content-Type:application/sparql-update" http://example.org/sparql > > This doesn't match the example on the wiki, but I think it is correct according to the protocol. And I think it worked for other updates. I tried pasting my update query into the web dashboard and it also returned immediately with zero mutations. This curl command did work for the version containing the subquery. > > So it does seem like there could be something funny going on with the first query. I don't have very much time to investigate this week but I will try to reproduce it with some other queries later. > > Thank you! > Jim > |
From: Jim B. <ba...@ne...> - 2015-03-11 20:55:26
|
> On Mar 11, 2015, at 1:03 PM, Michael Schmidt <ms...@me...> wrote: > > Jim, > > first of all, I’d agree that the two queries are equivalent (unless I’m missing some typo here). > > We’ve recently worked on improvements related to the evaluation of arbitrary length path operators (in your case: the transitive closure calculation over subClassOf), see http://trac.bigdata.com/ticket/1003. The fix there may help in terms of performance for both of your queries — as you may have noticed from previous discussions, we’re currently preparing the next release (planned for the end of the week), which will include this fix. We’d be happy to get your feedback whether this fixes the memory/performance of your query once the release is out. After your message I tried the latest from the git master branch. It did change the performance of the update with the subquery: the memory increased more slowly, and although it did get close to maxing out, the update eventually completed, inserting 14,459,839 triples. > Regarding the problem with the update having no effect: would you mind sharing your curl call? Is your request following the guidelines documented at http://wiki.bigdata.com/wiki/index.php/NanoSparqlServer#UPDATE_.28SPARQL_1.1_UPDATE.29? I am submitting it like this: curl -X POST --data-binary @materialize_subsumer.rq --header "Content-Type:application/sparql-update" http://example.org/sparql This doesn't match the example on the wiki, but I think it is correct according to the protocol. And I think it worked for other updates. I tried pasting my update query into the web dashboard and it also returned immediately with zero mutations. This curl command did work for the version containing the subquery. So it does seem like there could be something funny going on with the first query. I don't have very much time to investigate this week but I will try to reproduce it with some other queries later. Thank you! Jim |
From: Michael S. <ms...@me...> - 2015-03-11 17:30:37
|
Jim, first of all, I’d agree that the two queries are equivalent (unless I’m missing some typo here). We’ve recently worked on improvements related to the evaluation of arbitrary length path operators (in your case: the transitive closure calculation over subClassOf), see http://trac.bigdata.com/ticket/1003 <http://trac.bigdata.com/ticket/1003>. The fix there may help in terms of performance for both of your queries — as you may have noticed from previous discussions, we’re currently preparing the next release (planned for the end of the week), which will include this fix. We’d be happy to get your feedback whether this fixes the memory/performance of your query once the release is out. Regarding the problem with the update having no effect: would you mind sharing your curl call? Is your request following the guidelines documented at http://wiki.bigdata.com/wiki/index.php/NanoSparqlServer#UPDATE_.28SPARQL_1.1_UPDATE.29 <http://wiki.bigdata.com/wiki/index.php/NanoSparqlServer#UPDATE_.28SPARQL_1.1_UPDATE.29>? Best, Michael > On 11 Mar 2015, at 17:37, Jim Balhoff <ba...@ne...> wrote: > > Hi, > > I am having a confusing issue with SPARQL update and wanted to verify my expectations. Submitting this update using curl returns immediately and reports 0 mutations: > > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> > PREFIX owl: <http://www.w3.org/2002/07/owl#> > PREFIX dc: <http://purl.org/dc/elements/1.1/> > PREFIX ps: <http://purl.org/phenoscape/vocab.owl#> > PREFIX obo: <http://purl.obolibrary.org/obo/> > WITH <http://kb.phenoscape.org/ic> > INSERT { > ?annotation ps:reflexive_subClassOf ?subsumer . > } > WHERE { > ?term ps:has_phenotypic_profile/rdf:type ?annotation . > ?annotation rdfs:subClassOf* ?subsumer . > FILTER(isIRI(?subsumer)) > } > > If I change the query into a SELECT using the same WHERE clause, I get many results. If I edit the update to put the WHERE contents into a subquery, it no longer returns immediately. It seems to getting results, but I am not sure because it eventually runs out of memory before completing. This is the modified update with subquery: > > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> > PREFIX owl: <http://www.w3.org/2002/07/owl#> > PREFIX dc: <http://purl.org/dc/elements/1.1/> > PREFIX ps: <http://purl.org/phenoscape/vocab.owl#> > PREFIX obo: <http://purl.obolibrary.org/obo/> > WITH <http://kb.phenoscape.org/ic> > INSERT { > ?annotation ps:reflexive_subClassOf ?subsumer . > } > WHERE { > SELECT DISTINCT ?annotation ?subsumer WHERE { > ?term ps:has_phenotypic_profile/rdf:type ?annotation . > ?annotation rdfs:subClassOf* ?subsumer . > FILTER(isIRI(?subsumer)) > } > } > > Should these two queries have the same results? If I am getting no data inserted in the first case, might I be hitting a bug? Based on a SELECT query, the WHERE clause definitely matches content in the database. > > Thank you, > Jim > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
From: Jim B. <ba...@ne...> - 2015-03-11 16:37:14
|
Hi, I am having a confusing issue with SPARQL update and wanted to verify my expectations. Submitting this update using curl returns immediately and reports 0 mutations: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX ps: <http://purl.org/phenoscape/vocab.owl#> PREFIX obo: <http://purl.obolibrary.org/obo/> WITH <http://kb.phenoscape.org/ic> INSERT { ?annotation ps:reflexive_subClassOf ?subsumer . } WHERE { ?term ps:has_phenotypic_profile/rdf:type ?annotation . ?annotation rdfs:subClassOf* ?subsumer . FILTER(isIRI(?subsumer)) } If I change the query into a SELECT using the same WHERE clause, I get many results. If I edit the update to put the WHERE contents into a subquery, it no longer returns immediately. It seems to getting results, but I am not sure because it eventually runs out of memory before completing. This is the modified update with subquery: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX ps: <http://purl.org/phenoscape/vocab.owl#> PREFIX obo: <http://purl.obolibrary.org/obo/> WITH <http://kb.phenoscape.org/ic> INSERT { ?annotation ps:reflexive_subClassOf ?subsumer . } WHERE { SELECT DISTINCT ?annotation ?subsumer WHERE { ?term ps:has_phenotypic_profile/rdf:type ?annotation . ?annotation rdfs:subClassOf* ?subsumer . FILTER(isIRI(?subsumer)) } } Should these two queries have the same results? If I am getting no data inserted in the first case, might I be hitting a bug? Based on a SELECT query, the WHERE clause definitely matches content in the database. Thank you, Jim |
From: Peter A. <ans...@gm...> - 2015-03-08 20:14:39
|
Hi Brad, Congratulations, wikidata is a very useful resource, thanks for supporting it! Cheers, Peter On 8 March 2015 at 01:06, Brad Bebee <be...@sy...> wrote: > All, > > In case you haven't seen, Blazegraph has been selected to be the graph > database platform for the Wikidata query service: > https://lists.wikimedia.org/pipermail/wikidata-tech/2015-March/000740.html. > We beat out Titan, Neo4j, GraphX, etc in their evaluation. > > We're super-psyched to be working with Wikidata and think it will be a > great thing for Wikidata and Blazegraph. There's a spreadsheet link in the > selection message, which has quite an interesting comparison of graph > database platforms. > > Cheers, --Brad > > -- > _______________ > Brad Bebee > Managing Partner > SYSTAP, LLC > e: be...@sy... > m: 202.642.7961 > f: 571.367.5000 > w: www.systap.com > > Blazegraph™ <http://www.blazegraph.com> is our ultra high-performance > graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints > APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new > technology to use GPUs to accelerate data-parallel graph analytics. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP, LLC. Any unauthorized review, use, disclosure, > dissemination or copying of this email or its contents or attachments is > prohibited. If you have received this communication in error, please notify > the sender by reply email and permanently delete all copies of the email > and its contents and attachments. > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > |