I'm glad you enjoyed Hilmar's talk. I had considered implementing owlet as a custom SERVICE as described here http://wiki.bigdata.com/wiki/index.php/FederatedQuery#Custom_Services. I imagine the user specifying a graph in the config, from which the ontology could be loaded into memory. It should also provide a choice of reasoner, such as ELK, Hermit, JFact, etc.
I think it would be more efficient than the current version, since you wouldn't have to generate the filter, serialize the query, send it to another endpoint, and parse. But the nice thing about the current version is that can work with any triplestore and any OWL API reasoner.
We have found owlet useful because we have very large, complex ontologies which we don't really use to auto-classify instances. The instance data is linked to ontology terms, but queries may involve complex DL descriptions taking advantage of the knowledge in the ontology.
Last edit: Jim Balhoff 2014-02-28
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Jim, I found the approach quite exciting and I would very much like to see this feature available out of the box with bigdata. I spoke with Hilmar about visiting at his location and having a conversation about how best to proceed.
We do a lot of things through the SERVICE mechanism and it does offer quite a bit of flexibility, including the ability to hook and leverage updates. It could be used to either embed a reasoner or to qreach out to a remote reasoner. The ASTOptimizer is another possible integration point.
I am curious how much RAM and CPU demand is imposed by the owl reasoner. It could make sense to either embed the reasoner or have it be an external service, depending on the resource demand. I am also curious how the deployment model might interact with those decisions. For example, if deployed with the HA cluster. It. It be interesting to try some different configurations and observe the impact on query performance.
I'm glad you enjoyed Hilmar's talk. I had considered implementing owlet as a custom SERVICE as described here http://wiki.bigdata.com/wiki/index.php/FederatedQuery#Custom_Services. I imagine the user specifying a graph in the config, from the ontology could be loaded into memory. It should also provide a choice of reasoner, such as ELK, Hermit, JFact, etc.
I think it would be more efficient than the current version, since you wouldn't have to generate the filter, serialize the query, send it to another endpoint, and parse. But the nice thing about the current version is that can work with any triplestore and any OWL API reasoner.
We have found owlet useful because we have very large, complex ontologies which we don't really use to auto-classify instances. The instance data is linked to ontology terms, but queries may involve complex DL descriptions taking advantage of the knowledge in the ontology.
The reasoner is pretty demanding of RAM and CPU, although ELK works extremely quickly compared to some of the others. But all the OWL API reasoners hold the ontology in memory. If you only want to query the ontology itself, I think for the vast majority of ontologies you wouldn't need too much RAM, maybe a couple of gigabytes (lots less for many ontologies). But in our case our dataset itself is made up of thousands of complex class expressions which we want to find via DL queries. Holding it all in ELK takes around 6-10 GB RAM at the moment.
An issue to keep in mind is that I don't think that any of the available reasoners can answer simultaneous queries. ELK is safe to be called from multiple threads, but it will block until it answers each query in order.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The memory is not that worrying since higher memory machines could be used, but the high CPU utilization and single threaded query answering both suggest that it might be better to host this outside of the bigdata JVM. In order to support concurrent query, we might want to start an "ELK farm" so you could at least load balance the DL queries against multiple ELK instances.
The reasoner is pretty demanding of RAM and CPU, although ELK works extremely quickly compared to some of the others. But all the OWL API reasoners hold the ontology in memory. If you only want to query the ontology itself, I think for the vast majority of ontologies you wouldn't need too much RAM, maybe a couple of gigabytes (lots less for many ontologies). But in our case our dataset itself is made up of thousands of complex class expressions which we want to find via DL queries. Holding it all in ELK takes around 6-10 GB RAM at the moment.
An issue to keep in mind is that I don't think that any of the available reasoners can answer simultaneous queries. ELK is safe to be called from multiple threads, but it will block until it answers each query in order.
Hi there. Has any progress been made on this? I would really like to be able to use BlazeGraph with an ELK reasoner. I can't find any documentation on it, but maybe I'm missing something.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Ramona - I have never created a tight integration. Currently in Phenoscape we preprocess queries containing Owlet expressions and then send the result to Blazegraph. However I think developing an integration as a Blazegraph custom service would be pretty straightforward, where it would perhaps load the ontology from a graph at startup. If you get in touch directly I can describe a few different ways they can interact. One of the limitations is that a single ELK instance can only process one query at a time.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Here is a plausible path to a DL integration. There was an interesting talk that touched on this at CSHALS.
See http://trac.bigdata.com/ticket/824 (Examine owlet/ELK integration)
Bryan
Hi Bryan,
I'm glad you enjoyed Hilmar's talk. I had considered implementing owlet as a custom SERVICE as described here http://wiki.bigdata.com/wiki/index.php/FederatedQuery#Custom_Services. I imagine the user specifying a graph in the config, from which the ontology could be loaded into memory. It should also provide a choice of reasoner, such as ELK, Hermit, JFact, etc.
I think it would be more efficient than the current version, since you wouldn't have to generate the filter, serialize the query, send it to another endpoint, and parse. But the nice thing about the current version is that can work with any triplestore and any OWL API reasoner.
Alternatively I thought about "inverting" the current version to run as a server, so that you could call it as a remote SERVICE from a query to Bigdata. I wrote a little more about that here: https://github.com/phenoscape/owlet/wiki/Further-development-of-owlet
We have found owlet useful because we have very large, complex ontologies which we don't really use to auto-classify instances. The instance data is linked to ontology terms, but queries may involve complex DL descriptions taking advantage of the knowledge in the ontology.
Last edit: Jim Balhoff 2014-02-28
Jim, I found the approach quite exciting and I would very much like to see this feature available out of the box with bigdata. I spoke with Hilmar about visiting at his location and having a conversation about how best to proceed.
We do a lot of things through the SERVICE mechanism and it does offer quite a bit of flexibility, including the ability to hook and leverage updates. It could be used to either embed a reasoner or to qreach out to a remote reasoner. The ASTOptimizer is another possible integration point.
I am curious how much RAM and CPU demand is imposed by the owl reasoner. It could make sense to either embed the reasoner or have it be an external service, depending on the resource demand. I am also curious how the deployment model might interact with those decisions. For example, if deployed with the HA cluster. It. It be interesting to try some different configurations and observe the impact on query performance.
Bryan
On Feb 28, 2014, at 11:11 AM, "Jim Balhoff" balhoff@users.sf.netamp#98;amp#97;amp#108;amp#104;amp#111;amp#102;amp#102;amp#64;amp#117;amp#115;amp#101;amp#114;amp#115;amp#46;amp#115;amp#102;amp#46;amp#110;amp#101;amp#116; wrote:
Hi Bryan,
I'm glad you enjoyed Hilmar's talk. I had considered implementing owlet as a custom SERVICE as described here http://wiki.bigdata.com/wiki/index.php/FederatedQuery#Custom_Services. I imagine the user specifying a graph in the config, from the ontology could be loaded into memory. It should also provide a choice of reasoner, such as ELK, Hermit, JFact, etc.
I think it would be more efficient than the current version, since you wouldn't have to generate the filter, serialize the query, send it to another endpoint, and parse. But the nice thing about the current version is that can work with any triplestore and any OWL API reasoner.
Alternatively I thought about "inverting" the current version to run as a server, so that you could call it as a remote SERVICE from a query to Bigdata. I wrote a little more about that here: https://github.com/phenoscape/owlet/wiki/Further-development-of-owlet
We have found owlet useful because we have very large, complex ontologies which we don't really use to auto-classify instances. The instance data is linked to ontology terms, but queries may involve complex DL descriptions taking advantage of the knowledge in the ontology.
owllet + ELK integrationhttps://sourceforge.net/p/bigdata/discussion/676946/thread/5e589b75/?limit=25#84ab
Sent from sourceforge.nethttp://sourceforge.net because you indicated interest in https://sourceforge.net/p/bigdata/discussion/676946/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
The reasoner is pretty demanding of RAM and CPU, although ELK works extremely quickly compared to some of the others. But all the OWL API reasoners hold the ontology in memory. If you only want to query the ontology itself, I think for the vast majority of ontologies you wouldn't need too much RAM, maybe a couple of gigabytes (lots less for many ontologies). But in our case our dataset itself is made up of thousands of complex class expressions which we want to find via DL queries. Holding it all in ELK takes around 6-10 GB RAM at the moment.
An issue to keep in mind is that I don't think that any of the available reasoners can answer simultaneous queries. ELK is safe to be called from multiple threads, but it will block until it answers each query in order.
The memory is not that worrying since higher memory machines could be used, but the high CPU utilization and single threaded query answering both suggest that it might be better to host this outside of the bigdata JVM. In order to support concurrent query, we might want to start an "ELK farm" so you could at least load balance the DL queries against multiple ELK instances.
Bryan
From: Jim Balhoff balhoff@users.sf.netamp#98;amp#97;amp#108;amp#104;amp#111;amp#102;amp#102;amp#64;amp#117;amp#115;amp#101;amp#114;amp#115;amp#46;amp#115;amp#102;amp#46;amp#110;amp#101;amp#116;
Reply-To: "[bigdata:discussion]" 676946@discussion.bigdata.p.re.sf.netamp#54;amp#55;amp#54;amp#57;amp#52;amp#54;amp#64;amp#100;amp#105;amp#115;amp#99;amp#117;amp#115;amp#115;amp#105;amp#111;amp#110;amp#46;amp#98;amp#105;amp#103;amp#100;amp#97;amp#116;amp#97;amp#46;amp#112;amp#46;amp#114;amp#101;amp#46;amp#115;amp#102;amp#46;amp#110;amp#101;amp#116;
Date: Thursday, March 13, 2014 3:14 PM
To: "[bigdata:discussion]" 676946@discussion.bigdata.p.re.sf.netamp#54;amp#55;amp#54;amp#57;amp#52;amp#54;amp#64;amp#100;amp#105;amp#115;amp#99;amp#117;amp#115;amp#115;amp#105;amp#111;amp#110;amp#46;amp#98;amp#105;amp#103;amp#100;amp#97;amp#116;amp#97;amp#46;amp#112;amp#46;amp#114;amp#101;amp#46;amp#115;amp#102;amp#46;amp#110;amp#101;amp#116;
Subject: [bigdata:discussion] owllet + ELK integration
The reasoner is pretty demanding of RAM and CPU, although ELK works extremely quickly compared to some of the others. But all the OWL API reasoners hold the ontology in memory. If you only want to query the ontology itself, I think for the vast majority of ontologies you wouldn't need too much RAM, maybe a couple of gigabytes (lots less for many ontologies). But in our case our dataset itself is made up of thousands of complex class expressions which we want to find via DL queries. Holding it all in ELK takes around 6-10 GB RAM at the moment.
An issue to keep in mind is that I don't think that any of the available reasoners can answer simultaneous queries. ELK is safe to be called from multiple threads, but it will block until it answers each query in order.
owllet + ELK integrationhttps://sourceforge.net/p/bigdata/discussion/676946/thread/5e589b75/?limit=25#4d10
Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/bigdata/discussion/676946/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
Hi there. Has any progress been made on this? I would really like to be able to use BlazeGraph with an ELK reasoner. I can't find any documentation on it, but maybe I'm missing something.
Hi Ramona - I have never created a tight integration. Currently in Phenoscape we preprocess queries containing Owlet expressions and then send the result to Blazegraph. However I think developing an integration as a Blazegraph custom service would be pretty straightforward, where it would perhaps load the ontology from a graph at startup. If you get in touch directly I can describe a few different ways they can interact. One of the limitations is that a single ELK instance can only process one query at a time.