|
From: Stefano E. C. <cam...@ya...> - 2004-09-16 19:10:57
|
Laurian Gridinoc wrote: >hmmm, Gmail is making problems... > >about having URIs for statements, quads, named graphs; this doc. was >released today: >http://www.wiwiss.fu-berlin.de/suhl/bizer/ng4j/ >Interesting. > > Thanks, it is interesting also in my opinion. Using it for Platypus? In our project we encountered problems that NG4J seems is able to solve. Unfortunately it seems in a very early state.... "NG4J is an *experimental implementation* of the new syntaxes (TriX <http://swdev.nokia.com/trix/TriX.html>, TriG <http://www.wiwiss.fu-berlin.de/suhl/bizer/TriG/>) and query language (TriQL <http://www.wiwiss.fu-berlin.de/suhl/bizer/TriQL/>) developed within the Semantic Web Interest Group. The whole implementation and all interfaces might change in future versions... " I'd like to try it, .... Bye Bye Campa >Laur > > > |
|
From: Stefano C. <cam...@ya...> - 2004-09-17 07:13:38
|
Laurian Gridinoc wrote: >On Wed, 15 Sep 2004 07:16:15 +0200, Paolo Castagna > > >>>>... we need an unique identifier for a statement >>>>in the index and for reification. >>>> <subject uri> <predicate uri> <object uri> >>>>Is there any problem if we use SHA1? >>>> sha1(<subject uri> <predicate uri> <object uri>)? >>>> >>>> >>>Very OK with me, I mentioned it in a discussion about reification: >>>http://lists.w3.org/Archives/Public/www-rdf-interest/2004Aug/0194.html >>>but no one argued back with me :( >>> >>> >>So, it's decided: we'll use SHA1 ;) >>I don't know how to tell in english... but: >>"chi tace acconsente" :) >> >> Ok, I have added the follow methods in MetaManager class: /* *@return the Triple ID * */ public static String getStmtId(final URI subject, final URI predicate, final URI object) throws NoSuchAlgorithmException { final String tbDigest = subject.toString()+predicate.toString()+object.toString(); byte[] digest = getDigest(tbDigest.getBytes()); Base64 base = new Base64(); byte[] encoded = base.encode(digest); final String result = new String(encoded); return result; } private static byte[] getDigest(byte[] buffer) throws NoSuchAlgorithmException { MessageDigest md5 = MessageDigest.getInstance("SHA1"); md5.update(buffer); return md5.digest(); } So, we can calculate the ID of triples (tbd: literals ). I'd like thinking at the ID as a URI of the triple. In fact, this is what happen during the reification. In Platypus, when we create a refication we put the new resource (the refified triple) in the namespcace "reifications", so the URI is something like this: reifications:_086yskjf (reification+_CRC(s,p,o)). I consider this " reifications:_086yskjf" as the ID/URI of the triple, isn't it? If a triple isn't reificated? what is it's ID/URI? May be: namespace:_SHA1 (subjURI, predicateURI, objectURI/Literal) Or directly SHA1 (subjURI, predicateURI, objectURI/Literal) In my opinion we are creating a new RDF resource naming it, so it could be in a fixed "installation depending" namespace. I need this ID, becouse I use it to "add" or "remove" triples from the Lucene index. Any ideas? |
|
From: Laurian G. <la...@gm...> - 2004-09-17 12:31:02
|
On Fri, 17 Sep 2004 09:18:24 +0200, Stefano Campanini
<cam...@ya...> wrote:
> >On Wed, 15 Sep 2004 07:16:15 +0200, Paolo Castagna
> >>>>... we need an unique identifier for a statement
> >>>>in the index and for reification.
> >>>> <subject uri> <predicate uri> <object uri>
> >>>>Is there any problem if we use SHA1?
> >>>> sha1(<subject uri> <predicate uri> <object uri>)?
> Ok, I have added the follow methods in MetaManager class:
> /*
> *@return the Triple ID
> *
> */
> public static String getStmtId(final URI subject, final URI
> predicate, final URI object) throws NoSuchAlgorithmException {
> final String tbDigest =
> subject.toString()+predicate.toString()+object.toString();
> byte[] digest = getDigest(tbDigest.getBytes());
> Base64 base = new Base64();
> byte[] encoded = base.encode(digest);
> final String result = new String(encoded);
> return result;
> }
>
> So, we can calculate the ID of triples (tbd: literals ).
Instead of sha1 of strings concatenation:
subject.toString()+predicate.toString()+object.toString()
I would propose sha1 of the n-triple.toString() representing the
triple, I think this will include the optional xsd datatype and
language too, and would be easyer to describe/interoperate.
Consider:
<http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> "Grapefruit" .
sha1("http://www.grapefruit.rohttp://purl.org/dc/terms/1.0/titleGrapefruit")
doesn't look nice but:
sha1("<http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title>
\"Grapefruit\" .") does, I would include the final . too, for ease of
script processing if ever needed by us or by third parties.
also using n-triples would allow to identify by different URIs these statements:
<http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> "Grapefruit" .
<http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title>
"Grapefruit"@en .
<http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title>
"Grapefruit"^^xsd:string .
> I'd like thinking at the ID as a URI of the triple. In fact, this is
> what happen during the reification.
I always thought at this sha1 result as an URI, I would use the
already used format of identification of resources by sha1 content sum
used in P2P software:
urn:sha1:XSQCZ3UK3PPW6Y6HOVLTIX2QFMZ3TFFB
----------------^ base32 of sha1 sum of the content - in our case of
the n-triple
I using for base32: com.bitzi.util.Base32:
URI urn = new URI("urn", "sha1:" + Base32.encode(sha1digest), null);
You proposed base64, I would go for base32, theoretically we would be
able to identify (in the future) individual RDF statements over P2P
networks :)
> In Platypus, when we create a refication we put the new resource (the
> refified triple) in the namespcace "reifications", so the URI is
> something like this: reifications:_086yskjf (reification+_CRC(s,p,o)).
> I consider this " reifications:_086yskjf" as the ID/URI of the triple,
> isn't it?
I won't add semantics to the URI, in this case using a schema for
identifying reified statements, if your sha1(statement) is in the
system, then is reified.
> If a triple isn't reificated? what is it's ID/URI?
same, but if you don;t have its sha1 ID in the index, you may assume
is not reified, at least not in the system :)
It would be dangerous to change the ID of a statement just to mark
that is reified, and won't be interoperable.
> May be:
> namespace:_SHA1 (subjURI, predicateURI, objectURI/Literal)
> Or directly
> SHA1 (subjURI, predicateURI, objectURI/Literal)
> In my opinion we are creating a new RDF resource naming it, so it could
> be in a fixed "installation depending" namespace.
> I need this ID, becouse I use it to "add" or "remove" triples from the
> Lucene index.
Cheers,
--
Laurian Gridinoc
Chief Developer
GRAPEFRUIT DESIGN
www.gd.ro
|
|
From: Stefano E. C. <cam...@ya...> - 2004-09-17 16:31:41
|
Laurian Gridinoc wrote:
>On Fri, 17 Sep 2004 09:18:24 +0200, Stefano Campanini
><cam...@ya...> wrote:
>
>
>>>On Wed, 15 Sep 2004 07:16:15 +0200, Paolo Castagna
>>>
>>>
>>>>>>... we need an unique identifier for a statement
>>>>>>in the index and for reification.
>>>>>> <subject uri> <predicate uri> <object uri>
>>>>>>Is there any problem if we use SHA1?
>>>>>> sha1(<subject uri> <predicate uri> <object uri>)?
>>>>>>
>>>>>>
>>Ok, I have added the follow methods in MetaManager class:
>>/*
>> *@return the Triple ID
>> *
>> */
>> public static String getStmtId(final URI subject, final URI
>>predicate, final URI object) throws NoSuchAlgorithmException {
>> final String tbDigest =
>>subject.toString()+predicate.toString()+object.toString();
>> byte[] digest = getDigest(tbDigest.getBytes());
>> Base64 base = new Base64();
>> byte[] encoded = base.encode(digest);
>> final String result = new String(encoded);
>> return result;
>> }
>>
>>So, we can calculate the ID of triples (tbd: literals ).
>>
>>
>
>Instead of sha1 of strings concatenation:
>subject.toString()+predicate.toString()+object.toString()
>I would propose sha1 of the n-triple.toString() representing the
>triple, I think this will include the optional xsd datatype and
>language too, and would be easyer to describe/interoperate.
>
>Consider:
>
><http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> "Grapefruit" .
>
>sha1("http://www.grapefruit.rohttp://purl.org/dc/terms/1.0/titleGrapefruit")
>doesn't look nice but:
>sha1("<http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title>
>\"Grapefruit\" .") does, I would include the final . too, for ease of
>script processing if ever needed by us or by third parties.
>
>also using n-triples would allow to identify by different URIs these statements:
><http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> "Grapefruit" .
><http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title>
>"Grapefruit"@en .
><http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title>
>"Grapefruit"^^xsd:string .
>
>
Ok, is better use n-triple.toStirng():
>
>
>>I'd like thinking at the ID as a URI of the triple. In fact, this is
>>what happen during the reification.
>>
>>
>
>I always thought at this sha1 result as an URI, I would use the
>already used format of identification of resources by sha1 content sum
>used in P2P software:
>
>urn:sha1:XSQCZ3UK3PPW6Y6HOVLTIX2QFMZ3TFFB
>----------------^ base32 of sha1 sum of the content - in our case of
>the n-triple
>
>I using for base32: com.bitzi.util.Base32:
>URI urn = new URI("urn", "sha1:" + Base32.encode(sha1digest), null);
>
>You proposed base64, I would go for base32, theoretically we would be
>able to identify (in the future) individual RDF statements over P2P
>networks :)
>
>
Ok, I like it.
>
>
>>In Platypus, when we create a refication we put the new resource (the
>>refified triple) in the namespcace "reifications", so the URI is
>>something like this: reifications:_086yskjf (reification+_CRC(s,p,o)).
>>I consider this " reifications:_086yskjf" as the ID/URI of the triple,
>>isn't it?
>>
>>
>
>I won't add semantics to the URI, in this case using a schema for
>identifying reified statements, if your sha1(statement) is in the
>system, then is reified.
>
>
>
>>If a triple isn't reificated? what is it's ID/URI?
>>
>>
>
>same, but if you don;t have its sha1 ID in the index, you may assume
>is not reified, at least not in the system :)
>
>It would be dangerous to change the ID of a statement just to mark
>that is reified, and won't be interoperable.
>
>
Yes...
Following the Playtipus Way we have to permit to create a page (XHTM)
that describe the statement as for others resources. So, we need a
namespace or a better way (We have to invent it) to save the "index.rdf"
and the "index.html" describing it as other Wiki resources.
Do you think it is useful for the user reificate a triple and describe
it using a wiki page?
I propose that way:
* All Wiki triples are in an implicit way reificated (only in the index)
and its URI is as your proposal (new URI("urn", "sha1:" +
Base32.encode(sha1digest), null) ....
* If a user wants to create a wiki page that describe a triple we have
to save RDF reification (s, o, p etc...) and the page. Where we can
save the page? For the moment they are saved under "reification"
namespace/dir.
We add some semantics to the triple .. yes ... but ... the user create
it intentionally, using the one determined Wiki Installation. I found
the adding of this semantics not well but not so bad.
So, I suggest to permit the user to save it in any other namespace
selecting it.
What do you think Laurian ?
>>May be:
>>namespace:_SHA1 (subjURI, predicateURI, objectURI/Literal)
>>Or directly
>>SHA1 (subjURI, predicateURI, objectURI/Literal)
>>In my opinion we are creating a new RDF resource naming it, so it could
>>be in a fixed "installation depending" namespace.
>>I need this ID, becouse I use it to "add" or "remove" triples from the
>>Lucene index.
>>
>>
>
>Cheers,
>
>
In these days I start working on Unique Identifiers for triple in ways
that you suggested above to use it in the Lucene index.
Bye Bye
|
|
From: Laurian G. <la...@gm...> - 2004-09-20 09:53:20
|
Hello,
On Fri, 17 Sep 2004 18:35:46 +0200, Stefano Emilio Campanini
[...]
> >>In Platypus, when we create a refication we put the new resource (the
> >>refified triple) in the namespcace "reifications", so the URI is
> >>something like this: reifications:_086yskjf (reification+_CRC(s,p,o)).
> >>I consider this " reifications:_086yskjf" as the ID/URI of the triple,
> >>isn't it?
> >I won't add semantics to the URI, in this case using a schema for
> >identifying reified statements, if your sha1(statement) is in the
> >system, then is reified.
> >>If a triple isn't reificated? what is it's ID/URI?
> >same, but if you don;t have its sha1 ID in the index, you may assume
> >is not reified, at least not in the system :)
> >It would be dangerous to change the ID of a statement just to mark
> >that is reified, and won't be interoperable.
> Yes...
> Following the Playtipus Way we have to permit to create a page (XHTM)
> that describe the statement as for others resources. So, we need a
> namespace or a better way (We have to invent it) to save the "index.rdf"
> and the "index.html" describing it as other Wiki resources.
> Do you think it is useful for the user reificate a triple and describe
> it using a wiki page?
yes, we may consider the wiki page as an human readable explanation of
the reification...
> I propose that way:
> * All Wiki triples are in an implicit way reificated (only in the index)
> and its URI is as your proposal (new URI("urn", "sha1:" +
> Base32.encode(sha1digest), null) ....
You mean are in an implicit way identified bu the proposed URI scheme...
> * If a user wants to create a wiki page that describe a triple we have
> to save RDF reification (s, o, p etc...) and the page. Where we can
> save the page? For the moment they are saved under "reification"
> namespace/dir.
Would be ok, for the sake of microcontent.
> We add some semantics to the triple .. yes ... but ... the user create
> it intentionally, using the one determined Wiki Installation. I found
> the adding of this semantics not well but not so bad.
> So, I suggest to permit the user to save it in any other namespace
> selecting it.
hmmm, why not `under' the current resource? :)
for /foo/bar we have the refication wiki page under /foo/bar/reified?
Cheers,
--
Laurian Gridinoc
Chief Developer
GRAPEFRUIT DESIGN
www.gd.ro
|
|
From: Stefano C. <cam...@ya...> - 2004-09-22 11:56:09
|
Laurian Gridinoc wrote:
>I always thought at this sha1 result as an URI, I would use the
>already used format of identification of resources by sha1 content sum
>used in P2P software:
>
>urn:sha1:XSQCZ3UK3PPW6Y6HOVLTIX2QFMZ3TFFB
>----------------^ base32 of sha1 sum of the content - in our case of
>the n-triple
>
>I using for base32: com.bitzi.util.Base32:
>URI urn = new URI("urn", "sha1:" + Base32.encode(sha1digest), null);
>
>You proposed base64, I would go for base32, theoretically we would be
>able to identify (in the future) individual RDF statements over P2P
>networks :)
>
>
Hi Laurian,
Where I can find an up-to-date copy of the com.bitzi.util.Base32 class ?
I found it, but in the context of other projects. I'd like to get the
original source.
It is better to find a LGPL o more free licensed code. In this manner,
in the future, we can change to" a more free open source" Platypus license.
Thanks in advance.
Stefano
|
|
From: Laurian G. <la...@gm...> - 2004-09-22 13:05:38
|
On Wed, 22 Sep 2004 14:01:03 +0200, Stefano Campanini
<cam...@ya...> wrote:
> Laurian Gridinoc wrote:
>
> >I always thought at this sha1 result as an URI, I would use the
> >already used format of identification of resources by sha1 content sum
> >used in P2P software:
> >
> >urn:sha1:XSQCZ3UK3PPW6Y6HOVLTIX2QFMZ3TFFB
> >----------------^ base32 of sha1 sum of the content - in our case of
> >the n-triple
> >
> >I using for base32: com.bitzi.util.Base32:
> >URI urn = new URI("urn", "sha1:" + Base32.encode(sha1digest), null);
> >
> >You proposed base64, I would go for base32, theoretically we would be
> >able to identify (in the future) individual RDF statements over P2P
> >networks :)
> >
> >
> Hi Laurian,
>
> Where I can find an up-to-date copy of the com.bitzi.util.Base32 class ?
> I found it, but in the context of other projects. I'd like to get the
> original source.
> It is better to find a LGPL o more free licensed code. In this manner,
> in the future, we can change to" a more free open source" Platypus license.
>
> Thanks in advance.
> Stefano
>
It is in public domain, I'm digging now for the original...
http://bitzi.com/developer/code
The javacode is derrived from bitcollider which is in public domain:
http://sourceforge.net/projects/bitcollider
and you have the java version in CVS
http://cvs.sourceforge.net/viewcvs.py/bitcollider/jbitprint/src/com/bitzi/util/Base32.java?rev=1.1&view=auto
labeled public domain
Cheers,
--
Laurian Gridinoc
Chief Developer
GRAPEFRUIT DESIGN
www.gd.ro
|