From: Stefano E. C. <cam...@ya...> - 2004-09-16 19:10:57
|
Laurian Gridinoc wrote: >hmmm, Gmail is making problems... > >about having URIs for statements, quads, named graphs; this doc. was >released today: >http://www.wiwiss.fu-berlin.de/suhl/bizer/ng4j/ >Interesting. > > Thanks, it is interesting also in my opinion. Using it for Platypus? In our project we encountered problems that NG4J seems is able to solve. Unfortunately it seems in a very early state.... "NG4J is an *experimental implementation* of the new syntaxes (TriX <http://swdev.nokia.com/trix/TriX.html>, TriG <http://www.wiwiss.fu-berlin.de/suhl/bizer/TriG/>) and query language (TriQL <http://www.wiwiss.fu-berlin.de/suhl/bizer/TriQL/>) developed within the Semantic Web Interest Group. The whole implementation and all interfaces might change in future versions... " I'd like to try it, .... Bye Bye Campa >Laur > > > |
From: Stefano C. <cam...@ya...> - 2004-09-17 07:13:38
|
Laurian Gridinoc wrote: >On Wed, 15 Sep 2004 07:16:15 +0200, Paolo Castagna > > >>>>... we need an unique identifier for a statement >>>>in the index and for reification. >>>> <subject uri> <predicate uri> <object uri> >>>>Is there any problem if we use SHA1? >>>> sha1(<subject uri> <predicate uri> <object uri>)? >>>> >>>> >>>Very OK with me, I mentioned it in a discussion about reification: >>>http://lists.w3.org/Archives/Public/www-rdf-interest/2004Aug/0194.html >>>but no one argued back with me :( >>> >>> >>So, it's decided: we'll use SHA1 ;) >>I don't know how to tell in english... but: >>"chi tace acconsente" :) >> >> Ok, I have added the follow methods in MetaManager class: /* *@return the Triple ID * */ public static String getStmtId(final URI subject, final URI predicate, final URI object) throws NoSuchAlgorithmException { final String tbDigest = subject.toString()+predicate.toString()+object.toString(); byte[] digest = getDigest(tbDigest.getBytes()); Base64 base = new Base64(); byte[] encoded = base.encode(digest); final String result = new String(encoded); return result; } private static byte[] getDigest(byte[] buffer) throws NoSuchAlgorithmException { MessageDigest md5 = MessageDigest.getInstance("SHA1"); md5.update(buffer); return md5.digest(); } So, we can calculate the ID of triples (tbd: literals ). I'd like thinking at the ID as a URI of the triple. In fact, this is what happen during the reification. In Platypus, when we create a refication we put the new resource (the refified triple) in the namespcace "reifications", so the URI is something like this: reifications:_086yskjf (reification+_CRC(s,p,o)). I consider this " reifications:_086yskjf" as the ID/URI of the triple, isn't it? If a triple isn't reificated? what is it's ID/URI? May be: namespace:_SHA1 (subjURI, predicateURI, objectURI/Literal) Or directly SHA1 (subjURI, predicateURI, objectURI/Literal) In my opinion we are creating a new RDF resource naming it, so it could be in a fixed "installation depending" namespace. I need this ID, becouse I use it to "add" or "remove" triples from the Lucene index. Any ideas? |
From: Laurian G. <la...@gm...> - 2004-09-17 12:31:02
|
On Fri, 17 Sep 2004 09:18:24 +0200, Stefano Campanini <cam...@ya...> wrote: > >On Wed, 15 Sep 2004 07:16:15 +0200, Paolo Castagna > >>>>... we need an unique identifier for a statement > >>>>in the index and for reification. > >>>> <subject uri> <predicate uri> <object uri> > >>>>Is there any problem if we use SHA1? > >>>> sha1(<subject uri> <predicate uri> <object uri>)? > Ok, I have added the follow methods in MetaManager class: > /* > *@return the Triple ID > * > */ > public static String getStmtId(final URI subject, final URI > predicate, final URI object) throws NoSuchAlgorithmException { > final String tbDigest = > subject.toString()+predicate.toString()+object.toString(); > byte[] digest = getDigest(tbDigest.getBytes()); > Base64 base = new Base64(); > byte[] encoded = base.encode(digest); > final String result = new String(encoded); > return result; > } > > So, we can calculate the ID of triples (tbd: literals ). Instead of sha1 of strings concatenation: subject.toString()+predicate.toString()+object.toString() I would propose sha1 of the n-triple.toString() representing the triple, I think this will include the optional xsd datatype and language too, and would be easyer to describe/interoperate. Consider: <http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> "Grapefruit" . sha1("http://www.grapefruit.rohttp://purl.org/dc/terms/1.0/titleGrapefruit") doesn't look nice but: sha1("<http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> \"Grapefruit\" .") does, I would include the final . too, for ease of script processing if ever needed by us or by third parties. also using n-triples would allow to identify by different URIs these statements: <http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> "Grapefruit" . <http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> "Grapefruit"@en . <http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> "Grapefruit"^^xsd:string . > I'd like thinking at the ID as a URI of the triple. In fact, this is > what happen during the reification. I always thought at this sha1 result as an URI, I would use the already used format of identification of resources by sha1 content sum used in P2P software: urn:sha1:XSQCZ3UK3PPW6Y6HOVLTIX2QFMZ3TFFB ----------------^ base32 of sha1 sum of the content - in our case of the n-triple I using for base32: com.bitzi.util.Base32: URI urn = new URI("urn", "sha1:" + Base32.encode(sha1digest), null); You proposed base64, I would go for base32, theoretically we would be able to identify (in the future) individual RDF statements over P2P networks :) > In Platypus, when we create a refication we put the new resource (the > refified triple) in the namespcace "reifications", so the URI is > something like this: reifications:_086yskjf (reification+_CRC(s,p,o)). > I consider this " reifications:_086yskjf" as the ID/URI of the triple, > isn't it? I won't add semantics to the URI, in this case using a schema for identifying reified statements, if your sha1(statement) is in the system, then is reified. > If a triple isn't reificated? what is it's ID/URI? same, but if you don;t have its sha1 ID in the index, you may assume is not reified, at least not in the system :) It would be dangerous to change the ID of a statement just to mark that is reified, and won't be interoperable. > May be: > namespace:_SHA1 (subjURI, predicateURI, objectURI/Literal) > Or directly > SHA1 (subjURI, predicateURI, objectURI/Literal) > In my opinion we are creating a new RDF resource naming it, so it could > be in a fixed "installation depending" namespace. > I need this ID, becouse I use it to "add" or "remove" triples from the > Lucene index. Cheers, -- Laurian Gridinoc Chief Developer GRAPEFRUIT DESIGN www.gd.ro |
From: Stefano E. C. <cam...@ya...> - 2004-09-17 16:31:41
|
Laurian Gridinoc wrote: >On Fri, 17 Sep 2004 09:18:24 +0200, Stefano Campanini ><cam...@ya...> wrote: > > >>>On Wed, 15 Sep 2004 07:16:15 +0200, Paolo Castagna >>> >>> >>>>>>... we need an unique identifier for a statement >>>>>>in the index and for reification. >>>>>> <subject uri> <predicate uri> <object uri> >>>>>>Is there any problem if we use SHA1? >>>>>> sha1(<subject uri> <predicate uri> <object uri>)? >>>>>> >>>>>> >>Ok, I have added the follow methods in MetaManager class: >>/* >> *@return the Triple ID >> * >> */ >> public static String getStmtId(final URI subject, final URI >>predicate, final URI object) throws NoSuchAlgorithmException { >> final String tbDigest = >>subject.toString()+predicate.toString()+object.toString(); >> byte[] digest = getDigest(tbDigest.getBytes()); >> Base64 base = new Base64(); >> byte[] encoded = base.encode(digest); >> final String result = new String(encoded); >> return result; >> } >> >>So, we can calculate the ID of triples (tbd: literals ). >> >> > >Instead of sha1 of strings concatenation: >subject.toString()+predicate.toString()+object.toString() >I would propose sha1 of the n-triple.toString() representing the >triple, I think this will include the optional xsd datatype and >language too, and would be easyer to describe/interoperate. > >Consider: > ><http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> "Grapefruit" . > >sha1("http://www.grapefruit.rohttp://purl.org/dc/terms/1.0/titleGrapefruit") >doesn't look nice but: >sha1("<http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> >\"Grapefruit\" .") does, I would include the final . too, for ease of >script processing if ever needed by us or by third parties. > >also using n-triples would allow to identify by different URIs these statements: ><http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> "Grapefruit" . ><http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> >"Grapefruit"@en . ><http://www.grapefruit.ro> <http://purl.org/dc/terms/1.0/title> >"Grapefruit"^^xsd:string . > > Ok, is better use n-triple.toStirng(): > > >>I'd like thinking at the ID as a URI of the triple. In fact, this is >>what happen during the reification. >> >> > >I always thought at this sha1 result as an URI, I would use the >already used format of identification of resources by sha1 content sum >used in P2P software: > >urn:sha1:XSQCZ3UK3PPW6Y6HOVLTIX2QFMZ3TFFB >----------------^ base32 of sha1 sum of the content - in our case of >the n-triple > >I using for base32: com.bitzi.util.Base32: >URI urn = new URI("urn", "sha1:" + Base32.encode(sha1digest), null); > >You proposed base64, I would go for base32, theoretically we would be >able to identify (in the future) individual RDF statements over P2P >networks :) > > Ok, I like it. > > >>In Platypus, when we create a refication we put the new resource (the >>refified triple) in the namespcace "reifications", so the URI is >>something like this: reifications:_086yskjf (reification+_CRC(s,p,o)). >>I consider this " reifications:_086yskjf" as the ID/URI of the triple, >>isn't it? >> >> > >I won't add semantics to the URI, in this case using a schema for >identifying reified statements, if your sha1(statement) is in the >system, then is reified. > > > >>If a triple isn't reificated? what is it's ID/URI? >> >> > >same, but if you don;t have its sha1 ID in the index, you may assume >is not reified, at least not in the system :) > >It would be dangerous to change the ID of a statement just to mark >that is reified, and won't be interoperable. > > Yes... Following the Playtipus Way we have to permit to create a page (XHTM) that describe the statement as for others resources. So, we need a namespace or a better way (We have to invent it) to save the "index.rdf" and the "index.html" describing it as other Wiki resources. Do you think it is useful for the user reificate a triple and describe it using a wiki page? I propose that way: * All Wiki triples are in an implicit way reificated (only in the index) and its URI is as your proposal (new URI("urn", "sha1:" + Base32.encode(sha1digest), null) .... * If a user wants to create a wiki page that describe a triple we have to save RDF reification (s, o, p etc...) and the page. Where we can save the page? For the moment they are saved under "reification" namespace/dir. We add some semantics to the triple .. yes ... but ... the user create it intentionally, using the one determined Wiki Installation. I found the adding of this semantics not well but not so bad. So, I suggest to permit the user to save it in any other namespace selecting it. What do you think Laurian ? >>May be: >>namespace:_SHA1 (subjURI, predicateURI, objectURI/Literal) >>Or directly >>SHA1 (subjURI, predicateURI, objectURI/Literal) >>In my opinion we are creating a new RDF resource naming it, so it could >>be in a fixed "installation depending" namespace. >>I need this ID, becouse I use it to "add" or "remove" triples from the >>Lucene index. >> >> > >Cheers, > > In these days I start working on Unique Identifiers for triple in ways that you suggested above to use it in the Lucene index. Bye Bye |
From: Laurian G. <la...@gm...> - 2004-09-20 09:53:20
|
Hello, On Fri, 17 Sep 2004 18:35:46 +0200, Stefano Emilio Campanini [...] > >>In Platypus, when we create a refication we put the new resource (the > >>refified triple) in the namespcace "reifications", so the URI is > >>something like this: reifications:_086yskjf (reification+_CRC(s,p,o)). > >>I consider this " reifications:_086yskjf" as the ID/URI of the triple, > >>isn't it? > >I won't add semantics to the URI, in this case using a schema for > >identifying reified statements, if your sha1(statement) is in the > >system, then is reified. > >>If a triple isn't reificated? what is it's ID/URI? > >same, but if you don;t have its sha1 ID in the index, you may assume > >is not reified, at least not in the system :) > >It would be dangerous to change the ID of a statement just to mark > >that is reified, and won't be interoperable. > Yes... > Following the Playtipus Way we have to permit to create a page (XHTM) > that describe the statement as for others resources. So, we need a > namespace or a better way (We have to invent it) to save the "index.rdf" > and the "index.html" describing it as other Wiki resources. > Do you think it is useful for the user reificate a triple and describe > it using a wiki page? yes, we may consider the wiki page as an human readable explanation of the reification... > I propose that way: > * All Wiki triples are in an implicit way reificated (only in the index) > and its URI is as your proposal (new URI("urn", "sha1:" + > Base32.encode(sha1digest), null) .... You mean are in an implicit way identified bu the proposed URI scheme... > * If a user wants to create a wiki page that describe a triple we have > to save RDF reification (s, o, p etc...) and the page. Where we can > save the page? For the moment they are saved under "reification" > namespace/dir. Would be ok, for the sake of microcontent. > We add some semantics to the triple .. yes ... but ... the user create > it intentionally, using the one determined Wiki Installation. I found > the adding of this semantics not well but not so bad. > So, I suggest to permit the user to save it in any other namespace > selecting it. hmmm, why not `under' the current resource? :) for /foo/bar we have the refication wiki page under /foo/bar/reified? Cheers, -- Laurian Gridinoc Chief Developer GRAPEFRUIT DESIGN www.gd.ro |
From: Stefano C. <cam...@ya...> - 2004-09-22 11:56:09
|
Laurian Gridinoc wrote: >I always thought at this sha1 result as an URI, I would use the >already used format of identification of resources by sha1 content sum >used in P2P software: > >urn:sha1:XSQCZ3UK3PPW6Y6HOVLTIX2QFMZ3TFFB >----------------^ base32 of sha1 sum of the content - in our case of >the n-triple > >I using for base32: com.bitzi.util.Base32: >URI urn = new URI("urn", "sha1:" + Base32.encode(sha1digest), null); > >You proposed base64, I would go for base32, theoretically we would be >able to identify (in the future) individual RDF statements over P2P >networks :) > > Hi Laurian, Where I can find an up-to-date copy of the com.bitzi.util.Base32 class ? I found it, but in the context of other projects. I'd like to get the original source. It is better to find a LGPL o more free licensed code. In this manner, in the future, we can change to" a more free open source" Platypus license. Thanks in advance. Stefano |
From: Laurian G. <la...@gm...> - 2004-09-22 13:05:38
|
On Wed, 22 Sep 2004 14:01:03 +0200, Stefano Campanini <cam...@ya...> wrote: > Laurian Gridinoc wrote: > > >I always thought at this sha1 result as an URI, I would use the > >already used format of identification of resources by sha1 content sum > >used in P2P software: > > > >urn:sha1:XSQCZ3UK3PPW6Y6HOVLTIX2QFMZ3TFFB > >----------------^ base32 of sha1 sum of the content - in our case of > >the n-triple > > > >I using for base32: com.bitzi.util.Base32: > >URI urn = new URI("urn", "sha1:" + Base32.encode(sha1digest), null); > > > >You proposed base64, I would go for base32, theoretically we would be > >able to identify (in the future) individual RDF statements over P2P > >networks :) > > > > > Hi Laurian, > > Where I can find an up-to-date copy of the com.bitzi.util.Base32 class ? > I found it, but in the context of other projects. I'd like to get the > original source. > It is better to find a LGPL o more free licensed code. In this manner, > in the future, we can change to" a more free open source" Platypus license. > > Thanks in advance. > Stefano > It is in public domain, I'm digging now for the original... http://bitzi.com/developer/code The javacode is derrived from bitcollider which is in public domain: http://sourceforge.net/projects/bitcollider and you have the java version in CVS http://cvs.sourceforge.net/viewcvs.py/bitcollider/jbitprint/src/com/bitzi/util/Base32.java?rev=1.1&view=auto labeled public domain Cheers, -- Laurian Gridinoc Chief Developer GRAPEFRUIT DESIGN www.gd.ro |