From: Patrick B. <pat...@jo...> - 2010-07-28 00:52:25
|
Hello eXist Developers! I need the ability to set up some pretty advanced metadata for documents and for binary objects (perhaps even collections?). I spoke briefly with Adam, he seemed to think a module was the wrong path, and eXist needed metadata functionality built into the core (Adam, please feel free to correct me if you don't feel I've properly represented you). This is a job I'm more than willing to undertake, but I think it would be best if I was directed by the core developers, since I want to make sure anything I write benefits everyone and not just my needs. Also, I think I could probably build it much faster if given a few pointers. My needs are basically just the ability to efficiently store and index a schema validated piece of XML associated with other objects in the system. I was also thinking this might be a good time to consider how this could be extended into native XLink functionality. I've done a little work in the code base, I built a small addition to the unix permissions to allow more granular permissions (which I still intend to contribute, but I haven't had time to fully test, and now I'm thinking of changing it as I work on the metadata stuff). But a quick "here's where I would start" would be very helpful. Also, any tips to ensure efficiency would be appreciated. Lastly, links to reading material are always great. Let me know what ya'll think! Cheers, -- Patrick Bosek Jorsek Software Cell (585) 820 9634 Office (585) 239 6060 Jorsek.com |
From: Dan M. <dan...@gm...> - 2010-07-28 02:10:22
|
This might be a good idea, but there are several design trade-offs here. First of all, perhaps we should define a little bit more about what we mean by metadata. Each person seems to have a slightly different definition. Next come the question of what metadata to add by default and where do we put it (in the XML files or not in the XML files) and should we allow the users to change this on a collection-by-collection configuration. For example the current system keeps track of the following for each resource and each collection: owner name group name created date-time last updated date-time Note that collections also have these items also. Since we frequently need to sync much of our data to subversion we also add the user-id that created the document and the user-id that last modified the document. But we put all this metadata at the end of each XML file for "administered items" as the ISO-11179 metadata registry spec calls them. There are many trade-offs for storing system metadata in the XML documents and not in the documents. We do try to centralize some of these functions using a common XQuery module but we have more work to do here. We can not use the eXist built-in time-stamp metadata for user and timestamps since it gets changed when we do a restore from a backup and does not reflect actual user and timestamps that did change the data. We also would like to do what subversion does and have a timestamp change of a collection that reflects the most recent update of any resource inside that collection. This would be very useful for doing Sync operations between exist systems and systems like subversion. One option might be to create a collection configuration standard that would automatically add a <metadata> tag to the end of each element that needs this metadata and keeps it up to date. I could also see a lot of other useful data that might be unimportant to other people. Things like "validated-by" metatdata that hold the XML Schema name and version and time-stamp that a document what checked against a specific version of an XML Schema. Or a "published" date-time that shows when the document was published to an external public web server and who authorized the document to be published. I have also tried to use eXist triggers to keep this metadata up to date but my work on triggers has not been very successful and I don't have the background to debug why the sometimes do not fire. I hope that give us some ideas of where this can go. My only real suggestion is that we use the existing collection configuration files to change what metadata is tracked and where it is stored. It might be interesting to try to do this with just triggers and in-document XML data as a starting point. - Dan On Tue, Jul 27, 2010 at 7:52 PM, Patrick Bosek <pat...@jo...>wrote: > Hello eXist Developers! > > I need the ability to set up some pretty advanced metadata for documents > and for binary objects (perhaps even collections?). I spoke briefly with > Adam, he seemed to think a module was the wrong path, and eXist needed > metadata functionality built into the core (Adam, please feel free to > correct me if you don't feel I've properly represented you). This is a job > I'm more than willing to undertake, but I think it would be best if I was > directed by the core developers, since I want to make sure anything I write > benefits everyone and not just my needs. Also, I think I could probably > build it much faster if given a few pointers. > > My needs are basically just the ability to efficiently store and index a > schema validated piece of XML associated with other objects in the system. I > was also thinking this might be a good time to consider how this could be > extended into native XLink functionality. > > I've done a little work in the code base, I built a small addition to the > unix permissions to allow more granular permissions (which I still intend to > contribute, but I haven't had time to fully test, and now I'm thinking of > changing it as I work on the metadata stuff). But a quick "here's where I > would start" would be very helpful. Also, any tips to ensure efficiency > would be appreciated. Lastly, links to reading material are always great. > > Let me know what ya'll think! > > > Cheers, > > -- > Patrick Bosek > Jorsek Software > Cell (585) 820 9634 > Office (585) 239 6060 > Jorsek.com > > > > ------------------------------------------------------------------------------ > The Palm PDK Hot Apps Program offers developers who use the > Plug-In Development Kit to bring their C/C++ apps to Palm for a share > of $1 Million in cash or HP Products. Visit us here for more details: > http://ad.doubleclick.net/clk;226879339;13503038;l? > http://clk.atdmt.com/CRS/go/247765532/direct/01/ > _______________________________________________ > Exist-development mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-development > > -- Dan McCreary Semantic Solutions Architect office: (952) 931-9198 cell: (612) 986-1552 |
From: Adam R. <ad...@ex...> - 2010-07-28 07:31:37
|
One design option in my opinion - Existing metadata i.e. file permissions and attributes stay where they are. Its crucial for performance not to complicate this, as it needs to be interpreted for each document queried. The new user customisable metadata becomes a simple key value store and is indexed on document id and unique key name. Each value follows XQuery datatype rules and may be either an atomic value or a sequence. We then establish a metadata axis which is available in XQuery. From here we can use the standard predicates and update facilities of XQuery to maintain the metadata. The context of the metadata axis would always be the document root of the current node as metadata is on a per-document level, not a per-node level. e.g. doc("/db/abc.xml)/someNode[@value eq $some-value][metadata::someKey eq $some-metadata-value] update value doc("/db/abc.xml)/metadata::someKey with $new-metadata-value or maybe even - doc("/db/abc.xml)/someNode[@value eq $some-value][metadata::entry['someKey'] eq $some-metadata-value] update value doc("/db/abc.xml)/metadata::entry['someKey'] with $new-metadata-value Hows that sound? Cheers Adam. On 28 July 2010 01:52, Patrick Bosek <pat...@jo...> wrote: > Hello eXist Developers! > > I need the ability to set up some pretty advanced metadata for documents and > for binary objects (perhaps even collections?). I spoke briefly with Adam, > he seemed to think a module was the wrong path, and eXist needed metadata > functionality built into the core (Adam, please feel free to correct me if > you don't feel I've properly represented you). This is a job I'm more than > willing to undertake, but I think it would be best if I was directed by the > core developers, since I want to make sure anything I write benefits > everyone and not just my needs. Also, I think I could probably build it much > faster if given a few pointers. > > My needs are basically just the ability to efficiently store and index a > schema validated piece of XML associated with other objects in the system. I > was also thinking this might be a good time to consider how this could be > extended into native XLink functionality. > > I've done a little work in the code base, I built a small addition to the > unix permissions to allow more granular permissions (which I still intend to > contribute, but I haven't had time to fully test, and now I'm thinking of > changing it as I work on the metadata stuff). But a quick "here's where I > would start" would be very helpful. Also, any tips to ensure efficiency > would be appreciated. Lastly, links to reading material are always great. > > Let me know what ya'll think! > > > Cheers, > > -- > Patrick Bosek > Jorsek Software > Cell (585) 820 9634 > Office (585) 239 6060 > Jorsek.com > > > ------------------------------------------------------------------------------ > The Palm PDK Hot Apps Program offers developers who use the > Plug-In Development Kit to bring their C/C++ apps to Palm for a share > of $1 Million in cash or HP Products. Visit us here for more details: > http://ad.doubleclick.net/clk;226879339;13503038;l? > http://clk.atdmt.com/CRS/go/247765532/direct/01/ > _______________________________________________ > Exist-development mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-development > > -- Adam Retter eXist Developer { United Kingdom } ad...@ex... irc://irc.freenode.net/existdb |
From: Joe W. <jo...@gm...> - 2010-07-28 13:50:38
|
Very interesting discussion; I second all that Dan said about his use cases. Just a note on Adam's e-mail: > The new user customisable metadata becomes a simple key value store > and is indexed on document id and unique key name. Each value follows > XQuery datatype rules and may be either an atomic value or a sequence. > > We then establish a metadata axis which is available in XQuery. From > here we can use the standard predicates and update facilities of > XQuery to maintain the metadata. The context of the metadata axis > would always be the document root of the current node as metadata is > on a per-document level, not a per-node level. The metadata:: axis strikes me as similar in notion to the MarkLogic property:: axis, although the latter stores metadata as an XML node rather than key/value pair. See: http://docs.marklogic.com/4.1doc/docapp.xqy#display.xqy?fname=http://pubs/4.1doc/xml/dev_guide/properties.xml I raise this as an existing implementation for comparison. Joe |
From: Patrick B. <pat...@jo...> - 2010-07-28 15:55:23
|
I really like the concept of implementing metadata axis. After reading over MarkLogic's implementation, I think it may be a very good idea to essentially mimic them. It looks like their system would accomplish most of the goals expressed here thus far. And I'm always a proponent of consistency. Thoughts? Cheers, Patrick On Wed, Jul 28, 2010 at 9:50 AM, Joe Wicentowski <jo...@gm...> wrote: > Very interesting discussion; I second all that Dan said about his use > cases. Just a note on Adam's e-mail: > > > The new user customisable metadata becomes a simple key value store > > and is indexed on document id and unique key name. Each value follows > > XQuery datatype rules and may be either an atomic value or a sequence. > > > > We then establish a metadata axis which is available in XQuery. From > > here we can use the standard predicates and update facilities of > > XQuery to maintain the metadata. The context of the metadata axis > > would always be the document root of the current node as metadata is > > on a per-document level, not a per-node level. > > The metadata:: axis strikes me as similar in notion to the MarkLogic > property:: axis, although the latter stores metadata as an XML node > rather than key/value pair. See: > > > http://docs.marklogic.com/4.1doc/docapp.xqy#display.xqy?fname=http://pubs/4.1doc/xml/dev_guide/properties.xml > > I raise this as an existing implementation for comparison. > > Joe > -- Patrick Bosek Jorsek Software Cell (585) 820 9634 Office (585) 239 6060 Jorsek.com |
From: José M. F. G. <jm...@us...> - 2010-07-28 17:31:19
|
Hi everybody, we can have a look at the different approaches on filesystems world. In Mac OS X HFS+ and Windows NTFS there is the concept of forks, like several named streams of bytes attached to a file. One of those named streams is the default one, i.e. the file content. On UNIX you have the extended file attributes concept (http://en.wikipedia.org/wiki/Extended_file_attributes), and it is a key/value model. MarkLogic implementation is interesting and a good starting point, but I guess it does not allow attaching binary contents as metadata, which is a limitation. Some scenarios can be, for instance, a JPEG image stored in eXist could have some metadata in XML describing its EXIF properties, and binary metadata like a thumbnail, and SVG or XSL-FO documents could also have as binary metadata a thumbnail. Cheers, José María On 28/07/10 17:55, Patrick Bosek wrote: > I really like the concept of implementing metadata axis. > > After reading over MarkLogic's implementation, I think it may be a very good idea to essentially mimic them. It looks like their system would accomplish most of the goals expressed here thus far. And I'm always a proponent of consistency. > > Thoughts? > > Cheers, > > Patrick > > On Wed, Jul 28, 2010 at 9:50 AM, Joe Wicentowski <jo...@gm... <mailto:jo...@gm...>> wrote: > > Very interesting discussion; I second all that Dan said about his use > cases. Just a note on Adam's e-mail: > > > The new user customisable metadata becomes a simple key value store > > and is indexed on document id and unique key name. Each value follows > > XQuery datatype rules and may be either an atomic value or a sequence. > > > > We then establish a metadata axis which is available in XQuery. From > > here we can use the standard predicates and update facilities of > > XQuery to maintain the metadata. The context of the metadata axis > > would always be the document root of the current node as metadata is > > on a per-document level, not a per-node level. > > The metadata:: axis strikes me as similar in notion to the MarkLogic > property:: axis, although the latter stores metadata as an XML node > rather than key/value pair. See: > > http://docs.marklogic.com/4.1doc/docapp.xqy#display.xqy?fname=http://pubs/4.1doc/xml/dev_guide/properties.xml > > I raise this as an existing implementation for comparison. > > Joe > > > > > -- > Patrick Bosek > Jorsek Software > Cell (585) 820 9634 > Office (585) 239 6060 > Jorsek.com > > > > ------------------------------------------------------------------------------ > The Palm PDK Hot Apps Program offers developers who use the > Plug-In Development Kit to bring their C/C++ apps to Palm for a share > of $1 Million in cash or HP Products. Visit us here for more details: > http://p.sf.net/sfu/dev2dev-palm > > > > _______________________________________________ > Exist-development mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-development -- "La violencia es el último recurso del incompetente" - Salvor Hardin en "La Fundación" de Isaac Asimov "Premature optimization is the root of all evil." - Donald Knuth José María Fernández González e-mail: jos...@gm... |
From: Adam R. <ad...@ex...> - 2010-07-28 17:40:17
|
> MarkLogic implementation is interesting and a good starting point, but I guess it does not allow attaching binary contents as metadata, which is a limitation. Some scenarios can be, for instance, a JPEG image stored in eXist could have some metadata in XML describing its EXIF properties Accessing this metadata at the moment is possible via the image xquery extension module. > , and binary metadata like a thumbnail, and SVG or XSL-FO documents could also have as binary metadata a thumbnail. Well my proposal was to use the XQuery datatypes, so then the user could set metadata to be base64binary data if desired. > > Cheers, > José María > > On 28/07/10 17:55, Patrick Bosek wrote: >> I really like the concept of implementing metadata axis. >> >> After reading over MarkLogic's implementation, I think it may be a very good idea to essentially mimic them. It looks like their system would accomplish most of the goals expressed here thus far. And I'm always a proponent of consistency. >> >> Thoughts? >> >> Cheers, >> >> Patrick >> >> On Wed, Jul 28, 2010 at 9:50 AM, Joe Wicentowski <jo...@gm... <mailto:jo...@gm...>> wrote: >> >> Very interesting discussion; I second all that Dan said about his use >> cases. Just a note on Adam's e-mail: >> >> > The new user customisable metadata becomes a simple key value store >> > and is indexed on document id and unique key name. Each value follows >> > XQuery datatype rules and may be either an atomic value or a sequence. >> > >> > We then establish a metadata axis which is available in XQuery. From >> > here we can use the standard predicates and update facilities of >> > XQuery to maintain the metadata. The context of the metadata axis >> > would always be the document root of the current node as metadata is >> > on a per-document level, not a per-node level. >> >> The metadata:: axis strikes me as similar in notion to the MarkLogic >> property:: axis, although the latter stores metadata as an XML node >> rather than key/value pair. See: >> >> http://docs.marklogic.com/4.1doc/docapp.xqy#display.xqy?fname=http://pubs/4.1doc/xml/dev_guide/properties.xml >> >> I raise this as an existing implementation for comparison. >> >> Joe >> >> >> >> >> -- >> Patrick Bosek >> Jorsek Software >> Cell (585) 820 9634 >> Office (585) 239 6060 >> Jorsek.com >> >> >> >> ------------------------------------------------------------------------------ >> The Palm PDK Hot Apps Program offers developers who use the >> Plug-In Development Kit to bring their C/C++ apps to Palm for a share >> of $1 Million in cash or HP Products. Visit us here for more details: >> http://p.sf.net/sfu/dev2dev-palm >> >> >> >> _______________________________________________ >> Exist-development mailing list >> Exi...@li... >> https://lists.sourceforge.net/lists/listinfo/exist-development > > -- > "La violencia es el último recurso del incompetente" > - Salvor Hardin en "La Fundación" de Isaac Asimov > "Premature optimization is the root of all evil." - Donald Knuth > > José María Fernández González > e-mail: jos...@gm... > > ------------------------------------------------------------------------------ > The Palm PDK Hot Apps Program offers developers who use the > Plug-In Development Kit to bring their C/C++ apps to Palm for a share > of $1 Million in cash or HP Products. Visit us here for more details: > http://p.sf.net/sfu/dev2dev-palm > _______________________________________________ > Exist-development mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-development > -- Adam Retter eXist Developer { United Kingdom } ad...@ex... irc://irc.freenode.net/existdb |
From: Dmitriy S. <sha...@gm...> - 2010-07-28 17:46:02
Attachments:
smime.p7s
|
On Wed, 2010-07-28 at 19:31 +0200, José María Fernández González wrote: > we can have a look at the different approaches on filesystems world. > In Mac OS X HFS+ and Windows NTFS there is the concept of forks, like > several named streams of bytes attached to a file. One of those named > streams is the default one, i.e. the file content. On UNIX you have > the extended file attributes concept > (http://en.wikipedia.org/wiki/Extended_file_attributes), and it is a > key/value model. +1 for two streams, it's looks very close to Adam's proposal. -- Cheers, Dmitriy Shabanov |
From: James F. <jam...@ex...> - 2010-07-28 17:52:20
|
2010/7/28 Dmitriy Shabanov <sha...@gm...>: > On Wed, 2010-07-28 at 19:31 +0200, José María Fernández González wrote: >> we can have a look at the different approaches on filesystems world. >> In Mac OS X HFS+ and Windows NTFS there is the concept of forks, like >> several named streams of bytes attached to a file. One of those named >> streams is the default one, i.e. the file content. On UNIX you have >> the extended file attributes concept >> (http://en.wikipedia.org/wiki/Extended_file_attributes), and it is a >> key/value model. > > +1 for two streams, it's looks very close to Adam's proposal. I think this is a cool thing, a few thoughts: * we should make it possible to make metadata show up as embedded explicit mixed content (attributes or elements?) with a document stored in the database if we need to serialize document. * there maybe versioning considerations to think through with metadata * I do feel that considering a new metadata axis is useful but we need to be careful e.g. is it being proposed that we will have a metadata namespace for this stuff my 2p/2e/2czk J |
From: Thomas W. <tho...@gm...> - 2010-07-29 08:10:50
|
Adam, I like the idea of metadata axis. Your proposal for name/value pairs is simple and it can accommodate most of variations we may need. 2010/7/28 Adam Retter <ad...@ex...> > > MarkLogic implementation is interesting and a good starting point, > but I guess it does not allow attaching binary contents as metadata, which > is a limitation. Some scenarios can be, for instance, a JPEG image stored in > eXist could have some metadata in XML describing its EXIF properties > > Accessing this metadata at the moment is possible via the image xquery > extension module. > Actually the the existing image module gives vary limited metadata: hight, width and image type. It will really love to have all EXIF properties of an image as searchable metadata fields. > > > , and binary metadata like a thumbnail, and SVG or XSL-FO documents could > also have as binary metadata a thumbnail. > > Well my proposal was to use the XQuery datatypes, so then the user could > set metadata to be base64binary data if desired. -- > Adam Retter > What will be the format of the metadata if we request all available pairs? May be an element with all metadata pairs as attributes. Thomas |
From: Evgeny G. <gaz...@gm...> - 2010-07-29 08:32:12
|
What about adding into metadata any kind of key/value pair whith fulltext indexing of values. We will able to add authors, descriptions, keywords, linked docs, and any more for any kind of XML or binary docs. Common value can be the XML fragment or all metadata can be one XML fragment like with fixed shem like <metada> <property name="foo" value="bla-bla"/> <property name="foo1"> <!-- any XML or base64 binary fragment here --> </property> </metadata> or will have flex scheme <metadata> <foo>bla-bla</foo> <foo1> <!-- any XML or base64 binary fragment here --> </foo1> </metadata> the axis metadata:: will returns the "metadata" element for both cases -- Evgeny |
From: Patrick B. <pat...@jo...> - 2010-07-29 15:30:29
|
It looks like there is a lot of thought around using key/value pairs. What value does key/value provide over using a stored XML (similar to MarkLogic)? Because on the inverse of that, I can see a lot of value in being able to store more complex structures, as they can be simple key/value pairs at their simplest and more complex for users who have that need. Also, which method is going to be easier to index with lucene? Cheers, Patrick On Thu, Jul 29, 2010 at 4:32 AM, Evgeny Gazdovsky <gaz...@gm...>wrote: > What about adding into metadata any kind of key/value pair > whith fulltext indexing of values. > > We will able to add authors, descriptions, keywords, > linked docs, and any more for any kind of XML or > binary docs. > > Common value can be the XML fragment or all > metadata can be one XML fragment like > with fixed shem like > > <metada> > <property name="foo" value="bla-bla"/> > <property name="foo1"> > <!-- any XML or base64 binary fragment here --> > </property> > </metadata> > > or will have flex scheme > <metadata> > <foo>bla-bla</foo> > <foo1> > <!-- any XML or base64 binary fragment here --> > </foo1> > </metadata> > > the axis metadata:: will returns the "metadata" element for both cases > > > -- > Evgeny > > > ------------------------------------------------------------------------------ > The Palm PDK Hot Apps Program offers developers who use the > Plug-In Development Kit to bring their C/C++ apps to Palm for a share > of $1 Million in cash or HP Products. Visit us here for more details: > http://p.sf.net/sfu/dev2dev-palm > _______________________________________________ > Exist-development mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-development > -- Patrick Bosek Jorsek Software Cell (585) 820 9634 Office (585) 239 6060 Jorsek.com |
From: James F. <jam...@ex...> - 2010-07-29 15:36:56
|
On 29 July 2010 16:27, Patrick Bosek <pat...@jo...> wrote: > It looks like there is a lot of thought around using key/value pairs. What > value does key/value provide over using a stored XML (similar to MarkLogic)? > Because on the inverse of that, I can see a lot of value in being able to > store more complex structures, as they can be simple key/value pairs at > their simplest and more complex for users who have that need. I think its safe to say that attributes do the same job as key/value ... e.g. I think introducing a new axis is a bit too much for this kind of thing, we could just inject exist metadata attributes. This means that when data is backup/archive/extracted outside of eXist it can retain its information. > Also, which method is going to be easier to index with lucene? good point and another reason why we should introduce metadata into the natural xml structure instead of 'something else'. James Fuller |
From: Adam R. <ad...@ex...> - 2010-07-30 09:39:10
|
> It looks like there is a lot of thought around using key/value pairs. What > value does key/value provide over using a stored XML (similar to MarkLogic)? > Because on the inverse of that, I can see a lot of value in being able to > store more complex structures, as they can be simple key/value pairs at > their simplest and more complex for users who have that need. When I said key/value - I meant that the value would be an XML Schema Data Type - this could be an atomic value, a sequence of values or a node. But in terms of storage it would be encoded as a value of the key. > > -- > Patrick Bosek > Jorsek Software > Cell (585) 820 9634 > Office (585) 239 6060 > Jorsek.com > > > ------------------------------------------------------------------------------ > The Palm PDK Hot Apps Program offers developers who use the > Plug-In Development Kit to bring their C/C++ apps to Palm for a share > of $1 Million in cash or HP Products. Visit us here for more details: > http://p.sf.net/sfu/dev2dev-palm > _______________________________________________ > Exist-development mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-development > > -- Adam Retter eXist Developer { United Kingdom } ad...@ex... irc://irc.freenode.net/existdb |
From: Wolfgang M. <wol...@ex...> - 2010-07-29 18:37:09
|
> It looks like there is a lot of thought around using key/value pairs. What > value does key/value provide over using a stored XML (similar to MarkLogic)? > Because on the inverse of that, I can see a lot of value in being able to > store more complex structures, as they can be simple key/value pairs at > their simplest and more complex for users who have that need. I would prefer to use the appropriate standards for storing metadata, for example, MODS for bibliographic information related to a document. So basically I would not put any restrictions on the XML structure which can be stored as metadata. I don't like the idea of a property:: axis. If the metadata is an ordinary document, you can just use standard XPath, e.g. metadata("/db/my/doc.xml")//mods:titleInfo. Wolfgang |
From: Adam R. <ad...@ex...> - 2010-07-30 09:41:26
|
> I don't like the idea of a property:: axis. If the metadata is an > ordinary document, you can just use standard XPath, e.g. > metadata("/db/my/doc.xml")//mods:titleInfo. This is no different to having an axis in my vision! Surely better to keep the existing doc and collection functions, and be able to access the metadata naturally in a query by using metadata:: In this way you can slowly and easily introduce metadata into existing queries. > Wolfgang > > ------------------------------------------------------------------------------ > The Palm PDK Hot Apps Program offers developers who use the > Plug-In Development Kit to bring their C/C++ apps to Palm for a share > of $1 Million in cash or HP Products. Visit us here for more details: > http://p.sf.net/sfu/dev2dev-palm > _______________________________________________ > Exist-development mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-development > -- Adam Retter eXist Developer { United Kingdom } ad...@ex... irc://irc.freenode.net/existdb |
From: Adam R. <ad...@ex...> - 2010-07-30 09:37:40
|
> I like the idea of metadata axis. Your proposal for name/value pairs is > simple and it can accommodate most of variations we may need. Keep in mind that a value would be either an atomic value or a sequence; of XML Schema Data Type. >> Accessing this metadata at the moment is possible via the image xquery >> extension module. > > > Actually the the existing image module gives vary limited metadata: hight, > width and image type. > It will really love to have all EXIF properties of an image as searchable > metadata fields. Well we could very easily integrate the JpegParser from Apache Tika into the Image metadata function... > What will be the format of the metadata if we request all available pairs? > May be an element with all metadata pairs as attributes. I am not sure at the moment, some sort of serialized xml format perhaps - e.g. /metadata::* <metadata xmlns="http://exist-db.org/metadata"> <key name="someKey"> <value type="xs:string">hello world</value> </key> <key name="otherValue"> <value type="node()"> <a xmlns=""><b/></a> </value> </key> </metadata> -- Adam Retter eXist Developer { United Kingdom } ad...@ex... irc://irc.freenode.net/existdb |
From: James F. <jam...@ex...> - 2010-07-30 10:03:55
|
On 30 July 2010 11:37, Adam Retter <ad...@ex...> wrote: >> I like the idea of metadata axis. Your proposal for name/value pairs is >> simple and it can accommodate most of variations we may need. on 2nd thought I think axis is maybe undermining things a little ... whereas I completely see the benefit in a standard axis to deal with time ... metadata seems to be turning things on their head; we can just as easily inject attributes/elements to do the same thing. we need to keep in mind that when data is backed up / archived / extracted outside the system that metadata moves with it. J |
From: Dmitriy S. <sha...@gm...> - 2010-07-30 10:18:39
|
On Fri, Jul 30, 2010 at 3:03 PM, James Fuller <jam...@ex...>wrote: > On 30 July 2010 11:37, Adam Retter <ad...@ex...> wrote: > >> I like the idea of metadata axis. Your proposal for name/value pairs is > >> simple and it can accommodate most of variations we may need. > > on 2nd thought I think axis is maybe undermining things a little ... > whereas I completely see the benefit in a standard axis to deal with > time ... metadata seems to be turning things on their head; we can > just as easily inject attributes/elements to do the same thing. > > we need to keep in mind that when data is backed up / archived / > extracted outside the system that metadata moves with it. Keep things as simple as possible & try to use existing ... so, -1 for metadata axis. Yes, it simple, but break second rule. +1 for mods way. -- Dmitriy Shabanov |
From: Adam R. <ad...@ex...> - 2010-07-30 10:41:26
|
> we need to keep in mind that when data is backed up / archived / > extracted outside the system that metadata moves with it. Completely agree with you. But as was mentioned earlier in terms of thinking of ntfs streams, when the backup is done as well as for example doc1.xml being in the backup you could also have doc1.xmd (containing the metadata) > J > -- Adam Retter eXist Developer { United Kingdom } ad...@ex... irc://irc.freenode.net/existdb |
From: Patrick B. <pat...@jo...> - 2010-07-30 16:25:33
|
So, it seems like we're fairly agreed that a new axis is not the way to go. I would tend to agree that is the right decision. So the only question left is whether to use key/value or xml fragments. I like Wolfgang's method of accessing metadata, it seems very consistent and intuitive. Unless I'm missing something, I'm still having a hard time seeing value in choosing key/value over xml fragment. Since under Wolfgang's suggestion anyone who wanted a key/value type metadata set could simply access them as metadata("/db/doc.xml")/keyOne metadata("/db/doc.xml")/keyTwo This may not allow us to store binary data natively in metadata fields. But I would contend that in 99% of cases metadata is being used to aid in searching for or classifying another resource in the database. As a side note. I'm planning to begin working on implementing this around the 2nd week in August. Cheers, Patrick On Fri, Jul 30, 2010 at 6:41 AM, Adam Retter <ad...@ex...> wrote: > > we need to keep in mind that when data is backed up / archived / > > extracted outside the system that metadata moves with it. > > Completely agree with you. But as was mentioned earlier in terms of > thinking of ntfs streams, when the backup is done as well as for > example doc1.xml being in the backup you could also have doc1.xmd > (containing the metadata) > > > J > > > > > > -- > Adam Retter > > eXist Developer > { United Kingdom } > ad...@ex... > irc://irc.freenode.net/existdb > > > ------------------------------------------------------------------------------ > The Palm PDK Hot Apps Program offers developers who use the > Plug-In Development Kit to bring their C/C++ apps to Palm for a share > of $1 Million in cash or HP Products. Visit us here for more details: > http://p.sf.net/sfu/dev2dev-palm > _______________________________________________ > Exist-development mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-development > -- Patrick Bosek Jorsek Software Cell (585) 820 9634 Office (585) 239 6060 Jorsek.com |
From: Dan M. <dan...@gm...> - 2010-07-30 16:49:32
|
I am not sure I understand so here is an example use case for photography images. We could create a trigger under /db/images so that all images that are added to specific collections that have an image type (.jpg, .png. gif etc.) such as: /db/images/foo.jpg would have an XML fragment automatically created that might contain information about the image like this<http://www.exif.org/samples/canon-ixus.html>but in XML format? Can we control where and/or how we store the metadata? If we want the metadata to be backed up it might be in the same collection: /db/images/foo.jpg.meta or /db/images/foo.jpg.meta.xml or optionally, perhaps it should it be put in a system folder: /db/system/meta/db/images/foo.jpg.meta or /db/meta/db/images/foo.jpg.meta Does this make sense? - Dan On Fri, Jul 30, 2010 at 12:00 PM, Patrick Bosek <pat...@jo...>wrote: > So, it seems like we're fairly agreed that a new axis is not the way to go. > I would tend to agree that is the right decision. So the only question left > is whether to use key/value or xml fragments. I like Wolfgang's method of > accessing metadata, it seems very consistent and intuitive. Unless I'm > missing something, I'm still having a hard time seeing value in choosing > key/value over xml fragment. Since under Wolfgang's suggestion anyone who > wanted a key/value type metadata set could simply access them as > > metadata("/db/doc.xml")/keyOne > metadata("/db/doc.xml")/keyTwo > > > This may not allow us to store binary data natively in metadata fields. But > I would contend that in 99% of cases metadata is being used to aid in > searching for or classifying another resource in the database. > > > As a side note. I'm planning to begin working on implementing this around > the 2nd week in August. > > Cheers, > > Patrick > > > On Fri, Jul 30, 2010 at 6:41 AM, Adam Retter <ad...@ex...> wrote: > >> > we need to keep in mind that when data is backed up / archived / >> > extracted outside the system that metadata moves with it. >> >> Completely agree with you. But as was mentioned earlier in terms of >> thinking of ntfs streams, when the backup is done as well as for >> example doc1.xml being in the backup you could also have doc1.xmd >> (containing the metadata) >> >> > J >> > >> >> >> >> -- >> Adam Retter >> >> eXist Developer >> { United Kingdom } >> ad...@ex... >> irc://irc.freenode.net/existdb >> >> >> ------------------------------------------------------------------------------ >> The Palm PDK Hot Apps Program offers developers who use the >> Plug-In Development Kit to bring their C/C++ apps to Palm for a share >> of $1 Million in cash or HP Products. Visit us here for more details: >> http://p.sf.net/sfu/dev2dev-palm >> _______________________________________________ >> Exist-development mailing list >> Exi...@li... >> https://lists.sourceforge.net/lists/listinfo/exist-development >> > > > > -- > Patrick Bosek > Jorsek Software > Cell (585) 820 9634 > Office (585) 239 6060 > Jorsek.com > > > > ------------------------------------------------------------------------------ > The Palm PDK Hot Apps Program offers developers who use the > Plug-In Development Kit to bring their C/C++ apps to Palm for a share > of $1 Million in cash or HP Products. Visit us here for more details: > http://p.sf.net/sfu/dev2dev-palm > _______________________________________________ > Exist-development mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-development > > -- Dan McCreary Semantic Solutions Architect office: (952) 931-9198 cell: (612) 986-1552 |
From: Dmitriy S. <sha...@gm...> - 2010-07-31 03:53:19
|
On Fri, Jul 30, 2010 at 9:49 PM, Dan McCreary <dan...@gm...>wrote: > I am not sure I understand so here is an example use case for photography > images. > > We could create a trigger under /db/images so that all images that are > added to specific collections that have an image type (.jpg, .png. gif etc.) > such as: > > /db/images/foo.jpg > > would have an XML fragment automatically created that might contain > information about the image like this<http://www.exif.org/samples/canon-ixus.html>but in XML format? > > Can we control where and/or how we store the metadata? If we want the > metadata to be backed up it might be in the same collection: > > /db/images/foo.jpg.meta > As I understand: metadata("/db/images/foo.jpg")/keyOne My question: is it going to be stored under /system/metadata/db/images/foo.jpg.xml ? -- Dmitriy Shabanov |
From: Patrick B. <pat...@jo...> - 2010-08-02 19:33:16
|
I'm not sure where it will be stored. This is actually something I was hoping one of the core developers could advise me on (Wolf?) Cheers, Patrick On Fri, Jul 30, 2010 at 11:53 PM, Dmitriy Shabanov <sha...@gm...>wrote: > On Fri, Jul 30, 2010 at 9:49 PM, Dan McCreary <dan...@gm...>wrote: > >> I am not sure I understand so here is an example use case for photography >> images. >> >> We could create a trigger under /db/images so that all images that are >> added to specific collections that have an image type (.jpg, .png. gif etc.) >> such as: >> >> /db/images/foo.jpg >> >> would have an XML fragment automatically created that might contain >> information about the image like this<http://www.exif.org/samples/canon-ixus.html>but in XML format? >> >> Can we control where and/or how we store the metadata? If we want the >> metadata to be backed up it might be in the same collection: >> >> /db/images/foo.jpg.meta >> > > As I understand: metadata("/db/images/foo.jpg")/keyOne > > My question: is it going to be stored under > /system/metadata/db/images/foo.jpg.xml ? > > -- > Dmitriy Shabanov > -- Patrick Bosek Jorsek Software Cell (585) 820 9634 Office (585) 239 6060 Jorsek.com |