The ids are generally low positive integer values. However, the event id may be a bit more descriptive so maybe has some letters as well. The ids are also not ordered, so id 5 may come after id 10.
I don't think we need to use these URIs in the transcript xml and markups xml files themselves. However, they might be used in bookmarking or when displaying results of search queries.
We could have something which is able to resolve these URIs and display the appropriate XML/HTML to the user.
Thanks, Mark
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The session is mandatory right? (Many events will have just one session e.g. a talk). So in these cases we'd just have a single session.
I'm not so sure about the use of the colon in these uris. I think the xml spec reserves a special meaning for colons (something to do with namespace stuff). I know that these aren't namespaces - but better to avoid any possible confusion and stick to standard naming conventions where possible...
How about using slashes instead
And again using ids prefixed with letters to conform to the xml spec
(I wouldn't bother embedding any event metadata into the event id (other than possibly the year) - since sometimes certain info is unknown, or sometimes not enough info to differentiate between 2 events - e.g. 2 talks on the same day - i've thought about this before - simpler to just assign numerical id)
The second session at the retreat (note - the actual id does not have to be "2"):
Ok agreed. hyphens to delimit local ids is better. We can enforce that local ids cannot contain hyphens to make this safe.
Also agree with the "s1" solution.
However, I'm not so sure about using hyphens because that conflicts with the range hyphen. How about underscore?
Generally, I don't think we need to talk about URIs since the UUID says it all and is much more compact then the whole URI. From the UUID we can work out whether its an event, session, block range etc.
Here's another idea:
d<year><month><day>_<event id within day>_<session id within event>_<block id within session>
For example:
d20070302_e2_s3_b4
might represent the 4th block of the third session of the second event starting on 2 March 2007.
The date represents the start date of the event (so it doesnt matter if the event spans multiple days).
Or could even go further with: y2007_m03_d02_e2_s3_b4
For example, "y2007_m03" by itself my represent all the events in the March 2007. I imagine a query could define a set of these UUIDs to filter the results. For example, the set {"y2007_m12", "y2008_m01"} might be used to only return result for those two months of the year.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yeah, I'm not convinced about colons or slashes. I wanted to have a standard for building UUIDs for each entity. Both colons and slashes have other meanings in URIs.
It seems strange that we cannot use integers for ids. Thats what databases tend to do, and its worked pretty well for them. However, I agree that it makes the URIs easier to read. Could even do something like:
So, that block has local id "b4" and uuid "e2008073s2b4".
I think that looks quite nice.
Although I don't really like having the "b" and "m" prefixes everywhere (in the id attributes), I suppose it is safer in case we want to define two types of entities within an xml file. Each type would have its own leading letter.
For single-session events, I think we still need the s component, otherwise the uuid will clash with the event uuid. Maybe thats ok because the uri contains the "session" instead of the "event" component.
Here is a proposal for the format of various URIs:
Event:
http://www.ishafoundation.org/archives/event/<event id>
Session:
http://www.ishafoundation.org/archives/session/<event id>:<session id for event>
Range of blocks/paragraphs within a session:
http://www.ishafoundation.org/archives/block_range/<event id>:<session id for event>[<start block id for session>-<end block id for session>]
Block/Paragraph:
http://www.ishafoundation.org/archives/block/<event id>:<session id for event>:<block id for session>
Range of characters within block:
http://www.ishafoundation.org/archives/char_range/<event id>:<session id for event>:<block id for session>[<start offset>-<end offset>]
Transcript Markup:
http://www.ishafoundation.org/archives/transcript_markup/<event id>:<session id for event>:<markup id for session's transcript>"
The ids are generally low positive integer values. However, the event id may be a bit more descriptive so maybe has some letters as well. The ids are also not ordered, so id 5 may come after id 10.
Some examples:
Inner Engineering Retreat in June 2007:
http://www.ishafoundation.org/archives/event/ier200706
The second session at the retreat (note - the actual id does not have to be "2"):
http://www.ishafoundation.org/archives/session/ier200706:2
The first two blocks of the second session (note - because the ids are not ordered, you could have something like ":8-1")
http://www.ishafoundation.org/archives/block_range/ier200706:2\[1-2]
The fourth block of the second session
http://www.ishafoundation.org/archives/block/ier200706:2:4
The second sentence (consisting of 100 chars) of the fourth block:
http://www.ishafoundation.org/archives/char_range/ier200706:2:4\[58-158]
A transcript markup (id 12) for the second session:
http://www.ishafoundation.org/archives/transcript_markup/ier200706:2:12
=== end of examples ===
I don't think we need to use these URIs in the transcript xml and markups xml files themselves. However, they might be used in bookmarking or when displaying results of search queries.
We could have something which is able to resolve these URIs and display the appropriate XML/HTML to the user.
Thanks, Mark
The session is mandatory right? (Many events will have just one session e.g. a talk). So in these cases we'd just have a single session.
I'm not so sure about the use of the colon in these uris. I think the xml spec reserves a special meaning for colons (something to do with namespace stuff). I know that these aren't namespaces - but better to avoid any possible confusion and stick to standard naming conventions where possible...
How about using slashes instead
And again using ids prefixed with letters to conform to the xml spec
Modified Examples
Event:
http://www.ishafoundation.org/archives/event/<event id>
Session:
http://www.ishafoundation.org/archives/session/<event id>/<session id for event>
Range of blocks/paragraphs within a session:
http://www.ishafoundation.org/archives/block_range/<event id>/<session id for event>[<start block id for session>-<end block id for session>]
Block/Paragraph:
http://www.ishafoundation.org/archives/block/<event id>/<session id for event>/<block id for session>
Range of characters within block:
http://www.ishafoundation.org/archives/char_range/<event id>/<session id for event>/<block id for session>[<start offset>-<end offset>]
Transcript Markup:
http://www.ishafoundation.org/archives/transcript_markup/<event id>/<session id for event>/<markup id for session's transcript>"
Some examples:
Inner Engineering Retreat in June 2007:
http://www.ishafoundation.org/archives/event/e2008073
(I wouldn't bother embedding any event metadata into the event id (other than possibly the year) - since sometimes certain info is unknown, or sometimes not enough info to differentiate between 2 events - e.g. 2 talks on the same day - i've thought about this before - simpler to just assign numerical id)
The second session at the retreat (note - the actual id does not have to be "2"):
http://www.ishafoundation.org/archives/session/e2008073/s2
The first two blocks of the second session (note - because the ids are not ordered, you could have something like ":8-1")
http://www.ishafoundation.org/archives/block_range/e2008073/s2\[b1-b2]
The fourth block of the second session
http://www.ishafoundation.org/archives/block/e2008073/s2/b4
The second sentence (consisting of 100 chars) of the fourth block:
http://www.ishafoundation.org/archives/char_range/e2008073/s2/b4\[58-158]
A transcript markup (id m12) for the second session:
http://www.ishafoundation.org/archives/transcript_markup/e2008073/s2/m12
I think the uris are a bit clearer now - to see what they mean at a glance
Pranams
Swami Kevala
Yes - I agree slashes aren't good
I think that format is also quite nice:
e2008073s1b28
How about using hyphens to make it more readable e.g.
e2008073-s1-b28
I think that's a bit nicer
I think we should have the "s1" solution for one session events. Seems clearest and easiest to me.
I agree that it's a bit rubbish that you can't have an integer as an ID in XML - but what to do!
Ok agreed. hyphens to delimit local ids is better. We can enforce that local ids cannot contain hyphens to make this safe.
Also agree with the "s1" solution.
However, I'm not so sure about using hyphens because that conflicts with the range hyphen. How about underscore?
Generally, I don't think we need to talk about URIs since the UUID says it all and is much more compact then the whole URI. From the UUID we can work out whether its an event, session, block range etc.
Here's another idea:
d<year><month><day>_<event id within day>_<session id within event>_<block id within session>
For example:
d20070302_e2_s3_b4
might represent the 4th block of the third session of the second event starting on 2 March 2007.
The date represents the start date of the event (so it doesnt matter if the event spans multiple days).
Or could even go further with: y2007_m03_d02_e2_s3_b4
For example, "y2007_m03" by itself my represent all the events in the March 2007. I imagine a query could define a set of these UUIDs to filter the results. For example, the set {"y2007_m12", "y2008_m01"} might be used to only return result for those two months of the year.
although having said that - colons are actually used in the uri already - "http:" !!
but i still don't like your colons for some reason!
What do you think about using slashes instead?
Yeah, I'm not convinced about colons or slashes. I wanted to have a standard for building UUIDs for each entity. Both colons and slashes have other meanings in URIs.
I found this:
http://webservices.xml.com/pub/a/ws/2005/02/23/salz.html
though maybe its not that helpful.
It seems strange that we cannot use integers for ids. Thats what databases tend to do, and its worked pretty well for them. However, I agree that it makes the URIs easier to read. Could even do something like:
The fourth block of the second session
http://www.ishafoundation.org/archives/block/e2008073s2b4
So, that block has local id "b4" and uuid "e2008073s2b4".
I think that looks quite nice.
Although I don't really like having the "b" and "m" prefixes everywhere (in the id attributes), I suppose it is safer in case we want to define two types of entities within an xml file. Each type would have its own leading letter.
For single-session events, I think we still need the s component, otherwise the uuid will clash with the event uuid. Maybe thats ok because the uri contains the "session" instead of the "event" component.
So the options are:
http://www.ishafoundation.org/archives/session/e2008073
http://www.ishafoundation.org/archives/session/e2008073s (so we instantly know its the only session)
http://www.ishafoundation.org/archives/session/e2008073s1 (fully compatible with naming of multi-session events)