Thread: [Refdb-users] Re: nitpicking (element ordering in risx)
Status: Beta
Brought to you by:
mhoenicka
From: <Jus...@UL...> - 2003-11-24 08:39:23
|
Hi, May I join into the discussion: "Markus Hoenicka" <mar...@mh...> wrote on Fri, 21 Nov 2003 23:52:31 +0100: > Bruce D'Arcus writes: > > I've been working on some input templates for risx for emacs. The ide= a=20 > > is to make entering data much easier than it currently is, since ris i= s=20 > > a rather abstract data model. So, for each variable, the user is=20 > > prompted for the input with unambiguous prompts: "book title" and so=20 > > forth. > ... > You may then select a suitable mode (M-x ris-mode or M-x psgml-mode or > whatever you use to edit XML) if you need specific editing support. Indeed, I am not sure what an input template would provide over and above psgml-mode. Except: In my local copy of risx.dtd, I made the citekey and the type #REQUIRED so that psgml-mode prompts for them. Even better, one could convert the DTD to RELAX NG and use it with nxml-mode which does XML validation. Here are some questions and comments about risx.dtd: - There is no editor element. How do I distinguish between authors and editors, say, for a book? - I think there should be a way to provide a <url> for a conference article that's been published in a proceedings volume. This belongs into the <part>, I think. - This may have been answered elsewhere, but I can't remember right now: How are full vs. abbreviated journal titles handled by RefDB? - It would be nice to provide an event address for conferences (which is usually different from the publisher's or organization's address). - I think we need a mechanism to protect individual characters against automatic case conversion of titles by style sheets (like {G}aussian and {HMM} in BibTeX). - In <libinfo>, why isn't <reprint> optional? Also, the only allowed content is <date>, not even #CDATA is permitted. This contradicts the RIS specification of the RP field, doesn't it? - Is there really no established standard for representing references in XML? By TEI, for example, or something linked to the Dublin Core... And what is the relation of RIS/RISX/RefDB to Bibliofile? Another issue: I'd like the "master" representation of my references to be an XML file rather than a data base because that allows me to choose and change the format as desired. To make this practical, it would really help if RefDB could optionally identify references by the citekey instead of by the numeric ID. This would allow me to store references from different sources in the same RefDB database, and to update some of them by re-importing manually maintained XML files. Cheers, Justus --=20 Justus H. Piater, Ph.D. http://www.montefiore.ulg.ac.be/~piater/ Institut Montefiore, B28 Phone: +32-4-366-2279 Universit=E9 de Li=E8ge, Belgium Fax: +32-4-366-2620 |
From: Marc H. <mar...@en...> - 2003-11-25 10:02:01
|
>=A0In any > case I don't understand the advantage of maintaining your references > outside of a database. You can retrieve them from the database as XML > files and update them any time. You can even retrieve the full > database as XML periodically and check this into CVS. But the point of > a reference manager is to make the references accessible by simple > queries. If you maintain your data outside the database, you're back > to grep, or you overload your brain with things a database was > designed to remember. IMHO, doing this is also interesting in order to test/debug refdb. XML as the "master" copy buys you some reliability, which does not prevent using all the nifty features of a ("slave") database. The price to pay, of course, is about frequently updating the slave. I suspect this can be highly automated, though. |
From: Markus H. <mar...@mh...> - 2003-11-26 21:09:37
|
Marc Herbert writes: > IMHO, doing this is also interesting in order to test/debug refdb. XML > as the "master" copy buys you some reliability, which does not prevent > using all the nifty features of a ("slave") database. The price to > pay, of course, is about frequently updating the slave. I suspect this > can be highly automated, though. > Yes, the non-interactive mode of refdbc is great for scripting. Or you hack your own Perl script using the RefDBClient module. BTW I use a shell script to debug RefDB, and "make test" of the RefDBClient module is also helpful for this purpose. regards, Markus -- Markus Hoenicka mar...@ca... (Spam-protected email: replace the quadrupeds with "mhoenicka") http://www.mhoenicka.de |
From: Bruce D'A. <bd...@fa...> - 2003-11-24 10:16:14
|
On Nov 24, 2003, at 3:39 AM, Justus H. Piater wrote: > May I join into the discussion: Sure! > Indeed, I am not sure what an input template would provide over and > above psgml-mode. It gets to your later question about editor. It allows you to present to the user a more context-specific and unambiguous prompt for entry. For example, I have created a template called book-edited, which has a prompt to enter "editor." When the user enters that, it goes in the "author" field. I will post later so you can see for yourself. > Even better, one could convert the DTD to RELAX NG and use it with > nxml-mode which does XML validation. I've been known to argue this. IMHO, Relax NG is a big leap beyond DTDs in both elegance and power, and nxml mode is simple excellent. But Markus very much wants to maintain compatibility with SGML, so has been more reluctant. Still, it's trivial to convert RISX to RNG with Trang. I'll let Markus answer some of your detail questions. > - Is there really no established standard for representing references > in XML? By TEI, for example, or something linked to the Dublin > Core... No. My opinion is the new MODS schema from the Library of Congress is the most likely to provide this, however. > And what is the relation of RIS/RISX/RefDB to Bibliofile? Markus is involved in development discussions of Bibliofile, as am I. It is an xslt-based formatting engine designed to be DTD/schema-independent, both with respect to bib data and document format. Noone has done this before, so it's bleeding edge, and Markus is still skeptical it will work adequately. I really hope it does, and that RefDB can use it as its formatting engine too. Absent that, bibliofile also includes a style specification DTD that can handle MODS, and which Markus is interested in too. > Another issue: I'd like the "master" representation of my references > to be an XML file rather than a data base because that allows me to > choose and change the format as desired. To make this practical, it > would really help if RefDB could optionally identify references by the > citekey instead of by the numeric ID. This would allow me to store > references from different sources in the same RefDB database, and to > update some of them by re-importing manually maintained XML files. I'm not following here. RefDB stores the citekey; what more do you need it to do? Bruce |
From: Markus H. <mar...@mh...> - 2003-11-24 21:50:28
|
Bruce D'Arcus writes: > > And what is the relation of RIS/RISX/RefDB to Bibliofile? > > Markus is involved in development discussions of Bibliofile, as am I. > It is an xslt-based formatting engine designed to be > DTD/schema-independent, both with respect to bib data and document > format. Noone has done this before, so it's bleeding edge, and Markus > is still skeptical it will work adequately. xslt is said to be a complete programming language so it will work eventually, given that someone puts enough sweat in it. I'm skeptical whether it is wise to do the raw->cooked transformation in xslt. As we're not looking at anything specific to XML (in the case of RefDB, the data to convert are not XML, but SQL query results), any language with a better signal-to-noise ratio than xslt will provide better and easier to maintain programs (think C or Perl). For me it is simply a matter of chosing the right tool for the right job. You can use Emacs as a web server (google for it, it's fun!) but no one would claim this stacks up well against Apache. > I really hope it does, and > that RefDB can use it as its formatting engine too. I strive for compatibility at the stylesheet level. regards, Markus -- Markus Hoenicka mar...@ca... (Spam-protected email: replace the quadrupeds with "mhoenicka") http://www.mhoenicka.de |
From: Bruce D'A. <bd...@fa...> - 2003-11-24 16:47:40
Attachments:
risx.rnc.tar.gz
|
I've attached the Relax NG version of RISX. Looking into this, I can't at the moment see how to properly code a book with an editor. Markus, I thought you had added a way to code "role"? Anyway, the template package I am talking about is here: http://emacs-template.sourceforge.net/index.html Here's a template for a book chapter. I've not included the document declaration as I use nxml mode, and RNG has no such concept. <?xml version="1.0" encoding="utf-8"?> <ris> <entry type="CHAP" citekey="(>>>citekey<<<)"> <part> <title>(>>>title<<<)</title> <author> <lastname>(>>>authorlast<<<)</lastname> <firstname>(>>>authorfirst<<<)</firstname> </author> </part> <publication> <title type="full">(>>>ctitle<<<)</title> <author> <lastname>(>>>cauthorlast<<<)</lastname> <firstname>(>>>cauthorfirst<<<)</firstname> </author> <pubinfo> <pubdate type="primary"> <date><year>(>>>year<<<)</year></date> </pubdate> <startpage>(>>>start<<<)</startpage> <endpage>(>>>end<<<)</endpage> <city>(>>>city<<<)</city> <publisher>(>>>publisher<<<)</publisher> </pubinfo> </publication> <contents> <keyword>(>>>POINT<<<)(>>>MARK<<<)</keyword> </contents> </entry> </ris> >>>TEMPLATE-DEFINITION-SECTION<<< ("citekey" "Citation Key: ") ("title" "Title: ") ("subtitle" "Subtitle: ") ("authorfirst" "Author Firstname: ") ("authorlast" "Author Lastname: ") ("ctitle" "Book Title: ") ("cauthorfirst" "Editor Firstname: ") ("cauthorlast" "Editor Lastname: ") ("year" "Publication Year: ") ("publisher" "Publisher: ") ("city" "City: ") ("start" "Start Page: ") ("end" "End Page: ") Bruce |
From: Markus H. <mar...@mh...> - 2003-11-24 21:50:31
|
Bruce D'Arcus writes: > Markus, I thought you had added a way to code "role"? > I'm sure I did. I'm afraid it went in after 0.9.3. The CVS version should have it in any case. regards, Markus -- Markus Hoenicka mar...@ca... (Spam-protected email: replace the quadrupeds with "mhoenicka") http://www.mhoenicka.de |
From: Markus H. <mar...@mh...> - 2003-11-24 21:50:24
|
Justus H. Piater writes: > Hi, > > May I join into the discussion: > Sure, anytime. > Indeed, I am not sure what an input template would provide over and > above psgml-mode. Except: In my local copy of risx.dtd, I made the > citekey and the type #REQUIRED so that psgml-mode prompts for them. > This is not a bad idea for entering data from scratch. However, I didn't want to force this upon users because refdbd can create useful citekeys if none are provided. Some people may prefer this convenience. > Even better, one could convert the DTD to RELAX NG and use it with > nxml-mode which does XML validation. > I'm reluctant to go down this path as currently few tools support RELAX NG. This is supposed to change eventually. Before answering the detailed questions below, let me briefly mention that the risx.dtd was not designed from scratch. Rather it is an XML representation of the RIS tagged format with all its strengths and (almost) all its weaknesses. The main purpose of this dtd is to have a target for SGML/XML transformations. It was by no means written as a replacement for any serious XML bibliography DTD like e.g. MODS. > Here are some questions and comments about risx.dtd: > > - There is no editor element. How do I distinguish between authors and > editors, say, for a book? > I've attempted to get some logic into the RIS author and title levels. A book author is supposed to be encoded as AU/A1 whereas an editor is an ED/A2. This does not translate well to the part/publication/set distinction used in risx. Recent versions of risx have a role attribute attached to the author element. However, this may not yet be honored correctly during formatting bibliographies. > - I think there should be a way to provide a <url> for a conference > article that's been published in a proceedings volume. This belongs > into the <part>, I think. > Question to Bruce (he knows MODS better than me): can MODS do this? > - This may have been answered elsewhere, but I can't remember right > now: How are full vs. abbreviated journal titles handled by RefDB? > Journal titles are kept in a separate table. All references encoding articles from this journal can use either the full or abbreviated name depending on what the bibliography style requires. There are fallbacks if one of the titles is missing. Currently there is no convenience command to maintain the journal title list. If you have references containing only one type of title, it is best to retrieve them, add the missing title, and update the references. > - It would be nice to provide an event address for conferences (which > is usually different from the publisher's or organization's > address). > Bruce: can MODS do this? > - I think we need a mechanism to protect individual characters against > automatic case conversion of titles by style sheets (like {G}aussian > and {HMM} in BibTeX). This would only be a problem if there are styles that force all lowercase. citestylex.dtd supports this but I don't know whether any real-life journal requires this. The other possibilities (all caps or keep as is) would work ok if you supply the titles in proper mixed case. If this is not sufficient we'll have a problem. > > - In <libinfo>, why isn't <reprint> optional? Also, the only allowed > content is <date>, not even #CDATA is permitted. This contradicts > the RIS specification of the RP field, doesn't it? I think you run a reference manager mainly to keep track of your offprints/electronic copies. The reprint status is kind of essential to do this. But if this bothers people I won't have a problem changing this element to optional. BTW the RIS spec says the RP field can contain one of three status notes, one of which (ON REQUEST) may be followed by a date. This is exactly what the reprint element represents: The attribute encodes the status, the optional child is the date. > > - Is there really no established standard for representing references > in XML? By TEI, for example, or something linked to the Dublin > Core... And what is the relation of RIS/RISX/RefDB to Bibliofile? > The nice thing about standards is that there are so many to choose from (don't know whom to attribute this to). I'm still evaluating whether MODS is suitable for the purposes of a reference manager/bibliography tool. It was clearly not designed for this purpose but it probably gets closest. > Another issue: I'd like the "master" representation of my references > to be an XML file rather than a data base because that allows me to > choose and change the format as desired. To make this practical, it > would really help if RefDB could optionally identify references by the > citekey instead of by the numeric ID. This would allow me to store > references from different sources in the same RefDB database, and to > update some of them by re-importing manually maintained XML files. > I don't know what you mean by "choosing and changing the format"? Do you want to move your data back and forth between different DTDs? Otherwise XML is plain text that you can (with a few exceptions) format any way you want, even if it originates from a database. In any case I don't understand the advantage of maintaining your references outside of a database. You can retrieve them from the database as XML files and update them any time. You can even retrieve the full database as XML periodically and check this into CVS. But the point of a reference manager is to make the references accessible by simple queries. If you maintain your data outside the database, you're back to grep, or you overload your brain with things a database was designed to remember. The ID and citekey are mostly interchangeable. Both must be unique in a database. The ID is automatically created by the database engine and as such comes for free. But you can of course update references by providing only the citekey but no ID. The risx import routine always checks the citekey first, then the ID. regards, Markus -- Markus Hoenicka mar...@ca... (Spam-protected email: replace the quadrupeds with "mhoenicka") http://www.mhoenicka.de |
From: Bruce D'A. <bd...@fa...> - 2003-11-24 22:14:22
|
On Monday, November 24, 2003, at 04:35 PM, Markus Hoenicka wrote: >> Even better, one could convert the DTD to RELAX NG and use it with >> nxml-mode which does XML validation. > > I'm reluctant to go down this path as currently few tools support > RELAX NG. This is supposed to change eventually. Well, both libxml and emacs now support Relax NG, so I'm not sure how serious a limitation this is now. >> - I think there should be a way to provide a <url> for a conference >> article that's been published in a proceedings volume. This belongs >> into the <part>, I think. > > Question to Bruce (he knows MODS better than me): can MODS do this? Yes. MODS has a location element, with type attributes for "physical" and "electronic." In RISX, you represent this example like: - part = conference article - publication = conference proceedings In MODS, it is: conference article relatedItem "host" = conference proceedings The location element can go in either level in MODS. >> - It would be nice to provide an event address for conferences (which >> is usually different from the publisher's or organization's >> address). > > Bruce: can MODS do this? I'm not sure exactly what he's wanting here. Maybe an example would help? MODS does have a "place" element that might do this. Bruce |