Menu

biblatex "hyphenation" and "language" fields

Nick Bart
2012-11-21
2012-12-13
  • Nick Bart

    Nick Bart - 2012-11-21

    biblatex2xml converts biblatex's "language" fields into "<language>foo</language>".
    In biblatex, however, the "language" field is supposed to contain a description of the work's content, such as "English and Latin". It's the "hyphenation" field that contains localization information used for hyphenation and possible case conversions of the bibliography entry.

    Since CSL expects localization information under "language", it seems that it's biblatex's "hyphenation" field that should be converted to MODS's "<language>foo</language>".

    What I'm not sure about is into what biblatex's "language" field should be converted, then ... Ideas?

     

    Last edit: Nick Bart 2012-11-21
  • Nick Bart

    Nick Bart - 2012-12-12

    I think I figured this one out:

    biblatex’s hyphenation field (“The identifier must be a language name known to the babel package.”) should be converted to

    <language>
    <languageTerm type="code" authority="iso639-2b">eng</languageTerm>
    <languageTerm type="text">English</languageTerm>
    </language>
    

    or

    <language>
    <languageTerm type="code" authority="iso639-2b">fra</languageTerm>
    <languageTerm type="text">French</languageTerm>
    </language>
    

    etc.

    citeproc-hs could probably be made to do the right thing with the type="text" lines only.

    Just <language>foo</language>, however, does not validate against mods-3-4.xsd.

    This is important for pandoc/citeproc-hs to get conversion to title case for English in some styles and prevent this conversion for other languages.

    biblatex’s language field (“The language(s) of the work.”) should be converted literally (or by expanding biblatex’s localization keys [yes, it's that complicated]) to:

    <note type="language">English</note>
    

    or

    <note type="language">French</note>
    

    or

    <note type="language">Greek and English</note>
    

    etc.

    (see http://www.loc.gov/standards/mods/mods-notes.html and http://www.loc.gov/marc/bibliographic/bd546.html)

    To me, this seems less urgent than the hyphenation field.

     

    Last edit: Nick Bart 2012-12-12
  • Nick Bart

    Nick Bart - 2012-12-13

    OK, more research shows that the above was not quite right yet.

    Actually, in MODS there seems to be a well-documented way for specifying both the language(s) of the content and (the) language of the entry/record/metadata.

    From http://www.loc.gov/standards/mods/userguide/language.html:

    Element <language>, Definition A designation of the language in which the content of a resource is expressed.

    So, the content of biblatex’s language field, describing the language(s) of the content, specified literally or as localization keys and separated by “ and ”, should go into MODS’s “language” field after all.

    The <language> filed should be repeated when content is in more than one language:

    biblatex’s

    language = {English and French}
    

    should be converted to

    <language>
      <languageTerm type="text">English</languageTerm>
      <languageTerm type="code" authority="iso639-2b">eng</languageTerm>
    </language>
    <language>
      <languageTerm type="text">French</languageTerm>
      <languageTerm type="code" authority="iso639-2b">fre</languageTerm>
    </language>
    

    (with either code, text or both).

    Also from http://www.loc.gov/standards/mods/userguide/language.html:

    “The subelement <languageOfCataloging> under <recordInfo> is used to give the language of the metadata in the record as a whole. It designates the language of the metadata record, while this <language> element designates the language of the resource.”

    So biblatex's hyphenation field (biblatex manual: “The language of the bibliography entry”) containing exactly one language name, in textual form, and known to the babel package should go to MODS's <languageOfCataloging> inside <recordInfo>, like this:

    <recordInfo>
    <languageOfCataloging>
    <languageTerm type="code" authority="iso639-2b">fra</languageTerm>
    <languageTerm type="text">French</languageTerm>
    </languageOfCataloging>
    </recordInfo>
    

    (with either code, text or both). A simple solution would be to just copy the content of the biblatex hyphenation field, as text.

    Next challenge is to convince citeproc-hs to accept this as input (it then will need to put <languageOfCataloging> in CSL’s “language” field (to get the format of the bibliography entry right), and <language>, well, for lack of anything better, probably add it to the note field). But that's a different story ...

     

    Last edit: Nick Bart 2013-01-11

Log in to post a comment.