Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#500 Allow note within sourceDesc

AMBER
closed-rejected
Fabio Ciotti
None
5(default)
2014-07-01
2014-03-08
Lou Burnard
No

The fact that sourceDesc is mandatory within the header, even when there is no source because the header is describing a born digital object, has long been one of those charming eccentricities we have learned to cope with. However, the fact that the compulsory comment "there is no source" has to be supplied either within a <p> or an <ab> is annoying, since those elements are often used for other purposes within a document. I would like to change the content model of sourceDesc from its current value (effectively, one or more model.biblLike or model.pLike) to include as a third option one or more model.noteLike, so that the compulsory comment could be given within a <note>. One possible alternative might be to permit an empty sourceDesc with the specific meaning of "no source" but that seems a bit too radical and would probably confuse many existing processes.

Discussion

  • "...is annoying, since those elements are often used for other purposes within a document". sorry, I dont get this? why does having to use <p> inside sourceDesc affect what you do elsewhere?

    personally, I'd rather have a model where it was text() |model.pLike | model.biblLike. putting the comment inside a <note> seems as redundant as wrapping it in a <p>

     
  • Lou Burnard
    Lou Burnard
    2014-03-08

    You might well want to enforce tighter constraints on the <p>s in the body of a text than on those in its header. For example, your ODD might define it as containing just s+. This seems less probable for <note> ; and if it did occur, you could use the @type attribute (not available on <p>) to distinguish in-text notes from others.

    I didn't suggest making sourceDesc mixed-content, because that would have automatically made it possible for it to be empty, which seems (as I said) a stretch too far.

    Seems fairly obvious to me that a remark like "there isn't one" inside a <sourceDoc> is more like an annotation than some content!

     
  • I am glad to see you agreeing that <p> should have @type....

    Re mixed content, why is <sourceDesc/> any worse than <sourceDesc><p/></sourceDesc> or
    <sourceDesc><p><note/></p></sourceDesc>, both of which are perfectly legal? Allowing for an empty sourceDesc seems really rather sensible to me.

    I agree that "there isn't one" is marginally better inside a <note> than a <p>, but only in a theoretical world which one seldom encounters outside the works of Stanislaw Lem.

     
    Last edit: Sebastian Rahtz 2014-03-08
  • Lou Burnard
    Lou Burnard
    2014-03-08

    Because it is not something we currently allow. The fact that we currently allow some silly things is not relevant. Nor are the works of S. Lem.

     
  • eh? but we do allow an empty description, here as in many similar places. the prose is quite forgiving.

     
  • Lou Burnard
    Lou Burnard
    2014-03-08

    My reading of the content model is that <sourceDesc/> is invalid. And Mr oXyGen agrees with me.

     
  • sorry, just meant that an empty <p> has the same effect as an empty <sourceDesc>

     
  • Martin Holmes
    Martin Holmes
    2014-03-08

    I don't like the idea of an empty <sourceDesc> because it's not really clear what it means -- there's no source, or the encoder is lazy? I know the same issue is there with <sourceDesc><p/></sourceDesc>, although we could do something with Schematron to prevent that.

    Lou's argument seems to be that <p> is inconvenient because it's likely to be constrained in the schema for reasons to do with its use in <text>. But it's difficult to imagine that <p> would be constrained in such as way as to disallow plain text content, isn't it? What is a <p> if it has no text in it?

     
    Last edit: Lou Burnard 2014-03-08
  • Lou Burnard
    Lou Burnard
    2014-03-08

    I gave my use case for wanting to avoid <p> within <sourceDesc> at the start of this thread. Here it is again. It is not an imaginary one: it is one I have met several times in real life. Suppose you are developing a corpus which you are carefully segmenting end-to-end using <s> elements. You want to ensure that all the text in your corpus is properly segmented and that there is no stray text floating around outside the <s> elements, which are also grouped into <p> elements. So you want to define p as containing s+ -- and you do NOT want to permit mixed content. That's perfectly kosher and TEI conformant, since you are subsetting the default definition. But the <p>s in your header don't follow this rule. Hence my desire to avoid using <p> in the header, and hence my suggestion of using <note> instead. Isn't <note> supposed to be global anyway?

     
  • If you cut down the content model of <p> generally to s+, you'll also potentially screw up all of abstract application availability cRefPattern calendar change correction editionStmt editorialDecl encodingDesc handNote hyphenation interpretation licence normalization prefixDef projectDesc publicationStmt quotation refsDecl samplingDecl scriptNote segmentation seriesStmt stdVals styleDefDecl typeNote in the header. If you want to apply that check to your day to day <p>, 'swhat god gave you schematron for. Don't get me wrong, I have no special issue with <note> inside <sourceDesc>, but it looks like like a symptom of a more general possible problem, viz the use of <p> throughout the header.

    Sure, allow <note> anywhere <p> can occur in the header.

     
  • Lou Burnard
    Lou Burnard
    2014-03-09

    Actually, avoiding p in the header is not as difficult as it
    looks. The elements you list fall into four groups:

    a) model.pLike is only provided as an alternative to more precise
    element content Recommendation : use more precise element
    content. (editorialDecl, encodingDesc, publicationStmt, editionStmt,
    seriesStmt)

    b) model.pLike is possible because the content is
    macro.specialPara. But this offers many alternatives to using p, not
    least plain text, list, note etc. (change, handNote,
    licence,scriptNote,typeNote)

    c) model.pLike is provided as optional content, but the element's real
    work is done by its attributes with the content repeating or glossing
    their meaning. Recommendation: leave them empty. But why not give them
    macro.specialPara? (calendar, cRefPattern, prefixDef, refsDecl, stdVals,
    styleDefDecl)

    d) model.pLike is actually useful because the element contains
    (potentially) extended documentation of some aspect or other of the
    encoding. However, there's no good reason that I can see why
    macro.specialPara wouldn't work just as well. (correction,
    hyphenation, interpretation, normalization, projectDesc, quotation,
    samplingDecl, segmentation)

     
  • this sounds like a productive line of thinking. normalize all the

    -like usage in the header to follow a common pattern.

     
  • James Cummings
    James Cummings
    2014-03-10

    tidying up the way paragraphs are used in the header makes sense. I've nothing against the addition of model.noteLike in sourceDesc (and the other places model.pLike is given as an alternative to a more structured form).

    I find the initial argument for the use-case unconvincing though... putting a
    <p>Born digital</p> is hardly that strenuous. I also worry that this might further dilute our concept of note. Why is note better than say ab?

     
  • Lou Burnard
    Lou Burnard
    2014-03-10

    It's not a question of strenuousness, obvs. But yes, a policy of saying "wherever it says model.pLike I will use <ab> in the header and not <p>" would sort of work -- but at the price of making <ab> unusable for other non-documentary reasons in the body, which is what I am trying to avoid.

    How does it dilute the use of <note> to use it for a note about the content of an element (to say that it's not there/inapplicable)? Surely it's more of a dilution to use <p> or <ab> to provide this sort of meta comment -- the source of the document is not in fact the phrase "born digital"!

     
  • Martin Holmes
    Martin Holmes
    2014-03-10

    I think Sebastian is absolutely right: the issue here is the requirement to constrain <p> to having only <s> in specific contexts, and that's best done with Schematron. So the use-case is backwards: there's no problem with the header assuming the idiosyncratic constraints in <text> are done with Schematron.

    Having said that, I also have no objection to <note> anywhere <p> can appear. I generally like to see more stuff available in more places. :-)

     
    Last edit: Martin Holmes 2014-03-10
  • Lou Burnard
    Lou Burnard
    2014-03-10

    I am not so sure. If it were a matter of constraining <p> within <front> or some special kind of <div> I'd have no hesitation about saying schematron is the way to go. But I do think there's a plausible case for regarding the constituent <p>s of a TEI Header as ontologically distinct from those of a <text. The late Jean Veronis made the same argument back in the nineties, at which time I poo-pooed it, but now I am changing my mind!

    And I do think we ought to do better than <p>born digital</p>

     
    Last edit: Lou Burnard 2014-03-10
  • I think you're arguing yourself, Lou, into wanting a new p-like element entirely for the header, which isnt any of the usual suspects, as you want to be able to constrain them in the <text>. Who's to say the next person won't want to constrain <note> as you wish to control <p>?

    if we used XSD schema this would all be easy...

     
  • Lou Burnard
    Lou Burnard
    2014-03-10

    Well, yes indeed JV's proposal was for a distinct header-P and that does make some sort of sense. However, in the case of <sourceDesc>, I am proposing that <note> would do all you might want such an element for, and no-one seems to be disagreeing with that.

    It would also be easy if we had a full(er) implementation of pure ODD!

     
  • James Cummings
    James Cummings
    2014-03-10

    We did have a feature request raised once that asked for a <para> in the header instead of <p>. This would work, of course, if we weren't using the class system to provide the same content for element X in the header as we do for it inside <text>. But yes... if we can add say <para> to model.pLike and add something general that removes <p> from model.pLike when it has ancestor::teiHeader or a schematron warning, then I'd go for that.

     
  • James Cummings
    James Cummings
    2014-05-19

    • assigned_to: Fabio Ciotti
     
  • James Cummings
    James Cummings
    2014-05-19

    Assigning to Fabio Ciotti to summarise and report to Council on options.

     
  • Martin Holmes
    Martin Holmes
    2014-07-01

    • status: open --> closed-rejected
     
  • Martin Holmes
    Martin Holmes
    2014-07-01

    Rejected by Council 2014-07-01.