Text Encoding Initiative / Feature Requests / #556 Allow <hi> to be contained by <m>

Martin Holmes - 2015-05-27

Would <seg> not be better than <hi> for paleographic annotation?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Amir Zeldes - 2015-05-27

Are you suggesting to put a <seg> in the <m> and then do <hi>? Like this?

<w>walk<m><seg><hi rend="bold">ing</hi></seg></m></w>

This would be possible, but it raises the question why we need <seg> within <m> but not within <w>. If a word or part of a word can be highlighted, shouldn't a morpheme or part of a morpheme be eligible for highlighting as well? The suggestion is to allow this, which seems intuitive to me:

<w>walk<m><hi rend="bold">ing</hi></m></w>

We use <m> rather than <seg> for principled reasons (-ing is a morpheme), and having both seems redundant (though it does validate).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Martin Holmes - 2015-05-27

It's not very elegant, I agree. I don't think I've seen a project where a morphological hierarchy (<m>s etc.) is mixed up with a typographic transcriptional hierarchy (bold, italic, all that).

But remember you can put @rend on <seg> directly:

<w>walk<m><seg rend="bold">ing</seg></m></w>

and you could distinguish this use of <seg> from linguistic annotation using its @type attribute.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Caroline T. Schroeder - 2015-05-28

I appreciate your pointing us to existing ways to use the TEI tagset.

We are trying to be as synchronous as possible with the usage practices for the EpiDoc subset of the TEI. Our corpus is in Coptic, a language that is really structured by morphemes not words. The only other existing TEI corpus in Coptic is the papyri.info set of Coptic documentary papyri and ostraca. The guidelines and usage practices for EpiDoc are to use <hi> for the purposes we are describing (subscript, superscript, color, etc.). For reference:
http://www.stoa.org/epidoc/gl/latest/trans-charactershighlighted.html
http://www.stoa.org/epidoc/gl/latest/trans-raisedlowered.html
http://www.stoa.org/epidoc/gl/latest/trans-tallorsmall.html

To be as interoperable as possible with an existing usage, we are requesting the ability to use <hi> within in the same way that the guidelines dictate <hi> usage within <w>. Our other choices are to follow your suggestion and use <seg> throughout our entire corpus, and then it will not be compatible with the papyri.info corpus; or to use <hi> whenever we don't have an annotation and <seg> whenever we do, but that seems a little strange and internally inconsistent. And again, is not interoperable with papyri.info.

If the TEI agrees to the change, we will then petition EpiDoc for this change, as well. We are taking this approach, because we think allowing <hi> to be nested inside is a less significant change than asking EpiDoc to change their <seg> and <hi> usages and Guidelines.

Thanks for the consideration.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- BODARD Gabriel - 2015-05-28
  
  In haste: if the TEI agrees to this change (which I would be in favour of, and can discuss further later if needed) then the EpiDoc schema will inherit it from the TEI schema, so no separate petitioning will be needed!
  
  (Indeed, if the TEI accept this change soon, then the EpiDoc ODD might implement it before it is available in the TEI schema, on the understanding that it will become canonical within a few months.)
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Martin Holmes - 2015-05-28

I think I should hand this over to EpiDoc experts at this point -- I'll ask Hugh and Gabby to take a look. Thanks for your patience!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Martin Holmes - 2015-05-28

The potential issues I would expect to be raised by Council would be along the lines of: does this open a door to an huge cascade of requests for similar inline-level elements (<emph>, <soCalled>, <mentioned>, etc.) inside <m>? To which a possible reply would be that we already allow a huge variety of non-linguistic stuff (linebreaks, pagebreaks, forme works, <space>) inside <m>, so how is this different?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- BODARD Gabriel - 2015-05-28
  
  I agree that the precedent you mention above is already set by the inclusion of various inline elements within <m>, but perhaps even more so by the inclusion of many more inline elements within <w>. Surely a morpheme should be no more restricted in what it can contain than a word, given that they are parallel and in some languages almost equivalent concepts. In any case, anything that can appear inside a word can, surely, by definition also appear inside a morpheme, given that if you're marking morphemes then all words are entirely ymade up of morphemes?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Caroline T. Schroeder - 2015-05-28

Gabby, you stated it better than I did. Thank you. Thanks for everyone's careful consideration.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hugh A. Cayless - 2015-05-28

Will implement.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- BODARD Gabriel - 2015-05-28
  
  Hugh, if you're going to implement this in the TEI ODD, should we also sneak it into the EpiDoc ODD this week? (Happy to do that, with documentation, if you like.)
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hugh A. Cayless - 2015-05-28

assigned_to: Hugh A. Cayless
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Caroline T. Schroeder - 2015-05-28
  
  Thanks, Hugh!
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Amir Zeldes - 2015-05-28

Just chiming in to say this all makes sense to me, and I prefer it to <seg> since then we're staying consistent with everything else where <m> behaves much like <w>

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Caroline T. Schroeder - 2015-06-25

Many thanks for everyone's deliberation. Does this discussion mean that this request been approved officially?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Hugh A. Cayless - 2015-06-25
  
  It has. I will be implementing it soon.
  
  On Thu, Jun 25, 2015 at 5:06 PM, Caroline T. Schroeder ctschroeder@users.sf.net wrote:
  
  Many thanks for everyone's deliberation. Does this discussion mean that
  this request been approved officially?
  
  [feature-requests:#556]
  http://sourceforge.net/p/tei/feature-requests/556 Allow <hi> to be
  contained by *
  
  Status: open
  Group: AMBER
  Created: Wed May 27, 2015 04:23 AM UTC by Caroline T. Schroeder
  Last Updated: Thu May 28, 2015 04:04 PM UTC
  Owner: Hugh A. Cayless
  
  Request to change the TEI to allow < m > to contain <hi>. This is
  essential for allowing paleographical annotations of letters in languages
  whose linguistic elements are segmented on the morpheme (< m >) level,
  below the word (<w>) level. Currently, since < m > does not contain <hi>,
  any paleographic annotation using hi@rend on a character or sequence of
  characters within a morpheme (< m >) does not validate.
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/tei/feature-requests/556/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  Related
  
  Feature Requests: ~~#556~~
  
  alternate
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Caroline T. Schroeder - 2015-06-25

Great news. Thank you!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- BODARD Gabriel - 2015-06-26
  
  Just to note, Carrie, that this has already been implemented in the latest EpiDoc release (try validating against http://www.stoa.org/epidoc/schema/latest/tei-epidoc.rng and see if it works for you), in anticipation of forthcoming TEI compliance...
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Caroline T. Schroeder - 2015-06-26

Thank you! It seems to be validating. Much appreciated.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hugh A. Cayless - 2015-06-27

status: open --> closed-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hugh A. Cayless - 2015-06-27

Done in r13281.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Allow <hi> to be contained by <m>

TEI produces the TEI Guidelines and associated software

Group

Searches

Help

#556 Allow <hi> to be contained by <m>

Related

Discussion

Related