1. The element <colloc> is defined as containing a "collocate", but this term isn't defined in the Guidelines or in widely available dictionaries.
2. <colloc type="prep"> is unclear: does this mean that the value of <colloc> goes before what's in <orth> or that what goes in <orth> goes before this collocate? According to people who know French better than me, the example is supposed to mean the latter, but I would not have guessed that by reading of the markup. This example needs a bit of glossing.
3. Perhaps we should add to the element definition recommended values for @type, explaining that "prep" means that what's in <orth> goes before this collocate.
I agree that the explanation of uses for this element is somewhat deficient, not so say gnomic. I am also puzzled by how to interpret the @type=prep example, which I read quite differently from you: as indicating the "medire" is to be followed by the preposition "de", rather than say the preposition "a". Clearly we need an expert to tell us which is right, so I am assigning this ticket to Laurent!
yep, "prep" stands for preposition; "collocate" is a common term in corpus linguistics and lexicography, even if dictionaries don't list it :-)
But let me add that putting "prep" here like this is a bit strange for someone who is responsible for ISO DCR, Laurent... ;-)
At the Paris meeting in November 2011 we accepted this ticket and assigned to Laurent. Given how much time has elapsed, I would implement it myself, but I'm still not clear how to do so.
I would like to see a brief gloss of "collocate" for those who don't work in corpus linguisics, and I would like to see (2) and (3) resolved.
I must this has escaped my radar and should deserve a little thinking before we go any further. First, we need to make sure we agree on the actual notion we want to express. Would "any word of sequence of words that significantly occur on conjunction with the head word" do the trick? We also need to see whether we actually want to say something about @type: "preposition", "support verb", "argument", "qualia" ? Quite a few possibilities there which we should want to leave open. The last issue may be to indicate frequency information... but this should come after the TEI conf. I think a contribution about this has been submitted.
For (1) in the original ticket, as a definition of collocate, your phrase sounds to me like a collocation, though I think of a collocation as not having any relation to a particular headword.
Anyway, I would reword your definition of collocate as "any sequence of words that co-occur with the headword with significant frequency".
For (2) and (3), I don't know enough about collocates to agree or disagree with Laurent's options.
Your rewording is fine!
Okay, (1) is now solved at revision 11848.
Still need to solve (2) and (3).
Looking at this for the first time, it strikes me that the
<colloc>
element doesn't seem to provide any way to encode how the collocation is formed -- for instance words, does the word in the<colloc>
element come before or after the headword?Wouldn't some kind of
<phr>
-based structure be better for showing the headword in the context of a collocation?Example @types would be fine, but (as on many other occasions) I would suggest not to recommend any specific values, because at that point, the Guidelines would go beyond their aim, and fail. They aren't meant to be guidelines for dictionary writing, are they.
Martin, the ordering could be indicated by @type more or less directly, e.g. "object" or "_ of NP", etc. -- I suggest to leave that to the particular system (which every serious dictionary is).
That's right: while it's not explained in the element definition for <colloc> (perhaps an oversight that we should fix), in section 9.3.2 (#DITPGR) it gives an example using @type. Confusion over how to interpret this led to questions (2) and (3) in this ticket.
Per discussion at Council meeting, we agreed to remove @type on
<colloc>
from the example in section 9.3.2 and in the spec for<colloc>
because it's misleading.Will
<colloc>
be in att.typed, though? Or are you breaking backwards compatibility?We didn't discuss explicitly, but I wasn't planning to change its class. Therefore people would be free to continue using @type on their own if they like, whether they used it to indicate a type of collocate, the part of speech of the content of the element, or something else.
Great, thanks. It would be bad to see @type go altogether.
Implemented Council decision at https://sourceforge.net/p/tei/code/12722/ .