Lou Burnard - 2014-01-05

The English Polish freedict example above doesn't seem to me to establish a particularly good precedent. If you were doing this project again, surely you'd use DCR ("data category registry", Kevin) pointers instead?


<tagUsage gi="pos">
<list n="values" type="bulleted">
<item xml:id="tag_N" ana="FreeDict_ontology.xml#f_pos_noun">N</item>
<item ana="FreeDict_ontology.xml#f_pos_noun">N Comp</item>
<item xml:id="tag_V" ana="FreeDict_ontology.xml#f_pos_verb">V</item>
<item ana="FreeDict_ontology.xml#f_pos_verb">V Mod</item>
<item ana="FreeDict_ontology.xml#f_pos_verb">V Phras</item>


Why do some entries have xml:id values but not others?

Why is the list of type "bulleted" rather than (say) "valueList" ?

The Guidelines suggest instead storing your POS values as <category> elements within a <taxonomy>. (maybe we should add <category> to att.datcat)

In any case, I can see the logic of defining these things here. So now we have three possibilities: document them with a <valList> inside your ODD; document them with a <taxonomy> in your corpus header; document them inside <tagUsage> inside your header. Oh, and then there's the possibility of that new other special element for <standoff> too...

Last edit: Kevin Hawkins 2014-01-05