Menu

#1 XML output as an option

open
nobody
5
2009-07-16
2009-07-16
Lou Burnard
No

It would be very nice to get output from the tagger in a pure XML format, rather than have to post-processs it.
I would suggest using something like the TEI or EAGLES recommendation for the specific XML format, but that is not so important.
Something like
<w pos="foo" lemma="bar">foobar</w>
or
<w><pos>foo</pos><lemma>bar</lemma><form>foobar</form></w>

would be really useful. Increasingly input for taggers is going to come in XML, so it would be useful to generate output in XML too.

You also need to be sure that you don't generate invalid XML in content (e.g. escaping ampersands and pointy brackets)

Discussion


Log in to post a comment.

MongoDB Logo MongoDB