From: Matthieu D. <mat...@en...> - 2012-08-30 14:18:15
|
Hi, The CWB and TreeTagger corpus formats (XML + tabulated words) are not yet supported in TXM. The next release will support these, but for now I'll suggest you to use the XML/w or the XML-TXM formats. The main idea is to replace the word lines with a "w" element. For instance : * With the XML/w import module : word1 prop11 prop12 gives <w p1="prop11" p2="prop12">word1</w> * With the XML-TXM import module : word1 prop11 prop12 <w id="1"> <txm:form>word1</txm:form> <txm:ana type="p1" resp="SM">prop11</txm:ana> <txm:ana type="p2" resp="SM">prop12</txm:ana> </w> More information about the XML-TXM format here : http://sourceforge.net/apps/mediawiki/txm/index.php?title=Xml-txm-tei If you need help to convert your corpus, we can help you. Textometry Team On 30/08/2012 16:03, Saifollah Mollaei wrote: > Hi > > How can I import a corpus encoded by cwb-encode tool into TXM? > Or, how can I encode a tagged corpus (e.g. by TreeTagger) using TXM? > > Regards > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Txm-open mailing list > Txm...@li... > https://lists.sourceforge.net/lists/listinfo/txm-open -- Matthieu Decorde, mat...@en... http://textometrie.ens-lyon.fr - Ingénieur - Equipex Matrice ENS de Lyon/CNRS - ICAR UMR5191, Institut de Linguistique Française 15, parvis René Descartes 69342 Lyon BP7000 Cedex, tél. +33(0)43737 |