<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to Home</title><link>https://sourceforge.net/p/wn-toolkit/wiki/Home/</link><description>Recent changes to Home</description><atom:link href="https://sourceforge.net/p/wn-toolkit/wiki/Home/feed" rel="self"/><language>en</language><lastBuildDate>Tue, 28 Jan 2014 07:40:54 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/wn-toolkit/wiki/Home/feed" rel="self" type="application/rss+xml"/><item><title>Home modified by Antoni Oliver</title><link>https://sourceforge.net/p/wn-toolkit/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v28
+++ v29
@@ -10,7 +10,7 @@

 [1]:https://www.researchgate.net/publication/259874613_WN-Toolkit_Automatic_generation_of_WordNets_following_the_expand_model?ev=prf_pub

-You can also take a look tot the [presentation][2].
+You can also take a look to the [presentation][2].
 [2]:https://docs.google.com/presentation/d/1hh2b9ttdTkIi6Xl7UaEf4dPHHf8VeCOeVqcEFI9DgpI/pub?start=false&amp;amp;loop=false&amp;amp;delayms=3000#slide=id.g2a2c33c54_046

 #Pre-requisites
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Antoni Oliver</dc:creator><pubDate>Tue, 28 Jan 2014 07:40:54 -0000</pubDate><guid>https://sourceforge.net32eec32976cfc862f10673bad59b11b71fd09d30</guid></item><item><title>Home modified by Antoni Oliver</title><link>https://sourceforge.net/p/wn-toolkit/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v27
+++ v28
@@ -9,6 +9,9 @@
 Oliver, A. (2014) *WN-Toolkit: Automatic generation of WordNets following the expand model* Proceedings of the 7th International Global WordNet Conference. January 25-29, 2014. Tartu. Estonia. pp. 7-15. ISBN: 978-9949-32-492-7 [pdf][1]

 [1]:https://www.researchgate.net/publication/259874613_WN-Toolkit_Automatic_generation_of_WordNets_following_the_expand_model?ev=prf_pub
+
+You can also take a look tot the [presentation][2].
+[2]:https://docs.google.com/presentation/d/1hh2b9ttdTkIi6Xl7UaEf4dPHHf8VeCOeVqcEFI9DgpI/pub?start=false&amp;amp;loop=false&amp;amp;delayms=3000#slide=id.g2a2c33c54_046

 #Pre-requisites

&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Antoni Oliver</dc:creator><pubDate>Mon, 27 Jan 2014 21:55:34 -0000</pubDate><guid>https://sourceforge.net40844d2a3feafbea723aae2f3d929ebb366f0231</guid></item><item><title>Home modified by Antoni Oliver</title><link>https://sourceforge.net/p/wn-toolkit/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v26
+++ v27
@@ -4,7 +4,7 @@

 #Introduction

-The WN-Toolkit is a set of programs for the creation of WordNets following the expand model. The toolkit is distributes under a free software license. If you use this toolkit in your work, please cite:
+The WN-Toolkit is a set of programs for the creation of WordNets following the expand model. The toolkit is distributed under a free software license. If you use this toolkit in your work, please cite:

 Oliver, A. (2014) *WN-Toolkit: Automatic generation of WordNets following the expand model* Proceedings of the 7th International Global WordNet Conference. January 25-29, 2014. Tartu. Estonia. pp. 7-15. ISBN: 978-9949-32-492-7 [pdf][1]

@@ -20,11 +20,11 @@

 ##Programs

-In this section the programs of the tools are presented. Some basic instructions for their use are also provided.
+In this section the programs of the Toolkit are presented. Some basic instructions for their use are also provided.

 ###Miscelaneous tools

-In the **0_pwn30-monosemic** directory of the distribution we can find a program called **createmonosemicwordlist.py**. This program extracts the list of monosemic variants (that is, the variants associated to only one synset) from the data.adj, data.adv, data.noun, data.verb, index.adj, index.adv, index.noun and index.verb files of the Princeton WordNet. The program creates 3 files: one for all monosemic variants, one for those written in lower-case (-min.txt) and one for those written with the first letter in upper-case (-maj):
+In the **0_pwn30-monosemic** directory of the distribution we can find a program called **createmonosemicwordlist.py**. This program extracts the list of monosemic variants (that is, the variants associated to only one synset) from the data.adj, data.adv, data.noun, data.verb, index.adj, index.adv, index.noun and index.verb files of the Princeton WordNet. The program creates 3 files: one for all monosemic variants, one for those written in lower-case (-min.txt) and one for those written with the first letter in upper-case (-maj.txt):

 The program has the **-h** option to show the arguments:

@@ -97,13 +97,14 @@
 13449450-n canvi_climàtic
 ~~~~~~~~~~

-We offer a variation of **wndictionary.py** called **wndictionary-normalizecaps.py** that implements a simple strategy to normalize the capitalization of the entries. This is necessary for some dictionaries where all the entries start with upper cap letters. This strategy only works for languages with similar capitalization rules as English (only proper names are written with upper cap letters). This program needs an additional argument:
+We offer a variation of **wndictionary.py** called **wndictionary-normalizecaps.py** that implements a simple strategy to normalize the capitalization of the entries. This is necessary for some dictionaries where all the entries start with upper case letters. This strategy only works for languages with similar capitalization rules as English (only proper names are written with upper cap letters). This program needs an additional argument:

 ~~~~~~~~~~
   -n FILE, --normalize FILE
                         The English wn-data-eng.tab file from OMW
 ~~~~~~~~~~

+OWN: Open Multilingua WordNet 

 Along with **wndictionary.py** we distribute some programs for the creation of bilingual dictionaries with the required format from some freely available dictionaries:
@@ -210,7 +211,7 @@
 es|c veure|v tan|r còmode|a ser|v recte|a .|c
 ~~~~~~~~~~

-Then the program uses a simple word-alignment stategy to get the target language variants, as:
+Then the program uses a simple word-alignment strategy to get the target language variants, as:

 ~~~~~~~~~~
 00048475-r ara
@@ -227,7 +228,7 @@

 We also provide the **ukbtosenses.py** program to transform the Freeling+UKB tagged corpus into a suitable format for the **synset-word-alignment.py**: 2 arguments must been provided: the path and name of the input file and the path and name of the output file.

-In the *Resources* section we provide several corpus pre-processed for the use of this strategy. They are fully described in that section.
+In the **Resources** section we provide several corpus pre-processed for the use of this strategy. They are fully described in that section.

 ###Evaluation tools

@@ -275,7 +276,7 @@
 00176150-a favorable   propici auspicious  auguring favorable circumstances and good luck
 ~~~~~~~~~~

-That is: synset, evaluated target-language variant, reference target language variants, Englisg variants, English definition
+That is: synset, evaluated target-language variant, reference target language variants, English variants, English definition

 The output of the nonevaluated.txt offers similar information, but lacking the reference target language variants.

@@ -395,7 +396,7 @@
 acclimate  v   aclimatar
 ~~~~~~~~~~

-If your language is not listed, you can create the dictionaris with the programs distributed with the WN-Toolkit.
+If your language is not listed, you can create the dictionaries with the programs distributed with the WN-Toolkit.

 ###Parallel corpora

@@ -608,7 +609,7 @@
 ablecer|v el|c código|n aduanero|a comunitario|a
 ~~~~~~~~~~

-These files should be splitted for the use of the WN-Toolkit. The easiest way to do so is using the Linux command **cut**. As the program need the sense-tagged English corpus, we can get it doing, for example:
+These files should be splited for the use of the WN-Toolkit. The easiest way to do so is using the Linux command **cut**. As the program need the sense-tagged English corpus, we can get it doing, for example:

 ~~~~~~~~~~
 cut -f 2 DGT-TM-preprocess-simpletagged-en-es.txt &gt; DGT-TM-senses-eng.txt
@@ -619,6 +620,8 @@
 ~~~~~~~~~~
 cut -f 4 DGT-TM-preprocess-simpletagged-en-es.txt &gt; DGT-TM-tagged-cat.txt
 ~~~~~~~~~~
+
+If you're working with Windows, try the CoreUtils for Windows .

 ####EMEA-03 Corpus
@@ -686,14 +689,16 @@
 These files should be splitted for the use of the WN-Toolkit. The easiest way to do so is using the Linux command **cut**. As the program need the sense-tagged English corpus, we can get it doing, for example:

 ~~~~~~~~~~
-cut -f 2 en-es-preprocess-simpletagged-en-es.txt &gt; DGT-TM-senses-eng.txt
+cut -f 2 en-es-preprocess-simpletagged-en-es.txt &gt; EMEA-senses-eng.txt
 ~~~~~~~~~~

 and we also need the target language tagged corpus using simple tags:

 ~~~~~~~~~~
-cut -f 4 en-es-preprocess-simpletagged-en-es.txt &gt; DGT-TM-tagged-cat.txt
-~~~~~~~~~~
+cut -f 4 en-es-preprocess-simpletagged-en-es.txt &gt; EMEA-tagged-cat.txt
+~~~~~~~~~~
+
+If you're working with Windows, try the CoreUtils for Windows .

 ####UNCorpus

@@ -771,15 +776,16 @@
 These files should be splitted for the use of the WN-Toolkit. The easiest way to do so is using the Linux command **cut**. As the program need the sense-tagged English corpus, we can get it doing, for example:

 ~~~~~~~~~~
-cut -f 2 en-es-preprocess-simpletagged-en-es.txt &gt; DGT-TM-senses-eng.txt
+cut -f 2 en-es-preprocess-simpletagged-en-es.txt &gt; UNCorpus-senses-eng.txt
 ~~~~~~~~~~

 and we also need the target language tagged corpus using simple tags:

 ~~~~~~~~~~
-cut -f 4 en-es-preprocess-simpletagged-en-es.txt &gt; DGT-TM-tagged-cat.txt
-~~~~~~~~~~
-
+cut -f 4 en-es-preprocess-simpletagged-en-es.txt &gt; UNCorpus-tagged-cat.txt
+~~~~~~~~~~
+
+If you're working with Windows, try the CoreUtils for Windows .

&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Antoni Oliver</dc:creator><pubDate>Mon, 27 Jan 2014 21:48:36 -0000</pubDate><guid>https://sourceforge.net652966db886aeedb6920b7cf6bfbbdad6276f847</guid></item><item><title>Home modified by Antoni Oliver</title><link>https://sourceforge.net/p/wn-toolkit/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Antoni Oliver</dc:creator><pubDate>Mon, 27 Jan 2014 15:54:10 -0000</pubDate><guid>https://sourceforge.nete41b44e2bf70667fcac468e8e96836f363d1f4bb</guid></item><item><title>Home modified by Antoni Oliver</title><link>https://sourceforge.net/p/wn-toolkit/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v24
+++ v25
@@ -620,6 +620,7 @@
 cut -f 4 DGT-TM-preprocess-simpletagged-en-es.txt &gt; DGT-TM-tagged-cat.txt
 ~~~~~~~~~~

+
 ####EMEA-03 Corpus

 We offer a pre-processed version of this corpus for several languages:
@@ -657,7 +658,8 @@

 ~~~~~~~~~~
 Abilify is a medicine containing the active substance aripiprazole.    Abilify
- 02604760-v a 00612160-n 02701210-v the 00035465-a 05921123-n aripiprazole .   Abilify es un medicamento que contiene el principio activo aripiprazol.
+ 02604760-v a 00612160-n 02701210-v the 00035465-a 05921123-n aripiprazole .   
+Abilify es un medicamento que contiene el principio activo aripiprazol.
 ~~~~~~~~~~

 Some of these corpora have been further preprocessed and the tagged version (using simple tags) of the target language text have been added. This has been done for the following languages:
@@ -675,7 +677,8 @@

 ~~~~~~~~~~
 Abilify is a medicine containing the active substance aripiprazole.    Abilify
- 02604760-v a 00612160-n 02701210-v the 00035465-a 05921123-n aripiprazole .   Abilify es un medicamento que contiene el principio activo aripiprazol. abilify
+ 02604760-v a 00612160-n 02701210-v the 00035465-a 05921123-n aripiprazole .   
+Abilify es un medicamento que contiene el principio activo aripiprazol.    abilify
 |n ser|v uno|c medicamento|n que|c contener|v el|c principio|n activo|a aripipr
 azol|n .|c
 ~~~~~~~~~~
@@ -692,5 +695,94 @@
 cut -f 4 en-es-preprocess-simpletagged-en-es.txt &gt; DGT-TM-tagged-cat.txt
 ~~~~~~~~~~

+####UNCorpus
+
+
+We offer a pre-processed version of this corpus for several languages:
+
+├── ar-en-preprocess.txt.gz
+├── en-es-preprocess.txt.gz
+├── en-fr-preprocess.txt.gz
+├── en-ru-preprocess.txt.gz
+└── en-zh-preprocess.txt.gz
+
+
+
+These files have several fields separated by tabulators:
+
+* The English text.
+* The English text where sense-tagged words are replaced by their PWN synset. The word sense disambiguation and Tagging has been made using Freeling + UKB.
+* The corresponding target language text.
+
+Here we can see an example:
+
+~~~~~~~~~~
+Adopted at the 81st plenary meeting, on 4 December 2000, on the recommendation 
+of the Committee (A/55/602/Add.2 and Corr.1, para. 94),The draft resolution rec
+ommended in the report was sponsored in the Committee by: Bolivia, Cuba, El Sal
+vador, Ghana and Honduras. by a recorded vote of 106 to 1, with 67 abstentions,
+ as follows:   02381726-v at the 02194255-a 00528167-a 08307589-n , on ??] , o
+n the 06694540-n of the 08324514-n ( A/55/602/Add.2 and Corr.1 , 13671310-n . 9
+4 ) , The 13377268-n 06511874-n 00882948-v in the 06681551-n 02445925-v 0221994
+0-v in the 08324514-n by Fd 08852843-n , 08750334-n , 08738272-n , 08946187-n a
+nd 08737716-n . by a 01000214-v 00183505-n of 106 to 1 , with 67 04882622-n , a
+s 02346895-v Fd    Aprobada en la 81a. sesión plenaria, celebrada el 4 de diciembr
+e de 2000, por recomendación de la Comisión (A/55/602/Add.2, párr. 94),El proye
+cto de resolución recomendado en el informe fue patrocinado en la Comisión por 
+los países siguientes: Bolivia, Cuba, El Salvador, Ghana y Honduras. en votació
+n registrada de 106 votos contra uno y 67 abstenciones, como sigue:
+~~~~~~~~~~
+
+Some of these corpora have been further preprocessed and the tagged version (using simple tags) of the target language text have been added. This has been done for the following languages:
+
+
+├── en-es-preprocess-simpletagged.txt.gz
+├── en-fr-preprocess-simpletagged.txt.gz
+├── en-ru-preprocess-simpletagged.txt.gz
+
+
+
+Here we can see an example:
+
+~~~~~~~~~~
+Adopted at the 81st plenary meeting, on 4 December 2000, on the recommendation 
+of the Committee (A/55/602/Add.2 and Corr.1, para. 94),The draft resolution rec
+ommended in the report was sponsored in the Committee by: Bolivia, Cuba, El Sal
+vador, Ghana and Honduras. by a recorded vote of 106 to 1, with 67 abstentions,
+ as follows:   02381726-v at the 02194255-a 00528167-a 08307589-n , on ??] , o
+n the 06694540-n of the 08324514-n ( A/55/602/Add.2 and Corr.1 , 13671310-n . 9
+4 ) , The 13377268-n 06511874-n 00882948-v in the 06681551-n 02445925-v 0221994
+0-v in the 08324514-n by Fd 08852843-n , 08750334-n , 08738272-n , 08946187-n a
+nd 08737716-n . by a 01000214-v 00183505-n of 106 to 1 , with 67 04882622-n , a
+s 02346895-v Fd    Aprobada en la 81a. sesión plenaria, celebrada el 4 de diciembr
+e de 2000, por recomendación de la Comisión (A/55/602/Add.2, párr. 94),El proye
+cto de resolución recomendado en el informe fue patrocinado en la Comisión por 
+los países siguientes: Bolivia, Cuba, El Salvador, Ghana y Honduras. en votació
+n registrada de 106 votos contra uno y 67 abstenciones, como sigue:    aprobar
+|v en|c el|c 81a|c .|c sesión|n plenario|a ,|c celebrar|v el|c [??|c ,|c por|c 
+recomendación|n de|c el|c comisión|n (|c A/55/602/Add.2|c ,|c párr.|n 94|c )|c 
+,|c el|n proyecto|n de|c resolución|n recomendar|v en|c el|c informe|n ser|v pa
+trocinar|v en|c el|c comisión|n por|c el|c país|n siguiente|a |c bolivia|n ,|c 
+cuba|n ,|c el_salvador|n ,|c ghana|n y|c honduras|n .|c en|c votación|n registr
+ar|v de|c 106|c voto|n contra|c 1|c y|c 67|c abstención|n ,|c como|c seguir|v |
+c
+~~~~~~~~~~
+
+These files should be splitted for the use of the WN-Toolkit. The easiest way to do so is using the Linux command **cut**. As the program need the sense-tagged English corpus, we can get it doing, for example:
+
+~~~~~~~~~~
+cut -f 2 en-es-preprocess-simpletagged-en-es.txt &gt; DGT-TM-senses-eng.txt
+~~~~~~~~~~
+
+and we also need the target language tagged corpus using simple tags:
+
+~~~~~~~~~~
+cut -f 4 en-es-preprocess-simpletagged-en-es.txt &gt; DGT-TM-tagged-cat.txt
+~~~~~~~~~~
+
+
+
+
+
 [[members limit=20]]
 [[download_button]]
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Antoni Oliver</dc:creator><pubDate>Mon, 27 Jan 2014 15:32:18 -0000</pubDate><guid>https://sourceforge.net154d13794a0deaae906db9b032ca1b2a812703d3</guid></item><item><title>Home modified by Antoni Oliver</title><link>https://sourceforge.net/p/wn-toolkit/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v23
+++ v24
@@ -620,5 +620,77 @@
 cut -f 4 DGT-TM-preprocess-simpletagged-en-es.txt &gt; DGT-TM-tagged-cat.txt
 ~~~~~~~~~~

+####EMEA-03 Corpus
+
+We offer a pre-processed version of this corpus for several languages:
+
+├── bg-en-preprocess.txt.gz
+├── cs-en-preprocess.txt.gz
+├── da-en-preprocess.txt.gz
+├── de-en-preprocess.txt.gz
+├── el-en-preprocess.txt.gz
+├── en-es-preprocess.txt.gz
+├── en-et-preprocess.txt.gz
+├── en-fi-preprocess.txt.gz
+├── en-fr-preprocess.txt.gz
+├── en-hu-preprocess.txt.gz
+├── en-it-preprocess.txt.gz
+├── en-lt-preprocess.txt.gz
+├── en-lv-preprocess.txt.gz
+├── en-mt-preprocess.txt.gz
+├── en-nl-preprocess.txt.gz
+├── en-pl-preprocess.txt.gz
+├── en-pt-preprocess.txt.gz
+├── en-ro-preprocess.txt.gz
+├── en-sk-preprocess.txt.gz
+├── en-sl-preprocess.txt.gz
+└── en-sv-preprocess.txt.gz
+
+
+These files have several fields separated by tabulators:
+
+* The English text.
+* The English text where sense-tagged words are replaced by their PWN synset. The word sense disambiguation and Tagging has been made using Freeling + UKB.
+* The corresponding target language text.
+
+Here we can see an example:
+
+~~~~~~~~~~
+Abilify is a medicine containing the active substance aripiprazole.    Abilify
+ 02604760-v a 00612160-n 02701210-v the 00035465-a 05921123-n aripiprazole .   Abilify es un medicamento que contiene el principio activo aripiprazol.
+~~~~~~~~~~
+
+Some of these corpora have been further preprocessed and the tagged version (using simple tags) of the target language text have been added. This has been done for the following languages:
+
+
+├── de-en-preprocess-simpletagged.txt.gz
+├── en-es-preprocess-simpletagged.txt.gz
+├── en-fr-preprocess-simpletagged.txt.gz
+├── en-it-preprocess-simpletagged.txt.gz
+├── en-nl-preprocess-sipletagged.txt.gz
+├── en-pt-preprocess-simpletagged.txt.gz
+
+
+Here we can see an example:
+
+~~~~~~~~~~
+Abilify is a medicine containing the active substance aripiprazole.    Abilify
+ 02604760-v a 00612160-n 02701210-v the 00035465-a 05921123-n aripiprazole .   Abilify es un medicamento que contiene el principio activo aripiprazol. abilify
+|n ser|v uno|c medicamento|n que|c contener|v el|c principio|n activo|a aripipr
+azol|n .|c
+~~~~~~~~~~
+
+These files should be splitted for the use of the WN-Toolkit. The easiest way to do so is using the Linux command **cut**. As the program need the sense-tagged English corpus, we can get it doing, for example:
+
+~~~~~~~~~~
+cut -f 2 en-es-preprocess-simpletagged-en-es.txt &gt; DGT-TM-senses-eng.txt
+~~~~~~~~~~
+
+and we also need the target language tagged corpus using simple tags:
+
+~~~~~~~~~~
+cut -f 4 en-es-preprocess-simpletagged-en-es.txt &gt; DGT-TM-tagged-cat.txt
+~~~~~~~~~~
+
 [[members limit=20]]
 [[download_button]]
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Antoni Oliver</dc:creator><pubDate>Mon, 27 Jan 2014 15:23:18 -0000</pubDate><guid>https://sourceforge.net156c5dc6755b12ea2a367068f65538915a1f9f4b</guid></item><item><title>Home modified by Antoni Oliver</title><link>https://sourceforge.net/p/wn-toolkit/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v22
+++ v23
@@ -558,12 +558,14 @@

 These files have several fields separated by tabulators:

-* The English text
+* The English text.
 * The English text where sense-tagged words are replaced by their PWN synset. The word sense disambiguation and Tagging has been made using Freeling + UKB.
-* The corresponding target language text
-
-~~~~~~~~~~
-CCorrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending 
+* The corresponding target language text.
+
+Here we can see an example:
+
+~~~~~~~~~~
+Corrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending 
 Regulation (EEC) No 2454/93 laying down provisions for the implementation of Co
 uncil Regulation (EEC) No 2913/92 establishing the Community Customs Code  
 06769578-n to Commission_Regulation ( 08173515-n ) 07205104-n 177/2010 of ??] 0
@@ -585,6 +587,7 @@
 ├── DGT-TM-preprocess-simpletagged-en-nl.txt.gz
 ├── DGT-TM-preprocess-simpletagged-en-pt.txt.gz

+Here we can see an example:

 ~~~~~~~~~~
 Corrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending 
@@ -603,7 +606,6 @@
 e|c fijar|v determinar|v disposición|n de|c aplicación|n de|c el|c reglamento|n
  (|c cee|n )|c no|r 2913/92|c de|c el|c consejo|n ,|c por|c el|c que|c se|c est
 ablecer|v el|c código|n aduanero|a comunitario|a
-
 ~~~~~~~~~~

 These files should be splitted for the use of the WN-Toolkit. The easiest way to do so is using the Linux command **cut**. As the program need the sense-tagged English corpus, we can get it doing, for example:
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Antoni Oliver</dc:creator><pubDate>Mon, 27 Jan 2014 15:15:58 -0000</pubDate><guid>https://sourceforge.neta3eeb0dceed54f0cf135988641271d526a2bb2ab</guid></item><item><title>Home modified by Antoni Oliver</title><link>https://sourceforge.net/p/wn-toolkit/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v21
+++ v22
@@ -563,16 +563,16 @@
 * The corresponding target language text

 ~~~~~~~~~~
-Corrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending 
-Regulation (EEC) No 2454/93 laying down provisions for the implementation of 
-Council Regulation (EEC) No 2913/92 establishing the Community Customs Code    
-06769578-n to Commission_Regulation ( 08173515-n ) 07205104-n 177/2010 of ??] 
-0 0205885-v 06664051-n ( 08173515-n ) 07205104-n 2454/93 01494310-v 00096089-r 
-01057200-n for the 00044150-n of Council_Regulation ( 08173515-n ) 07205104-n 
-2913/92 01647229-v the Community_Customs_Code  Corrección de errores del 
-Reglamento (UE) no 177/2010 de la Comisión, de 2 de marzo de 2010, que modifica 
-el Reglamento (CEE) no 2454/93, por el que se fijan determinadas disposiciones 
-de aplicación del Reglamento (CEE) no 2913/92 del Consejo, por el que se establece 
+CCorrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending 
+Regulation (EEC) No 2454/93 laying down provisions for the implementation of Co
+uncil Regulation (EEC) No 2913/92 establishing the Community Customs Code  
+06769578-n to Commission_Regulation ( 08173515-n ) 07205104-n 177/2010 of ??] 0
+0205885-v 06664051-n ( 08173515-n ) 07205104-n 2454/93 01494310-v 00096089-r 01
+057200-n for the 00044150-n of Council_Regulation ( 08173515-n ) 07205104-n 291
+3/92 01647229-v the Community_Customs_Code Corrección de errores del Regla
+mento (UE) no 177/2010 de la Comisión, de 2 de marzo de 2010, que modifica el R
+eglamento (CEE) no 2454/93, por el que se fijan determinadas disposiciones de a
+plicación del Reglamento (CEE) no 2913/92 del Consejo, por el que se establece 
 el código aduanero comunitario
 ~~~~~~~~~~

@@ -587,21 +587,23 @@

 ~~~~~~~~~~
-Corrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending
- Regulation (EEC) No 2454/93 laying down provisions for the implementation of 
-Council Regulation (EEC) No 2913/92 establishing the Community Customs Code    
-06769578-n to Commission_Regulation ( 08173515-n ) 07205104-n 177/2010 of ??] 
-0 0205885-v 06664051-n ( 08173515-n ) 07205104-n 2454/93 01494310-v 00096089-r 
-01 057200-n for the 00044150-n of Council_Regulation ( 08173515-n ) 07205104-n
- 2913/92 01647229-v the Community_Customs_Code Corrección de errores del 
-Reglamento (UE) no 177/2010 de la Comisión, de 2 de marzo de 2010, que modifica 
-el Reglamento (CEE) no 2454/93, por el que se fijan determinadas disposiciones 
-de aplicación del Reglamento (CEE) no 2913/92 del Consejo, por el que se establece 
-el código aduanero comunitario corrección|n de|c error|n de|c el|c reglamento|n 
-(|c ue|n )|c no|r 177/2010|c de|c el|c comisión|n ,|c de|c [??|c ,|c que|c modificar|v 
-el|c reglamento|n (|c cee|n )|c no|r 2454/93|c ,|c por|c el|c que|c se|c fijar|v determinar|v disposición|n de|c aplicación|n de|c el|c reglamento|n (|c cee|n )|c 
-no|r 2913/92|c de|c el|c consejo|n ,|c por|c el|c que|c se|c establecer|v el|c 
-código|n aduanero|a comunitario|a
+Corrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending 
+Regulation (EEC) No 2454/93 laying down provisions for the implementation of Co
+uncil Regulation (EEC) No 2913/92 establishing the Community Customs Code  
+06769578-n to Commission_Regulation ( 08173515-n ) 07205104-n 177/2010 of ??] 0
+0205885-v 06664051-n ( 08173515-n ) 07205104-n 2454/93 01494310-v 00096089-r 01
+057200-n for the 00044150-n of Council_Regulation ( 08173515-n ) 07205104-n 291
+3/92 01647229-v the Community_Customs_Code Corrección de errores del Regla
+mento (UE) no 177/2010 de la Comisión, de 2 de marzo de 2010, que modifica el R
+eglamento (CEE) no 2454/93, por el que se fijan determinadas disposiciones de a
+plicación del Reglamento (CEE) no 2913/92 del Consejo, por el que se establece 
+el código aduanero comunitario corrección|n de|c error|n de|c el|c reglamento|
+n (|c ue|n )|c no|r 177/2010|c de|c el|c comisión|n ,|c de|c [??|c ,|c que|c mo
+dificar|v el|c reglamento|n (|c cee|n )|c no|r 2454/93|c ,|c por|c el|c que|c s
+e|c fijar|v determinar|v disposición|n de|c aplicación|n de|c el|c reglamento|n
+ (|c cee|n )|c no|r 2913/92|c de|c el|c consejo|n ,|c por|c el|c que|c se|c est
+ablecer|v el|c código|n aduanero|a comunitario|a
+
 ~~~~~~~~~~

 These files should be splitted for the use of the WN-Toolkit. The easiest way to do so is using the Linux command **cut**. As the program need the sense-tagged English corpus, we can get it doing, for example:
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Antoni Oliver</dc:creator><pubDate>Mon, 27 Jan 2014 15:11:06 -0000</pubDate><guid>https://sourceforge.net941da086c192901a8f034063aa93c968d9533673</guid></item><item><title>Home modified by Antoni Oliver</title><link>https://sourceforge.net/p/wn-toolkit/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v20
+++ v21
@@ -563,7 +563,17 @@
 * The corresponding target language text

 ~~~~~~~~~~
-Corrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending Regulation (EEC) No 2454/93 laying down provisions for the implementation of Council Regulation (EEC) No 2913/92 establishing the Community Customs Code    06769578-n to Commission_Regulation ( 08173515-n ) 07205104-n 177/2010 of ??] 0 0205885-v 06664051-n ( 08173515-n ) 07205104-n 2454/93 01494310-v 00096089-r 01057200-n for the 00044150-n of Council_Regulation ( 08173515-n ) 07205104-n 2913/92 01647229-v the Community_Customs_Code    Corrección de errores del Reglamento (UE) no 177/2010 de la Comisión, de 2 de marzo de 2010, que modifica el Reglamento (CEE) no 2454/93, por el que se fijan determinadas disposiciones de aplicación del Reglamento (CEE) no 2913/92 del Consejo, por el que se establece el código aduanero comunitario
+Corrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending 
+Regulation (EEC) No 2454/93 laying down provisions for the implementation of 
+Council Regulation (EEC) No 2913/92 establishing the Community Customs Code    
+06769578-n to Commission_Regulation ( 08173515-n ) 07205104-n 177/2010 of ??] 
+0 0205885-v 06664051-n ( 08173515-n ) 07205104-n 2454/93 01494310-v 00096089-r 
+01057200-n for the 00044150-n of Council_Regulation ( 08173515-n ) 07205104-n 
+2913/92 01647229-v the Community_Customs_Code  Corrección de errores del 
+Reglamento (UE) no 177/2010 de la Comisión, de 2 de marzo de 2010, que modifica 
+el Reglamento (CEE) no 2454/93, por el que se fijan determinadas disposiciones 
+de aplicación del Reglamento (CEE) no 2913/92 del Consejo, por el que se establece 
+el código aduanero comunitario
 ~~~~~~~~~~

 Some of these corpora have been further preprocessed and the tagged version (using simple tags) of the target language text have been added. This has been done for the following languages:
@@ -577,10 +587,34 @@

 ~~~~~~~~~~
-Corrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending Regulation (EEC) No 2454/93 laying down provisions for the implementation of Council Regulation (EEC) No 2913/92 establishing the Community Customs Code    06769578-n to Commission_Regulation ( 08173515-n ) 07205104-n 177/2010 of ??] 0 0205885-v 06664051-n ( 08173515-n ) 07205104-n 2454/93 01494310-v 00096089-r 01 057200-n for the 00044150-n of Council_Regulation ( 08173515-n ) 07205104-n 2913/92 01647229-v the Community_Customs_Code   Corrección de errores del Reglamento (UE) no 177/2010 de la Comisión, de 2 de marzo de 2010, que modifica el Reglamento (CEE) no 2454/93, por el que se fijan determinadas disposiciones de aplicación del Reglamento (CEE) no 2913/92 del Consejo, por el que se establece el código aduanero comunitario  corrección|n de|c error|n de|c el|c reglamento|n (|c ue|n )|c no|r 177/2010|c de|c el|c comisión|n ,|c de|c [??|c ,|c que|c modificar|v el|c reglamento|n (|c cee|n )|c no|r 2454/93|c ,|c por|c el|c que|c se|c fijar|v determinar|v disposición|n de|c aplicación|n de|c el|c reglamento|n (|c cee|n )|c no|r 2913/92|c de|c el|c consejo|n ,|c por|c el|c que|c se|c establecer|v el|c código|n aduanero|a comunitario|a
-~~~~~~~~~~
-
-These files should be splitted for the use of the WN-Toolkit. The easiest way to do so is using the Linux command **cut**.
+Corrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending
+ Regulation (EEC) No 2454/93 laying down provisions for the implementation of 
+Council Regulation (EEC) No 2913/92 establishing the Community Customs Code    
+06769578-n to Commission_Regulation ( 08173515-n ) 07205104-n 177/2010 of ??] 
+0 0205885-v 06664051-n ( 08173515-n ) 07205104-n 2454/93 01494310-v 00096089-r 
+01 057200-n for the 00044150-n of Council_Regulation ( 08173515-n ) 07205104-n
+ 2913/92 01647229-v the Community_Customs_Code Corrección de errores del 
+Reglamento (UE) no 177/2010 de la Comisión, de 2 de marzo de 2010, que modifica 
+el Reglamento (CEE) no 2454/93, por el que se fijan determinadas disposiciones 
+de aplicación del Reglamento (CEE) no 2913/92 del Consejo, por el que se establece 
+el código aduanero comunitario corrección|n de|c error|n de|c el|c reglamento|n 
+(|c ue|n )|c no|r 177/2010|c de|c el|c comisión|n ,|c de|c [??|c ,|c que|c modificar|v 
+el|c reglamento|n (|c cee|n )|c no|r 2454/93|c ,|c por|c el|c que|c se|c fijar|v determinar|v disposición|n de|c aplicación|n de|c el|c reglamento|n (|c cee|n )|c 
+no|r 2913/92|c de|c el|c consejo|n ,|c por|c el|c que|c se|c establecer|v el|c 
+código|n aduanero|a comunitario|a
+~~~~~~~~~~
+
+These files should be splitted for the use of the WN-Toolkit. The easiest way to do so is using the Linux command **cut**. As the program need the sense-tagged English corpus, we can get it doing, for example:
+
+~~~~~~~~~~
+cut -f 2 DGT-TM-preprocess-simpletagged-en-es.txt &gt; DGT-TM-senses-eng.txt
+~~~~~~~~~~
+
+and we also need the target language tagged corpus using simple tags:
+
+~~~~~~~~~~
+cut -f 4 DGT-TM-preprocess-simpletagged-en-es.txt &gt; DGT-TM-tagged-cat.txt
+~~~~~~~~~~

 [[members limit=20]]
 [[download_button]]
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Antoni Oliver</dc:creator><pubDate>Mon, 27 Jan 2014 15:08:07 -0000</pubDate><guid>https://sourceforge.net6391d5d988553965854cc737c68d237a92833376</guid></item><item><title>Home modified by Antoni Oliver</title><link>https://sourceforge.net/p/wn-toolkit/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v19
+++ v20
@@ -352,7 +352,7 @@
 ├── wiktionary-en-te.txt
 └── wiktionary-en-zh.txt

-#Wikipedia dictionaries
+####Wikipedia dictionaries

 We also offer several dictionaries created from the Wikipedia (2012-08-05 dump):

@@ -530,8 +530,57 @@
 └── pwgc30-tagged-spa.txt

-
-
+####DGT_TM-release2013
+
+We offer a pre-processed version of this corpus for several languages:
+
+├── DGT-TM-preprocess-en-bg.txt.gz
+├── DGT-TM-preprocess-en-cs.txt.gz
+├── DGT-TM-preprocess-en-da.txt.gz
+├── DGT-TM-preprocess-en-de.txt.gz
+├── DGT-TM-preprocess-en-el.txt.gz
+├── DGT-TM-preprocess-en-es.txt.gz
+├── DGT-TM-preprocess-en-et.txt.gz
+├── DGT-TM-preprocess-en-fi.txt.gz
+├── DGT-TM-preprocess-en-fr.txt.gz
+├── DGT-TM-preprocess-en-hu.txt.gz
+├── DGT-TM-preprocess-en-it.txt.gz
+├── DGT-TM-preprocess-en-lt.txt.gz
+├── DGT-TM-preprocess-en-lv.txt.gz
+├── DGT-TM-preprocess-en-mt.txt.gz
+├── DGT-TM-preprocess-en-nl.txt.gz
+├── DGT-TM-preprocess-en-pl.txt.gz
+├── DGT-TM-preprocess-en-pt.txt.gz
+├── DGT-TM-preprocess-en-ro.txt.gz
+├── DGT-TM-preprocess-en-sk.txt.gz
+├── DGT-TM-preprocess-en-sl.txt.gz
+├── DGT-TM-preprocess-en-sv.txt.gz
+
+These files have several fields separated by tabulators:
+
+* The English text
+* The English text where sense-tagged words are replaced by their PWN synset. The word sense disambiguation and Tagging has been made using Freeling + UKB.
+* The corresponding target language text
+
+~~~~~~~~~~
+Corrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending Regulation (EEC) No 2454/93 laying down provisions for the implementation of Council Regulation (EEC) No 2913/92 establishing the Community Customs Code    06769578-n to Commission_Regulation ( 08173515-n ) 07205104-n 177/2010 of ??] 0 0205885-v 06664051-n ( 08173515-n ) 07205104-n 2454/93 01494310-v 00096089-r 01057200-n for the 00044150-n of Council_Regulation ( 08173515-n ) 07205104-n 2913/92 01647229-v the Community_Customs_Code    Corrección de errores del Reglamento (UE) no 177/2010 de la Comisión, de 2 de marzo de 2010, que modifica el Reglamento (CEE) no 2454/93, por el que se fijan determinadas disposiciones de aplicación del Reglamento (CEE) no 2913/92 del Consejo, por el que se establece el código aduanero comunitario
+~~~~~~~~~~
+
+Some of these corpora have been further preprocessed and the tagged version (using simple tags) of the target language text have been added. This has been done for the following languages:
+
+├── DGT-TM-preprocess-simpletagged-en-de.txt.gz
+├── DGT-TM-preprocess-simpletagged-en-es.txt.gz
+├── DGT-TM-preprocess-simpletagged-en-fr.txt.gz
+├── DGT-TM-preprocess-simpletagged-en-it.txt.gz
+├── DGT-TM-preprocess-simpletagged-en-nl.txt.gz
+├── DGT-TM-preprocess-simpletagged-en-pt.txt.gz
+
+
+~~~~~~~~~~
+Corrigendum to Commission Regulation (EU) No 177/2010 of 2 March 2010 amending Regulation (EEC) No 2454/93 laying down provisions for the implementation of Council Regulation (EEC) No 2913/92 establishing the Community Customs Code    06769578-n to Commission_Regulation ( 08173515-n ) 07205104-n 177/2010 of ??] 0 0205885-v 06664051-n ( 08173515-n ) 07205104-n 2454/93 01494310-v 00096089-r 01 057200-n for the 00044150-n of Council_Regulation ( 08173515-n ) 07205104-n 2913/92 01647229-v the Community_Customs_Code   Corrección de errores del Reglamento (UE) no 177/2010 de la Comisión, de 2 de marzo de 2010, que modifica el Reglamento (CEE) no 2454/93, por el que se fijan determinadas disposiciones de aplicación del Reglamento (CEE) no 2913/92 del Consejo, por el que se establece el código aduanero comunitario  corrección|n de|c error|n de|c el|c reglamento|n (|c ue|n )|c no|r 177/2010|c de|c el|c comisión|n ,|c de|c [??|c ,|c que|c modificar|v el|c reglamento|n (|c cee|n )|c no|r 2454/93|c ,|c por|c el|c que|c se|c fijar|v determinar|v disposición|n de|c aplicación|n de|c el|c reglamento|n (|c cee|n )|c no|r 2913/92|c de|c el|c consejo|n ,|c por|c el|c que|c se|c establecer|v el|c código|n aduanero|a comunitario|a
+~~~~~~~~~~
+
+These files should be splitted for the use of the WN-Toolkit. The easiest way to do so is using the Linux command **cut**.

 [[members limit=20]]
 [[download_button]]
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Antoni Oliver</dc:creator><pubDate>Mon, 27 Jan 2014 13:50:52 -0000</pubDate><guid>https://sourceforge.netdf0fdb68833a22674b27f62f8c982fbd3229b977</guid></item></channel></rss>