Hello: In the patterns you should use the same tags than your tagger. If your tagger uses PPER you should use PPER in the POS patterns. Remember that you can use wildcards from regular expressions to shorten or group patters. Best regards Antoni Antoni Oliver González Estudis d'Arts i Humanitats Director del màster en Traducció i tecnologies aoliverg@uoc.edu ResearchGate https://www.researchgate.net/profile/Antoni_Oliver2 / Twitter https://twitter.com/aoliverg?lang=en / Linkedin https://www.linkedin.com/in/antonioliver/...
Hello: You can use any tagger BUT: POS patterns may be changed if the used tagger uses a different tagset. The format for a tagged corpus should be as described, that is, each token should be represented as word_form|lemma|tag and each of these tokens should be separated by spaces. Remember that we have moved our repository to Github: https://github.com/aoliverg/TBXTools Best regards Antoni Oliver Antoni Oliver González Estudis d'Arts i Humanitats Director del màster en Traducció i tecnologies aoliverg@uoc.edu...
Hello: Sorry for the delay in my answer The Freeling API connects with Freeling to tag the text and puts the output in this special format. You can use any tagger and adapt the output to have the same format. Please, note that the POS tags may differ from one tagger to another so the POS patterns should be changed accordingly. Please, also remember that the project has moved to Github, so the lattest versions will be availablre only there: https://github.com/aoliverg/TBXTools Best regads Antoni
Hello: Sorry for the delay in my answer. I'm currently working in the new version but I'm moving the repository to github: https://github.com/aoliverg/TBXTools In the following weeks there will be new versions and the documentation. Best regards Antoni Antoni Oliver González Estudis d'Arts i Humanitats Director del màster en Traducció i tecnologies aoliverg@uoc.edu ResearchGate https://www.researchgate.net/profile/Antoni_Oliver2 / Twitter https://twitter.com/aoliverg?lang=en / Linkedin https://www.linkedin.com/in/antonioliver/...
Training scripts
Adding MTUOC3 directory
Remove prova1 and prova2 directories in preprocess
Preprocess scripts