Thank you for your answer Dr. Antoni. But how could I use the wildcards for regular expressions to shorten or group patterns? Could you provide any simple example on how to use them? since they aren't mentioned in the documentation. I am looking forward to receiving your response. Best Regards, Bahgat Ahmed
Hello: In the patterns you should use the same tags than your tagger. If your tagger uses PPER you should use PPER in the POS patterns. Remember that you can use wildcards from regular expressions to shorten or group patters. Best regards Antoni Antoni Oliver González Estudis d'Arts i Humanitats Director del màster en Traducció i tecnologies aoliverg@uoc.edu ResearchGate https://www.researchgate.net/profile/Antoni_Oliver2 / Twitter https://twitter.com/aoliverg?lang=en / Linkedin https://www.linkedin.com/in/antonioliver/...
Thank you for your answer, Dr. Antoni. I am very sorry for my late follow-up question. So do you mean that if the tagger uses a different tagset than the ones you mentioned in your code (TBXTools, Freeling, or Conll) formats, the POS patterns must be modified? For example, if the tagger tags the "personal pronoun" by this tag "PPER", while the corresponding Conll tag is "PRP" should I replace any "PPER" tag with "PRP" tag for TBXTools to work correctly? So he|he|PPER must become ---> he|he|PRP ?...
Thank you for your answer Dr. Antoni. I am very sorry for my late follow-up question. So do you mean that if the tagger uses a different tagset than the ones you mentioned in your code (TBXTools, Freeling, or Conll) formats, the POS patterns must be modified? For example, if the tagger tags the "personal pronoun" by this tag "PPER", while the corresponding Conll tag is "PRP" I should replace any "PPER" tag with "PRP" tag for TBXTools to work correctly? so he|he|PPER must become ---> he|he|PRP ? Thank...
Hello: You can use any tagger BUT: POS patterns may be changed if the used tagger uses a different tagset. The format for a tagged corpus should be as described, that is, each token should be represented as word_form|lemma|tag and each of these tokens should be separated by spaces. Remember that we have moved our repository to Github: https://github.com/aoliverg/TBXTools Best regards Antoni Oliver Antoni Oliver González Estudis d'Arts i Humanitats Director del màster en Traducció i tecnologies aoliverg@uoc.edu...
Thank you very much Dr. Antoni for your answer, I have another question please. Here are my question details: I did what you said. Moreover, I have experimented with different taggers, and lemmatizers. I have tested them against your ready sample of tagged corpus "corpus-control-JRC-tagged-eng.txt", and I compared the ratio true to fake, and the ratio fake to all terminologies extracted. I used your provided terminologies file "JRC-control-evaluation-terms2g3g-eng.txt" for getting the true terms...
Thank you very much Dr. Antoni for your answer, I have another question please. Here are my question details: I did what you said. Moreover, I have experimented with different taggers, and lemmatizers. I have tested them against your ready sample of tagged corpus "corpus-control-JRC-tagged-eng.txt", and I compared the ratio true to fake, and the ratio fake to all terminologies extracted. I used your provided terminologies file "JRC-control-evaluation-terms2g3g-eng.txt" for getting the true terms...
Thank you very much Dr. Antoni for your answer, I have another question please. Here are my question details: I did what you said. Moreover, I have experimented with different taggers, and lemmatizers. I have tested them against your ready sample of tagged corpus "corpus-control-JRC-tagged-eng.txt", and I compared the ratio true to fake, and the ratio fake to all terminologies extracted. I used your provided terminologies file "JRC-control-evaluation-terms2g3g-eng.txt" for getting the true terms...
Hello: Sorry for the delay in my answer The Freeling API connects with Freeling to tag the text and puts the output in this special format. You can use any tagger and adapt the output to have the same format. Please, note that the POS tags may differ from one tagger to another so the POS patterns should be changed accordingly. Please, also remember that the project has moved to Github, so the lattest versions will be availablre only there: https://github.com/aoliverg/TBXTools Best regads Antoni
Freeling API functionalities
Thanks Antoni, much appreciated
Hello: Sorry for the delay in my answer. I'm currently working in the new version but I'm moving the repository to github: https://github.com/aoliverg/TBXTools In the following weeks there will be new versions and the documentation. Best regards Antoni Antoni Oliver González Estudis d'Arts i Humanitats Director del màster en Traducció i tecnologies aoliverg@uoc.edu ResearchGate https://www.researchgate.net/profile/Antoni_Oliver2 / Twitter https://twitter.com/aoliverg?lang=en / Linkedin https://www.linkedin.com/in/antonioliver/...
New release
Hola: Did you also installed the Freeling API? If you experience problems with the connection between TBXTools and Freeling, you can tag your corpus with freeling, adapt the format of the corpus, and load the tagged corpus directly into TBXTools. I'm afraid I'm not able to help with the current version of TBXTools. I'm about to release a new version soon, and this new version will be fully documented. Best regards Antoni Antoni Oliver González Estudis d'Arts i Humanitats Director del màster en Traducció...
WIndows Freeling issue
Hello Antoni : I hope you spent a wonderful holiday. I am already have my own python3 environment installed in my computer Anaconda which has it's own editor Jupyter notebook and can run any python file.
Hello Antoni : I hope you spent a wonderful holiday. I am already have so many interpreters installed in your computer like Python3 and jupyter notebook.
Hello Mohamed: Sorry for the delay in my answer. I'm now in holidays until January 7th. Do you have the Python interpreter installed in your computer? Yo need a Python 3 interpreted. You can freely download from www.python.org. Best regards Antoni Antoni Oliver González Estudis d'Arts i Humanitats Director del màster en Traducció i tecnologies aoliverg@uoc.edu ResearchGate https://www.researchgate.net/profile/Antoni_Oliver2 / Twitter https://twitter.com/aoliverg?lang=en / Linkedin https://www.linkedin.com/in/antonioliver/...
Any help please ?!!
Start using TBXTools
Home