Suffix with dash

Help
hfarroukh
2012-05-15
2013-06-03
  • hfarroukh
    hfarroukh
    2012-05-15

    Hello,
    after searching a lot and reading the manual I still have difficulties for an actually simple suffix.

    I’d like to do the following:

    SFX Ezâ Y 2
    SFX Ezâ   0     -e      
    SFX Ezâ   0     -ye     

    Expected Result sample:
    OK: dast-e, zânu-ye
    NOK: dast-ye, zânu-e

    It doesn’t work and I don not understand, how to solve this simple case by compounding or other means.

    Thanks a lot.

     

  • Anonymous
    2012-05-15

    You have a add - to WORDCHARS. Here is an example that works

    dast.dic
    1
    dast/E
    zânu/E

    dast.aff
    SET UTF-8
    WORDCHARS -

    SFX E Y 2
    SFX E   0     -e      
    SFX E   0     -ye     

     
  • hfarroukh
    hfarroukh
    2012-05-15

    Thank you very much for your fast reaction. It doesn't work and It seems that I didn't understand the concept behind this spell checking.
    What I want to have, is to use hunspell as an addon in Firefox for spellchecking of Persian transcription script. That means, I write Persian with Latin characters based on some defined rules. Here is the alphabet: (capital letters are also used):
    a, â, b, c, d, e, ê, f, g, h, i, j, ĵ, k, l, m, n, o, p, q, r, s, ŝ, t, u, v, w, x, y, z.
    It is similar to Esperanto.
    I installed the addon for Persian and simply replaced the dic- and aff-file with the ones I created.
    First of all, the diacritic letters are not handeled correctly, and your solution doesn't work either. Maybe I'm doing something completely wrong. Let's make it simple:
    dic-File fa.dic


    SET ISO8859-3
    dast/Ezâ
    zânu/Ezâ
    manê/Ezâ

    Aff-File fa.aff

    SET ISO8859-3
    WORDCHARS -

    #Ezâfe connection
    SFX Ezâ Y 3
    SFX Ezâ   0     -e      
    SFX Ezâ   ê     -e       ê
    SFX Ezâ   0     -ye     

    Do I have to take other things into account? Thanks.

     

  • Anonymous
    2012-05-16

    I saved your example files. Then I tested them with the hunspell program, like this: hunspell -d fa

    It resulted in an error:
    error: line 1: missing or bad word count in the dic file
    Hash Manager Error : 4

    This error occurs because the first line in the .dic file is not a number. The number should be (approximately) the number of stems you have have in you the .dic file, in this example 3. I replaced SET ISO8859-3 with 3 in fa.dic, and hunspell started without problems. When I wrote dast-e, it accepted the word.

    Hope this helps! Here is a man page that tells more about the mysteries of the hunspell dictionaries ;) http://www.manpagez.com/man/4/hunspell/

     
  • hfarroukh
    hfarroukh
    2012-05-16

    You are so kind. I appreciate your help very much. I'm going to a trip and will implement your solution next week.
    Once again thank you very much.

    As I also had problems with diacritic letters, I looked at the Esperanto files. There, these letters were represented as:
    ¼ =ĵ
    þ=ŝ

    Does this have to be? Or can I save the file in Unicode format?
    Thanks a lot.

     
  • hfarroukh
    hfarroukh
    2012-05-16

    Please let me add another question. How can I use suffixes with wildcards:
    This should achieve, that any suffix beginning with a vocal transforms ow to av, if the word ends with ow:
    peyrow + i > peyravi
    peyrow + ân > peyravân
    row + am > ravam
    etc.

    I think, I should somehow take the following feature, but I cannot find any example:
    COMPOUNDRULE compound_pattern
    Define custom compound patterns with a regex-like syntax. The first COMPOUNDRULE is a
    header with the number of the following COMPOUNDRULE definitions. Compound patterns consist
    compound flags, parentheses, star and question mark meta characters. A flag followed by a ‘*’
    matches a word sequence of 0 or more matches of words signed with this compound flag. A flag
    followed by a ‘?’ matches a word sequence of 0 or 1 matches of a word signed with this compound flag. See tests/compound*.* examples.

    Thanks.